I'm wondering if there is an easy way to add sequences/taxonomy to a reference database created with QIIME.
I'm working on an eDNA metabarcoding project and want to detect species not present on NCBI's GenBank database. I have tissue samples from those species, and can thus sequence them with the same metabarcoding primers that I'm using for my study.
I would either Sanger sequence them with the metabarcoding primers, or would sequence them on the Illumina platform, then get a consensus sequence for that individual (although that seems a bit overkill).
I could add these directly to GenBank, then use rescript get-ncbi-data to download them in the proper format. Instead, I'm wondering if there's an easy way to manually add them instead.
Thanks for your help! I'd love to know if there is a standard procedure for this.
import your new sequences and taxonomy as FeatureData[Sequence] and FeatureData[Taxonomy]
use qiime feature-table merge-seqs to add the new sequences to your database
use qiime feature-table merge-taxa to add the new taxonomies to your database (rescript also has a merge-taxa action that could be used for this, but that is for a more advanced case).
Based on these instructions, I created a headerless taxonomy file with just the accession number from my FASTA. As with NCBI, I assumed Qiime would consider everything left of the first space in the header to be the accession number.
Both the taxonomy and sequences import without a problem, but if I just merge them to my GenBank taxonomy and sequences, I don't really have a way of knowing whether the new taxonomy and sequences that I imported are properly connected.
Is there a way to verify that? Or can you affirm that this should work?
HI @alexkrohn one quick way would be to create a visualization of your final taxonomy file, using qiime metadata tabulate ... then you can use the search box in the upper right of the visualization to search for your newly added sequence IDs.
You can also do this with the sequence file, but there is no search box for this visualization. You can simply use the browser search function to search for your newly added IDs.