Dear QIIME2 users,
I have a few general questions about filtering 16S data. I understand this is not a QIIME2 software problem, but I’d appreciate insights from senior biologists/users of QIIME2 working with 16S data on this.
I have Illumina 16S data (V4 with 515-806 primers), and I’m running QIIME2-2018.11 to examine microbial abundance and diversity in my samples. I used SILVA132 (99_16S.fna and 99_taxonomy_7_levels.txt) as the reference database and taxonomy database.
$time qiime feature-classifier classify-consensus-blast
--i-query fig2a/rep-seqs.qza
--i-reference-taxonomy fig2a/silva132_taxonomy.qza
--i-reference-reads fig2a/silva132-db.qza
--o-classification fig2a/classify2a
--p-perc-identity 0.90
--p-maxaccepts 1
--verbose
I’ve read about taxa filtering here https://docs.qiime2.org/2018.11/tutorials/filtering/ , and on this thread https://forum.qiime2.org/t/high-yield-of-d-2-alphaproteobacteria-d-3-rickettsiales-d-4-mitochondria-in-samples-from-wild-bats-artifact/7161.
After classification I scanned my taxonomy-to-tsv file and I knew I had to filter “Unassigned” in order to improve “relative abundance” in the samples downstream. To my dismay, “ Unassigned” were about 65%!!! How is this possible? Any suggestions for improving taxonomic assignment? Because I thought silva_132_99_16S.fna was appropriate for this (I could be wrong!). Like what is being suggested here several random posts online https://www.researchgate.net/post/Can_anyone_help_with_pulling_specific_sequences_that_correspond_to_OTU_IDs_from_Qiime
Also I have quite other assignments (with metagenome approx.. 17%), do I need to filter these out too? Because I’m interested in Family, Genus and Species assignments yet all this stuff with metagenome (though it hits bacteria) is only classified up to Phylum or Class. If I do, I’m worried I’ll retain only 16% of the original data!
Here:
And lastly, I wanted to confirm if some of the OTUs which are “Unassigned” are actually Unassigned by doing a quick ncbi blast. How do I get the sequences corresponding to the specific OTU Ids from my silva_132_99_16S.fna file? I’ve noted that filter_fasta.py (in QIIME1) could do this, but I’m running QIIME2. Where would I find an equivalent of that script in QIIME2?
If any of my questions are too basic, my apologies. I am at my wits end (almost 2 weeks on this) and I appreciate all the help I can get. This forum has been my life-saver on all things QIIME2!!
Thanks for your help.