So I've been processing a subset of demultiplexed paired end sequences through QIIME2 and DADA2 from a larger set of sequences that were previously processed to an OTU table via QIIME1.
I imported the sequences, joined their paired ends, denoised using DADA2, and came up with a resulting frequency features table as well as a representative sequences file. From here I was interested in producing a biom table with taxonomic labels at the species level so I used one of the pretrained Naive Bayes classifier and applied it to my representative sequences file.
qiime feature-classifier classify
--i-classifier gg-13-8-99-515-806-nb-classifier.qza
--i-reads rep-seqs.qza
--o-classification taxonomy.qza
I tested the classifier classifiertest.tsv (110.7 KB)
And then collapsed the taxa
qiime taxa collapse
--i-table table.qza
--i-taxonomy taxonomy.qza
--p-level 7
--o-collapsed-table table-l7.qza
Exported and converted it to tsv
qiime tools export table-l7.qza --output-dir exported-l7-table
biom convert -i exported-l7-table/feature-table.biom -o exported-l7-table/feature-table.tsv --to-tsv
And got the resulting tsv file... feature-table.tsv (15.0 KB)
It looks like only about 100 distinct classifications were made out of my sequences and it's really low compared to what was generated when these sequences were run with QIIME1. I believe the OTU clustering de novo via QIIME1 had identified ~300 taxa and I'm curious as to whether I did something incorrect in my workflow and therefore less were identified or if this is a product of higher resolution identification via DADA2 and using ASVs?
Thanks!