Sir, there's no fly in my digital soup. what gives?

devonorourke · November 30, 2018, 6:44pm

Thanks @Nicholas_Bokulich - indeed, the earlier threads in which you helped me through were instructive on not setting such a high --p-maxaccepts value and I have made sure that instruction was followed when I generated the data discussed in the current post.

These data were generated by first identifying and removing any host-associated COI reads. That filtered feature table was the object used in my vsearch classification. The database used for taxonomic assignment is comprised of arthropod sequences only (an earlier iteration included both arthropod and chordate). The parameter for vsearch in this instance were the defaults:

qiime feature-classifier classify-consensus-vsearch \
--i-query ../reads/arthseqs.qza \
--i-reference-reads "$REF"/ref_seqs.qza \
--i-reference-taxonomy "$REF"/ref_taxonomy.qza \
--p-perc-identity 0.97 \
--p-strand both \
--p-threads 24 \
--o-classification oro.v97.qza

Nothing special there, I don't suspect.

Regarding your questions about the plot and the labeling:

The only thing I was intending with this plot was to highlight the fact that the most frequently detected ASVs are missing most of their taxonomic information, and that strikes me as odd. It's not to say the plot shows anything about a correlation - that would require the plot you suggested, and I wouldn't be just highlighting those outliers in that case.
My intention with the plot was to suggest that of my most frequent ASVs, 1 contains information through to a Genus-level classification, only 2/5 contain at least Order level, and the other 3 contain no information other than to indicate they are an insect (and weren't deemed "Unclassified"). I haven't even plotted the Unclassified ones, which I need to do now that I think of it!
The taxonomic information is generated in a generic web-based BLAST through NCBI nr database.

Thanks for the advice on running vsearch. Perhaps I should also run the same taxonomy assignment with QIIME's BLAST classifier too.

Hope that clarifies your questions. Appreciate the quick reply!