2017.3-taxa-bar-plots.qzv (349.9 KB) 2019.3-taxa-bar-plots.qzv (454.7 KB) merged3-taxa-bar-plots.qzv (415.5 KB)
I've been struggling with two 18S amplicon datasets being classified inconsistently. They were both prepared the same way; however, one dataset contains paired-end, 2 by 250 reads, while the other consists of paired-end, 2 by 300 reads. I ran DADA2 on the two datasets separately, and then merged them into one dataset before performing data analyses. My lab produced a classifier for these datasets, and I've looked over it pretty extensively, but have not been able to find any errors with it. Regardless, when I perform "qiime feature-classifier classify-sklearn" using my custom classifier, one particular protist genus (Cthulu) that SHOULD be present in my samples doesn't show up in the merged dataset. This is strange, because when I produce taxa bar plots for the two datasets independently, Cthulu shows up in one dataset, but not the other. I also noticed that, in the dataset in which Cthulu is present, the taxa bar plot indicates that my samples were classified to a taxonomic level of 9. This presents itself in the legend of the taxa bar plot as two spaces after the species level on each species present in the samples (ex. "k__Eukaryota;p__Parabasalia;c__Spriotrychonympa;o__Spirotrichonymphida;f__Holomastigatoididae;g__Holomastigatoides;s__tenuis;;"). In contrast, in the dataset that Cthulu does not show up in, the taxonomic level to which my samples were classified is Level 7. This makes sense, because my classifier only classifies to level 7 (species level); however, it does not make sense that Cthulu isn't in a single one of my samples from that dataset.
When I merge the two datasets and produce a taxa bar plot of all of the samples together, the results align with the latter dataset mentioned - they are classified to a taxonomic level of 7, and Cthulu does not show up in any of the samples. Strange!
I attached the qzv files of the taxa bar plots to demonstrate the discrepancies. Has anyone encountered a similar problem? I'm not really sure what I'm doing wrong. I'm pretty new to using Qiime, but I've tried trimming my reads to different lengths in DADA2 (including ensuring that my two datasets were trimmed to the same length before merging the two together), I've tried changing the confidence level on my classifier from 0.7 to 0.6 and 0.5, and I've looked through the files used to make my classifier, and I just can't figure out the issue!
Any help is appreciated. Thanks!
Nicole