Issues with Classifiers in QIIME 2 - Unusual Assignments Over 99%

SoilRotifer · September 19, 2024, 7:43pm

If you look at the denoising-stats.qzv file you'll notice that your are losing many reads due to failed paired-end merges. With only ~ 10% of your reads merging. this means you need to adjust your --p-trunc-len-* parameters. Based on your quality score plots I'd recommend the followng:

--p-trunc-len-f 149
--p-trunc-len-r 149

The issue is likely that there is not enough overlap between the paired-ends to merge (you need 12 bases of overlap for DADA2), or there are too many mismatches in the region of overlap which results in an unsuccessful merge, or both.

Do you know if these data were generated using the EMP approach? If so, you should be fine with 2x150 sequencing runs, and the quality is an issue. Thus you need to play around with the truncation parameters.

But I am assuming you will likely be unable to merge the reads, as I do see the PCR primers contained within your reads which match the patterns: GTGYCAGCMGCCGCGGTAA near the 5' end and ATTAGAWACCCBNGTAGTCC (reverse compliment ) near the 3' end. These data would require 2x250 sequencing run, see here. I suspect that you may be limited to processing your data only using the forward reads. This also means you'd need to run cutadapt to remove the primers prior to analysis.

-Mike