Low frequency counts after DADA2 (7%)

Mehrbod_Estaki · July 19, 2018, 5:42pm

Thanks for the details about your issue!
You're right that the output does seem rather low but this just may be the nature of your data and DADA2 is doing it's job properly.
With the 785R, 341F primers we're expecting an amplicon size of ~ 444bp (785-341) so with 2x300 read cycle you should have an overlap of roughly 600-444= 156bp. So we want to make sure our truncating length doesn't go over 156 - (20 bp minimum overlap required + 20 bp natural variation to be safe) = 116 bp. So based on that calculation I would say both your scenarios are logical with their truncating parameters. What I suspect is happening however is that the the quality of your reverse reads are dropping low enough for dada2 to drop them due to low quality. This makes sense considering your second attempt kept more of the 3' tail of the reverse reads which allowed more poor quality reads, so more likely for a read to be dropped.
Can you share the result of your denoising-stats.qza? This should tell us a bit more about what is happening.
An easy solution would be to discard the reverse reads and just denoise the forward reads since they are in pretty good shape, this should yield much higher reads though at the cost of shorter reads.