Low percentage of merged reads after denoising/merging with dada2.

Hello!

qiime2 version: 2020.6
Data: 16S rRNA metagenome 2x250 sequences from Illumina MiSeq. 12 Samples from various sites of the human body(skin, oral, some negative controls, etc.).
Problem: Low read count after merging and denoising paired end reads with dada2.
Screenshots:



Trimming is performed with the qiime2 cutadapt implementation, trimming both forwards and reverse V3-V4 primers and also the reverse version of them from the opposite directions.
As it can be seen, the sequence quality is not good, and not much of an improvement is achieved after trimming (demux summary before trimming not shown).
I have done a little bit of research and from what i found, usually it is suggested to use only the forward reads, when low read count is achieved after denoising/merging, but in this case, as it can be seen from the demux summary, the forward reads look even worse than reverse reads.
Trying different --p-trunc-len-f/r parameters made little to no changes.

Question: Is the quality of the reads in this case the most likely reason for low count after denoising/merging?
Should i only use reverse or forward reads or what other options do i have?

Any help would be greatly appreciated.

Not directly, @rsak-384. Take a look at your denoising stats. You’ll notice that most of your attrition happens during the merging step. You have trimmed too many nucleotides, and so reads are not long enough to overlap and join.

Imagine you’re trying to sequence V3 and 4 from 337F to 805R - you’d need 468nt to cover that region, plus or minus a few nt in natural variation. In addition, dada2 needs at least 12 nt of overlap so it can join reads properly. That’s ~480nt out of your available 500, so you don’t have much room to trim away low-quality data. Your actual numbers may differ, but that’s the basic idea.

This is where quality comes in - if you increase your effective read length by loosening trim/trunc parameters to allow read joining, you will lose more sequences to quality filtering. Can’t hurt trying it, but don’t expect a big improvement.

Using only single-end sequences here saves you the trouble of balancing quality against length. I’ve never used this hack, and don’t know if it will raise any issues for you during your downstream analysis, but it is possible to trick QIIME 2 into treating your reverse reads as if they were forward reads.

Good luck!
Chris :mosquito:

1 Like

Thank you for your quick reply.
So i skipped to reversed primer cutting from foward and reverse read, only did the normal one and as the truncating parameters i chose the 2% sequence length values in the demux summary after trimming.
In the end i got a pretty significant improvement, atleast in my mind. For one sample, 10x improvement, and for others 2x or such.

Maybe i will try using only the forward reads, because now they look the same as reverse, but for now, i will mark your response as the answer.

It did help me greatly, thank you very much!

1 Like