Less percentage of reads merged in qiime-DADA2 even with enough overlapping length present.

denoising-stats.qzv (1.2 MB)

Hello QIIME 2 team,

We are analyzing the V3–V4 region of the 16S rRNA gene, sequenced using 2* 300 bp. The primers used are:
Forward: 341F
Reverse: 785R

We trimmed Illumina universal adapters using Cutadapt outside QIIME 2, retaining reads with a minimum length of 250 bp. Then we used the QIIME 2 Cutadapt plugin to trim the primers (341F and 785R) as shown below.

qiime cutadapt trim-paired
--i-demultiplexed-sequences qiime-cutadapt-file.qza
--p-cores 14
--p-front-f CCTACGGGNGGCWGCAG GACTACHVGGGTATCTAATCC
--p-front-r GACTACHVGGGTATCTAATCC CCTACGGGNGGCWGCAG
--p-match-read-wildcards
--p-match-adapter-wildcards
--p-discard-untrimmed
--o-trimmed-sequences primer-trimmed-cutadapt.qza
--verbose

After this, we ran qiime dada2 denoise-paired with the following parameters (no length truncation applied):

qiime dada2 denoise-paired
--i-demultiplexed-seqs primer-trimmed-cutadapt.qza
--p-trunc-len-f 0
--p-trunc-len-r 0
--p-n-threads 15
--o-table table.qza
--o-representative-sequences rep-seqs.qza
--o-denoising-stats denoising-stats.qza
--verbose

However, the percentage of merged reads is very low, and we are losing a significant number of reads during the merging step. Statitics after denoising is attached here.

Could you please help us understand .Why the merging might be failing, even when using full-length reads with enough length for overlap available?

Hello @Deepika_J,

Does the quality drop off significantly in the ends of the unmerged reads? What is the mean read length before merging? Attaching a demux visualization of the post-trimmed reads would be helpful.

Hello @colinvwood

Thank you for your reply.

The demux summarization after primer trimming in qiime2-cutadapt have been attached. As seen from the same, 280bp is the length 50% of the reads post primer trimming. And And quality does not drop below 20 phred score. Also, we havent trimmed nor truncated within DADA2. Still we are loosing data during merging.


primer-trimmed-cutadapt.qzv (346.0 KB)

Hello @Deepika_J,

I agree, neither the quality nor the read lengths seem to be the issue. I'm unaware of other factors beyond these that can affect merging, so I'll leave this post queued in case others have ideas. Another option option is to open an issue on the dada2 GitHub repository, the developer is often responsive to questions there.

1 Like

This is the key detail:

The default for DADA2 is to allow 0 mismatches when joining. While this setting can be changed in R and in future version of the Qiime2 plugin (see q2-dada2#179), it was set to zero during your run.

If you can set merge mismatches to 2 or 5 or 10 or something, that could also address this issue. This would require the next version of Qiime2 or using the R package directly.

Until then, you could trim the reverse read as short as possible. This will minimize mismatches that would lead to >0 differences and unmerged reads.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.