DADA2 truncated low quality region

Hallo everyone.

I have some questions about DADA2 truncate low quality region. I would appreciate it if you can help me!

After running ITSxpress, I trimmed the sequences based on the quality plot, but the final output showed very few filtered sequences. However, when my lab mate used the same data and applied the parameters I had used for a previous dataset (because she assumed they didn’t need to be changed), the filtered sequence count was much higher. I wonder how to set the trimming parameters for DADA2? Why did I get so few filtered sequences even though I trimmed based on the quality plot?

My code:
nohup qiime dada2 denoise-paired
--i-demultiplexed-seqs F_TrimPri_itsx.qza
--p-trim-left-f 0
--p-trim-left-r 0
--p-trunc-len-f 221
--p-trunc-len-r 235
--p-n-threads 40
--p-max-ee-f 2
--p-max-ee-r 2
--p-trunc-q 2
--p-min-overlap 12
--o-table F_dada2_Tab.qza
--o-representative-sequences F_dada2_Reps_LSR.qza
--o-denoising-stats F_dada2_Stats_LSR.qza
--verbose &

my lab mate code:
nohup qiime dada2 denoise-paired
--i-demultiplexed-seqs F_TrimPri_itsx.qza
--p-trim-left-f 0
--p-trim-left-r 0
--p-trunc-len-f 209
--p-trunc-len-r 219
--p-n-threads 40
--p-max-ee-f 2
--p-max-ee-r 2
--p-trunc-q 2
--p-min-overlap 12
--o-table F_dada2_Tab.qza
--o-representative-sequences F_dada2_Reps_CXC.qza
--o-denoising-stats F_dada2_Stats_CXC.qza
--verbose &
F_TrimPri_itsx.qzv (344.7 KB)
F_dada2_Stats_CXC.qzv (1.3 MB)
F_dada2_Stats_LSR.qzv (1.2 MB)

Hello @lishaoran0917,

Here is the key detail!

This means that the reads have been trimmed twice: 1) when running the ITSxpress pipeline and 2) when running DADA2.

During the second trimming, DADA2 will remove all reads that are shorter than the settings used. In your example

--p-trunc-len-f 221
--p-trunc-len-r 235

So a read that was 219 forward or 230 reverse would be removed.

You can run qiime demux summarize on your ITSxpress outputs to see the read-length distribution that is being passed into DADA2.

Let me know if you can confirm that ITSxpress is cutting your data short!

Yes, ITSxpress was cut my data short.

Before ITSxpress:

After ITSxpress:

1 Like

Yes, I agree with your conclusion!

Let us return to the first question:

I usually run DADA2 several times with different --p-trunc-len—* settings and select the ones that allow most of my data to merge successfully.

Now that you have run DADA2 twice, try running it a few more times to maximize the values in the "percentage of input merged" column! Your coworker found good settings already, so see if you can find even better ones!

For a detailed discussion of how to choose DADA2 truncation settings, see:

Okay!! Thank you very much!!