Changing one parameter to improve the DADA2 filtering percentage for CCS sequences by 50%? I'm not sure if that's feasible.

Hello everyone,

This is my first time using QIIME2 to process PacBio CCS sequences. I noticed that in qiime2 run_dada.R, the minQ parameter is set to 3, instead of the default value of 0 in dada2 filter.R (dada2/R/filter.R at master · benjjneb/dada2 · GitHub).

The version is 24.10 and I installed it in conda.

Could anyone kindly explain why the default parameter is modified and what benefits this change might bring?

Thank you!

When I used the qiime2 default parameter (3), the percentage of input passed filter was very low(~15%).

While set it as the dada2 default (0), the percentage would be much more higher(~70%).

Hello @tianrong_chen,

Thank you for brining this to our attention. As to why the parameter defaults differ between the qiime2 wrapper and the R software, it could be the case that minQ=0 was the dada2 default at the time the qiime2 wrapper was implemented, or the qiime2 default could have simply set to a different value on purpose, or by accident. Regardless, I think it probably makes the most sense to bring the qiime2 wrapper's default back in line with dada2's default.

It's surprising that you're seeing such a large difference between minQ=0 and minQ=3. Could you attach both of these dada2-stats visualizations, and could you also attach your demux visualization--if you feel comfortable doing so?

Thank you for your reply! I’ve attached the stats.qzv for both parameter sets and the demux.qzv files. I hope these files can help identify the reasons for the significant differences in sequence counts.
stats_given_minQ0.qzv (1.2 MB)
stats_default_minQ3.qzv (1.2 MB)
test_reads_250121_summary.qzv (301.0 KB)

Hello @tianrong_chen,

There are quite a few variables between your two dada2 runs. For example, the --p-max-mismatch and --p-max-ee parameters differ, among others. I would recommend isolating and changing only the --p-trunc-len --p-trunc-q to test any hypothesis about this parameter (which confusingly is not different between these two runs).

Thanks @colinvwood
Sorry, I mistakenly uploaded the files for minQ=0.
This is the results after only changing minQ to 0.
stats_default_minQ0.qzv (1.2 MB)

As you know, I’ve tried modifying many parameters. I believe you mean keeping all other parameters the same, leaving minQ at the QIIME2 default of 3, and setting --p-trunc-len to, for example, 1400. And this is the result.
stats_p-trunc-len_1400_minQ0.qzv (1.2 MB)


The passed filter percentage is higher than default, but still very low(~20%) .

Hello @tianrong_chen,

The minQ parameter to dada2's filterAndTrim function is actually not exposed through the qiime2 wrapper, so any differences you're seeing between qiime2 dada2 denoise-ccs runs is due to other variables. The reason minQ is set to 3 is likely because it is modeled after this tutorial. Let me know if this answers your questions.

Thank you very much! I think this answers my question well.

1 Like