This is my first time using QIIME2 to process PacBio CCS sequences. I noticed that in qiime2 run_dada.R, the minQ parameter is set to 3, instead of the default value of 0 in dada2 filter.R (dada2/R/filter.R at master · benjjneb/dada2 · GitHub).
The version is 24.10 and I installed it in conda.
Could anyone kindly explain why the default parameter is modified and what benefits this change might bring?
Thank you for brining this to our attention. As to why the parameter defaults differ between the qiime2 wrapper and the R software, it could be the case that minQ=0 was the dada2 default at the time the qiime2 wrapper was implemented, or the qiime2 default could have simply set to a different value on purpose, or by accident. Regardless, I think it probably makes the most sense to bring the qiime2 wrapper's default back in line with dada2's default.
It's surprising that you're seeing such a large difference between minQ=0 and minQ=3. Could you attach both of these dada2-stats visualizations, and could you also attach your demux visualization--if you feel comfortable doing so?
Thank you for your reply! I’ve attached the stats.qzv for both parameter sets and the demux.qzv files. I hope these files can help identify the reasons for the significant differences in sequence counts. stats_given_minQ0.qzv (1.2 MB) stats_default_minQ3.qzv (1.2 MB) test_reads_250121_summary.qzv (301.0 KB)
There are quite a few variables between your two dada2 runs. For example, the --p-max-mismatch and --p-max-ee parameters differ, among others. I would recommend isolating and changing only the --p-trunc-len--p-trunc-q to test any hypothesis about this parameter (which confusingly is not different between these two runs).
Thanks @colinvwood,
Sorry, I mistakenly uploaded the files for minQ=0.
This is the results after only changing minQ to 0. stats_default_minQ0.qzv (1.2 MB)
As you know, I’ve tried modifying many parameters. I believe you mean keeping all other parameters the same, leaving minQ at the QIIME2 default of 3, and setting --p-trunc-len to, for example, 1400. And this is the result. stats_p-trunc-len_1400_minQ0.qzv (1.2 MB)
The minQ parameter to dada2's filterAndTrim function is actually not exposed through the qiime2 wrapper, so any differences you're seeing between qiime2 dada2 denoise-ccs runs is due to other variables. The reason minQ is set to 3 is likely because it is modeled after this tutorial. Let me know if this answers your questions.