Hello @Nicholas_Bokulich ,
Thank you for your response.
That's true (I could serve as a example). The thing is that, as you say:
So maybe there is no need to fine-tune these parameters and we are just overthinking it. The only way to know if it is worth it is, as you say:
So what I could do is:
- Get mock data e.g. from mockrobiota, as they do in the Fungal ITS analysis tutorial
- Follow tutorial until the denoising step.
- Export sequences, then use DADA2 in R and try combinations of a range of values of KDIST_CUTOFF and BAND_SIZE.
- Go back to QIIME2, and do taxonomic classificiation for each test
- Evaluate accuracy and see if best combinations are different enough from default values
I'm currently focusing in my ITS QIIME2 Snakemake pipeline but I can spend some time to do the benchmarking and then share my findings here. If we spot some improvements by changing the default values of those parameters, I could even try to do a pull request to the q2-dada2 GitHub repository, although I would need to do some research on plugin creation, structure and philosophy.
Best wishes