I used dada2 to denoise my pair-ended reads in qiime2-2019.7 and R (dada2 version, 1.12.1), and got 3533 and 2535 features in the raw ASV table, respectively. The differences in the parameters I used in R were setting
pool=TURE for sample inference, and using
method=pooled for chimera removing. I was expecting to see more features in the ASV table deniosed by dada2 with
pool=TURE as singleton reads were retained. While there’s a big difference in the number of features, the total number of sequences after denoising are quite similar, being 7,717,544 and 7,540,610 respectively.
I noticed that dada2 in R uses 1e8 bases for denoising whereas qiime2-2019.7 uses 1 million reads. But I doubt this is the cause. Could someone explain why I’m seeing such a big difference in the number of unique features obtained in qiime2 and R?