Add DADA2 parameters


(Stephanie Orchanian) #1

My labmate and I have been working with a set of data where we are specifically interested in more rare OTUs. However, when running DADA2 via QIIME2, we noticed that if an OTU is only found once in a sample (or even found once in another sample) the OTU is removed. However, DADA2 has a parameter in R that prevents it from removing these OTUs that are only found once in either one sample or multiple samples. Would it be possible to add these parameters (pool=“pseudo” or pool=TRUE) to the QIIME2 pipeline? I took the description for these parameters from DADA2 pipeline tutorial 1.8 and posted it below.

“By default, the dada function processes each sample independently. However, pooling information across samples can increase sensitivity to sequence variants that may be present at very low frequencies in multiple samples. The dada2 package offers two types of pooling. dada(..., pool=TRUE) performs standard pooled processing, in which all samples are pooled together for sample inference. dada(..., pool="pseudo") performs pseudo-pooling, in which samples are processed independently after sharing information between samples, approximating pooled sample inference in linear time.”


Method comparison - differences in observed richness
(Nicholas Bokulich) #2

Hi @Stephanieorch,
Thanks for the suggestion! Looks like we already have an open issue for this feature. I do not have an eta on when it might be added, but you can keep an eye on that issue to track progress.