Add DADA2 parameters

My labmate and I have been working with a set of data where we are specifically interested in more rare OTUs. However, when running DADA2 via QIIME2, we noticed that if an OTU is only found once in a sample (or even found once in another sample) the OTU is removed. However, DADA2 has a parameter in R that prevents it from removing these OTUs that are only found once in either one sample or multiple samples. Would it be possible to add these parameters (pool=“pseudo” or pool=TRUE) to the QIIME2 pipeline? I took the description for these parameters from DADA2 pipeline tutorial 1.8 and posted it below.

“By default, the dada function processes each sample independently. However, pooling information across samples can increase sensitivity to sequence variants that may be present at very low frequencies in multiple samples. The dada2 package offers two types of pooling. dada(..., pool=TRUE) performs standard pooled processing, in which all samples are pooled together for sample inference. dada(..., pool="pseudo") performs pseudo-pooling, in which samples are processed independently after sharing information between samples, approximating pooled sample inference in linear time.”

Best,
Stephanie

1 Like

Hi @Stephanieorch,
Thanks for the suggestion! Looks like we already have an open issue for this feature. I do not have an eta on when it might be added, but you can keep an eye on that issue to track progress.
Thanks!

2 Likes

Hi Nicholas,
Since the DADA2 1.8 has been released on Bioconductor, do you know if pseudo-pooling and pooling will make it to the coming release of QIIME2-2019.7? I’m really looking forward to using pseudo-pooling for my dataset within QIIME2, which otherwise has to be done in R instead.
Thanks!

1 Like

Just got an answer from Dr. Callahan on the Github issue page saying that he didn’t have the new Q2 release on his calendar but would try to make it happen before the deadline of PRs. Fingers crossed.

2 Likes