q2-breakaway and q2-dada2 pseudo-pooling

Hello,

I was curious as to whether the q2-dada2 pooling parameter

  --p-pooling-method pseudo

was sufficient for maintaining the reads/singletons required for q2-breakaway diversity analysis?

Additionally, I was wondering whether the q2-dada2 chimera parameter

--p-chimera-method TEXT

needed to be changed from the 'consensus' default to 'pooled' and if that would benefit downstream q2-breakaway analysis?

Thank you for your time.

Best,
Daniel

1 Like

Hello, I was wondering if anyone could provide input to whether the q2-dada2 parameter

--p-pooling-method pseudo

was sufficient for maintaining the reads that q2-breakaway requires?

Hi @dann818,

If you use the "pseudo" method for pooling you'll want to use "pseudo" for chimera detection too. In fact, the original idea was to create an "auto" parameter that would match the pooling method (as per this issue), and it may still be implemented in future versions.

As for your other question, if you are going to use breakaway, I personally would suggest using the pseudo option, however, the "independent" pooling method would be fine too. DADA2 doesn't keep singletons by default because they are nearly impossible to differentiate from erroneous reads. Breakaway's richness estimation assumes all features it sees are real biological reads. So here is where you need to use some discretion, making sure you are confident in the data you feed breakaway, otherwise it will build its models with inflated richness. The models are more accurate if there are more "rare features" but you also don't want to be building those based on erroneous reads, so it can be a bit tricky.

1 Like

Thank you Mehrbod! That was incredibly informative. Will have to think hard about how much I trust my data! Additionally, was attempting to troubleshoot q2-breakaway alpha function over an hour yesterday to no success so I may have to just table this for now.

1 Like

Hi @dann818,
Glad you found it useful. The q2-breakaway plugin has not had active development in some time, you may want to consider the native R version which works fine for me. If you do want to troubleshoot your problems though feel free to start a new thread with some more details here.
Good luck!