collapseNoMismatch

Hi @M_R ,
Thanks for the suggestion. We have discussed adding this option to q2-dada2 for a few years now, and have an open issue at the moment. It has stalled for two reasons:

  1. the collapseNoMismatch option is actually a separate step in the dada2 R workflow, which basically corresponds to OTU clustering at 100%. This could be done, in theory, using the q2-vsearch plugin with de novo clustering at 100% to re-cluster ASVs that are trimmed at variable lengths.
  2. in general, there is quite a bit of disagreement that this option is even desirable in a "typical" dada2 workflow (but this is where obviously opinions diverge based on use case and biological questions). For single-end reads, truncating to the same position is generally recommended, as opposed to truncating to different positions (see some discussion here). Paired-end data probably should not be relevant here unless if some sort of variable spacer is used so the start positions are different (this is rare). So this is not to say that I disagree with you that having such an option would be convenient, only that this probably should not be the default.

But if you are interested in contributing this option to q2-dada2, you can see the open issue and discussion here. PRs are welcome :wink:

6 Likes