Why does deblur work and dada2 does not?

llenzi · July 29, 2020, 2:38pm

I surely agree on your principle on not using a tool as a black-box (although sometime is handy for speedy result)!

I think we are going really off-topic now so I'll try to add more of my thought in here but after this I would ask if you so kind to open one (or more if you need) additional topic(s).

On the expected min overlap, please note that is now 12 bp, as discussed in the thread:

A possible way to change this threshold is described here:

On the denoising step, let see if I can explain myself better.
In denoising together sequences of different lengths (different amplicons), I would be worried that a trimming setting wont fit them all, because all the sequences shorter (and expected to be shorter) than the chosen settings will be lost by dada2 normal behaviour.

If you would like to try there are alternative merging methods such as join-pairs: Join paired-end reads. — QIIME 2 2020.6.0 documentation
You could try this to have an idea don how good is the merging possibility in your sequences!

In my mind, your analysis would be close to the closed-reference clustering described here Clustering sequences into OTUs using q2-vsearch — QIIME 2 2020.6.0 documentation

Hope it helps