Merging (or not) ITS2 amplicons with variable length

colinbrislawn · January 18, 2019, 8:55pm

Well done! I'm so glad you included that control. This let's us answer LOTS of important questions.

This is one of the standing challenges with ITS reads. The variable length region can confuses pipelines that expected a constant length region like v4 16S.

This means that some reads can overlap by (300x2 - 369) = 231 overlap or totally overlap with overhang on both sides. This overhang with +100% overlapping reads can also confuse some joing steps. I'm not sure how well dada2 works with fully overlapping reads.

I wonder if this is due to the trunc option clipping the reads, so that they don't join and don't appear in the data. But even if you don't truncate your reads, they might still not join!! As I discovered here, the qiime2 dada2 plugin expects zero missmatches in the reads when joining. This would means a drop in quality could prevent joining, with or without the trunc option.

Maybe we should change the dada2 default to allow more than one missmatch so reads can actually join...

I really like this idea. You can use the mock community to choose settings that fully capture the diversity in your samples.

Let me know what you find!

Colin