You found the sequences!
We were looking for these 6 years ago.
At the time, some forums users were working with the kits and were having a hard time partly because the primers and regions were not published.
The fact that you got this all the way through DADA2 is remarkable.
Good job!
To your questions:
It's a multi-region kit, so any workflow is going to be non-standard. It has to be!
The taxonomy assignment looks promising. Did you include any positive controls?
Sure. Cutadapt has other settings you could use for extra trimming, if that was a problem.
Yes! You need full length because you have all the regions. I don't know if SILVA has been benchmarked with multiple regions, but you could do this with the positive controls!
Yes, per-region classifiers should perform better, but you would need to make quite a few of them! I think it's easier to use a full-length skl classifier or switch to a search+LCA classifier like vsearch. Speaking of which...
I'm not sure any of the classifiers were optimized for multi-region classification. I'm a big fan of vsearch because it's conceptually simple and runs fast. Try it out and report back!
It's hard to comment more about alpha and beta diversity without know more about the study design. I'll wait to write more. ![]()
Thank you for sharing your pipeline with us. And thank you for posting those primers after 6 years.