Hello again @LiyingXie,
Just as I had feared. Because the region overlapping is low quality, a longer overlap does not cause more reads to join...
OK, using only the forward read fixed the problems with quality and joining! ![]()
That is a problem, but luckily I have found the cause ![]()
It looks like your reads start with some random bases, then an adapter, then the true read. I'm willing to bet that these 6 bp are barcodes, causing your ASVs to seperate by sample.
(This paper lists actcctacgggaggcagcag is a 16S primer.)
If you trim off the barcode and adapter, your ASVs should appear across samples!
Try running with --p-trim-left 26 and see how it goes!
Yes. Removing low quality data should make your diversity analysis more accurate! ![]()
Those 6 starting basepairs and adapters look pretty abiotic to me!
![]()
Even better: the ASVs created by DADA2 could be 100% unique, with as little as 1 bp difference between them! The DADA2 paper explains this more. ![]()
Keep in touch,
Colin
