I have 16S rRNA sequencing data targeting the V3–V9 region, generated using paired-end 2×300 bp reads. I’ve been trying different truncation length combinations to improve read merging, but the merge rate consistently stays around 65–70%, even though filtering retention is always above 99%.
I would appreciate any suggestions on how to improve the merging percentage.
If this is a new sequencing run, it might have a poly-g tail (coming from the new 2-color chemistry set, where the sequencer can't tell the difference between no base and a G base). Additionally this gets complicated by the fact that a G in the 2-color chemistry is always a 40.
I found for a recent sequencing run that my quality looked really good and then I removed the poly-g tails and my quality was not as good, as I had originally thought. The poly-g's also could effect merging too.