Is the issue that the quality score plot looks funny and q-scores are unreasonably high? This is actually normal. See here:
In the overlap region the q-scores will increase considerably due to the increased certainty of the base call from two reads. The scores are calculated as in USEARCH and described in Edgar & Flyvbjerg (2015) doi:10.1093/bioinformatics/btv401.
What makes you think that? These are V4 reads, so the length looks about right (though the actual length distribution is not shown in the screenshot you shared).
Your version is nearly a year old! Please install the latest version and use that — if this is in fact a bug we will need to start with the latest version for providing support. Thanks!
Thank you for your reply.
As you can see after position 245 or so it says in red for all positions, that the plot at that position was generated using a random sampling of 1 out of many more sequences. So when selecting the trim length for Deblur for this sequence, what would you recommend I pick?
Here is the result with Qiime 2018.6. I have looked everywhere and cannot figure out based on this plot what length to use to trim when denoising with deblur. I would be very happy for an answer to: What length do I use to trim with deblur and why?
It looks like the reads are probably joining correctly, but that most joined reads are ~245 nt long. Those other sequences appear to be (one or more) outliers that have longer joined lengths. So I still don’t think the quality plot is wrong, it just looks bizarre.
I would recommend truncating at 245, because that is the joined length of all but a handful of sequences. Longer would cause the 245-nt-long seqs to be dropped, and shorter would reduce the amount of information in your sequences.