To decrease per-sample sequencing costs in some large studies, our data was produced in a NovaSeq instrument. When we started working through the datasets our quality plots looked very odd, which led us to identify the new quality binning as the cause. (thanks to some posts here!!)
The question is: are there any best practices for working with this data? We are concerned that using any algorithm that takes into account quality scores is no longer appropriate, however, we are at an impasse regarding what the best approach is.
Do you have any advice?