Strange Interactive Quality Plot after Importing the data

Hi @ari_sh70,
The quality scores of the newest generations of Illumina machines (including NovaSeq in your case) use a new more "streamlined" binning system for quality scores which does not give the traditional continuous Phred scores you may be familiar with, rather it bins the values into 4 values only, thus the very artificial looking quality plots you see. But this is totally expected. Now, how these newer binning systems affect the downstream quality control/denoising is a different discussion.
The DADA2, it looks like, are going to release a new version that can specifically better handle this new data type (as per here), but this is not currently implemented in the q2-dada2 version and I'm not sure how well or bad DADA2 would perform with the binning scores, my guess its error model building step may not do so well, but that is pure speculation on my part!
You could always try merging your reads with q2-vsearch and running the output through Deblur. Of note, the Deblur pre-packaged error model was based on Illumina MiSeq, and as far as I'm aware it has never been benchmarked against the NovaSeq data. My guess however is that it would work fine because the NovaSeq is meant to have more accurate base-calling than MiSeq, meaning that you may be taking a more conservative approach to your QC here.

Btw, the above suggestions are based on the assumption that you have amplicon data (i.e. 16S, ITS), and NOT shotgun data.

5 Likes