Hi Sam, this is because the primary and recommended quality filtering parameter in DADA2 is "expected errors" (max-ee
), which is based on the quality scores but is a better filter than averaging raw quality scores. You can read more about EE filtering here: https://doi.org/10.1093/bioinformatics/btv401
The quality truncation at q-score 2 is really just for older Illumina software where a score of 2 was code for "I don't know what's going on anymore" and any bases after the first 2 often were poor. These days, it's basically superfluous in most cases, and I'd recommend using max-ee
as the quality filter in almost all cases, in conjunction with trunc-len
to truncate off low quality suquence tails.