Hello
I am a newbie around here, so forgive me for the basic question.
Here on this post I mentioned, you say we should consider trimming when median Q drops below 20. But which one is the median Q on the parametric seven-number summary table? Is it the 25th or the 50th percentile (or neither)?
Also, I often see the terms "trim" and "truncate". What are the differences between them?
Great question! The median quality score corresponds to the 50th percentile in the parametric seven-number summary table. It should be annotated as the median as well, for reference - here is a screenshot from an example interactive quality plot and summary table that demonstrates this:
Truncation refers to the 3' end of your reads, while trimming refers to the 5' end.
When selecting a truncation length, any reads shorter than that length will be discarded - and the remaining reads will be shortened to the truncation length (at the 3' end).
When selecting a trim length, this refers to the number of base pairs to remove from the 5' end (and this occurs after truncation has been performed).