Interpreting denoising stats

Fatemah · October 3, 2018, 1:45am

Thank you Dear Nicholas
I have attached a screen shot from my stat-dada2.qzv, I know the whole idea is to see how many reads you start with and how many you have left at the end.

There is an input column which should be the number reads for forward and reverse reads together for each sample. Plz correct me if I am wrong?
Could you please explain other columns like filtered, denoised, merged and non chmeric ?
Why usually merged and non chimeric are relatively the same (shown in red)?
Could you please have a look.
Thank you for your help.

AfterDenoise.csv (2.2 KB)

Nicholas_Bokulich · October 3, 2018, 12:40pm

Since these are paired-end reads, N input == N forward == N reverse, not N forward + N reverse. But yes you have the right idea.

These are the # of reads left after each step of the process: filtering based on Q score, denoising sequences, after merging sequences (the drop here indicates paired-end sequences that cannot merge, either because the sequences are too short and do not overlap, or because the ends do not align), and after chimera filtering.

Not all samples have the same value, but most do. Evidently your sequences contain very low levels of chimera. That's a good thing!

I hope that helps.