Amount of reads filtered after running dada2

Good morning to everyone,
I was looking at my denoising-stats.qzv file resulting after performing denoising, chimera filtering and clustering through dada2.

A table looking like this is shown.

I think I need some help about interpretating such data, and, more speficially, if the amont of sequences removed sounds coherent with a normal dada2 analysis. I’m afraid it could be too strict or not enough, and the amount of sequences left may end up affecting further analysis steps!

Thanks in advance!

Hi @Sparkle,
The steps proceed from left to right, so this is effectively showing you how many reads remain for each sample after each stage of the dada2 pipeline.

You are losing too many reads at the filtering stage… more than 90% in most samples!

This means that your reads contain too many errors so dada2 throws them out before attempting to denoise them… to fix this, you need to truncate the reads more to cut off the 3’ ends where quality degrades. I recommend reading the forum archives for some similar dada2 truncation issues to get some ideas about optimizing this step.

Good luck!

1 Like