I have doubts about the interpretation of the number of ASVs that I obtained in my data set and I would like your help. After running DADA2 I got 2,268,395 nonchimeric sequences and 18,616 ASVs. Is it okay to say that of the 2,268,395 sequences 18,616 sequences were assigned and the rest (2,249,779) were not? And that only 0.82% of my total sequences were assigned ASVs? Isn't this a very low percentage?

Can you expand on where you are seeing these numbers? Because, I think what you are looking at is 2,268,395 total denoised sequences, representing 18,616 unique ASVs. So you would say your overall depth across all samples is ~ 2.26 million sequences, and those are assigned to ~18k unique ASVs.


Hi! @Mehrbod_Estaki, this is the data i'm seeing.

Thanks for the update, and yes so my assumption was correct. You have ~16K unique ASVs across all 36 samples, and your total # of sequences across those 36 samples is ~2.2 million. This looks just fine to me!

In conclusion, these 2,268,395 sequences were grouped into 18,616 ASV's, therefore it is not correct to subtract that number of ASV's from the total number of sequences (18,616 ASV's-2,268,395) and say that 2,249,779 were sequences that were lost? Excuse me, I want to understand the idea well.


Here is a toy example:

Sample_name ASV1 ASV2 ASV3
S1 5 2 100
S2 0 0 80
S3 50 0 0

So in this example we have 3 unique ASVs across all 3 samples, and a total of 237 sequences.
So in this example the visualization summary would say:

Number of samples 3
Number of features 3
Total frequency 237

Hope that helps.


@Wen, if it's QIIME 2 related, you would be better off just starting a new post on the forum as you'll likely have much quicker response and you'll get other expert advise that I may not be able to provide.

