Difference in number of sequences and features per sample.

Hello. I have been using QIIME2 for quite a bit. I have observed a great reduction in the number of features presented in any sample after deblurring as compared to the number of original sequences in that sample. Why is this so? For example, 71247 sequences became 5231, 85005 became 7940 and so on. Does that mean only so many reads can be used for for furthur analysis? If yes, how can I improve/maximize the number of reads that can be saved?

@Shriya_Sawant, I don’t have much experience with Deblur, but I’d be happy to work with you on troubleshooting.

Are you losing sequences during the quality filtering part of the process, or during deblur itself?
Do you know how to view/interpret the statistics produced during those steps? (If not, take a look at this section of the moving pictures tutorial.)

@ChrisKeefe Yes, I think I am losing these while doing deblur itself. I had a total of 1,444,088 sequences initially, but after the deblur step, I was left with only 132,237 frequency of features in all.
In the quality filtering step, I just lost 56 seqs.

Can you share your deblur stats qzv?

deblur-stats300.qzv (207.0 KB)

Do you know how to view/interpret the statistics you just shared?

Your experience appears to be somewhat similar to data in this post, which also includes a great discussion of deblur, and links to other useful topics. Give those a read, consider reading the deblur paper, and report back here with relevant information or targeted questions.


As a side note, it’s almost always worth your time to search this forum for existing answers before posting your own question. There’s a ton of expert information here, and you’ll probably spend less time searching than you would waiting for troubleshooting support.

@ChrisKeefe thankyou for your time and reply. Will definitely have a look. :slight_smile:

