Samples with low feature counts

yeojuny · July 5, 2017, 11:33pm

I'm just new to use Qiime2 now, and I exported the feature table from Qiime 2 to Qiime 1.
So I could have a look as it's like OTU table somehow. But I realized quite many samples (around 40 samples from 1400 total cohort samples) had less than 10 features, which means they have only a few OTUs in terms of Qiime 1 system. when I used UPARSE for this same samples, the sample with the lowest OTU count was found 40. Is it a kind of normal issue in Qiime 2 (by DADA2 protocol)?

pjtorres · July 6, 2017, 2:44am

Hi @yeojuny,

UPARSE is an OTU clustering algorithm while DADA2 and Deblur no longer use sequence clustering but instead use sequence error correction algorithms to actually identify sequence features. Thus both approaches (DADA2 and Deblur) do quite a bit of filtering to remove sequences with sequencing errors. In my own data I have observed that about half of my sequences get thrown away with both Deblur and DADA2 (since about half the reads have a sequencing error).

yeojuny · July 6, 2017, 12:58pm

Thanks for the answer!
so in the end, do you think it's better to use OTU algorithm if I want to see the general composition difference in certain groups?

pjtorres · July 6, 2017, 3:27pm

It is tempting to go back to what we are used to, but remember that people never really liked OTUs or their clustering approach to begin with; it was just the best we had. In both DADA2 and Deblur papers, we are able to see the advantages of these approaches such as a reduction of “batch effects” resulting from differences in error rates caused by different methodologies. While these techniques are not perfect, I prefer to use either Deblur or DADA2 for my analysis. One of the first things I noticed is that my alpha diversity metrics were not as inflated as they were with OTU picking, but the significant differences between my groups were still there. I feel that if my samples, or yours, really are different then I should see the general composition difference between my groups using DADA2 or Deblur.

benjjneb · July 7, 2017, 3:32am

With the caveat that I am the developer of DADA2: We have a preprint about the substantial benefits from ASV amplicon sequence variant methods (aka error correct/denoising/...) like DADA2 or Deblur relative to OTU methods: http://www.biorxiv.org/content/early/2017/03/07/113597

And at the risk of speaking for Robert Edgar (the creater of UPARSE) I believe even he recommends using ASV methods over OTU methods now: FAQ: Should you use UPARSE or UNOISE?

yeojuny · July 7, 2017, 3:42am

I really appreciate your opinion! Thank you!

system · August 7, 2017, 9:42am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.