How to cluster sequences into OTUs, not ASV

I need to cluster sequences according to a 97% threshold, but the current QIIME2 version only provides DADA2-based ASV. Although a method is provided here( How to cluster sequences into OTUs - Microbiome marker gene analysis with QIIME 2 ), according to this paper(https://doi.org/10.1093/ismejo/wrae106), it cannot be used to calculate Chao1/ACE after denoising with DADA2.

Hello!

One can use vsearch-dereplicate plugin within Qiime2 to dereplicate demultiplexed sequences, which results in feature table and representative sequences artifacts. Those can be used for chimera detection and clustering into OTUs at desired threshold. It skips Dada2, so you can use it for the metrics you mentioned above. You can check this thread for an example how to replicate sequences.

Does it help?

1 Like

Thank you! However, how to denoising, chimera removal, or other quality control before clustering by vsearch cluster-features-closed-reference? I cannot find any infromation in new QIIME2 docs

Hi @SouthGateProject,
Is there a reason why you need to compute ACE and Chao1 specifically? An important point from the paper you shared is:

these algorithms are impeded by technical limitations intrinsic to Illumina amplicon data that prevent confident resolution of authentic singleton sequences.

If you're simply interested in computing richness, and don't need to specifically compute ACE and Chao1, I would recommend methods that don't rely on accurate identification of singletons in the data because this is generally problematic, regardless of how you to quality control. The metrics that are computed as part of the gut-to-soil tutorial or the Moving Pictures tutorial are generally the ones I work with and I recommend starting there if possible.

Hope this is helpful!

3 Likes

thank you! @gregcaporaso

Do you mean Observed Features (a qualitative measure of community richness) of Kmer-based diversity analysis?

Hi @SouthGateProject, Yes, those, but also see the boots core-metrics command. The methods there are a bit more traditional in that they're telling you about the ASV diversity (as opposed to the kmer diversity). ASV diversity and kmer diversity will almost certainly be nearly perfectly correlated, but depending on what you're interested in you may prefer one over the other.

@gregcaporaso thanks

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.