How to cluster sequences into OTUs, not ASV

SouthGateProject · November 21, 2025, 8:01pm

I need to cluster sequences according to a 97% threshold, but the current QIIME2 version only provides DADA2-based ASV. Although a method is provided here( How to cluster sequences into OTUs - Microbiome marker gene analysis with QIIME 2 ), according to this paper(https://doi.org/10.1093/ismejo/wrae106), it cannot be used to calculate Chao1/ACE after denoising with DADA2.

timanix · November 21, 2025, 8:06pm

Hello!

One can use vsearch-dereplicate plugin within Qiime2 to dereplicate demultiplexed sequences, which results in feature table and representative sequences artifacts. Those can be used for chimera detection and clustering into OTUs at desired threshold. It skips Dada2, so you can use it for the metrics you mentioned above. You can check this thread for an example how to replicate sequences.

Does it help?

SouthGateProject · November 22, 2025, 4:32pm

Thank you! However, how to denoising, chimera removal, or other quality control before clustering by vsearch cluster-features-closed-reference? I cannot find any infromation in new QIIME2 docs

gregcaporaso · November 23, 2025, 8:04pm

Hi @SouthGateProject,
Is there a reason why you need to compute ACE and Chao1 specifically? An important point from the paper you shared is:

these algorithms are impeded by technical limitations intrinsic to Illumina amplicon data that prevent confident resolution of authentic singleton sequences.

If you're simply interested in computing richness, and don't need to specifically compute ACE and Chao1, I would recommend methods that don't rely on accurate identification of singletons in the data because this is generally problematic, regardless of how you to quality control. The metrics that are computed as part of the gut-to-soil tutorial or the Moving Pictures tutorial are generally the ones I work with and I recommend starting there if possible.

Hope this is helpful!

SouthGateProject · November 24, 2025, 7:23am

thank you! @gregcaporaso

Do you mean Observed Features (a qualitative measure of community richness) of Kmer-based diversity analysis?

gregcaporaso · November 25, 2025, 2:21pm

Hi @SouthGateProject, Yes, those, but also see the boots core-metrics command. The methods there are a bit more traditional in that they're telling you about the ASV diversity (as opposed to the kmer diversity). ASV diversity and kmer diversity will almost certainly be nearly perfectly correlated, but depending on what you're interested in you may prefer one over the other.

SouthGateProject · November 26, 2025, 7:11am

@gregcaporaso thanks

system · December 27, 2025, 1:12pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.