Clustering the ASVs into OTUs at 98% identity

Hello. In the tutorial open-reference clustering, the reference sequences used in --i-reference-sequences are at 85% identity. Meanwhile, the reference sequences provided by silva are at 97% or 99% indentity, so could you please tell me how to make the reference sequences at 98% indentity? If using de-novo-clustering with the 99% reference sequences as input, how to get the reference taxonomy of the consequent 98% reference sequences?

Thanks in advance! :blush:

Mort

Hi @nmgduan,
The point of de-novo-clustering is that - unlike closed-reference, and open-reference clustering- it does not depend on a reference database. So if you want your OTU table to be clustered at 98% identity, you can simply set your feature-table, and rep-seqs file and set your desired identity clustering, as per the Clustering sequences tutorial:

qiime vsearch cluster-features-de-novo \
  --i-table table.qza \
  --i-sequences rep-seqs.qza \
  --p-perc-identity 0.98 \
  --o-clustered-table 98_otu_table.qza \
  --o-clustered-sequences 98_rep-seqs.qza

If you want to use open-reference clustering, simply use your 99% silva identity reference-sequences in the --i-reference-sequences parameter and set the identity to .98. You can always cluster to lower identity % of your references, but not higher.

Hope this helps!

3 Likes

Thanks for your prompt reply!! :clap:

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.