97% similarity VS 99% Similarity-QIIME2

I was looking at the paper (https://www.microbiologyresearch.org/content/journal/ijsem/10.1099/00207713-44-4-846) for the gold standard 97% similarity OTU clustering and just wondering why Qiime2 changed the similarity to 99%?

Thank You.

1 Like

Good afternoon @Karen_Yike_Shen,

I think in October 1994, when that article was published, 97% was the gold-standard, but times have changed! We have way faster computers and better sequencing technology today, and a new Gold Standard has been proposed:

100% similarity OTUs == exact sequence variants == amplicon sequence variants (ASVs)

Exact sequence variants should replace operational taxonomic units in marker-gene data analysis


Let us know what you think! I think people still use 97% OTUs, but ASVs are gaining a lot of popularity.


Hello Colin,
Thanks for your reply and the paper. I think the big picture of taxonomy profile for 97% and ASVs are similar from my results. However, more sequences at family level are unidentified using the ASVs. Lots of genus are unidentified (Genus is too much to ask for 16s rRNA V4 sequencing though).

I used the default greengenes database which have not been updated for several years. I will try to use SILVA to see if the 97% and 100% clustering differ a lot.

Hello Yike,

I agree with your interpretation. I hope you find the ASV methods in Qiime 2 to work well for your project and give you additional taxonomic resolution.

:+1: Good idea!

If you have more questions, you can always post them here.


1 Like

Hi @Karen_Yike_Shen
When you say:

I’m intrigued by what you mean here. Do you mean compared to 97% OTUs of the same dataset? If so, I’m not sure if I agree with this. If you are getting better classification of an OTU compared to its corresponding ASVs I would not trust the former over the latter. And this really seems like an issue of the taxonomic classification method and reference databases used not so much OTU vs ASV. Or perhaps you are misdiagnosing this because with the ASV method you would have 3-4 features that are not ‘identified’ at the same resolution as 1 OTU that encompasses all those 3-4 featuers.