How can I classify my ASVs with a sequence reference database?

I apologize if I use the wrong words.
I am recently working on data analysis and I have some questions.
How can I classify my ASVs with a sequences reference database? When obtaining ASVs, is it possible to perform a grouping by reference to 99% identity to obtain OTU’s?.
And if it is, will I lose the high resolution of the ASVs (even when grouping at 99%)?

Hi @cinthya_vieyra,

ASVs are the highest resolution version of your sequences, think of them as 100% OTUs. If you want to cluster them using a reference database (ex Greengenes, Silva, GTDB etc.) you can certainly do that using q2-vsearch, see this tutorial for more details. My recommendation is to perform denoising first with DADA2 or Deblur, then if you still want OTUs then perform clustering on your ASVs (as per the example in the tutorial).
As for resolution, yes, you will technically resolution a bit because you are collapsing sequences that are within 1% different from each other. What this means biologically depends on your samples and question. Clustering can combine 2 or more similar -but distinct- species into one OTU, but on the other hand ASVs can also call the reads from the same species into multiple features because some species have multiple copies of the 16S that are non-identical.
Hope that helps!


Thank you so much for your help!

1 Like

A post was split to a new topic: I have had trouble being able to process my data with dada2.