I have a few questions regarding ASV table over here. I have generated my OTU table in the taxonomy analysis step and recently get to know very little about ASV (if someone can explain it to me in a simple words that will be much appreciated). Since I am using DADA2 in QIIME2, I wonder I have already generated my ASV outputs in terms of rep-seqs-dada2.qza, stats-dada2.qza and table-dada2.qza, am I right?
If I am right, how can I “use” this ASV table? Or how can I transform them into a readable table form? Is it by converting them from a .qza format into a .biom file? Can they be treated just like an OTU table?
You can think of ASVs as being 100% OTUs that have gone through some sort of denoising process. So instead of being operational taxonomic units clustered as some % threshold as a crude method for reducing noise by collapsing into clusters, they are amplicon sequence variants that recover true signal from the sample. I recommend this paper for a better idea of the transition from ASVs to OTUs.
Hi Clara, I replied to you directly in DMs, but I want to add one weird quirk I am not sure @Nicholas_Bokulich mentioned.
The taxonomic phylogeny is not tied to the ASV/OTU biom file. You will need to manually add them together. I think there’s a couple of issues, but if you’re importing into Phyloseq you need to get the taxonomic file with the OTU/ASV biom file and concatenate them according to ASV/OTU ID.
I am a bit confused here… I usually get the output from DADA2 (rep-seqs) and classify them against the silva_132 99% (available here => https://www.arb-silva.de/download/archive/qiime/) using qiime feature-classifier classify-consensus-vsearch to get taxonomy assignment for each feature. Am I doing it right?
Thanks for all the replies, it is very helpful.
But I wonder which step makes the OTU table an OTU table? Because let’s say we use DADA2 to filter and trim our raw sequences read, the outcome will actually be an ASV table. Next, if we use this ASV output in the feature-classifier step, the generated artifact taxonomy.qza will actually be an ASV table?
using one of the OTU clustering methods in the q2-vsearch plugin. I recommend reading the overview tutorial for a better idea.
No, because the output of a taxonomy classifier is not a feature table at all, it is only a FeatureData[Taxonomy] artifact, which is effectively a list of sequence IDs and their taxonomic identities. It does not contain abundance information.
Thanks for the link. By looking at the vsearch plugin, I have 2 questions at the moment and answers for it, but need some validation if my analysis is right:
What is the different if I use a qiime vsearch cluster-features-open-reference and qiime feature-classifier classify-sklearn?
Since both of these functions started with a FeatureData[Sequence], I am thinking is that means that if I use qiime vsearch cluster-features-open-reference (taken from the tutorial),
qiime vsearch cluster-features-open-reference
an OTU table will be generated (since this is an OTU clustering step) and I will then use this rep-seqs-or-85.qza file in the qiime feature-classifier classify-sklearn to generate a taxonomy.qzv file and subsequently in a qiime taxa barplot to obtain the final OTU table?
If I do not use qiime vsearch cluster-features-open-reference (or de novo or closed reference), and straight away classify using qiime feature-classifier classify-sklearn followed by qiime metadata tabulate and qiime taxa barplot I am actually generating an ASV table?
BIG differences! DO not equate the two, and the overview tutorial should make that clear. The first clusters sequences into OTUs to make an OTU table and list of OTUs (sequences). The second assigns taxonomy to those sequences.
Yes, you have the sequence correct. But adding taxonomy does not make it a “final” OTU table. The OTU table is an observation matrix of OTUs in each sample, and in QIIME 2 that is final. Taxonomy is always kept separate — this is a type of “feature metadata”, information about the features (OTUs) stored inside a feature table (OTU table).
It depends. Are you using dada2/deblur to denoise your data? If yes, then you have an ASV table, and should not need to cluster those ASVs (I discourage it, and there is lots of discussion on this forum for the reasons for/against and I suggest reading those discussions for more info).