Taxonomy consensus column

Hi all,

Can someone help me understand the "consensus" column in my taxonomy file after running classify-consensus-vsearch or point me in the right direction?

For example, one ASV has this output with a consensus of 0.556: d__Eukaryota; p__Annelida; c__Polychaeta; o__Phyllodocida; f__Phyllodocida; g__Phyllodocida

Is this similar to NCBI BLAST "percent match", where there is only a 55.6% match to the sequence? If so, is there any kind of quality control I should be doing to filter out low matches?

Here is my script:

module load miniconda
source activate qiime2-2023.2
qiime feature-classifier classify-consensus-vsearch
--i-query rep-seqs.qza
--i-reference-reads silva-138-99-seqs.qza
--i-reference-taxonomy silva-138-99-tax.qza
--p-maxaccepts 5 --p-query-cov 0.4
--p-perc-identity 0.7
--o-classification taxonomy
--o-search-results searchresults
--p-threads 72

Thanks!

Hi @areaume,
The consensus is the fraction of assignments must match top hit to be accepted as consensus assignment. So for your example 55.6% of the assignments for your sequence agreed that it is d__Eukaryota; p__Annelida; c__Polychaeta; o__Phyllodocida; f__Phyllodocida; g__Phyllodocida.

The default min value for this is 51% but if you want a raise that you could use the --p-min-consensus parameter to the raise the min value.

:turtle:

1 Like

Thanks @cherman2! This helps clear things up a lot.

I also have a followup question- do you know why order, family, and genus are all "Phyllodocida"? It is my understanding that this is an order classification. I have a few other assignments like this as well.

Hi @areaume,
That is decided by the database you are querying. In your case, you are using the Silva database.
For more information on specific labels, I would look into Silva taxonomy.
:turtle:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.