Problem with annotation using Silva 132 db

Hi,
After obtaining the ASV table, I am using Silva 132 db for annotation, however I get many unassigned ASV when I am running the “feature classifier” script (and also many incomplete taxonomy assignments).
Furthermore, I rarely get the full annotation (up to the species level) from this script.
When I build a full ASV table with the corresponding sequences and blast them in NCBI ( blastn- 16s db) I do get much higher specific annotations.

These are the scripts that I use:

qiime tools import --type ‘FeatureData[Sequence]’ --input-path /home/qiime2/Desktop/Datasets/Silva_132_release/silva_132_97_16S.fna --output-path silva_132_97_seqs.qza

qiime tools import --type ‘FeatureData[Taxonomy]’ --input-format HeaderlessTSVTaxonomyFormat --input-path /home/qiime2/Desktop/Datasets/Silva_132_release/taxonomy_7_levels.txt --output-path silva_132_97_taxonomy.qza

qiime feature-classifier classify-consensus-blast --i-query /home/qiime2/Desktop/xxx/N_2_rep-seqs.qza --i-reference-taxonomy /home/qiime2/Desktop/xxx/silva_132_97_taxonomy.qza --i-reference-reads /home/qiime2/Desktop/xxx/silva_132_97_seqs.qza --p-evalue 1e-11 --p-strand plus --p-maxaccepts 1 --p-perc-identity 0.99 --output-dir /home/qiime2/Desktop/xxx/N_3_blast --verbose

Am I doing something wrong?
thanks a lot,
Nir

Good morning Nir,

I wonder if the settings of classify-consensus-blast could be changing your result?

The default settings of blast classification in 2020.2 are super different from what I see in your script. Based on the settings you chose, this script is not classifying by consensus, it is looking for perfect matches only.

How were these settings chosen? How does this script work with default settings?

Colin

Dear Colin,
Thanks for your prompt reply.
I looked again at the tutorial and I notice that my parameters are not like the default ones, as you mentioned. (not sure I understood the meaning of “Consensus” ).
However, I am not sure what parameters I should change and how.
Probably the maxaccepts and perc-identity…
Could you please advise? I want to get the closet result to the blastN in NCBI.
Do you think the the strand direction is also a relevant parameter?
Thank you very much for your help,
Nir

You should start with the Qiime 2 defaults. The devs take a lot of care to make sure that the default settings are good, so in many cases changing the defaults will yield worse results!

All the default settings are listed in the docs that I posted. You can also just omit / remove a custom flag and the defaults will be used.

Of course! The default for this setting is both which is important because you want to classify a read regardless if it is forward or reversed in the database.


Good point!

So both classify-consensus-blast and classify-consensus-vsearch work by searching reach of your reads against a reference database, and returning a list of the top hits. (By default this is ten.) Then the taxonomy of these top hits is compared. The read is classified as the lowest taxonomic level of these hits that has reached a consensus (> 51% agree).

This is why the --p-maxaccepts 1 setting is so strange. This will return one results, so there is now chance to reach a consensus from the top 10 hits, you just get the taxonomy of that top 1 hit. I’m not sure why someone would set up the script like this… :thinking:

Well, NCBI blast reports 100 top hits by default. :man_shrugging:

Colin

Dear Colin,
Thank you very much for your answer.
I will change the script according to your suggestions.
Thanks again,
Nir

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.