Annotation Challenges with COI Amplicon Analysis in QIIME2

Hello everyone,

I'm currently analyzing the COI amplicon region using QIIME2 (version 2022.2) with samples originating from a lake and sequenced using the PE300 sequencing strategy. My amplification primers are as follows:


In my analysis workflow, I first imported the data and applied DADA2 denoising with the following commands:

qiime tools import \
  --type  'SampleData[PairedEndSequencesWithQuality]' \
  --input-path ${Data} \
  --input-format CasavaOneEightSingleLanePerSampleDirFmt \
  --output-path atcc-paired-end.qza

qiime dada2 denoise-paired \
  --i-demultiplexed-seqs atcc-paired-end.qza \
  --p-trim-left-f 26 \
  --p-trim-left-r 26 \
  --p-trunc-len-f 0 \
  --p-trunc-len-r 0 \
  --p-n-threads 20 \
  --o-table atcc_table.qza \
  --o-representative-sequences atcc_seqs.qza \
  --o-denoising-stats atcc_stats.qza

Subsequently, I trained a COI database following the tutorial "Building a COI database from BOLD references" on the QIIME2 forum and used the following commands for species annotation:

qiime feature-classifier classify-sklearn \
  --i-classifier COI.qza \
  --i-reads atcc_seqs.qza \
  --p-n-jobs 8 \
  --o-classification taxonomy.qza

My issue is that the annotation results using classify-sklearn only included 20 genus-level classifications. However, when directly comparing the same sequences with the NCBI NT database via BLAST, I can find many more COI gene matches.

To explore further, I tried classify-consensus-blast for comparison:

qiime feature-classifier classify-consensus-blast \
  --i-query atcc_seqs.qza \
  --i-reference-reads bold_derep1_seq.qza \
  --i-reference-taxonomy bold_derep1_taxa.qza \
  --p-perc-identity 0.8 \
  --o-classification otu-taxonomy.qza

This time, the results showed annotations for 700 genera. However, when adjusting the p-perc-identity parameter to 0.9, the number of annotated genera decreased to 30.

I'm curious about why there's such a significant discrepancy between the classify-sklearn annotation results and the BLAST comparisons? What's the impact of adjusting the p-perc-identity value on the annotation results? Are there recommended approaches to increase the number of genus-level annotations to better match the results from NT database comparisons?

Thank you all for your assistance!

Additionally, to eliminate the influence of the database, I utilized a new database available at:

This includes the COI database from both BOLD and NCBI.

However, the results were roughly similar. Using the command:

qiime feature-classifier fit-classifier-naive-bayes

I was able to annotate to 30 genera, while with:

qiime feature-classifier classify-consensus-blast

With p-perc-identity set to 0.8, it annotated up to 900 genera, but setting p-perc-identity to 0.9 resulted in annotations for 35 genera.

Moreover, the 900 genera annotated with p-perc-identity 0.8 had relatively low similarity, between 80-90%.

Why is this the case? How can I annotate more results effectively?

Hi @mengpf0409,

These are great questions!

This issue can be partially explained as outlined here, here, here and here. That is, the more strict your --p-perc-identity 0.9 setting, the fewer reference sequences will match. This will result in a smaller pool (this is tied to the --p-min-consensus parameter) from which to calculate the lowest common ancestor (LCA) consensus taxonomy. Thus resulting in a consensus taxonomy that is more broad, or at at higher taxonomic rank.

When using --p-perc-identity 0.8, you are allowing the retention of more hits to the reference sequences, increasing the pool of taxonomic information that can contribute to the LCA consensus taxonomy, again tied to the --p-min-consensus parameter. Kind of a "majority-rule" approach.

Keep in mind, for some datasets & primer-pair choice, it is not unusual for short-read amplicon data to be unable to classify taxa to the genus level. In some cases the more specific classification is the result of incorrect over-classification. That is, returning a more specific identification than it should be able to.

Also, feature-classifier classify-sklearn works a differently than feature-classifier classify-consensus-blast. I'd highly recommend reading the following papers:


Thank you very much, I will try it