Poor taxonomic resolution for ITS1 classification

Hello,
I am working on an ITS1 dataset and while I try to classify sequences I obtain a very poor classification.
In fact > 50% of my sequences are classified just at Kingdom level (Fungi).

Here is the pipeline I am using.

IMPORT SEQS

qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path manifest.tsv
--output-path reads.qza
--input-format PairedEndFastqManifestPhred33V2

DEMUX SUMMARISE

qiime demux summarize
--i-data raw-reads.qza
--o-visualization raw-reads.qzv

DENOISING DADA2

qiime dada2 denoise-paired
--i-demultiplexed-seqs raw-reads.qza
--p-trunc-len-f 234
--p-trunc-len-r 175
--o-table table.qza
--o-representative-sequences repseqs.qza
--o-denoising-stats dada-stats.qza

CLASSIFICATION

feature-classifier classify-sklearn
--i-reads repseqs.qza
--i-classifier unite_classifier_99_27.10.2022.qza
--o-classification taxonomy.qza
--verbose

I was wondering if I am doing something wrong here.
Can you help me improving my classification?

Thank you so much for your help!

Hi @Antani

If you search the forum you'll find several threads on this. The reason why so many taxa are only being classified as Fungi is because there are no outgroups in your reference set. That is, these sequences are likely non-fungal eukaryotes see, here, here, and here.

This issue crops up with other reference databases too, like rRNA reference sequences.

I recommend using the unite database that also contains non-fungal eukaryotes.

-Mike

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.