poor classification using qiime2

Good morning,

I am experiencing some difficultie sto get results even if indeed my pipeline has not changed.
In specific what I obtain is kind of poor classification: half of the sequences (very low number of OTU in addition (e.g 900) are just attributed to Bacteria or OD1. So I think this is not a great result.

I include my commands

taxa_classi:
$(CONDA_ACTIVATE) Miqiime2-2021.8;
qiime feature-classifier classify-sklearn
--i-classifier gg-13-8-99-nb-classifier.qza
--i-reads rep-seqs-or-85.qza
--o-classification taxonomy10C.qza

joined_import_filter_derep:
export HDF5_USE_FILE_LOCKING='FALSE';
$(CONDA_ACTIVATE) Miqiime2-2021.8;
qiime vsearch dereplicate-sequences
--i-sequences fil_joined.qza
--o-dereplicated-table table.qza
--o-dereplicated-sequences rep-seqs.qza

Couls you please help me?

Thanks a lot

I specify I did not check at the moment if primers for sequencing have changed or so

Michela

I would appr4ciate very much you kind help.

1 Like

Hello Michela,

The information about primers would be crucial. It is the most possible explanation for the poor performance of the classifier.

Cheers
Valentyn

Hi Valentyn, I will be back with information about primers, for sure I would need indications on waht classifier would be best fitted.

Thanks a lot for you support

Michela

After you obtain primer sequences you can refer to a tutorial on building a reference database here:

Cheers
Valentyn

1 Like

Thanks a lot!
could you please remind me of which primers are compatible with this classifier database
gg-13-8-99-nb-classifier.qza

?
This will help me very much to recontruct the sudden impossibility of classification starting from the same facility.

I would appreciate it very much

Michela

More details are avaiable on the data resources page.

Naive Bayes classifiers trained on:

gg-13-8-99-nb-classifier.qza is the first one. So, no primers are used to select a region at all so the full 16S region is used for k-mer profiling and classification.

Using RESCRIPt to build a database for just your region of interest should perform better because it's more specific.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.