Input and output file in qiime feature-classifier classify-sklearn different Feature IDs

Hi, I have a problem when classifying sequences. The taxonomy output file has different feature IDs than the input rep-seqs file. Only a few feature IDs are the same. Otherwise the taxonomy file looks fine.
I use the latest version of Qiime2 (2020.2) and run this command:
qiime feature-classifier classify-sklearn --i-classifier silva-132-99-nb-classifier.qza --i-reads B-rep-seqs.qza --o-classification B-taxonomy.qza
I downloaded the pre-trainedSilva reference database from the data resource page on the Qiime2 website (Silva 132 99% OTUs full-length sequences)
I can not find a previous similar issue in the forum, so I am hoping it is a simple mistake on my part. Hopefully someone can help.

Hi @AsaJac, could you please post both of these files here:
B-rep-seqs.qza
B-taxonomy.qza

You can send via DM if these data need to be kept private

This will help us diagnose. Thanks!

Hi, the B samples worked, while the A samples did not. I have attached the files.

A-rep-seqs.qza (246.7 KB) A-taxonomy.qza (164.8 KB)

1 Like

Hi @AsaJac,
It looks like the issue a mix-up in filenames (or processing parameters).

Take a look at the provenance in those two files (you can view this with https://view.qiime2.org/)… both artifacts come from the same original data, but diverge at the filter-samples step.

Specifically, the rep-seqs.qza were filtered where "[Primers]='A'" but the taxonomy was filtered where "[Primers]='B'", so the samples (and as a consequence the features) will not overlap 100% in the resulting table.

In other words, you filtered your rep seqs two different times and are trying to compare the taxonomy to the wrong rep seqs artifact.

I hope that helps!

2 Likes

That makes sense. I will try and look back in the filtering process, see if I can fix it. I am new to Qiime2, so thank you very much for your assistance, it is highly appreciated.

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.