Issue with qiime rescript evaluate-fit-classifier - wrong classification

You'll have to do this prior to training the classifier. There are two ways:

  1. When running rescript get-ncbi-data you can set the taxonomic ranks you'd like to extract via the --p-ranks option. The default is to pull kingdom phylum class order family genus species. In your case you can use: --p-ranks kingdom phylum class order family genus
  2. Perhaps more easily, you can simply make use of rescript edit-taxonomy. That is, your command to remove the species labels may look something like this:
qiime rescript edit-taxonomy  \
    --i-taxonomy NCBI-diatoms-rbcL-ref-tax.qza \
    --p-use-regex \
    --p-search-strings  's__.*' \
    --p-replacement-strings ''  \
    --o-edited-taxonomy NCBI-diatoms-rbcL-ref-tax-genus-only.qza

Then use NCBI-diatoms-rbcL-ref-tax-genus-only.qza for all your downstream curation and classifier training steps. I hope I got the regex correct, but you can play around with it. :slight_smile:

-Cheers!

1 Like