Training silva classifiers based on NCBI taxonomy

I want to train my classifier using the SILVA database based on the NCBI taxonomy, I have tried a few times using RESCRIPt but come across many problems, so I think maybe the file I have chosen is wrong.

I go to the SILVA v138.2 archive to get the following taxonomy files:

  • SILVA_138.2_SSURef_Nr99---tax_ncbi-species.txt
  • SILVA_138.2_SSURef_Nr99---taxmap_ncbi.txt
  • tax_slv_ssu_138.2.tre
    The sequence file:
  • SILVA_138.2_SSURef_NR99_tax_silva_trunc.fasta

Has anyone tried this before? (EMP 515F-806R)

Hi @Chengzi,

Can you provide the commands you are running. Also, if you want to make use of silva v 138.2 you'll can do one of the following:

  • Download and process the 138.2 files manually as outlined here. Click on the drop menu "The gritty details". This allows you to grab SILVA versions that have not yet been added to RESCRIPt get-silva-data plugin action. Just update the version numbers of the file names.
  • Alternatively, you can simply install the github version of RESCRIPt. :warning: Note, this may not always work as this is often tied to the QIIME 2 development cycle. Meaning that the current RESCRIPt code in GitHub may or may not be compatible with with the current QIIME 2 release.
    conda activate qiime2-amplicon-2024.5
    pip install git+https://github.com/bokulich-lab/RESCRIPt.git
    qiime dev refresh-cache

:warning: The latest version of SILVA 138.2 makes use of a new taxonomy schema as outlined here, so the taxonomy is likely not compatible with any additional data downloaded from NCBI, so it'd be best to stick with 138.1 as the only difference between 138.1 and 138.2 is the updated taxonomy. Also, even if you are using prior versions of SILVA, there may be differences in rank labels, e.g. "kingdom" vs "domain". In which case you can use RESCRIPt's edit-taxonomy function.

1 Like