Silva classifier training

Benedict · July 22, 2020, 11:42am

Hi,

I would like to classify my query sequences using pre-trained silva classifier contributed by qiime2 society, however it was incompatible with the current qiime2 release (2020.6). Thus, I planned to train my own classifier using the pre-formatted SILVA reference database and taxonomy files (138 release) from the data resource page https://docs.qiime2.org/2020.6/data-resources/#taxonomy-classifiers-for-use-with-q2-feature-classifier. However, the job was killed due to incompatible version of scikit. May I know which version of scikit was used to process and produce the pre-formatted SILVA 138 release? Else, do you have any idea on which version of qiime2 is compatible with the mentioned pre-formatted silva database?

Apart from this, my taxonomic classification using pre-trained classifier always being killed too due to memory error. May I seek the advice on computer core (mine is 6) and memory size (mine is12GB RAM) for the successful classification and classifier training based on your previous experience?

Thank you.

-Benedict-

SoilRotifer · July 22, 2020, 3:19pm

Hi @Benedict,

Unless your environment was altered, the current SILVA 138 classifiers should work with 2020.6, as they were made with that version. Do you know which version of scikit-learn you have?

We have an “escape hatch” of sorts… You can simply train the classifier yourself using these files as input to feature-classifier classify-sklearn.

-Mike

system · August 22, 2020, 9:19pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.