Dear all,
I feel the need to re-open the topic because I was not able to find a satisfactory solution for me. Discussion was between @DannyBoi97 and @SoilRotifer, which I hope will be able to help me once more.
I am trying to train a classifier myself, as I am not able to find classifiers for Silva 132, or better, I got one but Qiime 2 (I am using version 2023.2 in a singularity container) retrieve the error message:
Plugin error from feature-classifier:
- The scikit-learn version (0.20.2) used to generate this artifact does not match the current version of scikit-learn installed (0.24.1). Please retrain your classifier for your current deployment to prevent data-corruption errors.*
For this reason I decided to train my own classifier and found very useful instructions here: Training feature classifiers with q2-feature-classifier — QIIME 2 2022.2.0 documentation
The point is, with all the folders we have when we download Silva database, which one should I use when I am compiling the following commands?
qiime tools import
–type ‘FeatureData[Sequence]’
–input-path INSERT_REF_SEQ_FILE.fasta
–output-path ref_seq.qza
qiime tools import
–type ‘FeatureData[Taxonomy]’
–input-format HeaderlessTSVTaxonomyFormat
–input-path INSERT_TAXONOMY_FILE.txt
–output-path ref-taxonomy.qza
Is it rep_set or rep_set_aligned?
Unfortunately, as I open the link about RESCRIPt suggested by MIke I can see guidelines only about Silva 138, which I dont need. Can you please give support?
Thanks in advance for your answer!
EDIT: while waiting for the answer, I proceeded running the command as following.
qiime tools import --type 'FeatureData[Sequence]' --input-path Databases/Silva_132/rep_set/rep_set_all/99/silva132_99.fna --output-path Databases/Silva_132/99_otus_all.qza
qiime tools import --type 'FeatureData[Taxonomy]' --input-format HeaderlessTSVTaxonomyFormat --input-path Databases/Silva_132/taxonomy/taxonomy_all/99/raw_taxonomy.txt --output-path Databases/Silva_132/ref-taxonomy.qza
Both ended successfully. I will give further updates about the results of classifier training and its use.