Which file to use to train the classifier?

Greetings everyone,

I hope someone can help me. I'm trying to create a classifier for my thesis, but I'm not exactly sure which files to download from the Silva website. I'm following the tutorials from QIIME2 (Training feature classifiers with q2-feature-classifier — QIIME 2 2024.2.0 documentation). I searched the forum, but the names they give don't match. Perhaps the file names have changed, so it's a bit confusing. Are those the correct files?. What I understand is that I need the file containing the sequences and another one containing the taxonomic references to run the following script:

qiime tools import
--type 'FeatureData[Sequence]'
--input-path 85_otus.fasta
--output-path 85_otus.qza

qiime tools import
--type 'FeatureData[Taxonomy]'
--input-format HeaderlessTSVTaxonomyFormat
--input-path 85_otu_taxonomy.txt
--output-path ref-taxonomy.qza

From: Archive
SILVA_138.1_SSURef_NR99_tax_silva.fasta

From: Archive
tax_slv_ssu_138.1.txt

Hi @Niyuh ,

The process of building a reference sequence database can be complicated! So there is a QIIME 2 plugin (RESCRIPt) to automate this process, including from SILVA. Please see this tutorial:

The sequences, taxonomy, and pre-trained classifiers prepared using the RESCRIPt plugin as described in this tutorial are also available on the QIIME 2 website data resources page: Data resources — QIIME 2 2024.2.0 documentation

2 Likes

Muchas gracias :slight_smile: seguiré ese tutorial, perdón por intentar simplificar el proceso

2 Likes