Creation of a new database with RESCRIPt ________________________________________ 1. Download of NCBI resources for plants species: qiime rescript get-ncbi-data \ --p-query 'txid33090[ORGN] AND (ITS OR Internal Transcribed Spacer) NOT environmental sample[Title] NOT environmental samples[Title] NOT environmental[Title] NOT uncultured[Title] NOT unclassified[Title] NOT unidentified[Title] NOT unverified[Title]'\ --p-n-jobs 5 \ --o-sequences ncbi-refseqs-unfiltered_viridiplant.qza \ --o-taxonomy ncbi-refseqs-taxonomy-unfiltered_viridiplant.qza (the qiime rescript get-ncbi-data command have to be done between 9pm and 5am (US time)) qiime metadata tabulate \ --m-input-file ncbi-refseqs-taxonomy-unfiltered_viridiplant.qza \ --o-visualization ncbi-refseqs-taxonomy-unfiltered_viridiplant.qzv qiime metadata tabulate \ --m-input-file ncbi-refseqs-unfiltered_viridiplant.qza \ --o-visualization ncbi-refseqs-unfiltered_viridiplant.qzv In my case study, I had to retrieve only the NCBI resources for my plant list (I used Excel to create my file called taxonomy_toronto_plants.txt from the taxonomy of viridiplants, keeping only the species of my list, I left the file containing the reference sequences as is) 2. Download of NCBI resources for fungi species: (I used a bioproject that gathers the ITS reference sequences for fungi). qiime rescript get-ncbi-data \ --p-query '177353[BioProject]'\ --p-n-jobs 5 \ --o-sequences ncbi-refseqs-unfiltered_fungi.qza \ --o-taxonomy ncbi-refseqs-taxonomy-unfiltered_fungi.qza (the qiime rescript get-ncbi-data command have to be done between 9pm and 5am (US time) to make sure you don't have problems running the command until you get results) qiime metadata tabulate \ --m-input-file ncbi-refseqs-taxonomy-unfiltered_fungi.qza \ --o-visualization ncbi-refseqs-taxonomy-unfiltered_fungi.qzv qiime metadata tabulate \ --m-input-file ncbi-refseqs-unfiltered_fungi.qza \ --o-visualization ncbi-refseqs-unfiltered_fungi.qzv 3.Creation of the custom database for taxonomic classification: Importation of the data and Taxonomy assignment: (here I used my own plant taxonomy but if you want to use the whole Viridiplant database, use : ncbi-refseqs-taxonomy-unfiltered_viridiplant.qza) qiime tools import \ --type 'FeatureData[Taxonomy]' \ --input-format HeaderlessTSVTaxonomyFormat \ --input-path taxonomy_toronto_plants.txt \ --output-path taxonomy_toronto_plants.qza Merging of taxo and ref-seqs of Toronto plants and fungi: qiime feature-table merge-taxa \ --i-data taxonomy_toronto_plants.qza \ --i-data ncbi-refseqs-taxonomy-unfiltered_fungi.qza \ --o-merged-data taxonomy_toronto_plants_fungi.qza qiime feature-table merge-seqs \ --i-data ncbi-refseqs-unfiltered_viridiplant.qza \ --i-data ncbi-refseqs-unfiltered_fungi.qza \ --o-merged-data ref-seqs_toronto_plants_fungi.qza 4. Taxonomy classification (here I used Naive bayes classifier): qiime feature-classifier fit-classifier-naive-bayes \ --i-reference-reads ref-seqs_toronto_plants_fungi.qza \ --i-reference-taxonomy taxonomy_toronto_plants_fungi.qza \ --o-classifier classifier_plant_and_fungi.qza qiime feature-classifier classify-sklearn \ --i-classifier classifier_plant_and_fungi.qza \ --i-reads rep-seqs-dada2.qza \ --o-classification R_taxonomy_plants_toronto_fungi_NCBI.qza qiime metadata tabulate \ --m-input-file R_taxonomy_plants_toronto_fungi_NCBI.qza \ --o-visualization R_taxonomy_plants_toronto_fungi_NCBI.qzv