Training Classifier with Dada2 fasta refdb

I have a single reference database (fasta) file for 12S that I have been using for my Dada2 pipeline in R (see first few lines of file format below). It is the same format as this PR2 version.


I would like to use this for taxonomic assignment in Qiime2, but I am having trouble figuring out how to import both taxonomy and sequences when they are in a single file. I am using the RESCRIPt plugin, and have tried with the following commands, but they are not exactly correct for my data (*sequences are not all exact length):

qiime tools import
--input-path 12s.fasta
--output-path 12s.qza
--type 'FeatureData[AlignedSequence]'

qiime feature-classifier fit-classifier-naive-bayes
--i-reference-reads 12s.fasta
--i-reference-taxonomy 12s.qza
--o-classifier silva-138.1-ssu-nr99-classifier.qza

I looked through the forum posts, and the RESCRIPt Github documentation for building databases here, but it doesn't seem to be the correct information I need. Can someone point me to the correct information, or how to do this?

Hi @lastewart,

There is a 12S rRNA marker gene example for metazoa provided here.

If you are interested in making an amplicon specific version, you can try combining this approach with the current pre-release of the extract-seq-segments action.