Hello qiime2 people!
I am having some troubles classifying my feaures using qiime feature-classifier classify-sklearn
.
Since I work with endospore-forming Bacteria, annotating the feature table using GG or silva datasets leads to a lot of sequences not being recognised (endospores do not figure in most environmental studies because of their resistance to DNA extraction).
To solve this problem I downloaded GreenGenes classifier (with taxonomy and reference sequences separated) from here qiime Data resources. I then proceeded add some endospores reference sequences and taxonomy I had. Finally, I tried to build and train my expanded classifier as shown in the "training feature classifier" tutorial.
This invariably fails when I use the command:
qiime feature-classifier fit-classifier-naive-bayes
--i-reference-reads exp-ref-seqs.qza
--i-reference-taxonomy exp-ref-taxonomy.qza
--o-classifier exp-classifier.qza
ValueError: rep_set/99_otus_TSP.fasta is not a QIIME archive
I was wondering whether there is a way to add sequences to a pre-existing classifier without causing qiime2 to freak out. Also, is is a good idea to do so? Could this lead to a bad classification?
Thank you in advance, and apologies if this question was already asked. I Looked around in this forum and elsewhere but - to my surprise - found nothing.
Giacomo