We are trying to use a very specific COI database to classify our data: MZGdb Atlas Cnidaria (World Oceans)

What we have done so far is download the desired .fasta file from that website: MZGfasta-coi__T4000200__o00__A.fasta, made it upper case and then used:

qiime tools import \
--type 'FeatureData[Sequence]' \
--input-path MZGfasta-coi__T4000200__o00__A_upper.fasta \
--output-path MZGfasta-coi__T4000200__o00__A.qza

to create a qza and then we have downloaded the associated .mothur taxonomy file from the same site, made it tab separated, and then we went through the following steps:

qiime tools import \
--type 'FeatureData[Taxonomy]' \
--input-format HeaderlessTSVTaxonomyFormat \
--input-path MZGmothur-coi__T4000200__o00__A.tsv \
--output-path MZGmother-coi__t4000200__o00__A.qza
qiime feature-classifier fit-classifier-naive-bayes \
--i-reference-reads MZGfasta-coi__T4000200__o00__A.qza \
--i-reference-taxonomy  MZGmother-coi__t4000200__o00__A.qza \
--o-classifier COI_jellyfish_classifier.qza
qiime tools validate COI_jellyfish_classifier.qza
qiime feature-classifier classify-sklearn \
--i-classifier COI_jellyfish_classifier.qza \
--i-reads rep-seqs-dada2.qza \
--o-classification jellyfish-taxonomy-rescript.qza
qiime metadata tabulate \
--m-input-file jellyfish-taxonomy-rescript.qza \
--o-visualization jellyfish-taxonomy-rescript.qzv

We are unsure of whether our data are really just very strange or if something has gone wrong. When looking at the bar chart from the visualisation, almost everything is identified as one species and none of the expected species show up at all with only 6 taxa identified in total.

We are running qiime2-2021.2 (which we realise is quite old, but we don't think this is the problem?) on linux Mint 20.1 Cinnamon (on a virtual machine using proxmox)

Are we doing something wrong?

HI @asajoh,

On the surface, everything you ran looks okay. But, just in case there is a hidden formatting issue, can you private DM me links to your MZGfasta-coi__T4000200__o00__A.qza and MZGmother-coi__t4000200__o00__A.qza files?


