Feature Classifier - Not enough values to unpack

Hi again,

I figured out what the problem was. The imported taxonomy file is not correct and does not contain the taxonomy strings. I previously had issues importing the taxonomy using the following command:

qiime tools import --type 'FeatureData[Taxonomy]' --source-format HeaderlessTSVTaxonomyFormat --input-path /shared/c3/bio_db/Phytoref/qiime2.tax.txt --output-path phytoref_taxonomy.qza

I got an error saying that the file I am trying to import is not headerless, even though it clearly didn't have a header, see the first few lines of the qiime2.tax.txt file:

202#HE610155 Eukaryota;Archaeplastida;Chlorophyta;Ulvophyceae;Ulvophyceae_X;Ulvophyceae_XX;Ulvophyceae_XXX;Ulvophyceae_XXXX;Desmochloris;Desmochloris_halophila;
803#AF514849 Eukaryota;Stramenopiles;Ochrophyta;Bacillariophyta;Bacillariophyceae;Naviculales;Naviculales_X;Naviculaceae;Haslea;Haslea_crucigera;

My workaround of this problem was actually introducing a header line, after that the above command ran without error and produced an output file (which I now know is not correct).

A colleague of mine figured out that the error during importing the taxonomy file was related to the '#' character in the identifiers. We replaced the '#' with an "_" underscore (also in the sequence file) and everything worked.

So it seems as if '#' characters should be avoided in the identifiers of taxonomy files. The '#' is part of all identifiers in the phytoref database (http://phytoref.sb-roscoff.fr/static/downloads/PhytoRef_with_taxonomy.fasta) which can be used to assess eukaryotik species based on their chloroplast 16S.

2 Likes