I appears the silva-132-99-515-806-nb-classifier.qza artifact (https://data.qiime2.org/2019.10/common/silva-132-99-515-806-nb-classifier.qza) contains white spaces for some of the taxonomies. This generates downstream errors such as CategoricalMetadataColumn does not support values with leading or trailing whitespace characters. Column 'Taxon' has the following value: 'D_0__Bacteria;D_1__Gemmatimonadetes;D_2__Gemmatimonadetes;D_3__Gemmatimonadales;D_4__Gemmatimonadaceae;D_5__uncultured;D_6__uncultured bacterium ' when running qiime metadata tabulate or qiime taxa filter-table.
I’d like to just filter out the whitespaces from silva-132-99-515-806-nb-classifier.qza, but exporting it via qiime tools export just creates a tar file, which then un-tars to a pkl file, which then creates the following error when trying to load it via pickle.load(): _pickle.UnpicklingError: invalid load key, 'D'.
Note that the taxon strings whitespace has been stripped.
The machine classifier doesn't work like that, this is a binary file, editing it isn't recommended.
i) No need to remove the whitespace, simply upgrade. Please note, 2019.10 is the only version of QIIME 2 currently supported.
ii) You are generalizing your experience with trying to edit a binary pickle --- exporting and extracting data are first class citizens in QIIME 2, and the resulting data is in whatever format the Semantic Type represented the data as (TSV, JSON, fastq, pkl, etc). You are simply trying to do something that doesn't really make sense for this kind of data.