Plugin error from diversity: 'latin-1' codec can't encode character '\u03b2' (...)

Hi!

I’ve been getting the following error when running the “qiime diversity core-metrics-phylogenetic” command:

Plugin error from diversity:
‘latin-1’ codec can’t encode character ‘\u03b2’ in position 3464: ordinal not in range(256)

I’m unable to find any bad character (specifically \u03b2 which stands for greek beta) in any of my files, as the message suggests. The input files include my metadata file (which is very simple, attached below), ASV feature table as created by dada2 plugin, and a phylogenetic tree as created by fragment-insertion plugin, sepp algorithm.

metadata_barcodes_only.tsv (6.6 KB)

The command I ran (nothing special):

qiime diversity core-metrics-phylogenetic \
	  --i-phylogeny sepp_rooted_tree.qza \
	  --i-table feature_table.qza \
	  --p-sampling-depth 15227 \
	  --m-metadata-file metadata_barcodes_only.tsv \
	  --output-dir Diversity

Log file output:

/conda/envs/qiime2/lib/python3.6/site-packages/sklearn/metrics/pairwise.py:1575: DataConversionWarning: Data was converted to boolean for metric jaccard
warnings.warn(msg, DataConversionWarning)
/conda/envs/qiime2/lib/python3.6/site-packages/skbio/stats/ordination/_principal_coordinate_analysis.py:152: RuntimeWarning: The result contains negative eigenvalues. Please compare their magnitude with the magnitude of some of the largest positive eigenvalues. If the negative ones are smaller, it’s probably safe to ignore them, but if they are large in magnitude, the results won’t be useful. See the Notes section for more details. The smallest eigenvalue is -0.09186268763487035 and the largest is 2.8949291936192294.
RuntimeWarning
Traceback (most recent call last):
File “/conda/envs/qiime2/lib/python3.6/site-packages/q2cli/commands.py”, line 327, in call
results = action(**arguments)
File “</conda/envs/qiime2/lib/python3.6/site-packages/decorator.py:decorator-gen-390>”, line 2, in core_metrics_phylogenetic
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 240, in bound_callable
output_types, provenance)
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 477, in callable_executor
outputs = self._callable(scope.ctx, **view_args)
File “/conda/envs/qiime2/lib/python3.6/site-packages/q2_diversity/_core_metrics.py”, line 59, in core_metrics_phylogenetic
metric=‘unweighted_unifrac’, n_jobs=n_jobs)
File “</conda/envs/qiime2/lib/python3.6/site-packages/decorator.py:decorator-gen-482>”, line 2, in beta_phylogenetic
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 240, in bound_callable
output_types, provenance)
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 411, in callable_executor
spec.qiime_type, output_view, spec.view_type, prov)
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/sdk/result.py”, line 273, in _from_view
provenance_capture=provenance_capture)
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/core/archive/archiver.py”, line 316, in from_data
Format.write(rec, type, format, data_initializer, provenance_capture)
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/core/archive/format/v5.py”, line 21, in write
provenance_capture)
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/core/archive/format/v1.py”, line 26, in write
prov_dir, [root / cls.METADATA_FILE, archive_record.version_fp])
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/core/archive/provenance.py”, line 313, in finalize
self.write_citations_bib()
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/core/archive/provenance.py”, line 304, in write_citations_bib
self.citations.save(str(self.path / self.CITATION_FILE))
File “/conda/envs/qiime2/lib/python3.6/site-packages/qiime2/core/cite.py”, line 71, in save
bp.dump(db, f, writer=writer)
File “/conda/envs/qiime2/lib/python3.6/site-packages/bibtexparser/init.py”, line 111, in dump
bibtex_file.write(writer.write(bib_database))
UnicodeEncodeError: ‘latin-1’ codec can’t encode character ‘\u03b2’ in position 3464: ordinal not in range(256)

Additional details:
qiime version: 2019.7.0

Any help would be much appreciated!
Thanks,
Efrat

1 Like

Hey @efratm!

I think this is the first time I’ve seen someone with the latin-1 encoding (which is weird as it used to be a super common standard).

There’s not necessarily any bad characters here, rather latin-1 just cannot express some symbols. What strikes me as odd is that QIIME 2 is attempting to use latin-1 at all. It looks like perhaps we did not specify utf-8 (which can encode all characters) to our bibtex library, and so it must be using a default?

Where it is getting that default is less clear, I would guess that it’s your environment configuration, so let’s see if that’s true:

env | grep "LC\|LANG\|LANGUAGE"

That command will tell us how your environment’s locale is configured, and we can brainstorm from there.

Thanks @ebolyen,
This is what I got when running your command:

LANG=en_US
LANGUAGE=en_IL:en_US:en_GB:en

Also, it just happened with another dataset on our server, confirming something’s off in the environment not the specific files.

Many many thanks for your help.

Thanks for the output @efratm!

Looks like it is indeed just a locale misconfiguration. If you run this command:

locale -a

You will get a list of all available locals. We want one that uses a prefix matching the language you want, and that ends in some variant of utf8/UTF-8/UTF8. Once you’ve identified one, try setting this variable:

export LC_ALL="<the one you picked, no angle-brackets>"

That should clear up the issue you are having with QIIME 2 (you may want to put the above command in something like a ~/.bashrc file to save it beyond a single session)

3 Likes

Thanks again @ebolyen!! That indeed solved the issue.

1 Like

Hello,

I am having the same plugin error with the same version of qiime. When I check my environment configuration though I get something ending in UTF-8 (output below)
LC_ALL=en_US LANG=C.UTF-8
I was wondering if you had any advice?
code that I tried to run

qiime diversity core-metrics-phylogenetic \ --i-phylogeny rooted-tree.qza \ --i-table table.qza \ --p-sampling-depth 1200 \ --m-metadata-file arctic_manifest.txt \ --output-dir core-metric-results arctic_manifest.txt (63.7 KB)