Better location information for "utf-8 can't decode codec..." error

Hi @nick-youngblut! This error message is being raised by Python when attempting to read the TSV file. I agree it's not the most user-friendly error message. The error message is stating the value of the byte that can't be decoded (0xc4) -- I think this would be the 7725th byte in your file ("position" makes me think that the numbers are 0-based indexing instead of 1-based indexing).

You can check what encoding your file has by using one of the Unix tools described here. That may shed some light on the issue -- it looks like you have a Unicode file encoded with something other than utf-8. Re-encoding to utf-8 may solve the issue.

I'm working on overhauling Metadata in qiime2 for this month's release (2017.12), and part of the work will include better error messages for these types of situations. Would you mind sharing your metadata file with me so that I can make sure this situation is handled better? Feel free to send me a DM if you don't want to share the file publicly. Thanks!

1 Like