The different line endings are one of those little technical things that seem custom-made to confuse and frustrate non-technical users. They are literally invisible!
But at least the workaround is easy:
Could we make this easier on our users?
Would it be possible to transparently support flat text files with either dos or posix newlines?
P.S. Where would I open an issue for this? Maybe
P.P.S. Everyone has the same problem with the em-dash…
pretty certain that we’re using universal newline handling everywhere in QIIME 2 (it’s the default in Python 3, so we’d have to very specifically ask the computer to not transparently handle newlines), but it’s possible there’s an issue somewhere.
Are we certain that it is newlines causing the issue?
I’m not sure what’s causing this issue.
@Nicholas_Bokulich tried to troubleshoot this too, but I’m not sure we made much progress.
I’m glad that these sorts of issues are being handled in Qiime 2! I’ll close this thread for now why we investigate this specific case.
Turns out it was a byte-order mark:
Just to provide some closure, I think I found the issue:
(via hexdump -C <filename>)
00000000 ef bb bf 73 61 6d 70 6c 65 2d 69 64 2c 61 62 73 |...sample-id,abs|
00000010 6f 6c 75 74 65 2d 66 69 6c 65 70 61 74 68 2c 64 |olute-filepath,d|
00000020 69 72 65 63 74 69 6f 6e 0d 0a 35 30 30 36 34 2c |irection..50064,|
First column is position, next two are the bytes as a single hex number and finally the ASCII value (if it is a visible character, otherwise it will be a dot)
Here we see tha…
We definitely need to fix that! It comes up surprisingly often, although this is the first case of a UTF-8 BOM. Most of the BOMs are only 2 bytes and from UTF-16.