The problem about the fastq data uploading QIIME2


it is the first time we uploading our data to QIIME2 , but failed to upload and creat .qza so many times and can not find reason.
the error code is
An unexpected error has occurred:

’utf-8’ codec can’t decode byte 0xff in position 0: invalid start byte

and we already created our “manifest” as picture

if you know how to fix it please help me! and thanks you very much
PS:my data is already demultiplexed and is pair-end data

Hey @Doc.chen!

Welcome to the forum!

Would you be able to upload your manifest file to the forum as an attachment?

I strongly suspect that your file is UTF-16-le (instead of UTF-8) as 0xff is the start of a byte-order mark for UTF-16-le. When you saved the file, what editor did you use, and what option to save (screenshot would be great)?

thanks for your help!i used excel to saved this file and saved as txt.
se-33-manifest.txt (836 Bytes)

1 Like

Thanks @Doc.chen!

Just as I suspected, you are saving as UTF-16 Unicode (Little-endian). In that dropdown, is there an equivalent txt format for UTF-8 Unicode (.txt)?

(I’m guessing a little bit as to what the specific name is)

If there is a UTF-8 option, try saving it using that and see if that fixes the issue.

1 Like

thanks you very much !after the guide which you send me, i uploading my data successfully. But i found that the problem is not only about UTF-8,but also something else, such as the manifest table can not using $PWD if the computer system is MAC OS et al,if you follow the instructions on the QIIME2 official website exactly,errors must be there. Again, thanks you very much to help me!

PS:i have no ideal how other people works, but when i have so many data need to analyze,did that mean i need spend lots of times to write my “manifest”?such as in this research, i have 10 sample, and the error instruct always show after the data run, but if i have 200 sample, it will consume many time and hard to correct error.

Hey @Doc.chen,

As always, the error message (and log) is helpful. The $PWD is kind of a shortcut and is calculated from your current position in your terminal (if you aren’t sure what that is, that’s ok, you can type pwd to find out!). This should work fine on OS X, but it’s certainly finicky for other reasons.

I tend to use this format personally, because it doesn’t take any effort, HOWEVER, you must have your files named a certain way (as described there). This is usually the default output of Illumina Sequencers, but every facility has it’s own scheme, so this one may not be yours.

yes! and thanks you very much!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.