I am new to QIIME and am using the program in VirtualBox on Windows 10 with Windows Subsystem for Linux. I am currently going through the “Moving Pictures” tutorial. While importing the data with this command:
There was a problem importing emp-single-end-sequences:
emp-single-end-sequences/sequences.fastq.gz is not a(n) FastqGzFormat file:
The typographical error in the fastq.gz sequence file also changes every time I close and reopen the terminal.
Lowercase case sequence on line 22334
Quality score length doesn’t match sequence length for record beginning in line 58001
Quality score length doesn’t match sequence length for record beginning in line 14001
Quality score length doesn’t match sequence length for record beginning in line 45469
I am wondering whether this problem is due to the emp-single-end-sequences/sequences.fastq.gz file from the “Moving Pictures” tutorial being corrupted or something else.
Error messages are almost always right, so I’m inclined to trust this:
It seems to be telling you that your FASTQ is corrupted. If some sequences don’t have quality scores for each nucleotide (or vice versa) it’s not valid FASTQ data.
There are myriad ways that could have happened (bad connection, file opened with a word processor, etc). Try re-downloading those files from the terminal, while you’re in your QIIME 2 environment, and re-running the tutorial command.
If that doesn’t help, get back to me with clarifications on the following, and we’ll see what we can figure out together.
Questions/Notes
You mention using both WSL and a VirtualBox image. Is the QIIME 2 instance you’re using a “native” installation in WSL? Is it in the VirtualBox image you’re running? Are you running QIIME 2 within the VirtualBox within WSL?
Is this line part of your error message, or something you wrote? I assume the latter, but it’s a little unclear.
If you wrap code blocks in triple-backticks (```), they’ll show up separately from your normal text - I edited your post, just in case you want to see an example.