Fastq.gz and quality score length do not match using type: EMPPairedEndSequences from MiSeq

I have three fastq.gz files from our MiSeq:
They are in their own directory called EXP1_ToProcess on Qiime2 (I’m running version 2019.7 in conda on a mac), and I used the code:

qiime tools import --type EMPPairedEndSequences --input-path EXP1_ToProcess/ --output-path EXP1_paired_end_seqs

This has worked in the past, but now I get the error message:

There was a problem importing EXP1_ToProcess/:

EXP1_ToProcess/forward.fastq.gz is not a(n) FastqGzFormat file:

Quality score length doesn’t match sequence length for record beginning on line 43126365

From other threads, I gather that this is pretty far down in the file and might be hard to troubleshoot, but I was hoping someone had an idea of what’s going on! We have used these exact files in Qiime to process, but would like to process in Qiime2 now.
Also, I tried running the code with --verbose, but it just gives the error “no such option: --verbose”

Thank you!

Hi @chelsea.brisson.423!

If I had to guess, the file forward.fastq.gz wasn’t completely transferred when moving it to the mac that you’re running QIIME 2 on. This can happen - network errors sometimes cause files to look like they have completely transferred, when in reality they aren’t all there. The reason I think that is the case is because of the specific error message. As you pointed out, the error is down near the bottom of the file (which represents the last part of the file transferred). As well, the error message is complaining that the quality scores in a record are shorter than the sequences for the same record:


That can happen if the network connection died or encountered an issue - a partially transferred file and record.

Double check that you have the complete files (md5 checksums can help with that, or just measuring the file size).

Keep us posted! :qiime2: