Import not recognising fastq.gz files

Hello,

I’m currently trying to import some demultiplexed single end reads using qiime tools import. However, I’m getting the following error:

There was a problem importing Chpt_4_R1_fastq:

Chpt_4_R1_fastq/SID2R19-141220_S193_L001_R1_001.fastq.gz is not a(n) FastqGzFormat file:

File is uncompressed

Since the file is in a fastq.gz format i’m at a loss as to what the import is having an issue with. I’m using qiime2-2020.11 with conda in mac terminal.
Below is the input code

qiime tools import --type ‘SampleData[SequencesWithQuality]’ --input-path Chpt_4_R1_fastq --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path demux-single-end.qza

Thank you in advance!

Hello! Welcome to the forum! :wave:

Are you sure? The filepath above in your error message says Chpt_4_R1_fastq (which is different than Chpt_4_R1_fastq.gz), and the error message also says:

The “.gz'” part of the filename usually indicates that the file has been compressed.

If you don’t have compressed fastq.gz files, no worries, you can use a manifest format to import:

https://docs.qiime2.org/2020.11/tutorials/importing/#fastq-manifest-formats

This import mechanism allows you to import uncompressed files, which will then be compressed as necessary during the import step. The reason that behavior isn’t on by default is because it can be a bit slow (turns out zipping files is a lot of work for the computer), but hopefully that can get you moving in the right direction.

:qiime2:

Hello,

Thank you for responding!

The input path i have is in reference to a folder with all of my compressed fastq files which end in .fastq.gz (~70 of them). When I test with one or two in a test folder the command runs without issue. It’s just when all 70 of them are in the folder it doesn’t like it. I’ve double checked all the files and they are in a compressed format and my computer lists them as being compressed in the file info. Maybe I should try decompressing them then creating a manifest.

Thanks again!

Oops, thanks for pointing that out! Well, here the thing though:

Just putting .gz on the end of the filename doesn’t actually mean a file is compressed, unfortunately.

I would recommend against that, for now. Let’s figure out what is going on, first! This is a little bit of a red flag, you probably want to make sure everything is in order before moving forward (at least I know I would if I were in your shoes!).

So then maybe there are only one or two files that are uncompressed, you can start by checking out SID2R19-141220_S193_L001_R1_001.fastq.gz. If you’re on a mac, in your QIIME 2 env you can run (and share the results here):

file Chpt_4_R1_fastq/SID2R19-141220_S193_L001_R1_001.fastq.gz

If you do find that some are compressed and others aren’t, then you should totally use the manifest method. Don’t worry about compressing or uncompressing - just make a manifest with all of your samples and the paths to the files, QIIME 2 will figure out the rest for you.

1 Like

Turns out some files were corrupted during a sftp from a server so there was nothing to zip :stuck_out_tongue: after I fixed that all was good! Thank you for the help and sorry for the late reply. I’m in full thesis mode at the moment and time is flying!

Best!

2 Likes