Problem importing fastq.gz files to create seqs.qza file

Hello All,

I am very much new to qiime2. I used linux command line to analyze 16s data on qiime1 about 2 years ago, and now have new data to analyze on qiime 2.

I received fastq.gz files from a collaborator, which were already demultiplexed and quality filtered, and placed into a folder, called “UCLA”. I am trying to create a seqs.qza file, containing my data, which seems to be required for clustering into an OTU table on qiime2…

I kept getting the following error message, when I ran this script…

script: qiime tools import
–input-path /work/tayb_worm/UCLA
–output-path /work/tayb_worm/seqs.qza
–type ‘SampleData[Sequences]’

Error message: There was a problem importing /work/tayb_worm/UCLA:

Unrecognized file
(/work/tayb_worm/UCLA/UCLA-0A4_247_L001_R1_001.fastq.gz) for
QIIME1DemuxDirFmt.

When I got this message, I thought it may be because all of the fastq.gz files were not named like this:
UCLA-XXX_XXX_L00X_RX_00X.fastq.gz

Some had varying digits, such as
UCLA-XX_XXX_L00X_RX_00X.fastq.gz
or
UCLA-XXX_XX_L00X_RX_00X.fastq.gz

So I edited the names of all of the sequences, to match “UCLA-XXX_XXX_L00X_RX_00X.fastq.gz” which I placed into a folder called “UCLA2”. But I am still getting the same error message back.

There was a problem importing /work/tayb_worm/UCLA2:

Unrecognized file

(/work/tayb_worm/UCLA2/UCLA-0A4_247_L001_R1_001.fastq.gz) for

QIIME1DemuxDirFmt.

I’m not sure of what I should do next…

Thanks!

I also tried renaming the files as

“UCLA_” instead of “UCLA-”, and still got the same error message

Hey,
assuming you are using Qiime2 2018.8
you have to import as it says in the Importing data tutorial.
Try to use the Casava format:

qiime tools import \
  --type 'SampleData[SequencesWithQuality]' \
  --input-path casava-18-single-end-demultiplexed \
  --input-format CasavaOneEightSingleLanePerSampleDirFmt \
  --output-path demux-single-end.qza

Should work :slightly_smiling_face:

2 Likes

Hi @tayb,
Welcome to Qiime2!
There are lots of new things in qiime2 and I think you would greatly benefit from going over one of the tutorials listed here, I would start with the “Moving Pictures” tutorial, though as @Yos.Dos describes (thanks @Yos.Dos!) , your initial importing step will be different, which the link he has provided should help you get your files into qiime2 easily.
The Casava format might work for you if ALL the naming scheme of your sample follows something like L2S357_15_L001_R1_001.fastq.gz. If you’re unsure about this you can easily just use the manifest importing option which has no naming format requirements.

Finally, I noticed you said that the files you received were already quality filtered. Depending on what kind of quality filtering was performed this may or may not be an issue if you are planning on using the DADA2 denoising methods in qiime2, which is in place of OTU picking. This is because DADA2 does its own filtering and in fact expects unfiltered sequences. If you have access to the demultiplexed but not quality filtered files that is your best option, if not then could you look into exactly what kind of quality filtering was carried out and we can see if that will be an issue or not?
Good luck and let us know how it goes.

1 Like

I’ve just realized that the cluster I am using at my institution has qiime2-2017.9 installed, not qiime2-2018.8. Maybe this is why I was having some issues.

I have requested for them to install the latest qiime2…

The virtual box option is not feasible for me at the moment, unless I free up space on my computer (I’ve had it for several years so it has acquired many files).


My collaborator said that the reads are paired end demultiplexed, so I used the Casava 1.8 paired-end demultiplexed fastq.

with this script: qiime tools import
–type ‘SampleData[PairedEndSequencesWithQuality]’
–input-path /work/tayb_worm/UCLA2
–input-format CasavaOneEightSingleLanePerSampleDirFmt
–output-path /work/tayb_worm/demux-paired-end.qza

this is the error message that is returned:

Error: no such option: --input-format

But it may be because the older version of qiime is installed?
Regardless, I will ask for more specifics on how the quality filtering was carried out…

Hi @tayb, I agree with all of @Mehrbod_Estaki’s comments above. I’ll just jump into confirm that you will want to have your system administrator update to the latest version of QIIME 2. A lot has changed in the year since the 2017.9 release came out, including tons of cool new functionality that you’re likely to want to have access to.

You can point your system administrator at the installation page here - there is a note about upgrading at the bottom of that page. If your system administrator runs into any issues, we’d be happy to help them on the forum.

3 Likes

Hi!

Okay, so qiime2-2018.8 has finally been downloaded on the cluster.

I passed this script:

qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path /work/tayb_worm/UCLA2
--input-format CasavaOneEightSingleLanePerSampleDirFmt
--output-path /work/tayb_worm/UCLA2_demux-paired-end.qza

and got the error message:

There was a problem importing /work/tayb_worm/UCLA2:

/work/tayb_worm/UCLA2/._UCLA-0A1_089_L001_R1_001.fastq.gz is not a(n) FastqGzFormat file:

File is uncompressed

So, I noticed on this qiime2 thread:

that @ebolyen mentioned:

In addition to the command, I noticed something interesting about your filename: it starts with a dot which means it is “hidden” by default. It’s probably a leftover from a text editor. Would you be able to run ls -alh in your directory? If I’m right, you’ll see both ._SB-01 and SB-01 in there.

I ran the ls -alh command, and indeed I do see files with '._UCLA*

I tried removing them using rm -rf ._UCLA-*
but it deleted all of my files. So I redownloaded it into my folder and am unsure of what to do next....

Please disregard the last post!

I just used this script:

rm -r ._UCLA-*

and deleted each file by pressing “y” when it asked me if I wanted to remove it…

then I got the message: Imported /work/tayb_worm/UCLA2 as CasavaOneEightSingleLanePerSampleDirFmt to /work/tayb_worm/UCLA2_demux-paired-end.qza

Success!!! :smiley:

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.