Problem with importing paired end demultiplexed fastq.gz

Hi I am new to qiime and I am trying to import my own paired end demultiplexed data. I have gone through the tutorial and everything worked out as far as that goes. However, as I am trying to follow the tutorial using my data I keep running into the same error:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/q2cli/tools.py”, line 116, in import_data
view_type=source_format)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/result.py”, line 180, in import_data
view_type = qiime2.sdk.parse_format(view_type)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/util.py”, line 93, in parse_format
raise TypeError(“No format: %s” % format_str)
TypeError: No format: ZooPooSingleLanePerSampleDirFmt

An unexpected error has occurred:

No format: ZooPooSingleLanePerSampleDirFmt

Hi @manpflo615!

Could you post your entire command? I think you may be mixing up what some the arguments mean.

From your error I see “No format: ZooPooSingleLanePerSampleDirFmt” which makes sense because we have names for a specific number of formats which describe what kinds of ways the data may be represented. Since bioinformatics has so many different formats, we always start by having you tell QIIME 2 what your data looks like.

Let me know if that makes sense!

Here’s the command:

(qiime2-2017.12) [email protected]:~/zoo-samples$ qiime tools import \
> --type 'SampleData[SequencesWithQuality]' \
> --input-path zoo-poo \
> --source-format ZooPooSingleLanePerSampleDirFmt \
> --output-path zoo-poo.qza
Traceback (most recent call last):
  File "/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/util.py", line 91, in parse_format
    format_record = pm.formats[format_str]
KeyError: 'ZooPooSingleLanePerSampleDirFmt'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/q2cli/tools.py", line 116, in import_data
    view_type=source_format)
  File "/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/result.py", line 180, in import_data
    view_type = qiime2.sdk.parse_format(view_type)
  File "/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/util.py", line 93, in parse_format
    raise TypeError("No format: %s" % format_str)
TypeError: No format: ZooPooSingleLanePerSampleDirFmt

An unexpected error has occurred:

  No format: ZooPooSingleLanePerSampleDirFmt

See above for debug info.

Also, could you clarify what you meant by telling QIIME 2 what the data looks like?

Hi @manpflo615!

Of course. QIIME 2 uses the format to keep track of the kind of data it’s holding onto. For example, data might be a newick file, a fastq file, a bunch of fastq files with some naming convention, etc.

We use that format name to dynamically figure out what code to invoke, that code is “in charge” of reading the data and sometimes turning that data into other kinds of formats. It’s basically a technical detail which lets us convert data on the fly between methods (super handy for bioinformatics).

But in order to do that, we need a starting point, which is what the import command is for. It gives QIIME 2 something to start with and it can work out the rest of the rest of the details from there.

In your particular case, you are giving QIIME 2 the format ZooPooSingleLanePerSampleDirFmt. It then goes searching for some code to understand this format, but it can’t find any.

I think what you want is CasavaOneEightSingleLanePerSampleDirFmt which is a particular naming scheme for fastq files produced by Illumina instruments (more specifically the Casava 1.8 software package by Illumina).

If that doesn’t sound right, could you provide an ls of the zoo-poo directory? I should be able to figure out your best import strategy from that.

Thanks!

1 Like

Ahhhhh. Now I understand the point of importing. As far as the format goes and what you mentioned, I was working under the notion that CasavaOneEightSingleLanePerSampleDirFmt was a naming scheme that I could modify to better suit the naming of my data. I have an ls of the zoo-poo below:

(qiime2-2017.12) [email protected]:~/zoo-samples$ cd zoo-poo
(qiime2-2017.12) [email protected]:~/zoo-samples/zoo-poo$ ls
RID230Blank_S121_L001_R1_001.fastq.gz
RID230Blank_S121_L001_R2_001.fastq.gz
VanLaarA07_S116_L001_R1_001.fastq.gz
VanLaarA07_S116_L001_R2_001.fastq.gz
VanLaarB01_S95_L001_R1_001.fastq.gz
VanLaarB01_S95_L001_R2_001.fastq.gz
VanLaarC02_S100_L001_R1_001.fastq.gz
VanLaarC02_S100_L001_R2_001.fastq.gz
VanLaarC03_S104_L001_R1_001.fastq.gz
VanLaarC03_S104_L001_R2_001.fastq.gz
VanLaarC07_S117_L001_R1_001.fastq.gz
VanLaarC07_S117_L001_R2_001.fastq.gz
VanLaarC09_S120_L001_R1_001.fastq.gz
VanLaarC09_S120_L001_R2_001.fastq.gz
VanLaarE01_S97_L001_R1_001.fastq.gz
VanLaarE01_S97_L001_R2_001.fastq.gz
VanLaarE02_S101_L001_R1_001.fastq.gz
VanLaarE02_S101_L001_R2_001.fastq.gz
VanLaarE04_S107_L001_R1_001.fastq.gz
VanLaarE04_S107_L001_R2_001.fastq.gz
VanLaarE06_S114_L001_R1_001.fastq.gz
VanLaarG01_S98_L001_R1_001.fastq.gz
VanLaarG01_S98_L001_R2_001.fastq.gz
VanLaarG02_S102_L001_R1_001.fastq.gz
VanLaarG02_S102_L001_R2_001.fastq.gz
VanLaarG03_S105_L001_R1_001.fastq.gz
VanLaarG03_S105_L001_R2_001.fastq.gz
VanLaarG04_S108_L001_R1_001.fastq.gz
VanLaarG04_S108_L001_R2_001.fastq.gz
VanLaarG05_S111_L001_R1_001.fastq.gz
VanLaarG05_S111_L001_R2_001.fastq.gz
VanLaarG06_S115_L001_R1_001.fastq.gz
VanLaarG06_S115_L001_R2_001.fastq.gz
VanLaarG08_S118_L001_R1_001.fastq.gz
VanLaarG08_S118_L001_R2_001.fastq.gz
Zymov4SDSBd12_S12_L001_R1_001.fastq.gz
Zymov4SDSBd12_S12_L001_R2_001.fastq.gz

Hi @manpflo615! It looks like your paired-end data match the Casava 1.8 naming conventions that @ebolyen mentioned, so you should be able to import the zoo-poo directory using the CasavaOneEightSingleLanePerSampleDirFmt format (see this section of the importing tutorial for examples).

2 Likes

Hi @jairideout! I noticed the naming of the data had an ‘S’ in the barcode sequence identifier so I removed that ‘S’ and those files were then imported. Before, those files were not importing.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.