Can't import my data Qiime2021.11

Hi, sorry if I'm a little dusty on this. I reviewed the Importing data tutorials and I believe the way to go is to try the Casava 1.8 paired-end demultiplexed fastq. My files are in a folder called fastq which I unzipped and got two files for each pair sample (.fastq and .fastq.gz).

When I run the following command, I get the error:

qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path fastq --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path demux-paired-end.qza

**There was a problem importing fastq:**

**fastq/baby014_201_L001_R2_001.fastq.gz is not a(n) FastqGzFormat file:**

**File is uncompressed**

But it is a compressed file in my folder.

I also tried doing the PairedEndFastqManifestPhred33V2 and also got an error:

qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path fastq --output-path paired-end-demux.qza --input-format PairedEndFastqManifestPhred33V2

**There was a problem importing fastq:**

**fastq is not a file.**

I attached also a view of my file organization in the directory I'm working at.

I would appreciate any help around this!
Thank you

@malenaamer,

Based on the file names, they do appear to be Casava 1.8 formatted. I think what happened is that some how the .gz extension just got added to some of the files without them actually being compressed.

First, lets remove the .gz extension. From the directory that contains your fastq directory(where it looks like you had been running your commands from), you can use this command in your terminal to remove them:

for i in fastq/*.gz ; do mv "$i" "${i%%.gz}" ; done

Next you can compress all of the files in that folder using:

gzip -r fastq

Try this out and let us know how it goes!

2 Likes

Thank you for the quick response!

I seem to be getting now another type of error. I don't know if it's related to the fact that I installed qiime2 in my computer hard drive, but I'm working with a data base that is in an external hard drive.

Do you know how I could solve this following error?

(qiime2-2021.11) MacBook-Pro-de-Malena:HS mae$ qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path fastq --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path demux-paired-end.qza
Traceback (most recent call last):
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2cli/builtin/tools.py", line 157, in import_data
    artifact = qiime2.sdk.Artifact.import_data(type, input_path,
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/result.py", line 277, in import_data
    return cls._from_view(type_, view, view_type, provenance_capture,
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/result.py", line 305, in _from_view
    result = transformation(view, validate_level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/core/transform.py", line 68, in transformation
    self.validate(view, level=validate_level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/core/transform.py", line 143, in validate
    view.validate(level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/plugin/model/directory_format.py", line 173, in validate
    getattr(self, field)._validate_members(collected_paths, level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/plugin/model/directory_format.py", line 103, in _validate_members
    self.format(path, mode='r').validate(level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/plugin/model/file_format.py", line 26, in validate
    self._validate_(level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_types/per_sample_sequences/_format.py", line 284, in _validate_
    self._check_n_records(record_count_map[level])
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_types/per_sample_sequences/_format.py", line 244, in _check_n_records
    for i, record in file_:
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)

An unexpected error has occurred:

  'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)

See above for debug info.

Hi again,

I changed my directory to the Desktop because I wanted to make sure this problem wasn't caused because of the external hard drive use. But the error persisted.

I would appreciate any help around it :slight_smile:

    (qiime2-2021.11) MacBook-Pro-de-Malena:HS mae$ qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path fastq --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path demux-paired-end.qza
Traceback (most recent call last):
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2cli/builtin/tools.py", line 157, in import_data
    artifact = qiime2.sdk.Artifact.import_data(type, input_path,
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/result.py", line 277, in import_data
    return cls._from_view(type_, view, view_type, provenance_capture,
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/result.py", line 305, in _from_view
    result = transformation(view, validate_level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/core/transform.py", line 68, in transformation
    self.validate(view, level=validate_level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/core/transform.py", line 143, in validate
    view.validate(level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/plugin/model/directory_format.py", line 173, in validate
    getattr(self, field)._validate_members(collected_paths, level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/plugin/model/directory_format.py", line 103, in _validate_members
    self.format(path, mode='r').validate(level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/plugin/model/file_format.py", line 26, in validate
    self._validate_(level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_types/per_sample_sequences/_format.py", line 284, in _validate_
    self._check_n_records(record_count_map[level])
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_types/per_sample_sequences/_format.py", line 244, in _check_n_records
    for i, record in file_:
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)

An unexpected error has occurred:

  'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)

See above for debug info.

I also read this post: UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0x8b in position 513 - #6 by ariel

And did what Ariel said it worked for her but I don't know what errors to look for when all the files appear on the list.

(qiime2-2021.11) MacBook-Pro-de-Malena:HS mae$ for f in fastq/*; do file $f;
> done

Hi again,

I'm trying more things as I go and hope that it helps to see all this when you get a chance to respond.

A friend recommended me to try using the "PairedEndFastqManifestPhred33V2" method and making a manifest file. I did one as a text with just one of my samples.

And then I ran the following code:

(qiime2-2021.11) MacBook-Pro-de-Malena:HS mae$ qiime tools import \
>   --type 'SampleData[PairedEndSequencesWithQuality]' \
>   --input-path manifest.txt \
>   --output-path demux.qza \
>   --input-format PairedEndFastqManifestPhred33V2

But obtained a similar error:

Traceback (most recent call last):
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2cli/builtin/tools.py", line 157, in import_data
    artifact = qiime2.sdk.Artifact.import_data(type, input_path,
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/result.py", line 277, in import_data
    return cls._from_view(type_, view, view_type, provenance_capture,
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/result.py", line 305, in _from_view
    result = transformation(view, validate_level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/core/transform.py", line 73, in transformation
    other.validate(new_view)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/core/transform.py", line 143, in validate
    view.validate(level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/plugin/model/directory_format.py", line 173, in validate
    getattr(self, field)._validate_members(collected_paths, level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/plugin/model/directory_format.py", line 103, in _validate_members
    self.format(path, mode='r').validate(level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/plugin/model/file_format.py", line 26, in validate
    self._validate_(level)
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_types/per_sample_sequences/_format.py", line 284, in _validate_
    self._check_n_records(record_count_map[level])
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_types/per_sample_sequences/_format.py", line 244, in _check_n_records
    for i, record in file_:
  File "/Users/mae/miniconda3/envs/qiime2-2021.11/lib/python3.8/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xae in position 37: ordinal not in range(128)

An unexpected error has occurred:

  'ascii' codec can't decode byte 0xae in position 37: ordinal not in range(128)

See above for debug info.

This makes me think it might be because of a configuration I have in my computer.

I would appreciate any help around this!
Malena

@malenaamer,

Ok, I had to do a bit of digging on this. It looks like some of your original files were unzipped but with the .gz appended while some were actually zipped. I was able to replicate your error by manually stripping the .gz extension off of a file that I was planning to import, then compressing it again with gzip. I don't have a good automated solution tonight for this but I think these basic steps should fix your issue:

  1. Unzip all files.
  2. Rename all files with .gz extension so that you can try unzipping them again.
  3. Try unzipping all files again. This will have to be done either in a try/except manner or individually.
  4. Strip all .gz endings.
  5. Recompress all files.
  6. Retry import.

Hopefully I will have time to work out a script to do this in the morning, but I wanted to at least lay-out what I believe to be the problem and an outline of a solution to it right now.

1 Like

Hi!

Thank you for your reply. I just want to make sure I understand. Should I be doing this manually? Is this different of what you first advised me to do?

for i in fastq/*.gz ; do mv "$i" "${i%%.gz}" ; done

Next you can compress all of the files in that folder using:

gzip -r fastq

As far as I understand these are the first two steps. But then, Step 3, what is a "try/except" manner? Should I do this from the Finder?

Step 4, by stripping you mean to repeat step 1? When I delete the .gz, the file is just decompressed.

Step 5: how is this different from step 2? Can I use the same command?

Sorry if my questions are too basic, I tried googleing your terms but I'm still confused.

Thanks in advance,
Malena

HI @Keegan-Evans,

I figured it out how to fix it! I think there must have been some files that were corrupted. I redownloaded it and avoided a step I did before (I replaced some files that were inside another folder of the fastq.zip that might be the ones messing up my process).

I reran my same code and now it worked.
I appreciate your time and help!
Malena

1 Like

@malenaamer I am glad you got it working and thanks for the update!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.