Error messages after cutadapt, but got trimmed files.

gihyeon · October 23, 2024, 1:53am

Hi,

I encountered a problem with running cutadapt. I am sorry if this problem has been solved before, but I could not find a solution.

I have run the code : parallel --xapply --jobs 30 'cutadapt --pair-filter any --no-indels --discard-untrimmed -g CCTACGGGNGGCWGCAG -G GACTACHVGGGTATCTAATCC -o 01_primer_trimmed_fastqs/cutadapt_{1/} -p 01_primer_trimmed_fastqs/cutadapt_`basename {=s/_1/_2/;s/\.fastq.gz//=}.fastq.gz` {1} {=s/_1/_2/=} > 01_primer_trimmed_fastqs/{1/}_cutadapt_log.txt' ::: raw_data/*_1.fastq.gz

and then I have an error message as follows.

  File "/home/gihyeon/anaconda3/envs/qiime2-amplicon-2024.5/bin/cutadapt", line 10, in <module>
    sys.exit(main_cli())
  File "/home/gihyeon/anaconda3/envs/qiime2-amplicon-2024.5/lib/python3.9/site-packages/cutadapt/cli.py", line 1149, in main_cli
    main(sys.argv[1:])
  File "/home/gihyeon/anaconda3/envs/qiime2-amplicon-2024.5/lib/python3.9/site-packages/cutadapt/cli.py", line 1243, in main
    stats = runner.run(pipeline, progress, outfiles)
  File "/home/gihyeon/anaconda3/envs/qiime2-amplicon-2024.5/lib/python3.9/site-packages/cutadapt/runners.py", line 423, in run
    (n, total1_bp, total2_bp) = pipeline.process_reads(
  File "/home/gihyeon/anaconda3/envs/qiime2-amplicon-2024.5/lib/python3.9/site-packages/cutadapt/pipeline.py", line 137, in process_reads
    for reads in self._reader:
  File "/home/gihyeon/anaconda3/envs/qiime2-amplicon-2024.5/lib/python3.9/site-packages/dnaio/pairedend.py", line 96, in __iter__
    for r1, r2 in zip(self.reader1, self.reader2):
  File "src/dnaio/_core.pyx", line 581, in dnaio._core.FastqIter.__next__
  File "src/dnaio/_core.pyx", line 512, in dnaio._core.FastqIter._read_into_buffer
  File "/home/gihyeon/anaconda3/envs/qiime2-amplicon-2024.5/lib/python3.9/gzip.py", line 300, in read
    return self._buffer.read(size)
igzip_lib.IsalError: Error -1 Invalid deflate block found

The weird point is that the output files were produced successfully. I also checked the log file and found no error.

Thank you for your help in advance.

Best,
Gihyon

SF-999_1.fastq.gz_cutadapt_log.txt (1.8 KB)

colinbrislawn · October 23, 2024, 1:58am

Hello @gihyeon,

Welcome to the forums!

Here's the core of the error:

'Deflate' is the compression method used in .gz files.
This means that one of your fastq.gz files is probably corrupted during downloading.
Redownloading it should fix this issue!

P.S. I've moved this question to 'other bioinformatic tools' as it looks like you are using cutadapt and GNU parallel.
Have you considered importing your data and using the q2-cutadapt plugin? :qiime2:

gihyeon · October 23, 2024, 5:04am

Hi @colinbrislawn,

I appreciate your really fast answer!
(I would amend that I used cutadapt package installed in qiime2 environment, not q2-cutadapt plugin.)

Unfortunately, if my fastq files were corrupted while downloading, there is no way to solve it

I am trying to run qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path ./list.csv --output-path 01_importing_file/importing_data --input-format PairedEndFastqManifestPhred33.
list.csv files includes the path of trimmed files which were produced by previous code (maybe damaged fastq.gz files).

By the way, Can I use q2-cutadapt plugin with GNU parallel for shorten working time? or any other method you recommend?

Thank you so much!!

Sincerely,
Gihyeon.

colinbrislawn · October 24, 2024, 7:55pm

Yes, unfortunately.

Lots of the Qiime2 plugins include built-in methods to process things in parallel, including q2-cutadapt.

>Usage: qiime cutadapt trim-paired [OPTIONS]

  Search demultiplexed paired-end sequences for adapters and remove them. The
  parameter descriptions in this method are adapted from the official cutadapt
  docs - please see those docs at https://cutadapt.readthedocs.io for complete
  details.

Inputs:
  --i-demultiplexed-sequences ARTIFACT
    SampleData[PairedEndSequencesWithQuality]
                          The paired-end sequences to be trimmed.   [required]
Parameters:
  --p-cores NTHREADS      Number of CPU cores to use.             [default: 1]

Good luck sorting out the corrupted file. I've had luck with BBTools repair.sh in the past. That's another 3rd party tool!

gihyeon · November 7, 2024, 2:31am

Thank you so much for your help!
Have a good day.

Best,
Gihyeon

system · December 8, 2024, 8:31am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.