Qiime cutadapt trim-paired failing silently in Qiime 2018.6


(Paul Czechowski) #1

Dear all,

thank your for reviewing this post. I am invoking qiime cutadapt trim-paired on a cluster as follows:

for ((i=1;i<=2;i++)); do
    qiime cutadapt trim-paired \
        --i-demultiplexed-sequences "$trpth"/"${inpth[$i]}" \
        --p-cores "$cores" \
        --p-front-f "${fwdcut[$i]}" \
        --p-front-r "${revcut[$i]}" \
        --p-error-rate 0.1 \
        --o-trimmed-sequences "$trpth"/"${otpth[$i]}" \
        --verbose | tee "$trpth"/"${log[$i]}"

I am looping over a imported .qza file containing 18S data, and receive a trimmed .qza file as expected. However this is not the case for a 16S data set, where only the log file gets written --verbose | tee "$trpth"/"${log[$i]}". So trimming is taking place for both files, but for the 16S data no files are written. Is this a possibly known bug? Perhaps I should just skip this step, as not read are filtered out anyways? Can’t update Qiime in the target environment easily, it’s a big remote machine. Looking forward to hearing from you in due course.

Thank you.


(Nicholas Bokulich) #2

(Paul Czechowski) #3

Dear all,

thank you very much, I resolved this. The error message was in the log file but quite far up. Looks like the data has some errors that didn’t get caught during importing. I will be excluding the offending data and retry. Error message was:

This is cutadapt 1.16 with Python 3.5.5
Command line parameters: --cores 40 --error-rate 0.1 --times 1 --overlap 3 -o /tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-bpmdr3da/585A_150_L001_R1_001.fastq.gz -p /tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-bpmdr3da/585A_151_L001_R2_001.fastq.gz --front AGAGTTTGATCMTGGCTCAG -G GWATTACCGCGGCKGCTG /tmp/qiime2-archive-e7k4n5h7/49b8914c-474b-4d85-acc2-5978cc10d5ef/data/585A_150_L001_R1_001.fastq.gz /tmp/qiime2-archive-e7k4n5h7/49b8914c-474b-4d85-acc2-5978cc10d5ef/data/585A_151_L001_R2_001.fastq.gz
Running on 40 cores
Trimming 2 adapters with at most 10.0% errors in paired-end mode ...
ERROR: Traceback (most recent call last):
  File "/programs/Anaconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/cutadapt/pipeline.py", line 456, in run
    (n, bp1, bp2) = self._pipeline.process_reads()
  File "/programs/Anaconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/cutadapt/pipeline.py", line 284, in process_reads
    for read1, read2 in self._reader:
  File "/programs/Anaconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/cutadapt/seqio.py", line 414, in __iter__
    r2 = next(it2)
  File "src/cutadapt/_seqio.pyx", line 214, in __iter__ (src/cutadapt/_seqio.c:5619)
cutadapt.seqio.FormatError: Line 3 in FASTQ file is expected to start with '+', but found '@M01153:41'

Kind regards,


(Nicholas Bokulich) #4