Error during deblur process

Hello everyone! I am new to doing bioinformatics stuff. I have little to no background on it and I am assigned to help a professor do some taxonomic analysis. I installed QIIME2 v2024.2 amplicon. FASTQ files are from American Gut Project and they are I believe Human Gut Metagenome, or Human Metagenome.

Here is the process I am going through:

  1. Download FASTQ 600 files (paired end so 1200 FASTQ files R1, R2)
  2. Creating a Demux file "CasavaOneEightSingleLanePerSampleDirFmt"
  3. Trim paired-end
    a. Forward Primer: "GTGCCAGCMGCCGCGGTAA"
    b. Rear Primer: "GGACTACHVGGGTWTCTAAT"
    I got the primer information on qiita when I received deblur error that I have to get rid of Phix or adapters from my sequences. I am not even sure if these primers are correct.
  4. Merge the pairs
  5. Run deblur process (trim length 125). I am getting mainly this error:

Plugin error from deblur:

No sequences passed the filter. It is possible the trim_length (125) may exceed the longest sequence, that all of the sequences are artifacts like PhiX or adapter, or that the positive reference used is not representative of the data being denoised.

See above for debug info.

The other error is (This is what I get occasionally):

Plugin error from deblur:

max() arg is an empty sequence

See above for debug info.

I am using a class computer with a lot of cores. I have a python script that saves the files in each of them by their distinct ID based folders. And run each file separately on parallel so that I can troubleshoot each of them individually.

I haven't tried DADA2 process.

When I ran single end files, I didn't trim anything using cutadapt, or didn't have to merge anything. I simply ran deblur process and it ran just fine. I ran on 100 files just to test and all of them worked perfectly. The issue keeps happening to paired end files. I ran the process on 300 files on 3 cluster computers (100 each) but after 24 hours, I got successful output of like 10. A lot of them gave me error output (failed to pass 125 trim length) which I saved as a log file. A lot of the the others were still running until the cluster computing time ran out.

Any perspective on trimming or possible issues would be greatly appreciated.

Thank you.

Hello @moshhoss,

Can you post the demux visualization and the command that you ran that resulted in an error?

ERR5947748_demux.qzv (311.1 KB)
ERR5947631_demux.qzv (312.0 KB)

I have attached two demux.qzv files. ERR5947631 is one that ran successfully.

My cutadapt code is
qiime cutadapt trim-paired
--i-demultiplexed-sequences '{input_demux}'
--p-front-f "GTGCCAGCMGCCGCGGTAA"
--p-front-r "GGACTACHVGGGTWTCTAAT"
--o-trimmed-sequences '{output_trimmed}'
--verbose

My merge code is
qiime vsearch merge-pairs
--i-demultiplexed-seqs '{input_demux}'
--o-merged-sequences '{output_joined}'
--o-unmerged-sequences '{output_unmerged}'
--verbose

And my Deblur code is
qiime deblur denoise-16S
--i-demultiplexed-seqs '{input_filtered_sequences}'
--p-trim-length 125
--p-sample-stats
--o-representative-sequences '{output_rep_seqs}'
--o-table '{output_table}'
--o-stats '{output_stats}'
--verbose

1 Like

ERR5947747_demux.qzv (312.0 KB)
ERR5947639_demux.qzv (311.5 KB)

I have attached two more demux.qzv files that failed to run deblur.

All the failed ones that I have attached, gave me this error:

Plugin error from deblur:

No sequences passed the filter. It is possible the trim_length (125) may exceed the longest sequence, that all of the sequences are artifacts like PhiX or adapter, or that the positive reference used is not representative of the data being denoised.

See above for debug info.

Hi @moshhoss,

It looks like some samples do not have sequences long enough for the trim length specified. The primers have already been removed for these data so that is not be needed. I've never attempted merging R1/R2 for these data -- is it possible that one or a few of the samples had no successfully stitched reads?

For context, I'm the Scientific Director for the American Gut Project.

Please note too that if you'd like pre-computed Deblur feature tables, they can be obtained using redbiom against study 10317.

Best,
Daniel