Error during deblur process

moshhoss · April 1, 2024, 5:14am

Hello everyone! I am new to doing bioinformatics stuff. I have little to no background on it and I am assigned to help a professor do some taxonomic analysis. I installed QIIME2 v2024.2 amplicon. FASTQ files are from American Gut Project and they are I believe Human Gut Metagenome, or Human Metagenome.

Here is the process I am going through:

Download FASTQ 600 files (paired end so 1200 FASTQ files R1, R2)
Creating a Demux file "CasavaOneEightSingleLanePerSampleDirFmt"
Trim paired-end
a. Forward Primer: "GTGCCAGCMGCCGCGGTAA"
b. Rear Primer: "GGACTACHVGGGTWTCTAAT"
I got the primer information on qiita when I received deblur error that I have to get rid of Phix or adapters from my sequences. I am not even sure if these primers are correct.
Merge the pairs
Run deblur process (trim length 125). I am getting mainly this error:

Plugin error from deblur:

No sequences passed the filter. It is possible the trim_length (125) may exceed the longest sequence, that all of the sequences are artifacts like PhiX or adapter, or that the positive reference used is not representative of the data being denoised.

See above for debug info.

The other error is (This is what I get occasionally):

Plugin error from deblur:

max() arg is an empty sequence

See above for debug info.

I am using a class computer with a lot of cores. I have a python script that saves the files in each of them by their distinct ID based folders. And run each file separately on parallel so that I can troubleshoot each of them individually.

I haven't tried DADA2 process.

When I ran single end files, I didn't trim anything using cutadapt, or didn't have to merge anything. I simply ran deblur process and it ran just fine. I ran on 100 files just to test and all of them worked perfectly. The issue keeps happening to paired end files. I ran the process on 300 files on 3 cluster computers (100 each) but after 24 hours, I got successful output of like 10. A lot of them gave me error output (failed to pass 125 trim length) which I saved as a log file. A lot of the the others were still running until the cluster computing time ran out.

Any perspective on trimming or possible issues would be greatly appreciated.

Thank you.

colinvwood · April 1, 2024, 9:48pm

Hello @moshhoss,

Can you post the demux visualization and the command that you ran that resulted in an error?

moshhoss · April 2, 2024, 12:43am

ERR5947748_demux.qzv (311.1 KB)
ERR5947631_demux.qzv (312.0 KB)

I have attached two demux.qzv files. ERR5947631 is one that ran successfully.

My cutadapt code is
qiime cutadapt trim-paired
--i-demultiplexed-sequences '{input_demux}'
--p-front-f "GTGCCAGCMGCCGCGGTAA"
--p-front-r "GGACTACHVGGGTWTCTAAT"
--o-trimmed-sequences '{output_trimmed}'
--verbose

My merge code is
qiime vsearch merge-pairs
--i-demultiplexed-seqs '{input_demux}'
--o-merged-sequences '{output_joined}'
--o-unmerged-sequences '{output_unmerged}'
--verbose

And my Deblur code is
qiime deblur denoise-16S
--i-demultiplexed-seqs '{input_filtered_sequences}'
--p-trim-length 125
--p-sample-stats
--o-representative-sequences '{output_rep_seqs}'
--o-table '{output_table}'
--o-stats '{output_stats}'
--verbose

moshhoss · April 2, 2024, 12:43am

ERR5947747_demux.qzv (312.0 KB)
ERR5947639_demux.qzv (311.5 KB)

I have attached two more demux.qzv files that failed to run deblur.

All the failed ones that I have attached, gave me this error:

Plugin error from deblur:

No sequences passed the filter. It is possible the trim_length (125) may exceed the longest sequence, that all of the sequences are artifacts like PhiX or adapter, or that the positive reference used is not representative of the data being denoised.

See above for debug info.

wasade · April 2, 2024, 5:01pm

Hi @moshhoss,

It looks like some samples do not have sequences long enough for the trim length specified. The primers have already been removed for these data so that is not be needed. I've never attempted merging R1/R2 for these data -- is it possible that one or a few of the samples had no successfully stitched reads?

For context, I'm the Scientific Director for the American Gut Project.

Please note too that if you'd like pre-computed Deblur feature tables, they can be obtained using redbiom against study 10317.

Best,
Daniel

moshhoss · May 1, 2024, 5:41pm

Thank you for your reply. I was able figure out the issue. I was trying to deblur paired ends but same subject already has single end files and running the single end files simply solved all the problem.

Thank you again. I greatly appreciate it.

marcosandrew · May 10, 2024, 3:32pm

Hey Daniel McDonald,

I am facing same issue and your expertise and insights on the sequencing data challenges are truly appreciated. Your proactive approach and offer of pre-computed Deblur feature tables showcase your dedication. Thank you for your valuable contributions to the American Gut Project.

(Marcos)

wasade · May 11, 2024, 9:03pm

Hi @marcosandrew,

Thank you for the kind words. Could you describe the specific issue you're having, and include the commands and any errors observed?

Best,
Daniel

system · June 12, 2024, 3:03am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.