Running denoise and deblur on paired-end sequences

Hi all, I am currently having a raw shotgun metagenome sequencing data. It's paired-end sequences and I would like to use QIIME2 pipeline to identify the gut microbiome. My paired-end-demux.qza file size is 6.7GB, containing only forward and reverse sequence of one sample.
However, when I tried to run the denoising step, it kept gave me the same error even I have tried to adjust the number of threads (24, 7 and 2).

qiime dada2 denoise-paired \

> --i-demultiplexed-seqs paired-end-demux.qza \

> --p-trim-left-f 0 \

> --p-trim-left-r 0 \

> --p-trunc-len-f 150 \

> --p-trunc-len-r 150 \

> --o-representative-sequences reps-seqs-dada2.qza \

> --o-table pet-table.qza \

> --o-denoising-stats denoising-stats.qza \

> --p-n-threads 24 \

> --verbose

Running external command line application(s). This may print messages to stdout and/or stderr.

The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/tmpud3kk44l/forward /tmp/tmpud3kk44l/reverse /tmp/tmpud3kk44l/output.tsv.biom /tmp/tmpud3kk44l/track.tsv /tmp/tmpud3kk44l/filt_f /tmp/tmpud3kk44l/filt_r 150 150 0 0 2.0 2.0 2 12 independent consensus 1.0 24 1000000

R version 4.0.3 (2020-10-10)

Loading required package: Rcpp

DADA2: 1.18.0 / Rcpp: 1.0.6 / RcppParallel: 5.1.2

1) Filtering .

2) Learning Error Rates

6591345150 total bases in 43942301 reads from 1 samples will be used for learning the error rates.

Traceback (most recent call last):

  File "/home/chia/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 266, in denoise_paired

    run_commands([cmd])

  File "/home/chia/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 36, in run_commands

    subprocess.run(cmd, check=True)

  File "/home/chia/miniconda3/envs/qiime2-2021.4/lib/python3.8/subprocess.py", line 516, in run

    raise CalledProcessError(retcode, process.args,

subprocess.CalledProcessError: Command '['run_dada_paired.R', '/tmp/tmpud3kk44l/forward', '/tmp/tmpud3kk44l/reverse', '/tmp/tmpud3kk44l/output.tsv.biom', '/tmp/tmpud3kk44l/track.tsv', '/tmp/tmpud3kk44l/filt_f', '/tmp/tmpud3kk44l/filt_r', '150', '150', '0', '0', '2.0', '2.0', '2', '12', 'independent', 'consensus', '1.0', '24', '1000000']' died with <Signals.SIGKILL: 9>.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "/home/chia/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/q2cli/commands.py", line 329, in __call__

    results = action(**arguments)

  File "<decorator-gen-514>", line 2, in denoise_paired

  File "/home/chia/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/qiime2/sdk/action.py", line 244, in bound_callable

    outputs = self._callable_executor_(scope, callable_args,

  File "/home/chia/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/qiime2/sdk/action.py", line 390, in _callable_executor_

    output_views = self._callable(**view_args)

  File "/home/chia/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 279, in denoise_paired

    raise Exception("An error was encountered while running DADA2"

Exception: An error was encountered while running DADA2 in R (return code -9), please inspect stdout and stderr to learn more.

Plugin error from dada2:

  An error was encountered while running DADA2 in R (return code -9), please inspect stdout and stderr to learn more.

See above for debug info.

Based on what I understand for the error, it seems like saying that my CPU do not have enough RAM to run the data. Please correct me if I'm wrongly intepret.

However, after a few time fails I try to change to deblur-16S, yet I got stuck at the deblurring sequencing steps upon checking at the deblur-log file. I have tried different number of number of jobs which including 24, 3 and 7.
Please refer the details from the picture I have attached.

Hello!

Dada2 and Deblur are 16S rRNA gene amplicons specific and are not appropriate for shotgun data. Please take a look on the tutorial for metagenomic samples. Also, don't forget to install / use qiime2-metagenome edition (will be renamed to qiime2-moshpit edition starting from the next release).

The workflow could be (just an example, it depends on your project and questions you want to answer):

  • QC
  • Host DNA removal
  • Taxonomy annotation
  • Assemblies
  • Binning
  • Function annotations

Then you can use obtained data for diversity and stat analyses.

Best,
Timur

2 Likes

Oh dear, I just realized that I forget to attach the picture for my deblur


Hi Timur, thank you so much for the reply, I will try it on and see how it goes!