Error raised during denosing step with DADA2

Hi,

I’ve been trying to run few paired-end fastq samples (n=3) with DADA2 plugin:

“qiime dada2 denoise-paired --i-demultiplexed-seqs demux-paired-end_OSD14.qza --p-trunc-len-f 250 --p-trunc-len-r 250 --p-trim-left-f 9 --p-trim-left-r 9 --p-n-reads-learn 200000 --p-n-threads 4 --o-representative-sequences rep-seqs-dada2_OSD14.qza --o-table table-dada2_OSD14.qza --output-dir dada2-output_OSD14”

However, during this step I got the following error:

Command: run_dada_paired.R /var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/forward /var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/reverse /var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/output.tsv.biom /var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/track.tsv /var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/filt_f /var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/filt_r 250 250 9 9 2.0 2 consensus 1.0 4 200000

R version 3.4.1 (2017-06-30)
Loading required package: Rcpp
DADA2 R package version: 1.6.0

  1. Filtering …
  2. Learning Error Rates
    2a) Forward Reads
    Initializing error rates to maximum possible estimate.
    Sample 1 - 197259 reads in 197190 unique sequences.
    Sample 2 - 152345 reads in 152309 unique sequences.
    selfConsist step 2
    selfConsist step 3
    Convergence after 3 rounds.
    2b) Reverse Reads
    Initializing error rates to maximum possible estimate.
    Error rates could not be estimated.
    Error in err[c(1, 6, 11, 16), ] <- 1 :
    incorrect number of subscripts on matrix
    Calls: dada
    Execution halted
    Traceback (most recent call last):
    File “/Users/arenha/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 229, in denoise_paired
    run_commands([cmd])
    File “/Users/arenha/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 36, in run_commands
    subprocess.run(cmd, check=True)
    File “/Users/arenha/miniconda3/envs/qiime2-2018.4/lib/python3.5/subprocess.py”, line 398, in run
    output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command ‘[‘run_dada_paired.R’, ‘/var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/forward’, ‘/var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/reverse’, ‘/var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/output.tsv.biom’, ‘/var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/track.tsv’, ‘/var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/filt_f’, ‘/var/folders/mn/y4mzz7jn2fv8203nf44h5df80000gn/T/tmp9gv9n6sf/filt_r’, ‘250’, ‘250’, ‘9’, ‘9’, ‘2.0’, ‘2’, ‘consensus’, ‘1.0’, ‘4’, ‘200000’]’ returned non-zero exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/Users/arenha/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “”, line 2, in denoise_paired
File “/Users/arenha/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/Users/arenha/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 366, in callable_executor
output_views = self._callable(**view_args)
File “/Users/arenha/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 244, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

It seems that reverse fastq files are causing this, but I do not have any idea why.
Thanks in advance,
António

Hi @antonioggsousa,
Could you share the quality scores profile for your forward and reverse reads? Are the Q scores by any chance fake? We have seen this same error message occur when fake Q scores are used. See here:

Thanks!

Hi @Nicholas_Bokulich,
thank you for your quick reply.

Yes, I can share the Q scores plots produced by QIIME2 (see below).

I'm trying to do a tutorial on QIIME2 for people that work in the same lab as me with fastq files retrieve from ENA repository. I use ftp protocol to download the fastq files. I do not think that there is any change of the Q scores being fake. The accession in ENA from fastq files that I'm using are: ERR770975, ERR770976, ERR771093.

To check if the problem was about the installation or DADA2 version, I tested the “Atacama soil microbiome” tutorial, available in QIIME2 website. I got now the results, and it seems that works. So, the problem definitely has to be with these samples.

May be I need to change the samples or used samples from a different project.

Thanks in advance,
António

2 Likes

Taking a quick look at the sample accession, I think this is shotgun metagenomics data?

If so that’s the problem. DADA2 is designed for amplicon data, not shotgun data, and is likely to break in various ways due to its assumptions being violated.

Is there a Q2 plugin for shotgun data processing now?

2 Likes

There is a metaphlan2 plugin for QIIME 2 that @antonioggsousa could try out — that is not a core QIIME 2 plugin currently so I am not aware of the status of that plugin.

@antonioggsousa please let us know if this is shotgun data. It would still be possible to process using external software and then import a biom table into QIIME 2 for downstream analysis.

Hi @benjjneb,
thank you for your quick reply.

Yes, you are correct about these samples being shotgun metagenomics data instead amplicon, as I thought. I check again these, they are indeed metagenomic. The problem was that I looked to the samples names, and I tried to retrieve the fastq files, but I forgot to check if it was from amplicon or metagenomics, since the Ocean Sampling Day 2014 project (Project ERP009703 (PRJEB8682)) has both types of data submitted on the same project, i.e., metagenomics and amplicon.

Thanks a lot @benjjneb, I never get there alone.
António

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.