DADA2 - Filtering Error in filterAndTrim

Dear Qiime developers,
I am running a script to process 7 Miseq runs using the approach explained in the FMT tutorial https://docs.qiime2.org/2019.1/tutorials/fmt/.

I first import and check artifical-sequences-free reads using the following :

### Import sequences 
# only sequences and properly formatted in the INPUT directory
# in the following directories

RUN1=BP
RUN2=CP 
RUN3=DD
RUN4=IB
RUN5=II
RUN6=LU
RUN7=MC

for seqs in ${RUN1} ${RUN2} ${RUN3} ${RUN4} ${RUN5} ${RUN6} ${RUN7}
do
qiime tools import --type SampleData[PairedEndSequencesWithQuality] \
                   --input-path ${IN}/${seqs} \
                   --output-path ${IN}/${seqs}_reads.qza \
                   --input-format CasavaOneEightSingleLanePerSampleDirFmt 

Check this artifact to make sure that QIIME now recognizes your data
qiime tools peek ${IN}/${seqs}_reads.qza 

### 'Initial' sequence quality control
qiime demux summarize \
  --i-data ${IN}/${seqs}_reads.qza  \
  --o-visualization ${IN}/${seqs}_reads.qzv  \
  --verbose
done

Then, I run the following loop to run DADA2 independently on the different runs :

### DADA2 workflow
#https://github.com/LangilleLab/microbiome_helper/wiki/Amplicon-SOP-v2-(qiime2-2018.8)

for seqs in ${RUN1} ${RUN2} ${RUN3} ${RUN4} ${RUN5} ${RUN6} ${RUN7}
do
truncF=0
truncL=220
trimF=0
trimL=0
maxee=3
truncq=12
nreadslearn=1000000
chim=consensus


mkdir dada2_output_${seqs}

qiime dada2 denoise-paired --i-demultiplexed-seqs ${IN}/${seqs}_reads.qza \
                           --p-trunc-len-f ${truncF} \
                           --p-trunc-len-r ${truncL} \
                           --p-trim-left-f ${trimF} \
                           --p-trim-left-r ${trimL} \
                           --p-max-ee ${maxee} \
                           --p-trunc-q ${truncq} \
                           --p-n-reads-learn ${nreadslearn} \
                           --p-n-threads ${NSLOTS} \
                           --p-chimera-method ${chim} \
                           --o-representative-sequences dada2_output_${seqs}/${seqs}_representative_sequences.qza \
                           --o-table dada2_output_${seqs}/${seqs}_table.qza \
                           --o-denoising-stats dada2_output_${seqs}/${seqs}_denoising_stats.qza \
                           --verbose

### Viewing denoising stats
qiime metadata tabulate \
  --m-input-file dada2_output_${seqs}/${seqs}_denoising_stats.qza \
  --o-visualization dada2_output_${seqs}/${seqs}_denoising_stats.qza
... 

It is working like a charm except for the first run and the I got the error :

DADA2 R package version: 1.6.0

  1. Filtering Error in filterAndTrim(unfiltsF, filtsF, unfiltsR, filtsR, minLen = 175, :
    These are the errors (up to 5) encountered in individual cores…
    Error in isFALSE(simplify) : could not find function “isFALSE”
    Error in isFALSE(simplify) : could not find function “isFALSE”
    Error in isFALSE(simplify) : could not find function “isFALSE”
    Error in isFALSE(simplify) : could not find function “isFALSE”
    Error in isFALSE(simplify) : could not find function “isFALSE”
    Execution halted
    Traceback (most recent call last):
    File “/homedir/constancias/.conda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 231, in denoise_paired
    run_commands([cmd])
    File “/homedir/constancias/.conda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 36, in run_commands
    subprocess.run(cmd, check=True)
    File “/homedir/constancias/.conda/envs/qiime2-2019.1/lib/python3.6/subprocess.py”, line 418, in run
    output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command ‘[‘run_dada_paired.R’, ‘/tmp/7700394.1.long.q/tmp96myw46k/forward’, ‘/tmp/7700394.1.long.q/tmp96myw46k/reverse’, ‘/tmp/7700394.1.long.q/tmp96myw46k/output.tsv.biom’, ‘/tmp/7700394.1.long.q/tmp96myw46k/track.tsv’, ‘/tmp/7700394.1.long.q/tmp96myw46k/filt_f’, ‘/tmp/7700394.1.long.q/tmp96myw46k/filt_r’, ‘0’, ‘220’, ‘0’, ‘0’, ‘3.0’, ‘12’, ‘consensus’, ‘1.0’, ‘25’, ‘1000000’]’ returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/homedir/constancias/.conda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “</homedir/constancias/.conda/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-442>”, line 2, in denoise_paired
File “/homedir/constancias/.conda/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/homedir/constancias/.conda/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 365, in callable_executor
output_views = self._callable(**view_args)
File “/homedir/constancias/.conda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 246, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info.
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/7700394.1.long.q/tmp96myw46k/forward /tmp/7700394.1.long.q/tmp96myw46k/reverse /tmp/7700394.1.long.q/tmp96myw46k/output.tsv.biom /tmp/7700394.1.long.q/tmp96myw46k/track.tsv /tmp/7700394.1.long.q/tmp96myw46k/filt_f /tmp/7700394.1.long.q/tmp96myw46k/filt_r 0 220 0 0 3.0 12 consensus 1.0 25 1000000

Before running this, I have modified the run_dada_paired.R script in order to include the parameters :
minLen = 175, maxN = 0 in filterAndTrim
MAX_CONSIST=20 in dada(derepRs, selfConsist=TRUE,

Do you have any idea how to solve this issue ?

Thanks a ton

Yikes --- I suspect this is the problem. This is probably not ideal in terms of reproducibility --- perhaps you are better off just running dada2 directly in this case? You already have it in your QIIME 2 environment.

Care to share the changes you made?

1 Like

These are the changes I have made.

minLen = 175, maxN = 0 in filterAndTrim function
MAX_CONSIST=20 in dada function (derepRs, selfConsist=TRUE,

I think I have identified the issue, I was trying to update dada2 R package in my qiime2 conda environment while the job was running. I have reinstalled qiime2 cond env. and it is working perfectly with the tow changes I have made.

perhaps you are better off just running dada2 directly in this case? You already have it in your QIIME 2 environment.

I am also exploring this option, I had trouble to make the last version of dada2 with multithreads = T installed on another conda environment on our cluster.

Are you planning to update R and dada2 package in a next release of qiime2?

Thanks

Sorry, I should've been more clear --- can you share the diff of the changes you made? As I am sure you are aware, declaring what you think you did is often quite different from what you actually did when it comes to programming (I am painfully aware of this, myself!).

Okay, well, as I said above, modifying these static scripts is probably not a very good idea, I would suggest not doing that in the future.

I suggest you file an issue at the official DADA2 issue tracker.

I am not sure if we are going to update R, but yes, the plan is to update DADA2 --- there have been a handful of things preventing this from happening sooner.

Thanks a lot for your reply.

I am not sure if we are going to update R

I meant to install latest R version.

Me too! We are currently shipping R 3.4.1 with QIIME 2 2019.1. I am not sure if and when we might update that to a newer version.