Error was encountered while running DADA2 in R (return code 1)

Hello everyone,

I am trying to run Dada2 , my command is

qiime dada2 denoise-paired --i-demultiplexed-seqs DNA.qza --p-trunc-len-f 0 --p-trunc-len-r 240 --p-trim-left-f 18 --p-trim-left-r 20 --p-trunc-q 2 --p-n-threads 1 --o-table DNA-table.qza --o-representative-sequences DNA-rep-seqs.qza --o-denoising-stats DNA-denoisings-stats.qza

I’m getting the error below. How can I fix it? From the DNA.qza , I obtained the profile shown in the picture. Are the truncation values I used correct? Thank you for your support

(qiime2-amplicon-2024.5) login02 $ qiime dada2 denoise-paired --i-demultiplexed-seqs DNA.qza --p-trunc-len-f 0 --p-trunc-len-r 240 --p-trim-left-f 18 --p-trim-left-r 20 --p-trunc-q 2 --p-n-threads 1 --o-table DNA-table.qza --o-representative-sequences DNA-rep-seqs.qza --o-denoising-stats DNA-denoisings-stats.qza --verbose
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada.R --input_directory /tmp/tmp4frygh15/forward --input_directory_reverse /tmp/tmp4frygh15/reverse --output_path /tmp/tmp4frygh15/output.tsv.biom --output_track /tmp/tmp4frygh15/track.tsv --filtered_directory /tmp/tmp4frygh15/filt_f --filtered_directory_reverse /tmp/tmp4frygh15/filt_r --truncation_length 0 --truncation_length_reverse 240 --trim_left 18 --trim_left_reverse 20 --max_expected_errors 2.0 --max_expected_errors_reverse 2.0 --truncation_quality_score 2 --min_overlap 12 --pooling_method independent --chimera_method consensus --min_parental_fold 1.0 --allow_one_off False --num_threads 1 --learn_min_reads 1000000

R version 4.3.3 (2024-02-29)
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.12 / RcppParallel: 5.1.6
2) Filtering Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, :
Mismatched forward and reverse sequence files: 13324, 5856.
10: stop("Mismatched forward and reverse sequence files: ", length(fqF),

Hi @Zeina,

The root of the error is here:

This can be caused by a variety of things- such as having a differing number of sequence counts, the sequences in one of the two paired files could be out of order with respect to the other, and/or there could be non standard headers in one or both of the files (causing the parser to "think" the files are mismatched).

Take a look at this related forum post, there are a few good suggestions for troubleshooting where the mismatch could be coming from. If you are still stuck, you can share your demux file and we can take a closer look.

Cheers :lizard:

Thank you for your reply ! I just realize that i have one sample with different F and R reads. How can i fix this difference ? I’ve seen on the forum that some people use this command, but it didn’t work for me. https://forum.qiime2.org/t/dada2-error-return-code-1-mismatched-sequences/2770/3 .
$ qiime tools validate paired-end-demux.qza

How can i fix the problem ? thank you

(qiime2-amplicon-2024.5) login02  $ qiime tools validate DNA.qza

(qiime2-amplicon-2024.5) login02  $ qiime dada2 denoise-paired 
--i-demultiplexed-seqs DNA.qza 
--p-trunc-len-f 0 --p-trunc-len-r 240 
--p-trim-left-f 18 --p-trim-left-r 20 
--p-trunc-q 5 --p-n-threads 1
--o-table DNA-table.qza --o-representative-sequences DNA-rep-seqs.qza 
--o-denoising-stats DNA-denoisings-stats.qza

Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada.R --input_directory /tmp/tmp1_4o0tfk/forward --input_directory_reverse /tmp/tmp1_4o0tfk/reverse --output_path /tmp/tmp1_4o0tfk/output.tsv.biom --output_track /tmp/tmp1_4o0tfk/track.tsv --filtered_directory /tmp/tmp1_4o0gth 0 --truncation_length_reverse 230 --trim_left 18 --trim_left_reverse 20 --max_expected_errors 2.0 --max_expected_errors_reverse 2.0 --truncation_quality_score 5 --min_overlap 12 --pooling_method independent --chimera_method consensus 1000000

R version 4.3.3 (2024-02-29)
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.12 / RcppParallel: 5.1.6
2) Filtering
Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, :
Mismatched forward and reverse sequence files: 13324, 5856.
10: stop("Mismatched forward and reverse sequence files: ", length(fqF),
", ", length(fqR), ".")
9: (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,
0), maxLen = c(Inf, Inf), minLen = c(20, 20), trimLeft = c(0,
0), trimRight = c(0, 0), minQ = c(0, 0), maxEE = c(Inf, Inf),
rm.phix = c(TRUE, TRUE), rm.lowcomplex = c(0, 0), matchIDs = FALSE,
orient.fwd = NULL, id.sep = "\s", id.field = NULL, n = 1e+06,
OMP = TRUE, qualityType = "Auto", compress = TRUE, verbose = FALSE,

1 Like

Hey @Zeina,

Thanks for following up with those details- this sample has pretty drastically different forward and reverse read counts. In this case, I'd recommend reaching out to the sequencing center you received your raw data from to see if they're able to provide corrected read counts for this particular sample. If that's not possible, it's probably best to drop this sample from your data- but with that being said, if any @moderators have other suggestions for this, please feel free to :qiime2: in!

1 Like

Hello!

If you just want to fix the sample by removing unmatched reads from paired fastq files, then check “repair.sh” script from bbtools, I used it couple of times to fix similar issues.

But I am also suspicious about that sample, and, as @lizgehret suggested, would contact the sequencing center or try to re-download / or demultiplex (if not done by sequencing center) it again. Or just delete it if you can afford it based on the sample size / experimental design

1 Like

I agree with @lizgehret and @timanix - the sample is suspicious and I wouldn't want to proceed with it without getting more information about why.

I just want to :qiime2: in with one other comment re: this command:

That this isn't validating the data in the contained .fastq.gz files, but rather is validating that the .qza file itself is intact. This would catch, for example, if someone unzipped the .qza, modified some files, and rezipped it. So this command is almost certainly not relevant for the issue that you're trying to track down.

UPDATE: turns out I was wrong about that, that command does validate the format of the data enclosed in the artifact. @Zeina - would you mind sharing the output you got when running your qiime tools validate command? I'm surprised that it didn't complain about this.

2 Likes