Erro excuting DADA2: Mismatched forward and reverse sequence

Greetings to all,

I am trying to analyze with Qiime2 a set of 30 samples (paired-ended) and I obtain an error when I try to execute Dada2. I have searched about this issue in the forum and I have already applied the recomendations described. Below I explain the steps that I followed and the error that I get:

conda activate qiime2-2020.8

qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path manifest
--output-path paired-end-demux.qza
--input-format PairedEndFastqManifestPhred33V2

I successfully imported my data into a qza with my manifest file:
manifest.txt (4.3 KB)

qiime demux summarize
--i-data paired-end-demux.qza
--o-visualization demux.qzv
demux.qzv (312.7 KB)

I ran DADA2:

qiime dada2 denoise-paired
--i-demultiplexed-seqs paired-end-demux.qza
--p-trim-left-f 10
--p-trim-left-r 10
--p-trunc-len-f 300
--p-trunc-len-r 300
--o-representative-sequences rep-seqs-dada2.qza
--o-table pet-table.qza
--p-n-threads 1
--o-denoising-stats denoising-stats.qza

I obtained:

Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/tmpgoh5nhbn/forward /tmp/tmpgoh5nhbn/reverse /tmp/tmpgoh5nhbn/output.tsv.biom /tmp/tmpgoh5nhbn/track.tsv /tmp/tmpgoh5nhbn/filt_f /tmp/tmpgoh5nhbn/filt_r 300 300 10 10 2.0 2.0 2 independent consensus 1.0 1 1000000

R version 3.5.1 (2018-07-02)
Loading required package: Rcpp
DADA2: 1.10.0 / Rcpp: 1.0.4.6 / RcppParallel: 5.0.0

  1. Filtering Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, :
    Mismatched forward and reverse sequence files: 100000, 56391.
    Execution halted
    Traceback (most recent call last):
    File "/home/noe/miniconda3/envs/qiime2-2020.8/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 264, in denoise_paired
    run_commands([cmd])
    File "/home/noe/miniconda3/envs/qiime2-2020.8/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 36, in run_commands
    subprocess.run(cmd, check=True)
    File "/home/noe/miniconda3/envs/qiime2-2020.8/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '['run_dada_paired.R', '/tmp/tmpgoh5nhbn/forward', '/tmp/tmpgoh5nhbn/reverse', '/tmp/tmpgoh5nhbn/output.tsv.biom', '/tmp/tmpgoh5nhbn/track.tsv', '/tmp/tmpgoh5nhbn/filt_f', '/tmp/tmpgoh5nhbn/filt_r', '300', '300', '10', '10', '2.0', '2.0', '2', 'independent', 'consensus', '1.0', '1', '1000000']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/noe/miniconda3/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/commands.py", line 329, in call
results = action(**arguments)
File "", line 2, in denoise_paired
File "/home/noe/miniconda3/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
output_types, provenance)
File "/home/noe/miniconda3/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/action.py", line 390, in callable_executor
output_views = self._callable(**view_args)
File "/home/noe/miniconda3/envs/qiime2-2020.8/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 279, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
wing output

Seeing the mismatch error I checked the forum and took the following actions:

  1. Validated my .qza

qiime tools validate paired-end-demux.qza

paired-end-demux.qza appears to be valid at level=max.

  1. Checked all of my paths - they were fine.

  2. Checked all my sequence files for mismatched entries per fastq with the command:

for f in *.fastq; do r=$(wc -l < $f | tr -d ‘[:space:]’); echo $r $f; done
4005172 concatenated_ET1_20C_48_Copro_1_1.fastq
4005172 concatenated_ET1_20C_48_Copro_1_2.fastq
4559244 concatenated_ET1_4C_120_copro_1_1.fastq
4559244 concatenated_ET1_4C_120_copro_1_2.fastq
4106648 concatenated_ET1_4C_120_Gut_1_1.fastq
4106648 concatenated_ET1_4C_120_Gut_1_2.fastq
4446828 concatenated_ET1_4C_48_Copro_1_1.fastq
4446828 concatenated_ET1_4C_48_Copro_1_2.fastq
3769360 concatenated_ET2_20C_48_Copro_1_1.fastq
3769360 concatenated_ET2_20C_48_Copro_1_2.fastq
4257512 concatenated_ET2_RT_120_Gut_1_1.fastq
4257512 concatenated_ET2_RT_120_Gut_1_2.fastq
3814544 concatenated_ET3_4C_48_Copro_1_1.fastq
3814544 concatenated_ET3_4C_48_Copro_1_2.fastq
4149904 concatenated_ET3_4C_48_Gut_1_1.fastq
4149904 concatenated_ET3_4C_48_Gut_1_2.fastq
3725520 concatenated_ET3_RT_48_Gut_1_1.fastq
3725520 concatenated_ET3_RT_48_Gut_1_2.fastq
293860 NG-25787_V3V4a_ET1_20C_48_Gut_lib417413_6977_1_1.fastq
293860 NG-25787_V3V4a_ET1_20C_48_Gut_lib417413_6977_1_2.fastq
252796 NG-25787_V3V4a_ET1_4C_48_Gut_lib417411_6977_1_1.fastq
252796 NG-25787_V3V4a_ET1_4C_48_Gut_lib417411_6977_1_2.fastq
291536 NG-25787_V3V4a_ET1_RT_120_copro_lib417416_6977_1_1.fastq
291536 NG-25787_V3V4a_ET1_RT_120_copro_lib417416_6977_1_2.fastq
261080 NG-25787_V3V4a_ET1_RT_120_Gut_lib417415_6977_1_1.fastq
261080 NG-25787_V3V4a_ET1_RT_120_Gut_lib417415_6977_1_2.fastq
225564 NG-25787_V3V4a_ET1_RT_48_copro_lib417410_6977_1_2.fastq
3842916 NG-25787_V3V4a_ET1_RT_48_copro_lib417410_6999_1_1.fastq
278916 NG-25787_V3V4a_ET1_RT_48_Gut_lib417409_6977_1_1.fastq
278916 NG-25787_V3V4a_ET1_RT_48_Gut_lib417409_6977_1_2.fastq
269384 NG-25787_V3V4a_ET2_20C_48_Gut_lib417423_6977_1_1.fastq
269384 NG-25787_V3V4a_ET2_20C_48_Gut_lib417423_6977_1_2.fastq
345760 NG-25787_V3V4a_ET2_4C_120_copro_lib417428_6977_1_1.fastq
345760 NG-25787_V3V4a_ET2_4C_120_copro_lib417428_6977_1_2.fastq
280564 NG-25787_V3V4a_ET2_4C_120_Gut_lib417427_6977_1_1.fastq
280564 NG-25787_V3V4a_ET2_4C_120_Gut_lib417427_6977_1_2.fastq
275408 NG-25787_V3V4a_ET2_4C_48_Copro_lib417422_6977_1_1.fastq
275408 NG-25787_V3V4a_ET2_4C_48_Copro_lib417422_6977_1_2.fastq
372664 NG-25787_V3V4a_ET2_4C_48_Gut_lib417421_6977_1_1.fastq
372664 NG-25787_V3V4a_ET2_4C_48_Gut_lib417421_6977_1_2.fastq
337712 NG-25787_V3V4a_ET2_RT_120_copro_lib417426_6977_1_1.fastq
337712 NG-25787_V3V4a_ET2_RT_120_copro_lib417426_6977_1_2.fastq
230600 NG-25787_V3V4a_ET2_RT_120_Gut_lib417425_6977_1_1.fastq
230600 NG-25787_V3V4a_ET2_RT_120_Gut_lib417425_6977_1_2.fastq
4026912 NG-25787_V3V4a_ET2_RT_120_Gut_lib417425_6999_1_1.fastq
4026912 NG-25787_V3V4a_ET2_RT_120_Gut_lib417425_6999_1_2.fastq
306068 NG-25787_V3V4a_ET2_RT_48_copro_lib417420_6977_1_1.fastq
306068 NG-25787_V3V4a_ET2_RT_48_copro_lib417420_6977_1_2.fastq
266856 NG-25787_V3V4a_ET2_RT_48_Gut_lib417419_6977_1_1.fastq
266856 NG-25787_V3V4a_ET2_RT_48_Gut_lib417419_6977_1_2.fastq
277256 NG-25787_V3V4a_ET3_20C_48_Copro_lib417434_6977_1_1.fastq
277256 NG-25787_V3V4a_ET3_20C_48_Copro_lib417434_6977_1_2.fastq
263972 NG-25787_V3V4a_ET3_20C_48_Gut_lib417433_6977_1_1.fastq
263972 NG-25787_V3V4a_ET3_20C_48_Gut_lib417433_6977_1_2.fastq
263700 NG-25787_V3V4a_ET3_4C_120_copro_lib417438_6977_1_1.fastq
263700 NG-25787_V3V4a_ET3_4C_120_copro_lib417438_6977_1_2.fastq
270972 NG-25787_V3V4a_ET3_4C_120_Gut_lib417437_6977_1_1.fastq
270972 NG-25787_V3V4a_ET3_4C_120_Gut_lib417437_6977_1_2.fastq
252164 NG-25787_V3V4a_ET3_RT_120_copro_lib417436_6977_1_1.fastq
252164 NG-25787_V3V4a_ET3_RT_120_copro_lib417436_6977_1_2.fastq
283676 NG-25787_V3V4a_ET3_RT_120_Gut_lib417435_6977_1_1.fastq
283676 NG-25787_V3V4a_ET3_RT_120_Gut_lib417435_6977_1_2.fastq
280092 NG-25787_V3V4a_ET3_RT_48_copro_lib417430_6977_1_1.fastq
280092 NG-25787_V3V4a_ET3_RT_48_copro_lib417430_6977_1_2.fastq

All the paired samples had the same counts

  1. I have already renamed the files in the manifest file in order to avoid underscores, as suggested in another post.

  2. Also I must say that the first step I did was to concatenate the forward files and reverse files within some samples because the raw files I received were splitted in some cases.
    So I used the command:

cat sample1_forward_file_1.fastq sample1_forward_file_2.fastq > concatenated_sample1_forward_file.fastq

cat sample1_reverse_file_1.fastq sample1_reverse_file_2.fastq > concatenated_sample1_reverse_file.fastq

I tested using only one concatenated sample and dada2 ran ok, so I dicarded that the problem were caused due to concatenating files.

Any help would be appreciated, thank you

Please read the following before posting!

Is this post about a User Support Question? Those include questions about specific results while running QIIME 2, warnings observed while running a QIIME 2 command. Please do not post questions here that have to do with interpretation of results, general discussion, or technical support.

Before posting, please make sure you have the following information available, in order for us to help you in a timely manner:

  • Have you searched for the problem on the forum? It is rare that we see a new question asked, so make sure you do your homework before asking for us to commit our time to helping you.
  • Have you reviewed the QIIME 2 Forum Glossary?
  • Version of QIIME 2 you are running, and how it is installed (e.g. Virtualbox, conda, etc.)
  • What is the exact command or commands you ran? Copy and paste please.
  • What is the exact error message, if applicable? If you didn't run the command with the --verbose flag, please re-run and copy-and-paste the results.

Hi @LuciaGG!

Thanks for the detailed information. Let's start by taking a peek at the demux.qzv that you attached:

Sample ET1-RT-48-copro doesn't have matching read counts. Looking at the bash one-liner you shared:

A similar story, here.

I would suggest double-checking that you transferred the files completely. As well, your concatenation step might have an issue. If that doesn't reveal anything, I suggest contacting your sequencing center.

:qiime2:

@thermokarst thank you! I ran dada2 succesfully, my mistake was to not concatenate the sample ET1-RT-48-copro properly.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.