DADA2: Mismatched forward and reverse reads

Hello there,

I have been hoping to use the DADA2 plugin for the filtering of my sequences and I keep hitting a wall in the process. I wanted to avoid having to only run my forward reads, and am hoping to be able to successfully merge my forward and reverse reads for further downstream analysis.

Here is the visual of my forward and reverse sequences for an idea.

My reverse reads are not great (however, I have seen worse)...

The error returned to me when running the following command is:

> (qiime2-2019.4) d43-6:withnegs Admin$ qiime dada2 denoise-paired 
--i-demultiplexed-seqs demux.qza --o-table table-dada2.qza --o-representative-sequences rep-seqs-dada2.qza --o-denoising-stats stats-dada2.qza --p-trim-left-f 0  --p-trim-left-r 0 --p-trunc-len-f 200 --p-trunc-len-r 200 --p-max-ee 2 --p-n-threads 0 --verbose



Command: run_dada_paired.R /var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/forward /var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/reverse /var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/output.tsv.biom /var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/track.tsv /var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/filt_f /var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/filt_r 200 200 0 0 2.0 2 consensus 1.0 0 1000000

R version 3.5.1 (2018-07-02) 
Loading required package: Rcpp
DADA2: 1.10.0 / Rcpp: 1.0.1 / RcppParallel: 4.4.2 
**1) Filtering Error in filterAndTrim(unfiltsF, filtsF, unfiltsR, filtsR, truncLen = c(truncLenF,  : **
**  These are the errors (up to 5) encountered in individual cores...**
**Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  : **
**  Mismatched forward and reverse sequence files: 98, 97.**
**Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  : **
**  Mismatched forward and reverse sequence files: 98, 97.**
**Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  : **
**  Mismatched forward and reverse sequence files: 98, 97.**
**Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  : **
**  Mismatched forward and reverse sequence files: 20283, 97.**
**Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  : **
**  Mismatched forward and reverse sequence files: 98, 97.**
Execution halted
Traceback (most recent call last):
  File "/Users/Admin/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 231, in denoise_paired
    run_commands([cmd])
  File "/Users/Admin/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 36, in run_commands
    subprocess.run(cmd, check=True)
  File "/Users/Admin/miniconda3/envs/qiime2-2019.4/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run_dada_paired.R', '/var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/forward', '/var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/reverse', '/var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/output.tsv.biom', '/var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/track.tsv', '/var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/filt_f', '/var/folders/s0/6_ml29fd0c1dt1617blzpn_00000gq/T/tmp_ct4mpss/filt_r', '200', '200', '0', '0', '2.0', '2', 'consensus', '1.0', '0', '1000000']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/Admin/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2cli/commands.py", line 311, in __call__
    results = action(**arguments)
  File "</Users/Admin/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/decorator.py:decorator-gen-451>", line 2, in denoise_paired
  File "/Users/Admin/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
    output_types, provenance)
  File "/Users/Admin/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/action.py", line 365, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/Users/Admin/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 246, in denoise_paired
    " and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

**Plugin error from dada2:**
**An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.**
**See above for debug info.**

I have run different truncation and trimming lengths as well just to play around and have gotten that the same sequence files are mismatched.

My first steps in trying to find the source of the mismatched reads included:

  1. Double checked my manifest file to see that every file name was written correctly with both a forward and reverse read (all good)
    Metadata.csv (10.0 KB)

  2. Checking if my demux file and import worked ok:

(qiime2-2019.4) d43-6:import-check Admin$ qiime tools validate demux.qza
Result demux.qza appears to be valid at level=max
  1. I then ran the following command to compare the forward and reverse reads' line counts:
for f in *.fastq; do r1=$(wc -l < $f | tr -d '[:space:]'); r2=$(wc -l < ../r2/$f | tr -d '[:space:]'); echo $r1 $r2 $f; done

This is the part where I think maybe the command did not work correctly, or my files are a bit weird. The output basically shows that every forward and reverse read has different line counts... here are three for an example. These numbers also seem incorrect.

4977835 Aug 12  2017 Liz11-L127_S11_L001_R1_001.fastq.gz
6667502 Aug 12  2017 Liz11-L127_S11_L001_R2_001.fastq.gz

5059016 Aug 12  2017 Liz12-L128_S12_L001_R1_001.fastq.gz
6920420 Aug 12  2017 Liz12-L128_S12_L001_R2_001.fastq.gz

3722668 Aug 12  2017 Liz13-L130_S13_L001_R1_001.fastq.gz
5360458 Aug 12  2017 Liz13-L130_S13_L001_R2_001.fastq.gz

Please let me know if more clarification is needed on what I included. I am basically wondering if this is happening because my reverse reads are way worse than my forward reads and if the overall quality of these samples in general is creating the problem. My collaborators have run single-end on these samples through QIIME1 and I would ideally like to be more conservative by merging both if possible... even if that means dropping a few samples.

I think it is also worth mentioning that I have run through the rest of the QIIME2 tutorial with both the forward and reverse reads using Deblur and had no problems using that filtering step over DADA2.

Any guidance on next steps would be greatly appreciated!

All the Best,
Tabor

Bingo — this is what dada2 is complaining about when it says:

This error has been reported by many on the forum, and you can read the archive to see the source and solution to this problem:
https://forum.qiime2.org/search?q=Mismatched%20forward%20and%20reverse%20sequence%20files

Chances are your sequences went through some sort of initial quality filtering outside of QIIME 2, which led to some sequences being dropped in one read or the other, causing this mismatch. Is this true? Or do you have any other ideas for the discrepancy?

It looks like the validation ensures that matching file pairs exist, but it does not ensure that paired reads exist within those files. So your wc test is more diagnostic of the problem here.

My advice:

  1. if you know or can figure out why there would be different read counts (e.g., there was some type of pre-QIIME 2 processing that may have dropped reads... if you know of anything, then let's discuss), then go back to the rawest form of the data that you can.
  2. if not, then you are faced with a choice: figure out what went wrong, proceed with only the forward reads (after all, the reverse is not great), or attempt to correct it!

Thank you for the reply!

It looks like these were the rawest form of the reads we received, the sequencing core they came from did demultiplex them before they were sent our way. So the fact they were already demultiplexed might be why I am running into this error.

Thank you for the help and I will keep working at trying to find the root of the problem and if not can run it through Deblur instead.

how did they demultiplex? that's most likely the root cause. E.g., if the demux performed any type of QC, or if reads without a sample barcode were dropped independent of their pair.

All the samples have barcodes and the facility doesn’t do any type of filtering beforehand.

I am starting to think that the way they were downloaded offline weren’t the complete files and is why they are all different. Thanks for the help and I will see if re-downloading fixes my error!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.