Dada2 mismatch error after merging 2 runs

Dear QIIME2 support team,

I am running QIIME2 on a data set of demultiplexed paired end reads. I am having this error persistently now: It begun when I brought together fastq files from 2 separate runs with different identities:

The last error log was:

less /tmp/qiime2-q2cli-err-69szp22i.log
Mismatched forward and reverse sequence files: 100000, 39538.
Execution halted
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/tmpqmwoe_2q/forward /tmp/tmpqmwoe_2q/reverse /tmp/tmpqmwoe_2q/output.tsv.biom /tmp/tmpqmwoe_2q/track.tsv /tmp/tmpqmwoe_2q/filt_f /tmp/tmpqmwoe_2q/filt_r 120 120 0 0 2.0 2 consensus 1.0 1 1000000

Traceback (most recent call last):
File “/export/apps/qiime2/2018.4/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 229, in denoise_paired
run_commands([cmd])
File “/export/apps/qiime2/2018.4/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 36, in run_commands
subprocess.run(cmd, check=True)
File “/export/apps/qiime2/2018.4/lib/python3.5/subprocess.py”, line 398, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘run_dada_paired.R’, ‘/tmp/tmpqmwoe_2q/forward’, ‘/tmp/tmpqmwoe_2q/reverse’, ‘/tmp/tmpqmwoe_2q/output.tsv.biom’, ‘/tmp/tmpqmwoe_2q/track.tsv’, ‘/tmp/tmpqmwoe_2q/filt_f’, ‘/tmp/tmpqmwoe_2q/filt_r’, ‘120’, ‘120’, ‘0’, ‘0’, ‘2.0’, ‘2’, ‘consensus’, ‘1.0’, ‘1’, ‘1000000’]’ returned n/tmp/qiime2-q2cli-err-69szp22i.log.

What is the best way to share my fastq files and metadata file so as to help me assess there is no problem with these two files?
On my end I have done a thorough check and I am turning nothing, yet when I run DADA2, the same error is generated.

Thanks for your kind regard,

Ben.

This is not causing your error, but you should not merge multiple runs prior to running through dada2. dada2 must be run on each run separately to properly model the run error. See this tutorial for an example of merging multiple runs after dada2.

Your error appears to be here:

Make sure you have the correct forward/reverse files, and that both runs are paired-end (if not, scrap the reverse reads from the other run and analyze as single-end data).

I hope that helps!

Dear Nicholas, Thank you for your very helpful comments and in such a short time.

Regarding the 1st response that I should not merge runs prior to dada2 yet they are same set of samples, at what stage then should I merge?

On response no 2: Make sure to have correct forward and reverse files and that both are paired end: This I have done several times, may be I need a third eye.
Well, I would really prefer to use both forward and reverse reads, so unless it becomes practically impossible, I would rather find a solution to the reason for the error and continue down stream.

Else I will scrap the reverse reads? Anyone ever done this and how does it impact the quality of results?
Thanks again.
Regards,

Benard.

So you have the same samples sequenced on two separate runs? See my answer above:

Indeed you can, so long as you have forward and reverse reads for both runs and they cover the same amplicon sites.

Lots of people all the time (myself included) — just search the forum for examples where reverse read quality is insufficient to support merging. Depending on the amplicon target, you may lose a bit of resolution (e.g., taxonomic specificity) but the quality is unaffected.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.