Mismatched forward and reverse sequence files

hello

I have a problem with DADA2 filtering
The output result seems there is a mismatch between my forward and reverse file
I have already checked similar issues on the forum

here are my input and output

#input
#import data
qiime tools import \
   --type 'SampleData[PairedEndSequencesWithQuality]' \
   --input-path /home/chaojuichang/qiime2/chaojui/UASB/UASB_Manifest.txt \
  --output-path /home/chaojuichang/qiime2/chaojui/UASB/UASB_demux.qza \
  --input-format PairedEndFastqManifestPhred33V2 &

#DADA2 disnose
qiime dada2 denoise-paired \
  --i-demultiplexed-seqs UASB_demux.qza \
  --p-trim-left-f 19 \
  --p-trim-left-r 20 \
  --p-trunc-len-f 245 \
  --p-trunc-len-r 145 \
  --o-representative-sequences UASB_dada2.qza \
  --o-table table_UASB_dada2.qza \
  --o-denoising-stats stats_UASB_dada2.qza \
  --p-n-threads 0 &
#output
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/tmpggxwghl_/forward /tmp/tmpggxwghl_/reverse /tmp/tmpggxwghl_/output.tsv.biom /tmp/tmpggxwghl_/track.tsv /tmp/tmpggxwghl_/filt_f /tmp/tmpggxwghl_/filt_r 245 145 19 20 2.0 2.0 2 12 independent consensus 1.0 1 1000000

R version 4.0.5 (2021-03-31) 
Loading required package: Rcpp
DADA2: 1.18.0 / Rcpp: 1.0.7 / RcppParallel: 5.1.4 
1) Filtering Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  : 
  Mismatched forward and reverse sequence files: 3296, 4292.
Execution halted
Traceback (most recent call last):
  File "/home/chaojuichang/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 266, in denoise_paired
    run_commands([cmd])
  File "/home/chaojuichang/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 36, in run_commands
    subprocess.run(cmd, check=True)
  File "/home/chaojuichang/miniconda3/envs/qiime2-2021.11/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada_paired.R', '/tmp/tmpggxwghl_/forward', '/tmp/tmpggxwghl_/reverse', '/tmp/tmpggxwghl_/output.tsv.biom', '/tmp/tmpggxwghl_/track.tsv', '/tmp/tmpggxwghl_/filt_f', '/tmp/tmpggxwghl_/filt_r', '245', '145', '19', '20', '2.0', '2.0', '2', '12', 'independent', 'consensus', '1.0', '1', '1000000']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/chaojuichang/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2cli/commands.py", line 339, in __call__
    results = action(**arguments)
  File "<decorator-gen-540>", line 2, in denoise_paired
  File "/home/chaojuichang/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
    outputs = self._callable_executor_(scope, callable_args,
  File "/home/chaojuichang/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 391, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/home/chaojuichang/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 279, in denoise_paired
    raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

I have tried to diagnose by the following code

qiime tools export --input-path UASB_demux.qza --output-path debugging 
cd debugging
for f in *.fastq.gz; do r=$(( $(gunzip -c $f | wc -l | tr -d '[:space:]') / 4 )); echo $r $f; done

and there is the result

40406 UASB_001_0_L001_R1_001.fastq.gz
40406 UASB_001_112_L001_R2_001.fastq.gz
47421 UASB_002_113_L001_R2_001.fastq.gz
47421 UASB_002_1_L001_R1_001.fastq.gz
37613 UASB_003_114_L001_R2_001.fastq.gz
37613 UASB_003_2_L001_R1_001.fastq.gz
44128 UASB_004_115_L001_R2_001.fastq.gz
44128 UASB_004_3_L001_R1_001.fastq.gz
45685 UASB_005_116_L001_R2_001.fastq.gz
45685 UASB_005_4_L001_R1_001.fastq.gz
49868 UASB_006_117_L001_R2_001.fastq.gz
49868 UASB_006_5_L001_R1_001.fastq.gz
40017 UASB_007_118_L001_R2_001.fastq.gz
40017 UASB_007_6_L001_R1_001.fastq.gz
42465 UASB_008_119_L001_R2_001.fastq.gz
42465 UASB_008_7_L001_R1_001.fastq.gz
44188 UASB_009_120_L001_R2_001.fastq.gz
44188 UASB_009_8_L001_R1_001.fastq.gz
39777 UASB_010_121_L001_R2_001.fastq.gz
39777 UASB_010_9_L001_R1_001.fastq.gz
40202 UASB_011_10_L001_R1_001.fastq.gz
40202 UASB_011_122_L001_R2_001.fastq.gz
43611 UASB_012_11_L001_R1_001.fastq.gz
43611 UASB_012_123_L001_R2_001.fastq.gz
44003 UASB_013_124_L001_R2_001.fastq.gz
44003 UASB_013_12_L001_R1_001.fastq.gz
44549 UASB_014_125_L001_R2_001.fastq.gz
44549 UASB_014_13_L001_R1_001.fastq.gz
41169 UASB_015_126_L001_R2_001.fastq.gz
41169 UASB_015_14_L001_R1_001.fastq.gz
44911 UASB_016_127_L001_R2_001.fastq.gz
44911 UASB_016_15_L001_R1_001.fastq.gz
39764 UASB_017_128_L001_R2_001.fastq.gz
39764 UASB_017_16_L001_R1_001.fastq.gz
44450 UASB_018_129_L001_R2_001.fastq.gz
44450 UASB_018_17_L001_R1_001.fastq.gz
44643 UASB_019_130_L001_R2_001.fastq.gz
44643 UASB_019_18_L001_R1_001.fastq.gz
50386 UASB_020_131_L001_R2_001.fastq.gz
50386 UASB_020_19_L001_R1_001.fastq.gz
45736 UASB_021_132_L001_R2_001.fastq.gz
45736 UASB_021_20_L001_R1_001.fastq.gz
41054 UASB_022_133_L001_R2_001.fastq.gz
41054 UASB_022_21_L001_R1_001.fastq.gz
45984 UASB_023_134_L001_R2_001.fastq.gz
45984 UASB_023_22_L001_R1_001.fastq.gz
44859 UASB_024_135_L001_R2_001.fastq.gz
44859 UASB_024_23_L001_R1_001.fastq.gz
72514 UASB_025_136_L001_R2_001.fastq.gz
72514 UASB_025_24_L001_R1_001.fastq.gz
42509 UASB_026_137_L001_R2_001.fastq.gz
42509 UASB_026_25_L001_R1_001.fastq.gz
40378 UASB_027_138_L001_R2_001.fastq.gz
40378 UASB_027_26_L001_R1_001.fastq.gz
51722 UASB_028_139_L001_R2_001.fastq.gz
51722 UASB_028_27_L001_R1_001.fastq.gz
41800 UASB_029_140_L001_R2_001.fastq.gz
41800 UASB_029_28_L001_R1_001.fastq.gz
70230 UASB_030_141_L001_R2_001.fastq.gz
70230 UASB_030_29_L001_R1_001.fastq.gz
41376 UASB_031_142_L001_R2_001.fastq.gz
41376 UASB_031_30_L001_R1_001.fastq.gz
52021 UASB_032_143_L001_R2_001.fastq.gz
52021 UASB_032_31_L001_R1_001.fastq.gz
39916 UASB_033_144_L001_R2_001.fastq.gz
39916 UASB_033_32_L001_R1_001.fastq.gz
66189 UASB_034_145_L001_R2_001.fastq.gz
66189 UASB_034_33_L001_R1_001.fastq.gz
44019 UASB_035_146_L001_R2_001.fastq.gz
44019 UASB_035_34_L001_R1_001.fastq.gz
42887 UASB_036_147_L001_R2_001.fastq.gz
42887 UASB_036_35_L001_R1_001.fastq.gz
47431 UASB_037_148_L001_R2_001.fastq.gz
47431 UASB_037_36_L001_R1_001.fastq.gz
44575 UASB_038_149_L001_R2_001.fastq.gz
44575 UASB_038_37_L001_R1_001.fastq.gz
56046 UASB_039_150_L001_R2_001.fastq.gz
56046 UASB_039_38_L001_R1_001.fastq.gz
40235 UASB_040_151_L001_R2_001.fastq.gz
40235 UASB_040_39_L001_R1_001.fastq.gz
44274 UASB_041_152_L001_R2_001.fastq.gz
44274 UASB_041_40_L001_R1_001.fastq.gz
52778 UASB_042_153_L001_R2_001.fastq.gz
52778 UASB_042_41_L001_R1_001.fastq.gz
46678 UASB_043_154_L001_R2_001.fastq.gz
46678 UASB_043_42_L001_R1_001.fastq.gz
45589 UASB_044_155_L001_R2_001.fastq.gz
45589 UASB_044_43_L001_R1_001.fastq.gz
41604 UASB_045_156_L001_R2_001.fastq.gz
41604 UASB_045_44_L001_R1_001.fastq.gz
37001 UASB_046_157_L001_R2_001.fastq.gz
37001 UASB_046_45_L001_R1_001.fastq.gz
43242 UASB_047_158_L001_R2_001.fastq.gz
43242 UASB_047_46_L001_R1_001.fastq.gz
38444 UASB_048_159_L001_R2_001.fastq.gz
38444 UASB_048_47_L001_R1_001.fastq.gz
46384 UASB_049_160_L001_R2_001.fastq.gz
46384 UASB_049_48_L001_R1_001.fastq.gz
40149 UASB_050_161_L001_R2_001.fastq.gz
40149 UASB_050_49_L001_R1_001.fastq.gz
43249 UASB_051_162_L001_R2_001.fastq.gz
43249 UASB_051_50_L001_R1_001.fastq.gz
54838 UASB_052_163_L001_R2_001.fastq.gz
54838 UASB_052_51_L001_R1_001.fastq.gz
45454 UASB_053_164_L001_R2_001.fastq.gz
45454 UASB_053_52_L001_R1_001.fastq.gz
42283 UASB_054_165_L001_R2_001.fastq.gz
42283 UASB_054_53_L001_R1_001.fastq.gz
44682 UASB_055_166_L001_R2_001.fastq.gz
44682 UASB_055_54_L001_R1_001.fastq.gz
42392 UASB_056_167_L001_R2_001.fastq.gz
42392 UASB_056_55_L001_R1_001.fastq.gz
39334 UASB_057_168_L001_R2_001.fastq.gz
39334 UASB_057_56_L001_R1_001.fastq.gz
39728 UASB_058_169_L001_R2_001.fastq.gz
39728 UASB_058_57_L001_R1_001.fastq.gz
39156 UASB_059_170_L001_R2_001.fastq.gz
39156 UASB_059_58_L001_R1_001.fastq.gz
41384 UASB_060_171_L001_R2_001.fastq.gz
41384 UASB_060_59_L001_R1_001.fastq.gz
73024 UASB_061_172_L001_R2_001.fastq.gz
73024 UASB_061_60_L001_R1_001.fastq.gz
39163 UASB_062_173_L001_R2_001.fastq.gz
39163 UASB_062_61_L001_R1_001.fastq.gz
41272 UASB_063_174_L001_R2_001.fastq.gz
41272 UASB_063_62_L001_R1_001.fastq.gz
101352 UASB_064_175_L001_R2_001.fastq.gz
101352 UASB_064_63_L001_R1_001.fastq.gz
47384 UASB_065_176_L001_R2_001.fastq.gz
47384 UASB_065_64_L001_R1_001.fastq.gz
38564 UASB_066_177_L001_R2_001.fastq.gz
38564 UASB_066_65_L001_R1_001.fastq.gz
40671 UASB_067_178_L001_R2_001.fastq.gz
40671 UASB_067_66_L001_R1_001.fastq.gz
104338 UASB_068_179_L001_R2_001.fastq.gz
104338 UASB_068_67_L001_R1_001.fastq.gz
42959 UASB_069_180_L001_R2_001.fastq.gz
42959 UASB_069_68_L001_R1_001.fastq.gz
41276 UASB_070_181_L001_R2_001.fastq.gz
41276 UASB_070_69_L001_R1_001.fastq.gz
41363 UASB_071_182_L001_R2_001.fastq.gz
41363 UASB_071_70_L001_R1_001.fastq.gz
38471 UASB_072_183_L001_R2_001.fastq.gz
38471 UASB_072_71_L001_R1_001.fastq.gz
40810 UASB_073_184_L001_R2_001.fastq.gz
40810 UASB_073_72_L001_R1_001.fastq.gz
39435 UASB_074_185_L001_R2_001.fastq.gz
39435 UASB_074_73_L001_R1_001.fastq.gz
42227 UASB_075_186_L001_R2_001.fastq.gz
42227 UASB_075_74_L001_R1_001.fastq.gz
43143 UASB_076_187_L001_R2_001.fastq.gz
43143 UASB_076_75_L001_R1_001.fastq.gz
41626 UASB_077_188_L001_R2_001.fastq.gz
41626 UASB_077_76_L001_R1_001.fastq.gz
42030 UASB_078_189_L001_R2_001.fastq.gz
42030 UASB_078_77_L001_R1_001.fastq.gz
43255 UASB_079_190_L001_R2_001.fastq.gz
43255 UASB_079_78_L001_R1_001.fastq.gz
104292 UASB_080_191_L001_R2_001.fastq.gz
103296 UASB_080_79_L001_R1_001.fastq.gz
45969 UASB_081_192_L001_R2_001.fastq.gz
45969 UASB_081_80_L001_R1_001.fastq.gz
43493 UASB_082_193_L001_R2_001.fastq.gz
43493 UASB_082_81_L001_R1_001.fastq.gz
48109 UASB_083_194_L001_R2_001.fastq.gz
48109 UASB_083_82_L001_R1_001.fastq.gz
45277 UASB_084_195_L001_R2_001.fastq.gz
45277 UASB_084_83_L001_R1_001.fastq.gz
49271 UASB_085_196_L001_R2_001.fastq.gz
49271 UASB_085_84_L001_R1_001.fastq.gz
40515 UASB_086_197_L001_R2_001.fastq.gz
40515 UASB_086_85_L001_R1_001.fastq.gz
47556 UASB_087_198_L001_R2_001.fastq.gz
47556 UASB_087_86_L001_R1_001.fastq.gz
41965 UASB_088_199_L001_R2_001.fastq.gz
41965 UASB_088_87_L001_R1_001.fastq.gz
40753 UASB_089_200_L001_R2_001.fastq.gz
40753 UASB_089_88_L001_R1_001.fastq.gz
43019 UASB_090_201_L001_R2_001.fastq.gz
43019 UASB_090_89_L001_R1_001.fastq.gz
48160 UASB_091_202_L001_R2_001.fastq.gz
48160 UASB_091_90_L001_R1_001.fastq.gz
43959 UASB_092_203_L001_R2_001.fastq.gz
43959 UASB_092_91_L001_R1_001.fastq.gz
42789 UASB_093_204_L001_R2_001.fastq.gz
42789 UASB_093_92_L001_R1_001.fastq.gz
43237 UASB_094_205_L001_R2_001.fastq.gz
43237 UASB_094_93_L001_R1_001.fastq.gz
52524 UASB_095_206_L001_R2_001.fastq.gz
52524 UASB_095_94_L001_R1_001.fastq.gz
72635 UASB_096_207_L001_R2_001.fastq.gz
72635 UASB_096_95_L001_R1_001.fastq.gz
47040 UASB_097_208_L001_R2_001.fastq.gz
47040 UASB_097_96_L001_R1_001.fastq.gz
38631 UASB_098_209_L001_R2_001.fastq.gz
38631 UASB_098_97_L001_R1_001.fastq.gz
104452 UASB_099_210_L001_R2_001.fastq.gz
104452 UASB_099_98_L001_R1_001.fastq.gz
48286 UASB_100_211_L001_R2_001.fastq.gz
48286 UASB_100_99_L001_R1_001.fastq.gz
41659 UASB_101_100_L001_R1_001.fastq.gz
41659 UASB_101_212_L001_R2_001.fastq.gz
39557 UASB_102_101_L001_R1_001.fastq.gz
39557 UASB_102_213_L001_R2_001.fastq.gz
50654 UASB_103_102_L001_R1_001.fastq.gz
50654 UASB_103_214_L001_R2_001.fastq.gz
44577 UASB_104_103_L001_R1_001.fastq.gz
44577 UASB_104_215_L001_R2_001.fastq.gz
43067 UASB_105_104_L001_R1_001.fastq.gz
43067 UASB_105_216_L001_R2_001.fastq.gz
44963 UASB_106_105_L001_R1_001.fastq.gz
44963 UASB_106_217_L001_R2_001.fastq.gz
46491 UASB_107_106_L001_R1_001.fastq.gz
46491 UASB_107_218_L001_R2_001.fastq.gz
45856 UASB_108_107_L001_R1_001.fastq.gz
45856 UASB_108_219_L001_R2_001.fastq.gz
49870 UASB_109_108_L001_R1_001.fastq.gz
49870 UASB_109_220_L001_R2_001.fastq.gz
38091 UASB_110_109_L001_R1_001.fastq.gz
38091 UASB_110_221_L001_R2_001.fastq.gz
37837 UASB_111_110_L001_R1_001.fastq.gz
37837 UASB_111_222_L001_R2_001.fastq.gz
47190 UASB_112_111_L001_R1_001.fastq.gz
47190 UASB_112_223_L001_R2_001.fastq.gz

I cannot find any thing wrong, since the number of raw reads in the R1 and R2 file is identical
and I also try a small subset (10 samples) and it works

Any ideas about what the issue is and how to fix it?

Here is the Manifest file i used
UASB_Manifest.txt (17.4 KB)

Hi @chaojui,

Having a differing number sequence counts is only one potential symptom of mismatched reads. The sequences in one of the two paired files could be out of order with respect to the other, and/or there could be non standard headers in one or both of the files that makes the parser "think" the files are mismatched. Can you link us the first one-hundred headers / sequences from a given R1 and R2 FASTQ files?

1 Like

Hi @chaojui,

Our neighborhood friendly @thermokarst, let me know that there is indeed a sequence count mismatch in your data. Specifically:

1 Like

Hi @chaojui,

We forgot to mention, that if you generate a visualization of your imported / demuxed sequences, you'll see that the read counts are provided for each sample. This will save you from having to run the bash command.

1 Like