Mismatched forward and reverse reads error from dada2

I'm back....

I successfully imported my data into a qza with my manifest file:

AD_exp3_Arc_manifest-pe_sort.csv (10.9 KB)

(qiime2-2017.10) wsb255bioimac27:Fink_fermenter_Exp3 mel_local$ qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path AD_exp3_Arc_manifest-pe_sort.csv --output-path AD_exp3_Arc_pe-demux.qza --source-format PairedEndFastqManifestPhred33

And viewed it:

(qiime2-2017.10) wsb255bioimac27:Fink_fermenter_Exp3 mel_local$ qiime demux summarize --i-data AD_exp3_Arc_pe-demux.qza --o-visualization AD_exp3_Arc_pe-demux.qzv
Saved Visualization to: AD_exp3_Arc_pe-demux.qzv

AD_exp3_Arc_pe-demux.qzv (281.7 KB)

I decided on my dada2 cutoffs and ran dada2 with the following output:

(qiime2-2017.10) wsb255bioimac27:Fink_fermenter_Exp3 mel_local$ qiime dada2 denoise-paired --i-demultiplexed-seqs AD_exp3_Arc_pe-demux.qza --p-trim-left-f 15 --p-trunc-len-f 180 --p-trim-left-r 15 --p-trunc-len-r 180 --o-representative-sequences AD_exp3_Arc_rep-seqs.qza --o-table AD_exp3_Arc_table_dada2.qza --verbose --p-n-threads 0
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/forward /var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/reverse /var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/output.tsv.biom /var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/filt_f /var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/filt_r 180 180 15 15 2.0 2 consensus 1.0 0 1000000

R version 3.3.2 (2016-10-31) 
Loading required package: Rcpp
There were 50 or more warnings (use warnings() to see the first 50)
DADA2 R package version: 1.4.0 
1) Filtering ..Error in fastqPairedFilter(c(unfiltsF[[i]], unfiltsR[[i]]), c(filteredFastqF,  : 
  Mismatched forward and reverse sequence files: 87112, 82484.
Execution halted
Traceback (most recent call last):
  File "/Users/mel_local/miniconda2/envs/qiime2-2017.10/lib/python3.5/site-packages/q2_dada2/_denoise.py", line 179, in denoise_paired
    run_commands([cmd])
  File "/Users/mel_local/miniconda2/envs/qiime2-2017.10/lib/python3.5/site-packages/q2_dada2/_denoise.py", line 35, in run_commands
    subprocess.run(cmd, check=True)
  File "/Users/mel_local/miniconda2/envs/qiime2-2017.10/lib/python3.5/subprocess.py", line 398, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run_dada_paired.R', '/var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/forward', '/var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/reverse', '/var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/output.tsv.biom', '/var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/filt_f', '/var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp79pbfgms/filt_r', '180', '180', '15', '15', '2.0', '2', 'consensus', '1.0', '0', '1000000']' returned non-zero exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/mel_local/miniconda2/envs/qiime2-2017.10/lib/python3.5/site-packages/q2cli/commands.py", line 218, in __call__
    results = action(**arguments)
  File "<decorator-gen-338>", line 2, in denoise_paired
  File "/Users/mel_local/miniconda2/envs/qiime2-2017.10/lib/python3.5/site-packages/qiime2/sdk/action.py", line 220, in bound_callable
    output_types, provenance)
  File "/Users/mel_local/miniconda2/envs/qiime2-2017.10/lib/python3.5/site-packages/qiime2/sdk/action.py", line 355, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/Users/mel_local/miniconda2/envs/qiime2-2017.10/lib/python3.5/site-packages/q2_dada2/_denoise.py", line 194, in denoise_paired
    " and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

  An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info

Seeing the mismatch error I checked the forum and took the following actions:

  1. Validated my .qza

    (qiime2-2017.10) wsb255bioimac27:Fink_fermenter_Exp3 mel_local$ qiime tools validate AD_exp3_Arc_pe-demux.qza
    Artifact AD_exp3_Arc_pe-demux.qza appears to be valid at level=max.

  2. Checked all of my paths - they were fine.

  3. Checked all my sequence files for mismatched entries per fastq:

    (qiime2-2017.10) wsb255bioimac27:Fink_Project_008 mel_local$ for f in *.fastq; do r=$(wc -l < $f | tr -d '[:space:]'); echo $r $f; done
    504 BLANK_Arc_G06_S90_R1_001.fastq
    504 BLANK_Arc_G06_S90_R2_001.fastq
    504 BLANK_Arc_H06_S96_R1_001.fastq
    504 BLANK_Arc_H06_S96_R2_001.fastq
    329936 F1_0_5_Arc_S55_R1_001.fastq
    329936 F1_0_5_Arc_S55_R2_001.fastq
    348448 F1_0_Arc_S49_R1_001.fastq
    348448 F1_0_Arc_S49_R2_001.fastq
    402708 F1_101_5_Arc_S57_R1_001.fastq
    402708 F1_101_5_Arc_S57_R2_001.fastq
    331688 F1_128_5_Arc_S63_R1_001.fastq
    331688 F1_128_5_Arc_S63_R2_001.fastq
    33752 F1_155_5_Arc_S69_R1_001.fastq
    33752 F1_155_5_Arc_S69_R2_001.fastq
    337028 F1_16_Arc_S85_R1_001.fastq
    337028 F1_16_Arc_S85_R2_001.fastq
    128560 F1_182_5_Arc_S75_R1_001.fastq
    128560 F1_182_5_Arc_S75_R2_001.fastq
    363012 F1_1_Arc_S61_R1_001.fastq
    363012 F1_1_Arc_S61_R2_001.fastq
    363612 F1_24_Arc_S91_R1_001.fastq
    363612 F1_24_Arc_S91_R2_001.fastq
    87648 F1_2_Arc_S67_R1_001.fastq
    87648 F1_2_Arc_S67_R2_001.fastq
    276640 F1_41_5_Arc_S50_R1_001.fastq
    276640 F1_41_5_Arc_S50_R2_001.fastq
    399764 F1_42_5_Arc_S62_R1_001.fastq
    399764 F1_42_5_Arc_S62_R2_001.fastq
    401364 F1_42_Arc_S56_R1_001.fastq
    401364 F1_42_Arc_S56_R2_001.fastq
    237172 F1_43_5_Arc_S68_R1_001.fastq
    237172 F1_43_5_Arc_S68_R2_001.fastq
    290944 F1_45_5_Arc_S74_R1_001.fastq
    290944 F1_45_5_Arc_S74_R2_001.fastq
    379572 F1_49_5_Arc_S80_R1_001.fastq
    379572 F1_49_5_Arc_S80_R2_001.fastq
    231744 F1_4_Arc_S73_R1_001.fastq
    231744 F1_4_Arc_S73_R2_001.fastq
    433060 F1_57_5_Arc_S86_R1_001.fastq
    433060 F1_57_5_Arc_S86_R2_001.fastq
    416836 F1_65_5_Arc_S92_R1_001.fastq
    416836 F1_65_5_Arc_S92_R2_001.fastq
    463752 F1_89_5_Arc_S51_R1_001.fastq
    463752 F1_89_5_Arc_S51_R2_001.fastq
    308324 F1_8_Arc_S79_R1_001.fastq
    308324 F1_8_Arc_S79_R2_001.fastq
    540 F1_W_Arc_S78_R1_001.fastq
    540 F1_W_Arc_S78_R2_001.fastq
    315904 F2_0_5_Arc_S87_R1_001.fastq
    315904 F2_0_5_Arc_S87_R2_001.fastq
    291920 F2_0_Arc_S81_R1_001.fastq
    291920 F2_0_Arc_S81_R2_001.fastq
    299988 F2_101_5_Arc_S89_R1_001.fastq
    299988 F2_101_5_Arc_S89_R2_001.fastq
    370840 F2_128_5_Arc_S95_R1_001.fastq
    370840 F2_128_5_Arc_S95_R2_001.fastq
    337744 F2_155_5_Arc_S54_R1_001.fastq
    337744 F2_155_5_Arc_S54_R2_001.fastq
    79316 F2_16_Arc_S70_R1_001.fastq
    79316 F2_16_Arc_S70_R2_001.fastq
    326336 F2_182_5_Arc_S60_R1_001.fastq
    326336 F2_182_5_Arc_S60_R2_001.fastq
    465116 F2_1_Arc_S93_R1_001.fastq
    465116 F2_1_Arc_S93_R2_001.fastq
    288760 F2_24_Arc_S76_R1_001.fastq
    288760 F2_24_Arc_S76_R2_001.fastq
    377852 F2_2_Arc_S52_R1_001.fastq
    377852 F2_2_Arc_S52_R2_001.fastq
    362496 F2_41_5_Arc_S82_R1_001.fastq
    362496 F2_41_5_Arc_S82_R2_001.fastq
    410396 F2_42_5_Arc_S94_R1_001.fastq
    410396 F2_42_5_Arc_S94_R2_001.fastq
    280352 F2_42_Arc_S88_R1_001.fastq
    280352 F2_42_Arc_S88_R2_001.fastq
    396008 F2_43_5_Arc_S53_R1_001.fastq
    396008 F2_43_5_Arc_S53_R2_001.fastq
    407460 F2_45_5_Arc_S59_R1_001.fastq
    407460 F2_45_5_Arc_S59_R2_001.fastq
    276484 F2_49_5_Arc_S65_R1_001.fastq
    276484 F2_49_5_Arc_S65_R2_001.fastq
    233768 F2_4_Arc_S58_R1_001.fastq
    233768 F2_4_Arc_S58_R2_001.fastq
    116736 F2_57_5_Arc_S71_R1_001.fastq
    116736 F2_57_5_Arc_S71_R2_001.fastq
    99300 F2_65_5_Arc_S77_R1_001.fastq
    99300 F2_65_5_Arc_S77_R2_001.fastq
    293628 F2_89_5_Arc_S83_R1_001.fastq
    293628 F2_89_5_Arc_S83_R2_001.fastq
    416992 F2_8_Arc_S64_R1_001.fastq
    416992 F2_8_Arc_S64_R2_001.fastq
    484 F2_W_Arc_S84_R1_001.fastq
    484 F2_W_Arc_S84_R2_001.fastq
    166848 FD1_Arc_S66_R1_001.fastq
    166848 FD1_Arc_S66_R2_001.fastq
    254520 M1_Arc_S72_R1_001.fastq
    254520 M1_Arc_S72_R2_001.fastq

The closest file with the mismatched sequence numbers per the error: Mismatched forward and reverse sequence files: 87112, 82484.

87648 F1_2_Arc_S67_R1_001.fastq
87648 F1_2_Arc_S67_R2_001.fastq

But they match?

Any other suggestions for getting dada2 to work?

Thanks!

Hi @mmelendrez!

Thanks for the count one-liner - that was super helpful!

Oops! You forgot to divide the counts from your one-liner by 4 (4 lines per fastq record). So, the counts are actually:

87112 F1_0_Arc_S49_R1_001.fastq
87112 F1_0_Arc_S49_R2_001.fastq
82484 F1_0_5_Arc_S55_R1_001.fastq
82484 F1_0_5_Arc_S55_R2_001.fastq

So, my first thought was there was a typo in your manifest, so I pulled out just the records for those samples:

F1_0,F1_0_Arc_S49_R1_001.fastq.gz,forward
F1_0,F1_0_Arc_S49_R2_001.fastq.gz,reverse
F1_0_5,F1_0_5_Arc_S55_R1_001.fastq.gz,forward
F1_0_5,F1_0_5_Arc_S55_R2_001.fastq.gz,reverse

Bummer, looks good here. Okay, so now I have a few thoughts

  • Something went wrong during import
  • Something is wrong with q2-dada2 passing off data to DADA2
  • Something is wrong with DADA2

Lets test the first idea by running the following:

qiime tools export AD_exp3_Arc_pe-demux.qza --output-dir import-check
cd import-check
for f in *.fastq; do r=$(( $(wc -l < $f | tr -d '[:space:]') / 4 )); echo $r $f; done

Then, can you please copy-and-paste the output of that here? Thanks! We will get to the bottom of this! :t_rex: :qiime2:

1 Like

@thermokarst Yes - running it now… BTW on my distribution it’s output-dir rather than output-path:

(qiime2-2017.10) wsb255bioimac27:Fink_fermenter_Exp3 mel_local$ qiime tools export       AD_exp3_Arc_pe-demux.qza --output-path import-check
        Error: no such option: --output-path

(qiime2-2017.10) wsb255bioimac27:Fink_fermenter_Exp3 mel_local$ qiime tools export AD_exp3_Arc_pe-demux.qza --output-dir import-check

(qiime2-2017.10) wsb255bioimac27:import-check mel_local$ for f in *.fastq; do r=$(( $(wc -l < $f | tr -d '[:space:]') / 4 )); echo $r $f; done
126 BLANK1_0_L001_R1_001.fastq
126 BLANK1_48_L001_R2_001.fastq
126 BLANK2_1_L001_R1_001.fastq
126 BLANK2_49_L001_R2_001.fastq
87112 F1_0_3_L001_R1_001.fastq
87112 F1_0_51_L001_R2_001.fastq
82484 F1_0_5_2_L001_R1_001.fastq
82484 F1_0_5_50_L001_R2_001.fastq
100677 F1_101_5_53_L001_R2_001.fastq
100677 F1_101_5_5_L001_R1_001.fastq
82922 F1_128_5_54_L001_R2_001.fastq
82922 F1_128_5_6_L001_R1_001.fastq
8438 F1_155_5_55_L001_R2_001.fastq
8438 F1_155_5_7_L001_R1_001.fastq
84257 F1_16_56_L001_R2_001.fastq
84257 F1_16_8_L001_R1_001.fastq
32140 F1_182_5_57_L001_R2_001.fastq
32140 F1_182_5_9_L001_R1_001.fastq
90753 F1_1_4_L001_R1_001.fastq
90753 F1_1_52_L001_R2_001.fastq
90903 F1_24_11_L001_R1_001.fastq
90903 F1_24_59_L001_R2_001.fastq
21912 F1_2_10_L001_R1_001.fastq
21912 F1_2_58_L001_R2_001.fastq
69160 F1_41_5_13_L001_R1_001.fastq
69160 F1_41_5_61_L001_R2_001.fastq
100341 F1_42_15_L001_R1_001.fastq
99941 F1_42_5_14_L001_R1_001.fastq
99941 F1_42_5_62_L001_R2_001.fastq
100341 F1_42_63_L001_R2_001.fastq
59293 F1_43_5_16_L001_R1_001.fastq
59293 F1_43_5_64_L001_R2_001.fastq
72736 F1_45_5_17_L001_R1_001.fastq
72736 F1_45_5_65_L001_R2_001.fastq
94893 F1_49_5_18_L001_R1_001.fastq
94893 F1_49_5_66_L001_R2_001.fastq
57936 F1_4_12_L001_R1_001.fastq
57936 F1_4_60_L001_R2_001.fastq
108265 F1_57_5_19_L001_R1_001.fastq
108265 F1_57_5_67_L001_R2_001.fastq
104209 F1_65_5_20_L001_R1_001.fastq
104209 F1_65_5_68_L001_R2_001.fastq
115938 F1_89_5_22_L001_R1_001.fastq
115938 F1_89_5_70_L001_R2_001.fastq
77081 F1_8_21_L001_R1_001.fastq
77081 F1_8_69_L001_R2_001.fastq
135 F1_W_23_L001_R1_001.fastq
135 F1_W_71_L001_R2_001.fastq
72980 F2_0_25_L001_R1_001.fastq
78976 F2_0_5_24_L001_R1_001.fastq
78976 F2_0_5_72_L001_R2_001.fastq
72980 F2_0_73_L001_R2_001.fastq
74997 F2_101_5_27_L001_R1_001.fastq
74997 F2_101_5_75_L001_R2_001.fastq
92710 F2_128_5_28_L001_R1_001.fastq
92710 F2_128_5_76_L001_R2_001.fastq
84436 F2_155_5_29_L001_R1_001.fastq
84436 F2_155_5_77_L001_R2_001.fastq
19829 F2_16_30_L001_R1_001.fastq
19829 F2_16_78_L001_R2_001.fastq
81584 F2_182_5_31_L001_R1_001.fastq
81584 F2_182_5_79_L001_R2_001.fastq
116279 F2_1_26_L001_R1_001.fastq
116279 F2_1_74_L001_R2_001.fastq
72190 F2_24_33_L001_R1_001.fastq
72190 F2_24_81_L001_R2_001.fastq
94463 F2_2_32_L001_R1_001.fastq
94463 F2_2_80_L001_R2_001.fastq
90624 F2_41_5_35_L001_R1_001.fastq
90624 F2_41_5_83_L001_R2_001.fastq
70088 F2_42_37_L001_R1_001.fastq
102599 F2_42_5_36_L001_R1_001.fastq
102599 F2_42_5_84_L001_R2_001.fastq
70088 F2_42_85_L001_R2_001.fastq
99002 F2_43_5_38_L001_R1_001.fastq
99002 F2_43_5_86_L001_R2_001.fastq
101865 F2_45_5_39_L001_R1_001.fastq
101865 F2_45_5_87_L001_R2_001.fastq
69121 F2_49_5_40_L001_R1_001.fastq
69121 F2_49_5_88_L001_R2_001.fastq
58442 F2_4_34_L001_R1_001.fastq
58442 F2_4_82_L001_R2_001.fastq
29184 F2_57_5_41_L001_R1_001.fastq
29184 F2_57_5_89_L001_R2_001.fastq
24825 F2_65_5_42_L001_R1_001.fastq
24825 F2_65_5_90_L001_R2_001.fastq
73407 F2_89_5_44_L001_R1_001.fastq
73407 F2_89_5_92_L001_R2_001.fastq
104248 F2_8_43_L001_R1_001.fastq
104248 F2_8_91_L001_R2_001.fastq
121 F2_W_45_L001_R1_001.fastq
121 F2_W_93_L001_R2_001.fastq
41712 FD1_46_L001_R1_001.fastq
41712 FD1_94_L001_R2_001.fastq
63630 M1_47_L001_R1_001.fastq
63630 M1_95_L001_R2_001.fastq

OMgoodness?! The counts match but the names are all different now?! My original files the letters Arc are in the name and all the file names match until you . S##_001. fastq. This is the ls of the directory - it has all of them but I’m only working with the Arc files at the moment:

(qiime2-2017.10) wsb255bioimac27:import-check mel_local$ ls ../Fink_Project_008/
Analysis				F1_1_V4_S13_R2_001.fastq.gz		F1_57_5_V4_S38_R2_001.fastq.gz		F2_16_V4_S22_R2_001.fastq.gz		F2_49_5_V4_S17_R2_001.fastq.gz
BLANK_Arc_G06_S90_R1_001.fastq		F1_24_Arc_S91_R1_001.fastq		F1_65_5_Arc_S92_R1_001.fastq		F2_182_5_Arc_S60_R1_001.fastq		F2_4_Arc_S58_R1_001.fastq
BLANK_Arc_G06_S90_R2_001.fastq		F1_24_Arc_S91_R2_001.fastq		F1_65_5_Arc_S92_R2_001.fastq		F2_182_5_Arc_S60_R2_001.fastq		F2_4_Arc_S58_R2_001.fastq
BLANK_Arc_H06_S96_R1_001.fastq		F1_24_Pro_S139_R1_001.fastq.gz		F1_65_5_Pro_S140_R1_001.fastq.gz	F2_182_5_Pro_S108_R1_001.fastq.gz	F2_4_Pro_S106_R1_001.fastq.gz
BLANK_Arc_H06_S96_R2_001.fastq		F1_24_Pro_S139_R2_001.fastq.gz		F1_65_5_Pro_S140_R2_001.fastq.gz	F2_182_5_Pro_S108_R2_001.fastq.gz	F2_4_Pro_S106_R2_001.fastq.gz
BLANK_Pro_G06_S138_R1_001.fastq.gz	F1_24_V4_S43_R1_001.fastq.gz		F1_65_5_V4_S44_R1_001.fastq.gz		F2_182_5_V4_S12_R1_001.fastq.gz		F2_4_V4_S10_R1_001.fastq.gz
BLANK_Pro_G06_S138_R2_001.fastq.gz	F1_24_V4_S43_R2_001.fastq.gz		F1_65_5_V4_S44_R2_001.fastq.gz		F2_182_5_V4_S12_R2_001.fastq.gz		F2_4_V4_S10_R2_001.fastq.gz
BLANK_Pro_H06_S144_R1_001.fastq.gz	F1_2_Arc_S67_R1_001.fastq		F1_89_5_Arc_S51_R1_001.fastq		F2_1_Arc_S93_R1_001.fastq		F2_57_5_Arc_S71_R1_001.fastq
BLANK_Pro_H06_S144_R2_001.fastq.gz	F1_2_Arc_S67_R2_001.fastq		F1_89_5_Arc_S51_R2_001.fastq		F2_1_Arc_S93_R2_001.fastq		F2_57_5_Arc_S71_R2_001.fastq
BLANK_V4_G06_S42_R1_001.fastq.gz	F1_2_Pro_S115_R1_001.fastq.gz		F1_89_5_Pro_S99_R1_001.fastq.gz		F2_1_Pro_S141_R1_001.fastq.gz		F2_57_5_Pro_S119_R1_001.fastq.gz
BLANK_V4_G06_S42_R2_001.fastq.gz	F1_2_Pro_S115_R2_001.fastq.gz		F1_89_5_Pro_S99_R2_001.fastq.gz		F2_1_Pro_S141_R2_001.fastq.gz		F2_57_5_Pro_S119_R2_001.fastq.gz
BLANK_V4_H06_S48_R1_001.fastq.gz	F1_2_V4_S19_R1_001.fastq.gz		F1_89_5_V4_S3_R1_001.fastq.gz		F2_1_V4_S45_R1_001.fastq.gz		F2_57_5_V4_S23_R1_001.fastq.gz
BLANK_V4_H06_S48_R2_001.fastq.gz	F1_2_V4_S19_R2_001.fastq.gz		F1_89_5_V4_S3_R2_001.fastq.gz		F2_1_V4_S45_R2_001.fastq.gz		F2_57_5_V4_S23_R2_001.fastq.gz
F1_0_5_Arc_S55_R1_001.fastq		F1_41_5_Arc_S50_R1_001.fastq		F1_8_Arc_S79_R1_001.fastq		F2_24_Arc_S76_R1_001.fastq		F2_65_5_Arc_S77_R1_001.fastq
F1_0_5_Arc_S55_R2_001.fastq		F1_41_5_Arc_S50_R2_001.fastq		F1_8_Arc_S79_R2_001.fastq		F2_24_Arc_S76_R2_001.fastq		F2_65_5_Arc_S77_R2_001.fastq
F1_0_5_Pro_S103_R1_001.fastq.gz		F1_41_5_Pro_S98_R1_001.fastq.gz		F1_8_Pro_S127_R1_001.fastq.gz		F2_24_Pro_S124_R1_001.fastq.gz		F2_65_5_Pro_S125_R1_001.fastq.gz
F1_0_5_Pro_S103_R2_001.fastq.gz		F1_41_5_Pro_S98_R2_001.fastq.gz		F1_8_Pro_S127_R2_001.fastq.gz		F2_24_Pro_S124_R2_001.fastq.gz		F2_65_5_Pro_S125_R2_001.fastq.gz
F1_0_5_V4_S7_R1_001.fastq.gz		F1_41_5_V4_S2_R1_001.fastq.gz		F1_8_V4_S31_R1_001.fastq.gz		F2_24_V4_S28_R1_001.fastq.gz		F2_65_5_V4_S29_R1_001.fastq.gz
F1_0_5_V4_S7_R2_001.fastq.gz		F1_41_5_V4_S2_R2_001.fastq.gz		F1_8_V4_S31_R2_001.fastq.gz		F2_24_V4_S28_R2_001.fastq.gz		F2_65_5_V4_S29_R2_001.fastq.gz
F1_0_Arc_S49_R1_001.fastq		F1_42_5_Arc_S62_R1_001.fastq		F1_W_Arc_S78_R1_001.fastq		F2_2_Arc_S52_R1_001.fastq		F2_89_5_Arc_S83_R1_001.fastq
F1_0_Arc_S49_R2_001.fastq		F1_42_5_Arc_S62_R2_001.fastq		F1_W_Arc_S78_R2_001.fastq		F2_2_Arc_S52_R2_001.fastq		F2_89_5_Arc_S83_R2_001.fastq
F1_0_Pro_S97_R1_001.fastq.gz		F1_42_5_Pro_S110_R1_001.fastq.gz	F1_W_Pro_S126_R1_001.fastq.gz		F2_2_Pro_S100_R1_001.fastq.gz		F2_89_5_Pro_S131_R1_001.fastq.gz
F1_0_Pro_S97_R2_001.fastq.gz		F1_42_5_Pro_S110_R2_001.fastq.gz	F1_W_Pro_S126_R2_001.fastq.gz		F2_2_Pro_S100_R2_001.fastq.gz		F2_89_5_Pro_S131_R2_001.fastq.gz
F1_0_V4_S1_R1_001.fastq.gz		F1_42_5_V4_S14_R1_001.fastq.gz		F1_W_V4_S30_R1_001.fastq.gz		F2_2_V4_S4_R1_001.fastq.gz		F2_89_5_V4_S35_R1_001.fastq.gz
F1_0_V4_S1_R2_001.fastq.gz		F1_42_5_V4_S14_R2_001.fastq.gz		F1_W_V4_S30_R2_001.fastq.gz		F2_2_V4_S4_R2_001.fastq.gz		F2_89_5_V4_S35_R2_001.fastq.gz
F1_101_5_Arc_S57_R1_001.fastq		F1_42_Arc_S56_R1_001.fastq		F2_0_5_Arc_S87_R1_001.fastq		F2_41_5_Arc_S82_R1_001.fastq		F2_8_Arc_S64_R1_001.fastq
F1_101_5_Arc_S57_R2_001.fastq		F1_42_Arc_S56_R2_001.fastq		F2_0_5_Arc_S87_R2_001.fastq		F2_41_5_Arc_S82_R2_001.fastq		F2_8_Arc_S64_R2_001.fastq
F1_101_5_Pro_S105_R1_001.fastq.gz	F1_42_Pro_S104_R1_001.fastq.gz		F2_0_5_Pro_S135_R1_001.fastq.gz		F2_41_5_Pro_S130_R1_001.fastq.gz	F2_8_Pro_S112_R1_001.fastq.gz
F1_101_5_Pro_S105_R2_001.fastq.gz	F1_42_Pro_S104_R2_001.fastq.gz		F2_0_5_Pro_S135_R2_001.fastq.gz		F2_41_5_Pro_S130_R2_001.fastq.gz	F2_8_Pro_S112_R2_001.fastq.gz
F1_101_5_V4_S9_R1_001.fastq.gz		F1_42_V4_S8_R1_001.fastq.gz		F2_0_5_V4_S39_R1_001.fastq.gz		F2_41_5_V4_S34_R1_001.fastq.gz		F2_8_V4_S16_R1_001.fastq.gz
F1_101_5_V4_S9_R2_001.fastq.gz		F1_42_V4_S8_R2_001.fastq.gz		F2_0_5_V4_S39_R2_001.fastq.gz		F2_41_5_V4_S34_R2_001.fastq.gz		F2_8_V4_S16_R2_001.fastq.gz
F1_128_5_Arc_S63_R1_001.fastq		F1_43_5_Arc_S68_R1_001.fastq		F2_0_Arc_S81_R1_001.fastq		F2_42_5_Arc_S94_R1_001.fastq		F2_W_Arc_S84_R1_001.fastq
F1_128_5_Arc_S63_R2_001.fastq		F1_43_5_Arc_S68_R2_001.fastq		F2_0_Arc_S81_R2_001.fastq		F2_42_5_Arc_S94_R2_001.fastq		F2_W_Arc_S84_R2_001.fastq
F1_128_5_Pro_S111_R1_001.fastq.gz	F1_43_5_Pro_S116_R1_001.fastq.gz	F2_0_Pro_S129_R1_001.fastq.gz		F2_42_5_Pro_S142_R1_001.fastq.gz	F2_W_Pro_S132_R1_001.fastq.gz
F1_128_5_Pro_S111_R2_001.fastq.gz	F1_43_5_Pro_S116_R2_001.fastq.gz	F2_0_Pro_S129_R2_001.fastq.gz		F2_42_5_Pro_S142_R2_001.fastq.gz	F2_W_Pro_S132_R2_001.fastq.gz
F1_128_5_V4_S15_R1_001.fastq.gz		F1_43_5_V4_S20_R1_001.fastq.gz		F2_0_V4_S33_R1_001.fastq.gz		F2_42_5_V4_S46_R1_001.fastq.gz		F2_W_V4_S36_R1_001.fastq.gz
F1_128_5_V4_S15_R2_001.fastq.gz		F1_43_5_V4_S20_R2_001.fastq.gz		F2_0_V4_S33_R2_001.fastq.gz		F2_42_5_V4_S46_R2_001.fastq.gz		F2_W_V4_S36_R2_001.fastq.gz
F1_155_5_Arc_S69_R1_001.fastq		F1_45_5_Arc_S74_R1_001.fastq		F2_101_5_Arc_S89_R1_001.fastq		F2_42_Arc_S88_R1_001.fastq		FD1_Arc_S66_R1_001.fastq
F1_155_5_Arc_S69_R2_001.fastq		F1_45_5_Arc_S74_R2_001.fastq		F2_101_5_Arc_S89_R2_001.fastq		F2_42_Arc_S88_R2_001.fastq		FD1_Arc_S66_R2_001.fastq
F1_155_5_Pro_S117_R1_001.fastq.gz	F1_45_5_Pro_S122_R1_001.fastq.gz	F2_101_5_Pro_S137_R1_001.fastq.gz	F2_42_Pro_S136_R1_001.fastq.gz		FD1_Pro_S114_R1_001.fastq.gz
F1_155_5_Pro_S117_R2_001.fastq.gz	F1_45_5_Pro_S122_R2_001.fastq.gz	F2_101_5_Pro_S137_R2_001.fastq.gz	F2_42_Pro_S136_R2_001.fastq.gz		FD1_Pro_S114_R2_001.fastq.gz
F1_155_5_V4_S21_R1_001.fastq.gz		F1_45_5_V4_S26_R1_001.fastq.gz		F2_101_5_V4_S41_R1_001.fastq.gz		F2_42_V4_S40_R1_001.fastq.gz		FD1_V4_S18_R1_001.fastq.gz
F1_155_5_V4_S21_R2_001.fastq.gz		F1_45_5_V4_S26_R2_001.fastq.gz		F2_101_5_V4_S41_R2_001.fastq.gz		F2_42_V4_S40_R2_001.fastq.gz		FD1_V4_S18_R2_001.fastq.gz
F1_16_Arc_S85_R1_001.fastq		F1_49_5_Arc_S80_R1_001.fastq		F2_128_5_Arc_S95_R1_001.fastq		F2_43_5_Arc_S53_R1_001.fastq		M1_Arc_S72_R1_001.fastq
F1_16_Arc_S85_R2_001.fastq		F1_49_5_Arc_S80_R2_001.fastq		F2_128_5_Arc_S95_R2_001.fastq		F2_43_5_Arc_S53_R2_001.fastq		M1_Arc_S72_R2_001.fastq
F1_16_Pro_S133_R1_001.fastq.gz		F1_49_5_Pro_S128_R1_001.fastq.gz	F2_128_5_Pro_S143_R1_001.fastq.gz	F2_43_5_Pro_S101_R1_001.fastq.gz	M1_Pro_S120_R1_001.fastq.gz
F1_16_Pro_S133_R2_001.fastq.gz		F1_49_5_Pro_S128_R2_001.fastq.gz	F2_128_5_Pro_S143_R2_001.fastq.gz	F2_43_5_Pro_S101_R2_001.fastq.gz	M1_Pro_S120_R2_001.fastq.gz
F1_16_V4_S37_R1_001.fastq.gz		F1_49_5_V4_S32_R1_001.fastq.gz		F2_128_5_V4_S47_R1_001.fastq.gz		F2_43_5_V4_S5_R1_001.fastq.gz		M1_V4_S24_R1_001.fastq.gz
F1_16_V4_S37_R2_001.fastq.gz		F1_49_5_V4_S32_R2_001.fastq.gz		F2_128_5_V4_S47_R2_001.fastq.gz		F2_43_5_V4_S5_R2_001.fastq.gz		M1_V4_S24_R2_001.fastq.gz
F1_182_5_Arc_S75_R1_001.fastq		F1_4_Arc_S73_R1_001.fastq		F2_155_5_Arc_S54_R1_001.fastq		F2_45_5_Arc_S59_R1_001.fastq		index.html?C=M;O=A
F1_182_5_Arc_S75_R2_001.fastq		F1_4_Arc_S73_R2_001.fastq		F2_155_5_Arc_S54_R2_001.fastq		F2_45_5_Arc_S59_R2_001.fastq		index.html?C=M;O=D
F1_182_5_Pro_S123_R1_001.fastq.gz	F1_4_Pro_S121_R1_001.fastq.gz		F2_155_5_Pro_S102_R1_001.fastq.gz	F2_45_5_Pro_S107_R1_001.fastq.gz	index.html?C=N;O=A
F1_182_5_Pro_S123_R2_001.fastq.gz	F1_4_Pro_S121_R2_001.fastq.gz		F2_155_5_Pro_S102_R2_001.fastq.gz	F2_45_5_Pro_S107_R2_001.fastq.gz	index.html?C=N;O=D
F1_182_5_V4_S27_R1_001.fastq.gz		F1_4_V4_S25_R1_001.fastq.gz		F2_155_5_V4_S6_R1_001.fastq.gz		F2_45_5_V4_S11_R1_001.fastq.gz		index.html?C=S;O=A
F1_182_5_V4_S27_R2_001.fastq.gz		F1_4_V4_S25_R2_001.fastq.gz		F2_155_5_V4_S6_R2_001.fastq.gz		F2_45_5_V4_S11_R2_001.fastq.gz		index.html?C=S;O=D
F1_1_Arc_S61_R1_001.fastq		F1_57_5_Arc_S86_R1_001.fastq		F2_16_Arc_S70_R1_001.fastq		F2_49_5_Arc_S65_R1_001.fastq		md5.txt
F1_1_Arc_S61_R2_001.fastq		F1_57_5_Arc_S86_R2_001.fastq		F2_16_Arc_S70_R2_001.fastq		F2_49_5_Arc_S65_R2_001.fastq
F1_1_Pro_S109_R1_001.fastq.gz		F1_57_5_Pro_S134_R1_001.fastq.gz	F2_16_Pro_S118_R1_001.fastq.gz		F2_49_5_Pro_S113_R1_001.fastq.gz
F1_1_Pro_S109_R2_001.fastq.gz		F1_57_5_Pro_S134_R2_001.fastq.gz	F2_16_Pro_S118_R2_001.fastq.gz		F2_49_5_Pro_S113_R2_001.fastq.gz
F1_1_V4_S13_R1_001.fastq.gz		F1_57_5_V4_S38_R1_001.fastq.gz		F2_16_V4_S22_R1_001.fastq.gz		F2_49_5_V4_S17_R1_001.fastq.gz

But as you can see from the import check…it removed Arc from the name and it looks like things were renumbered?

Would this be the issue?

Well, that is because you renamed them when importing with your manifest file, you removed the 'Arc' portion from the sample IDs:

F1_0,F1_0_Arc_S49_R1_001.fastq.gz,forward
F1_0,F1_0_Arc_S49_R2_001.fastq.gz,reverse
F1_0_5,F1_0_5_Arc_S55_R1_001.fastq.gz,forward
F1_0_5,F1_0_5_Arc_S55_R2_001.fastq.gz,reverse

if you wanted to keep that part of the ID:

F1_0_Arc,F1_0_Arc_S49_R1_001.fastq.gz,forward
F1_0_Arc,F1_0_Arc_S49_R2_001.fastq.gz,reverse
F1_0_5_Arc,F1_0_5_Arc_S55_R1_001.fastq.gz,forward
F1_0_5_Arc,F1_0_5_Arc_S55_R2_001.fastq.gz,reverse

Or, you can relabel entirely:

peanut,F1_0_Arc_S49_R1_001.fastq.gz,forward
peanut,F1_0_Arc_S49_R2_001.fastq.gz,reverse
butter,F1_0_5_Arc_S55_R1_001.fastq.gz,forward
butter,F1_0_5_Arc_S55_R2_001.fastq.gz,reverse

Anyway, it looks like something is truncating (or left-splitting then underscores) in a way that makes the two samples F1_0 & F1_0_5 look the same to DADA2 (I think), loading the forward read from one sample and the reverse read from the other. Pinging @benjjneb to see if this is a known issue in DADA2. If not, we can try and hunt down what is happening in q2-dada2.

Moving forward, one option is to come up with a new Sample ID scheme for your manifest file to prevent this collision from occurring: F1_0 & F1_0-5 (for example).

Anyway, we will wait and see what @benjjneb thinks about this.

2 Likes

The issue is probably the underscores. In the plugin, sample names are defined by splitting the filenames on the underscore character: https://github.com/qiime2/q2-dada2/blob/master/q2_dada2/_denoise.py#L88

Workaround is probably as @thermokarst suggested, rename w/o underscores in the sample names.

On the plugin side, should this get checked for and an exception raised if found?

1 Like

Ok - I will rename with dashes and see if that solves the problem!

This would be great if an exception/error could be raised so that I would know that my naming format was incorrect or potentially not parse-able by DADA2?

Thanks @mmelendrez - we are still working out where the exact issue is coming up - we will post a link to a materialized issue ticket once we identify the source of the error. Stay tuned!

DADA2 seems to be running normally now. I renamed my files changing underscores to dashes. Thanks!

1 Like

Okay, I took a closer look at reproducing the “underscore error” you reported above, and I noticed that you are using a very old and unsupported version of QIIME 2 (2017.10). As well, I tried manually reproducing the situation (by creating many variations of sample IDs with underscores all over them) and was unsuccessful - the import and DADA2 worked as expected every time with IDs like sample_1/sample_1_1 and sam_ple_1/sam_ple_1_1, etc.

If the original manifest your provided above is still producing this error when you upgrade to 2018.4 (the latest version of QIIME 2), please send us the offending SampleData[PairedEndSequencesWithQuality] so that we can debug manually. Thanks! :t_rex: :qiime2:

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.