Hi,
I ran into an issue during the denoising and merging step with dada2; however, I believe the root cause of this issue is due to the way I imported my sequencing files. These sequencing files are from a collaborator run on an Illumina MiSeq in the 2x300bp configuration.
I originally imported the sequencing files using the manifest file format PairedEndFastqManifestPhred33V2 with the following code:
qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path manifest_file.txt
--output-path Omni_paired-end-demux.qza
--input-format PairedEndFastqManifestPhred33V2 &
The code passes and I get a demux.qza; however when I try to validate my demultiplexed QIIME2 artifact, I received this message:
"/tmp/qiime2/kmt137/data/e1fb0b38-eb52-4391-b782-eec817cdadaa/data/18627_7_L001_R1_001.fastq.gz is not a(n) FastqGzFormat file:
Quality score length doesn't match sequence length for record beginning on line 526137"
I am not sure why the annotation of the file suggests a Casava 1.8 format, but the name of the actual files are 18627.fastq.gz. When I try to use the Casava 1.8 importing command:
qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path /home/kmt137/Omni_Samples
--input-format CasavaOneEightSingleLanePerSampleDirFmt
--output-path demux-paired-end.qza
I just received this: There was a problem importing casava-18-paired-end-demultiplexed:
Missing one or more files for CasavaOneEightSingleLanePerSampleDirFmt: '.+_.+_L[0-9][0-9][0-9]_R[12]_001\.fastq\.gz'
Downstream when I tried to run dada2 with the demux.qza under Phred33V2 format; I just received these error messaged:
error in names(answer) <- names1 :
'names' attribute [84] must be the same length as the vector [12]
Execution halted
Traceback (most recent call last):
File "/home/kmt137/.conda/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_dada2/denoise.py", line 279, in denoise_paired
run_commands([cmd])
File "/home/kmt137/.conda/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_dada2/denoise.py", line 36, in run_commands
subprocess.run(cmd, check=True)
File "/home/kmt137/.conda/envs/qiime2-2022.2/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada_paired.R', '/tmp/tmpjhwvkaw/forward', '/tmp/tmpjhwvkaw/reverse', '/tmp/tmpjhwvkaw_/output.tsv.biom', '/tmp/tmpjhwvkaw_/track.tsv', '/tmp/tmpjhwvkaw_/filt_f', '/tmp/tmpjhwvkaw_/filt_r', '294', '244', '6', '8', '2.0', '2.0', '2', '12', 'independent', 'consensus', '1.0', '60', '1000000']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/kmt137/.conda/envs/qiime2-2022.2/lib/python3.8/site-packages/q2cli/commands.py", line 339, in call
results = action(**arguments)
File "", line 2, in denoise_paired
File "/home/kmt137/.conda/envs/qiime2-2022.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
outputs = self.callable_executor(scope, callable_args,
File "/home/kmt137/.conda/envs/qiime2-2022.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 391, in callable_executor
output_views = self._callable(**view_args)
File "/home/kmt137/.conda/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 292, in denoise_paired
raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
Any help would be really appreciated. I think this may be a file naming issue but just wanted to confirm before asking the collaborator to fix the naming of all files.
Thanks!
Cake