Importing fastq files after demultiplexing

Nir_Friedman · April 26, 2019, 12:30pm

Hi,
I have got fastq.gz samples from my sequencing provider after spilt libraries and demultuplexing.
In this run, I had 6 samples and he sent me 6 forward file and 6 reverse files separately.
I used to work with Qiime 1… a lot and now I wish to try Qiime 2.
I have several Q:

Do I need to combine all the 6 forward files into one combined file and all the 6 reverse files into another file in order to start the analysis?
I tried to understand from the tutorial how should I upload the files but I couldn’t, should I use the : “Fastq manifest” formats" " step?
Thanks a lot,
Nir

Nicholas_Bokulich · April 26, 2019, 1:52pm

No! If your reads are already demultiplexed you can use the manifest format to import these.

However, you mention "split libraries"... if your sequencing provider is running the qiime 1 split libraries script or performing any other kind of quality control on the sequences, you should ask for the raw data.

Good luck!

Nir_Friedman · April 26, 2019, 2:30pm

Thanks Nicholas.
My seq provider did not use qiime for sure, he uses Qiagen’s CLC.
According to him the used split libraries ( maybe he mistakenly used this term, when he actually meant for demultiplexing the samples).
The output that I have got is 12 files, 6 forward (R1) , 6 revers (R2) fastq.gz files.
Now, should I use the manifest format?
And if I do, just to be sure: I copy to the same directory both the fastq files and the manifest file and I am using this script:
qiime tools import
–type ‘SampleData[PairedEndSequencesWithQuality]’
–input-path pe-64-manifest
–output-path paired-end-demux.qza
–input-format PairedEndFastqManifestPhred64

Thanks again,
Nir

Nicholas_Bokulich · April 26, 2019, 2:32pm

Thanks for clarifying! That's good — some sequence providers use qiime1 scripts or other commands that demultiplex + QC. We recommend not running any QC, since this can conflict with later steps in QIIME 2 (e.g., running dada2).

Yes, unless if your filenames fit the Casava format described in the tutorial. If in doubt, always use the manifest format.

For manifest format the files can be anywhere. You specify the file path in the manifest. But that command looks correct.

Good luck!

Nir_Friedman · April 26, 2019, 2:41pm

Excellent.
Thanks a lot.
Will try it and get back to you if needed.
Nir

Nir_Friedman · April 29, 2019, 11:57am

Hi Nicholas,
Your answer was very helpful and i manged to upload the files.
I then ran this command to clean non biological sequences.
qiime dada2 denoise-paired
–i-demultiplexed-seqs /home/qiime2/Desktop/Netanya_6/paired-end-demux.qza
–p-trim-left-f 0
–p-trim-left-r 0
–p-trunc-len-f 240
–p-trunc-len-r 240
–p-n-threads 0
–o-representative-sequences 6_electra_rep-seqs.qza
–o-table 6_electra_table.qza
–o-denoising-stats 6_electra_dada2.qza
–verbose

I got this error:
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/tmpv2rg0v6_/forward /tmp/tmpv2rg0v6_/reverse /tmp/tmpv2rg0v6_/output.tsv.biom /tmp/tmpv2rg0v6_/track.tsv /tmp/tmpv2rg0v6_/filt_f /tmp/tmpv2rg0v6_/filt_r 240 240 0 0 2.0 2 consensus 1.0 0 1000000

R version 3.4.1 (2017-06-30)
Loading required package: Rcpp
DADA2 R package version: 1.6.0

Filtering …
Learning Error Rates
2a) Forward Reads
Initializing error rates to maximum possible estimate.
Sample 1 - 125909 reads in 48728 unique sequences.
Sample 2 - 124740 reads in 45881 unique sequences.
Sample 3 - 112445 reads in 48095 unique sequences.
Sample 4 - 134674 reads in 55631 unique sequences.
Sample 5 - 113621 reads in 43446 unique sequences.
Sample 6 - 135461 reads in 57261 unique sequences.
selfConsist step 2
Error in dada_uniques(names(derep[[i]]$uniques), unname(derep[[i]]$uniques), :
Memory allocation failed.
Calls: dada -> dada_uniques -> .Call
Execution halted
Warning message:
system call failed: Cannot allocate memory
Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/denoise.py", line 231, in denoise_paired
run_commands([cmd])
File "/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/denoise.py", line 36, in run_commands
subprocess.run(cmd, check=True)
File “/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/subprocess.py”, line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '[‘run_dada_paired.R’, '/tmp/tmpv2rg0v6/forward’, '/tmp/tmpv2rg0v6/reverse’, ‘/tmp/tmpv2rg0v6_/output.tsv.biom’, ‘/tmp/tmpv2rg0v6_/track.tsv’, ‘/tmp/tmpv2rg0v6_/filt_f’, ‘/tmp/tmpv2rg0v6_/filt_r’, ‘240’, ‘240’, ‘0’, ‘0’, ‘2.0’, ‘2’, ‘consensus’, ‘1.0’, ‘0’, ‘1000000’]’ returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “</home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-442>”, line 2, in denoise_paired
File “/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 365, in callable_executor
output_views = self._callable(**view_args)
File “/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 246, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info.

What does it means and what should I do to fix it?
Thanks a lot,
Nir

Nicholas_Bokulich · April 29, 2019, 12:01pm

Hi @Nir_Friedman,

Here is the key line:

You do not have enough memory to run this command! You will probably need to find a more powerful machine to run this (usually 8 GB RAM should suffice for a dada2 with a single sequencing run, but the more the merrier). See these forum posts for more troubleshooting advice: Search results for 'dada2 'Memory allocation failed'' - QIIME 2 Forum

Nir_Friedman · April 29, 2019, 12:07pm

Hi Nicholas,
I have designated 4 GB RAM and 3 CPUs ( my laptop has 32 GB RAM and 8 CPUs)
Should I increase only the RAM or also the CPUs?
Was this the only problem? ( just to rerun the command?)
Tx
Nir

Nicholas_Bokulich · April 29, 2019, 12:13pm

Increase RAM. No need to increase CPUs unless if you want to.

It's the only problem reported in that error message

Good luck!

Nir_Friedman · April 29, 2019, 1:15pm

Great…
Thank a lot,
Nir

Nicholas_Bokulich · April 30, 2019, 1:44pm

An off-topic reply has been split into a new topic: how do I import reference sequences for feature classification?

Please keep replies on-topic in the future.

system · May 31, 2019, 7:48pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.