Hi I am having trouble figuring out where to trim my data ! Any suggestions? I am trying to use the dada2 denoise-paired
qiime dada2 denoise-paired
--i-demultiplexed-seqs demux-paired-end.qza
--p-trim-left-f VALUE
--p-trim-left-r VALUE
--p-trunc-len-f VALUE
--p-trunc-len-r VALUE
--p-n-threads 12
--o-representative-sequences rep-seqs.qza
--o-table table.qza
I attached a copy of my visualization below for reference! The left is my forward and the right is my reverse
Hi @desiree757,
Trim/truncating data has quite a bit of subjectivity to it depending on the data, primer pair used, overlap region etc. There are several discussion with detailed explanation and examples of picking these parameters floating around the forum that you would benefit from reading.
However, the plots you are showing are rather odd, at least from the norm we see around here with only 75bp length sequences. So we'll need a bit more info to decide.
What is the target gene here and what primers were used? I guess more importantly, is there an overlap region? If so, what is the size of that region? Also, are these Illumina reads? Has there been any other quality control/trimming done do these prior to importing?
This was done using Illumina sequencing. The target gene is 16srRna. The primers I used were 515f and 926r. There has been no other quality control trimming prior to importing !
Hmm, are the plots you are showing then zoomed in to the 75bp or is this actually a 2 x 75bp Illumina run? If they are zoomed in then please either share the non-zoomed image or better share the actual .qzv file if you can.
I ask because the primers you describe 926-515 would give an expected amplicon size of ~ 411 bps which is much longer than 2x75bp. Meaning you would have sequenced 2 areas very far from each other without any overlap which means they cannot be merged together. This is non-conventional as far as sequencing goes since the reads are considered pretty short and would suffer in resolution. If this is in fact the scenario then I would just discard the reverse reads and use the forward reads only. From the plots above and given that the reads are already pretty short to begin with I would not trim/truncating anything and see how that turns out.
Hi @Mehrbod_Estaki ! so I am attempting to use just the forward reads right now. How do I skip the trim step but still get the tables I need to generate a tree for phylogentic diversity analyses?
--p-trunc-len INTEGER Position at which sequences should be
truncated due to decrease in quality. This
truncates the 3' end of the of the input
sequences, which will be the bases that were
sequenced in the last cycles. Reads that are
shorter than this value will be discarded.
If 0 is provided, no truncation or length
filtering will be performed [required]
So, just put 0 to not truncate.
--p-trim-left INTEGER Position at which sequences should be
trimmed due to low quality. This trims the
5' end of the of the input sequences, which
will be the bases that were sequenced in the
first cycles. [default: 0]
And here by default no trimming is done. So you can just leave this one blank.
Hi @desiree757,
Could you post the content of the /tmp/qiime2-q2cli-err-r9_lmd7m.log or re-run your code with the --verbose tag included and paste the output here please. Also, don't forget the slashes at the end of each line, in case that's the exact command you have typed.
Hi @Mehrbod_Estaki
This is what I got! and I did add the slashes but they won't paste in the forum.
File "</home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-440>", line 2, in denoise_single
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 365, in callable_executor
output_views = self._callable(**view_args)
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 187, in denoise_single
band_size='16')
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 163, in _denoise_single
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.
R version 3.4.1 (2017-06-30)
Loading required package: Rcpp
The filter removed all reads: /tmp/tmp2dphk_8h/Female-1-2_S14_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-2-1_S15_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-2-2_S16_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-3-1_S17_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-3-2_S18_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-4-1_S19_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-4-2_S20_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-5-1_S21_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-5-2_S22_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-6-1_S23_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Female-6-2_S24_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-1-1_S1_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-1-2_S2_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-2-1_S3_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-2-2_S4_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-3-1_S5_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-3-2_S6_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-4-1_S7_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-4-2_S8_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-5-1_S9_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-5-2_S10_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-6-1_S11_L001_R1_001.fastq.gz not written.
The filter removed all reads: /tmp/tmp2dphk_8h/Male-6-2_S12_L001_R1_001.fastq.gz not written.
Some input samples had no reads pass the filter.
Learning Error Rates
Not all sequences were the same length.
Initializing error rates to maximum possible estimate.
Error rates could not be estimated.
Error in err[c(1, 6, 11, 16), ] <- 1 :
incorrect number of subscripts on matrix
Calls: dada
Execution halted
Traceback (most recent call last):
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 152, in _denoise_single
run_commands([cmd])
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 36, in run_commands
subprocess.run(cmd, check=True)
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run_dada_single.R', '/tmp/qiime2-archive-o6_pl5f1/a37454d7-9b88-487a-9b6e-047929aa36d5/data', '/tmp/tmp2dphk_8h/output.tsv.biom', '/tmp/tmp2dphk_8h/track.tsv', '/tmp/tmp2dphk_8h', '0', '0', '2.0', '2', 'Inf', 'consensus', '1.0', '1', '1000000', 'NULL', '16']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py", line 274, in call
results = action(**arguments)
File "</home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-440>", line 2, in denoise_single
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 365, in callable_executor
output_views = self._callable(**view_args)
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 187, in denoise_single
band_size='16')
File "/home/smith5mr/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 163, in _denoise_single
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
Hi there @desiree757! How many samples are in the source dataset? The log indicates many/most/all of your samples aren't passing the filter step. As well, this error
might suggest that all of your reads are unique (I think), which means there are only singletons present.
Hi @thermokarst my professor suggested making a manifest files with my data first and running it that way ! any suggestions for that? I started another post on how to create the manifest files as I do not understand how to format in excel!