Hello friends-
I have just been assigned the bioinformatics part of a study after someone left their position.
My end goal is to classify sequences in samples that used ANML primers for COI.
After trimming primers I looked at the output quality plots and min sequence length:
Is it weird that the min sequencing length is so different between F and R? Ive don't this analysis twice with my own projects- and I have never had this issue.
I believe it is impacting my dada2 downstream- I am getting Plugin error from dada2:
An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
I have tried many different trunc lengths- but I suppose that is an issue to tackle later down the line?
I am using qiime2-2019.10 because that is what my classifier was trained on.
R version 3.5.1 (2018-07-02)
Loading required package: Rcpp
DADA2: 1.10.0 / Rcpp: 1.0.2 / RcppParallel: 4.4.4
Filtering The filter removed all reads: /var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_f/428_138_L001_R1_001.fastq.gz and /var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_r/428_139_L001_R2_001.fastq.gz not written.
The filter removed all reads: /var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_f/296-2_96_L001_R1_001.fastq.gz and /var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_r/296-2_97_L001_R2_001.fastq.gz not written.
Some input samples had no reads pass the filter.
.........................................................................................................................................x....................x.................................................................................................
Learning Error Rates
289560551 total bases in 1067489 reads from 4 samples will be used for learning the error rates.
244454981 total bases in 1067489 reads from 4 samples will be used for learning the error rates.
Denoise remaining samples ..................................................................................................................................................................................................................................................Error in dada(drpR, err = errR, multithread = multithread, verbose = FALSE) :
Invalid derep$quals matrix. Quality values must be positive integers.
Execution halted
Traceback (most recent call last):
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 257, in denoise_paired
run_commands([cmd])
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 36, in run_commands
subprocess.run(cmd, check=True)
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run_dada_paired.R', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/forward', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/reverse', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/output.tsv.biom', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/track.tsv', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_f', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_r', '0', '229', '0', '0', '2.0', '2.0', '2', 'consensus', '1.0', '0', '1000000']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2cli/commands.py", line 328, in call
results = action(**arguments)
File "</Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/decorator.py:decorator-gen-459>", line 2, in denoise_paired
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 240, in bound_callable
output_types, provenance)
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 383, in callable_executor
output_views = self._callable(**view_args)
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 272, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
hmm I am still trying to figure out my issue! I do hope I can get this figured out.
I am not sure how to figure out which samples might have the weird Phred scores
Is there a way to remove samples after demultiplex? I saw that someone was able to get past this error by removing extremely low read samples. I would rather not have to go through demultiplex again.
could I just get an opinion on what my trunc F and R should be ? I have tried multiple lengths- but maybe I am just getting this wrong PrimerTrimmed_eDNA_Invas3.qzv (300.6 KB)
Did we ever figure out if the files listed in eDNA_ManifestHighQual.csv contained raw fastq data or if it was already trimmed / processed in some way?
(I'm trying to track all the possibilities!)
Ah, I meant that as two different trunc length options. I'm not sure which one will work best.
(For reference, trim removes bases from the start of the read and I don't see any need for that in your data.)
That's great! Did you view the DADA2 stats file after? Want to post those here so I can take a look?
Let me know what they say! If they can give you unprocessed fastq files, that's ideal.
Hiiii
with all of the trunc lengths I have tried I have gotten the same(ish) error code about negative quality scores. (Invalid derep$quals matrix. Quality values must be positive integers.)
I feel pretty confident that the fastq files are raw - the certainly contain primer sequences
It seems like I have used trunc lengths very similar to what you suggested and still have gotten the error....
which leads me to believe there is something else going on here