Understanding min sequence length of F and R/ quality scores and how it is impacting dada2 downstream

Hello friends-
I have just been assigned the bioinformatics part of a study after someone left their position.
My end goal is to classify sequences in samples that used ANML primers for COI.

After trimming primers I looked at the output quality plots and min sequence length:


Is it weird that the min sequencing length is so different between F and R? Ive don't this analysis twice with my own projects- and I have never had this issue.
I believe it is impacting my dada2 downstream- I am getting
Plugin error from dada2:

An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

I have tried many different trunc lengths- but I suppose that is an issue to tackle later down the line?

I am using qiime2-2019.10 because that is what my classifier was trained on.

Thanks for help!

Hello Kara,

Can you post the full error from DADA2? That should give us more context for what may have caused this error.

Yes? My best guess is that R2 has been trimmed by another program before importing into a Qiime2 artifact.

R version 3.5.1 (2018-07-02)
Loading required package: Rcpp
DADA2: 1.10.0 / Rcpp: 1.0.2 / RcppParallel: 4.4.4

  1. Filtering The filter removed all reads: /var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_f/428_138_L001_R1_001.fastq.gz and /var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_r/428_139_L001_R2_001.fastq.gz not written.
    The filter removed all reads: /var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_f/296-2_96_L001_R1_001.fastq.gz and /var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_r/296-2_97_L001_R2_001.fastq.gz not written.
    Some input samples had no reads pass the filter.
    .........................................................................................................................................x....................x.................................................................................................
  2. Learning Error Rates
    289560551 total bases in 1067489 reads from 4 samples will be used for learning the error rates.
    244454981 total bases in 1067489 reads from 4 samples will be used for learning the error rates.
  3. Denoise remaining samples ..................................................................................................................................................................................................................................................Error in dada(drpR, err = errR, multithread = multithread, verbose = FALSE) :
    Invalid derep$quals matrix. Quality values must be positive integers.
    Execution halted
    Traceback (most recent call last):
    File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 257, in denoise_paired
    run_commands([cmd])
    File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 36, in run_commands
    subprocess.run(cmd, check=True)
    File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '['run_dada_paired.R', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/forward', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/reverse', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/output.tsv.biom', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/track.tsv', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_f', '/var/folders/yp/gmbd6_256wnbb6qyf3s3v_6r0000gn/T/tmpkg09sibn/filt_r', '0', '229', '0', '0', '2.0', '2.0', '2', 'consensus', '1.0', '0', '1000000']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2cli/commands.py", line 328, in call
results = action(**arguments)
File "</Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/decorator.py:decorator-gen-459>", line 2, in denoise_paired
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 240, in bound_callable
output_types, provenance)
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 383, in callable_executor
output_views = self._callable(**view_args)
File "/Users/karasnow/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 272, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

1 Like

Invalid derep$quals matrix. Quality values must be positive integers.

This is the core of the error!

Once I'm back at my computer, I can help you more with this error.

Here are two threads related to this error:

It sounds pretty rare! I'm not sure what's best here.

While you review the threads, let's see what the other mods have to say!

hmm I am still trying to figure out my issue! I do hope I can get this figured out.
I am not sure how to figure out which samples might have the weird Phred scores

Is there a way to remove samples after demultiplex? I saw that someone was able to get past this error by removing extremely low read samples. I would rather not have to go through demultiplex again.

Hi @KaraS ,

Yes, I think demux filter-samples is the action that you are looking for:

Good luck!

could I just get an opinion on what my trunc F and R should be ? I have tried multiple lengths- but maybe I am just getting this wrong
PrimerTrimmed_eDNA_Invas3.qzv (300.6 KB)

Good afternoon,

Did we ever figure out if the files listed in eDNA_ManifestHighQual.csv contained raw fastq data or if it was already trimmed / processed in some way?
(I'm trying to track all the possibilities!)

Here's where I would try trimming.

Where did you try trimming when running DADA2?
What did the DADA2 stats tell you after using these settings?

Hi thank you for the reply!
So would this mean
for forward:
trim. 257
trunc 275
and reverse
trunc 275?

Just by visually inspecting the fastq they look like raw data- with primers still attached. But I emailed the sequencing center to get more details

Here are different trunc lengths I have tried:

--p-trunc-len-f 187
--p-trunc-len-r 187 \

--p-trunc-len-f 276
--p-trunc-len-r 276 \

--p-trunc-len-f 291
--p-trunc-len-r 273 \

--p-trunc-len-f 200
--p-trunc-len-r 240 \

Yes!

Ah, I meant that as two different trunc length options. I'm not sure which one will work best.
(For reference, trim removes bases from the start of the read and I don't see any need for that in your data.)

That's great! Did you view the DADA2 stats file after? Want to post those here so I can take a look?


Let me know what they say! If they can give you unprocessed fastq files, that's ideal.

Hiiii :slight_smile:
with all of the trunc lengths I have tried I have gotten the same(ish) error code about negative quality scores. (Invalid derep$quals matrix. Quality values must be positive integers.)
I feel pretty confident that the fastq files are raw - the certainly contain primer sequences

It seems like I have used trunc lengths very similar to what you suggested and still have gotten the error....
which leads me to believe there is something else going on here

Totally!
(My apologies, I thought you had fixed the error and had moved on to optimizing settings.)

In the DADA2 github repo, I found this thread: Error in dada(derepFs, err = errF, multithread = TRUE) : Invalid derep$quals matrix. Quality values must be positive integers. · Issue #838 · benjjneb/dada2 · GitHub

I'm not sure we have that option in Qiime2, but we may be able to convert the quality scores when we import the data.

Let's see what the devs recommend!
EDIT: The importing tutorial is here. Looks like it may be under construction.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.