Can I run dada2single on a demultiplexed paired end file?

Running dada2paired doesn’t work because I have such horrible quality reads to start followed by pretty acceptable ones after that in the reverse read. I want to run dada2 on it still (and it ran without bugs and gave me the output files I asked for). Are these good to use?

Note: I was able to run dada2 on the paired end reads once I trimmed the first 75 bases off the reverse read, but that seems like a lot of data to lose.

Thanks.

To answer the title question:

Can I run dada2single on a demultiplexed paired end file?

Yes, absolutely! The reverse reads will simply be ignored. You can see this in the help docs for the action:

...
Inputs:
  --i-demultiplexed-seqs ARTIFACT SampleData[SequencesWithQuality |
    PairedEndSequencesWithQuality]
                         The single-end demultiplexed sequences to be
                         denoised.                                  [required]
...

so quality improves at longer base positions? That sounds very unusual — could you please share the demux quality plots QZV?

Definitely — not an unusual situation, hence why dada2 denoise-single accepts single or paired reads (you import paired-end reads, demux, then discover the reverse reads are no good or too short to join)

it is a lot to lose, but still worth it if it improves the length of output reads and you trust the data. Again, this sounds rather unusual.

2 Likes

undetermined-sep10-demux.qzv (298.9 KB)
qzv file above.

I should also note that when I run dada2paired on the whole sequence (without trimming the first 75 bases), I get the following error:

Plugin error from dada2:

No reads passed the filter. trunc_len_f (250) or trunc_len_r (250) may be individually longer than read lengths, or trunc_len_f + trunc_len_r may be shorter than the length of the amplicon + 20 nucleotides (the length of the overlap). Alternatively, other arguments (such as max_ee or trunc_q) may be preventing reads from passing the filter.

Debug info has been saved to /local_scratch/9427094/qiime2-q2cli-err-eew48kv0.log
/var/spool/slurm/job9427094/slurm_script: line 16: --verbose: command not found

and to reiterate: dada2paired works if I trim the first 75 bases of the reverse read and dada2single works on the demultiplexed paired end file (which, as you mentioned above, should work so that's good).

wow, what an unusual-looking quality profile. Sometimes we see short sections of lower quality if primers are on the 5’ end, so I wonder if maybe this could be caused by other low-complexity regions there or even an adapter attached to the 5’ end of the reverse reads? In any case, I think you are doing the right thing to trim the first 75 nt and hope joining the remaining length to the forward sequences yields useful data.

that makes sense — all reverse reads will be filtered out if you don’t trim, since the error rate is too high.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.