With DADA2, what happens to reads that don't merge?

dree · October 18, 2024, 8:04pm

Does DADA2 can give me ASV results from pair end reads after denoising step even if they necessarily don't get merged? And if I'm getting the ASVs by this way(pair-ends not overlapping and hence not merging) are those reliable?

I didn't het this question anywhere before hence posting.

I'm running QIIME 2 2024.5 Amplicon Distribution using conda.

colinbrislawn · October 18, 2024, 8:22pm

Hello @dree,

Welcome to the forums! :qiime2:

All reads that don't merge are removed! They don't appear as ASVs or as counts in the table.

pair-ends not overlapping and hence not merging) are those reliable?

Uh... yes, they are reliably absent, because DADA2 throws them away.

This is why it's so important to pick DADA2 trimming settings that maximize the number of reads that can be merged!
This process is covered in this part of the Atacama Soils tutorial.

Do your reads overlap enough for them to merge?

(I've edited your post so it has a short title and the full question is in the body of the post.)

dree · October 19, 2024, 10:33am

Hi! Thank you for your reply.

I'm performing denoising using DADA2 directly on my reads.qza (which are pair-ends 2*150 and primer-removed)

colinbrislawn · October 19, 2024, 12:58pm

Cool. 150 bp forward and 150 bp in reverse. I have made a simple cartoon of your reads:

Read1 forward
|--------------->
Read2 reverse
<---------------|

What amplicon did you sequence?
How long is that amplicon?

dree · October 21, 2024, 12:33pm

Hi!
It's 313 long amplicon.
Is there any option of just concatenation of the pair-end reads and not necessarily overlapping(merging)?

colinbrislawn · October 21, 2024, 1:00pm

There is an option to 'just concatenation' in DADA2, but it's not included in the Qiime2 plugin.

So why are we missing this feature?

Well, it turns out that it breaks most other programs downstream:

github.com/qiime2/q2-vsearch

Expose `fastq_join` command to concatenate (not merge) PE reads

opened 07:45PM - 07 Aug 24 UTC

colinbrislawn

Vsearch supports --fastq_join in which "sequences are not merged as with the fas…tq_mergepairs command, but simply joined with a gap." **Addition Description** It would work like the existing [merge-pairs (q2 plugin docs)](https://docs.qiime2.org/2024.5/plugins/available/vsearch/merge-pairs/). **Current Behavior** only merging is supported via overlap **Proposed Behavior** Add a new semantic type for formally-paired (now concatenated!) Illumina reads Add warnings and docs about how this differs from most other methods **Questions** 1. Who wants to make this new semantic type? 2. What programs take discontiguous amplicons as input? **Refs for vsearch** Forums [x-ref](https://forum.qiime2.org/t/concatenate-r1-and-r2-for-reads-that-cant-join/23646/). 803f-1392r amplicons is 590 reads, using PE300 [x-ref](https://forum.qiime2.org/t/handling-non-merging-paired-end-803f-1392r-amplicon-sequences-in-qiime-2-for-asv-generation/31196) **Refs for DADA2** https://forum.qiime2.org/t/dada2-option-justconcatenate/21661/2 https://forum.qiime2.org/t/classification-of-non-overlapping-reads-treated-with-justconcatenate/17914/7

are pair-ends 2*150
It's 313 long amplicon.

Read1 |--------------->                 150
Read2                 <---------------| 150
amp   |-------------------------------| 313
gap                   ^ 13 bp, so close!

~~You can select settings so your reads can join, or~~ just use the Fordard for Reverse reads for DADA2 analysis, like a single-end run!

dree · October 27, 2024, 12:55pm

Hi!
So, can I use the vsearch and do --fastq_join, then come back to dada2 to get my ASVs?

colinbrislawn · October 27, 2024, 1:02pm

You can try running vsearch --fastq_join and then DADA2 in R.

This will not work with Qiime2. Do you have more questions about the two posts explaining why?

If you want to use QIime2 with this data, another option is Deblur.
https://docs.qiime2.org/2024.5/plugins/available/deblur/

dree · October 27, 2024, 2:43pm

Oh. Got you. And I will have to shift to other tools for the downstream analyses, am I right? So, the take home is I can't proceed with pair reads which are just "concatenated" in Qiime2?

colinbrislawn · October 27, 2024, 2:46pm

Correct!

Discontiguous amplicons break the assumptions made by many tools, including Qiiem2.

There is interest in supporting them, but new tools and methods need to be made.
It sounds like you are looking to use tools, not build new ones, which is fair!

dree · October 28, 2024, 7:32am

Thank you so much Colin,

So, one more question, is there is separate/other dedicated forum for query regarding, analysis using DADA2 in R and not in Qiime2? Or can I continue here?

colinbrislawn · October 28, 2024, 12:51pm

You can ask questions here! We have a whole category for that:

Also consider:
https://www.biostars.org/

For DADA2, lots of questions have been asked and answered on the GitHub:

I want to remind you that many of the questions you have while learning about a new method have already been asked by other folks who have learned before you. Checking to see if the question has already been asked and finding the following discussion is often faster then opening a new thread.

Basically, remember to do a lit review! Especially on the internet!

system · November 28, 2024, 6:51pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.