dada2 trim for Bacteria and Archaea

Hi everyone, i have attached an interactive quality plot of my archaea and bacteria paired end sequencing and was hoping someone could confirm or suggest the trimming?

For archaea:
qiime dada2 denoise-paired
--i-demultiplexed-seqs paired-end-demux.trim.qza
--p-trunc-len-f 120
--p-trunc-len-r 120
--o-representative-sequences rep-seqs-dada2.qza
--o-table table-dada2.qza
--o-denoising-stats dada2-stats.qza \

For Bacteria
qiime dada2 denoise-paired
--i-demultiplexed-seqs paired-end-demux.trim.qza
--p-trunc-len-f 180
--p-trunc-len-r 0
--o-representative-sequences rep-seqs-dada2.qza
--o-table table-dada2.qza
--o-denoising-stats dada2-stats.qza \

Would you say this is correct. I'm still unclear how good the quality score has to be before choosing where to trim.

Best wishes

Hello Martyn,

The trimming position depends on both the quality and on the region you are trying to amplify. After trimming, you still want to reads to overlap a bit so they can be merged, and this depends on the length of the region sequenced.

What region did you sequence for Bacteria and how long is it? What about for Archaea?

Hi Colin , Thanks for the reply. This is what i wasn't sure about. For Archaea it was the v6-v8 region.
For Bacteria it was v3-v4 region. Both of them on the PCR was around 500 bp.

Hope this helps?

Ah, I think this might cause a problem with the truncation-length you provided in your --p-trunc-len-* commands.

Based on the trimming discussion that Liz linked to you, how much overlap between archaea reads would expect after passing --p-trunc-len-f 120 and --p-trunc-len-r 120?

:dna: :straight_ruler: :thinking:

Hi all. i am running qiime2 to analyse bacteria which target v3-v4 region .

What is the reason for such low number of merged and non chimeric sequences?

I looked at this after DADA2 where i trimmed my forward and reverse to around 220?

Hello again Martyn,

I've merged this in with our current thread, as merging is related to the overlap settings and the length of your amplicon.

Have you been able find the expected length of your bacteria V3-V4 amplicon and calculate the overlap after trimming at 220 bp each way?

Hi Colin, i'm not sure. It was showing around 500bp on the gel. It worked fine for Archaea and Fungi but i am unsure about this one.

Do you think it is best to not trim at all?

best wishes

Ah, OK. That's good to know. Do you know the specific primers used? I ask because many primers are labeled with their location on the E. coli 16S gene. For example, the 515f and 806r primers start at 515 and end at 806, making an amplicon ~291 bp long.

Hi Colin, thank you for taking the time to help me.

My primers are
Bakt_341F CCTACGGGNGGCWGCAG Bacteria forward
Bakt_805R GACTACHVGGGTATCTAATCC Bacteria reverse

I got the idea to use these ones from here:

Best wishes.

Thanks for posting that!

So 805 - 341 = 464 bp long amplicon.

With trimming at 220 at forward and reverse, how long would you expect the overlap to be?
(You can refer to this guide, if ya' want:
DADA2: Decreasing feature number as more sequences are maintained - #2)

Hi Colin, thank you for the assistance. I am thinking there would not be any overlap and would be too short?

I could be wrong because i am very new to this and i am still learning.

Best wishes.

I've been there too! We are always happy to help!

I agree. Do you want to show me how you calculated the overlap/gap, and how long you would expect the gap to be after trimming at 220, and how long you expect the overlap to be if you did not trim at all?

1 Like

Thanks Colin, I'm not sure if this is correct but i added 220+220=440 which would be too short?

I suspect it would be fine if I did not trim at all but then it wouldn't remove poor reads?

Again, thank you for your help. I am learning a lot here.

Best wishes

1 Like

That's right. 440 of coverage is 24 bp less than 464, resulting in a 24 bp gap.

DADA2 requires 12 bp of overlap by default. If you used the full forward read (250 bp long), where should you trim the reverse read to get 12 bp of overlap?

Thanks Colin, so if I didn't trim the forward read and trimmed the reverse around the 230 mark I would have 16bp of overlap. Is this correct?

qiime dada2 denoise-paired
--i-demultiplexed-seqs paired-end-demux.trim.qza
--p-trunc-len-f 0
--p-trunc-len-r 230
--o-representative-sequences rep-seqs-dada2.qza
--o-table table-dada2.qza
--o-denoising-stats dada2-stats.qza \

Thank you very much for your time on this.

Best wishes

1 Like

That's correct!

Try that command and let me know what you find. You could also run 226 and see how that compares.


Thank you so much for your help Colin, it is running now and we will see how it looks. The DADA2 step was the hardest part for me to wrap my head around.

Thank you.


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.