Am I overthinking this? Dada2 and Greengenes2

I have paired end sequencing (250bp) and trying to use the greengenes2 database. I have followed the V4 region tutorial here: Introducing Greengenes2 2022.10 - #32 by wasade

I read that ASVs are for 90nt, 100nt or 150nt and that the forward reads should be used and trimmed down to 150nt. I might be overthinking this and getting myself confused. This is the command I have used for DADA2 to truncate my sequences to 150nt.

qiime dada2 denoise-single \
--i-demultiplexed-seq Batch_1/trimmed_515F_806R/batch1_trimmed.qza \
--p-trim-left 0 \
--p-trunc-len 150 \
--o-representative-sequences Batch_1/denoise_forward_only/forward_rep_seqs.qza \
--o-table Batch_1/denoise_forward_only/forward_table.qza \
--o-denoising-stats Batch_1/denoise_forward_only/forward_denoising_stats.qza

This is the quality of the reads:

Because this is just the forward read, should I be truncating at 150nt so I have bases 0 -150. Or should I be trimming the left and right end of the sequence so I have the sequence bases (70nt - 220nt).

--p-trunc-len 200
--p-trim-left 70
```

Hi @newberrf,

The ASVs placed in Greengenes2 are derived from Qiita, where the standard processing only uses the forward read. The 150nt would be from the 3' end of the fwd primer (i.e., does not include the primer). Does that make sense?

Best,
Daniel

1 Like

Hi,
Yes that makes sense. I am using the forward reads instead of forward and reverse.

So --p-trunc-len cuts the length down from the right side (3' end) like this:

It goes from sequence base 100 to 250.

Or does it cut the length down from the left side and goes from sequence base 0 to 150?

I think I've just gotten myself really confused and don't know what way is up right now.

I really appreciate the help

I'm not really sure to be honest, I don't have experience with that plugin

Okay, thank you for the response.

I think I've figured it out. I was being a bit of an idiot.

If you want to compare data sets, processing all of them the same way is important!
Qiita uses Deblur with these settings (90, 100, 150) and other lengths like 200 and 250 too!
These are implemented in DADA2 as --p-trunc-len as they truncate (remove from the end) of the read.
DADA2 also addes the trim-left setting to trim from the very start of the read. Deblur does not have this.

Correct!

Correct!

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.