16s V1-V3, V3-V4 and V4-V5 region with 2x300 MiSeq run

venkar · August 1, 2018, 9:46pm

Hi,
I am processing samples that were sequenced on MiSeq using 2x300 for the V1-V3, V3-V4 and V4-V5 regions.
We are covering amplicon sizes of 488bp, 460bp and 412bp respectively. I was looking at the Demux results and see that the quality drop ~250 bp for the forward read and ~200bp for the reverse reads (as expected). With this, I want to run denoising step and chose to do it with DADA2. I know that I need not merge PE sequences for dada2, but I am not sure if I will get an overlap for V1-V3 region ?

There are few things I want to know more about.
1- I will trim 20 bp ( as they are primers) in the left of the forward and reverse reads and limit my read length to 260bp in forward and 200bp in reverse . This leaves me with ~ 400bp - 420bp where I will get a good overlap for V4-V5 and somewhat okay overlap for V3-V4. I am not sure if I will get a good overlap for V1-V3.
How should I proceed ?
2- Does DADA2 trim first, merge later ? How does these quality filtering affect in how DADA2 cluster the reads?
3- In a previous experiment, I was asked to use my forward reads as I had 2x150bp reads to look at the V4-V5 region. Should I do the same for V1-V3 as well in this case as I am not sure there might be too much overlap?

I really appreciate your answer as this will help me with my future work as well.

Thanks,
RV

venkar · August 1, 2018, 10:09pm

Additionally, I used bbmerge with default settings to see how the merge goes. i see that only 5-10% of my reads are merged. The rest are too ambiguous to merge.

This now again makes me question how DADA2 would perform if the insert size is larger than the sequence length.

Nicholas_Bokulich · August 2, 2018, 2:03pm

2X300 should give plenty of overlap for a 488 bp amplicon. Trimming from the 5' ends of each read does not matter — only trimming from the 3' ends. So if you trim these reads to 260 forward + 200 reverse then no you will not have enough read length to merge. See if you can adjust this to get a minimum 20 nt overlap to permit merging.

dada2 denoises first and merges later. Trimming is performed manually (as shown in the tutorials), though there are parameters to perform dynamic trimming based on q-score prior to denoising.

Read the dada2 paper for more details on how this works.

give it a try first — the dada2 stats output will tell you how many reads are dropped at each stage (including merging) so will give you a good idea of whether you have enough read length. If you do not have enough length to overlap, then yes I would recommend just using the forward reads.

This could be due to low-quality bases at the 3' ends of the reads. This is why we recommend trimming low-quality bases prior to using dada2 (or merging) — see the tutorials for examples of this.

I hope that helps!

venkar · August 8, 2018, 7:38pm

Thanks for your answers. That was very helpful.

I did not trim, but output1.txt (3.0 KB)
I used --p-trunc-len 260 and --p-trim-left 20 in the dada2 denoise command and I have the output as attached.
I can see that there is filtering and chimera detection taking place, but no information about the number of reads post merging ?

I did the same experiment with only forwards reads and the number of input reads and filtering is all very comparable. So when does dada2 merge ? From your answer, you had mentioned that it denoise first and merges later. If that is the case, I am not sure how my merging has fared ?

Thanks,
Raghavee

Nicholas_Bokulich · August 9, 2018, 6:07pm

Hi @venkar,

It sounds like you may be using the wrong command — you are certainly using an outdated version of qiime2.

Are you using dada2 denoise-paired or denoise-single? Only denoise-paired will denoise and merge paired-end reads. (it looks like you are running an old version of qiime2 as well; you should update to the latest release so we can better support your question)

In the latest version, filtering and joining (read merging) stats are listed in the output file.

Make sure you are using denoise-paired if you want to use the reverse reads.

If you are still having trouble, please post the commands you are using, and the output run stats QZA files.

I hope that helps!

venkar · August 15, 2018, 5:35pm

Hi,
Yes it was because I had an older version installed.

Thanks!

system · September 15, 2018, 11:49pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.