16S V3-V4 dada2 losing about 80-85% of sequence reads

Hi All,

I have sequenced the 16S V3-V4 region using the below primers presented in Klindworth et al. 2013 with a 2x300 MiSeq run.

S-D-Bact-0341- b-S-17 , 5' -CCTACGGGNGGCWGCAG-3'

when using dad2 to denoise and merge I lose about 80-90% of my reads. About 45-55% of them are being lost at the filtering stage, about 25-30% are being lost at the denoising and merging stage, and about 5-10% are being lost at the chimera stage. I know the quality does not look so hot, but I would expect higher than that to come through.

I have trimmed off the first 11 base pairs from the forwards reads and played around with different numbers for the 3' end on both the forward and reverse reads. I have even played around with the --p-min-overlap and the --p-max-ee for both the forward and the reverse with not much improvement.

For example my commands are:
qiime dada2 denoise-paired --i-demultiplexed-seqs demuxed-reads-16S.qza --p-trim-left-f 11 --p-trunc-len-f 280 --p-trunc-len-r 200 --o-table featuretable_frequency_16S_11_280_200.qza --o-representative-sequences featuredata_sequences_16S_11_280_200.qza --o-denoising-stats denoisingstats_16S_11_280_200.qza

qiime dada2 denoise-paired --i-demultiplexed-seqs demuxed-reads-16S.qza --p-trim-left-f 11 --p-trunc-len-f 287 --p-trunc-len-r 206 --p-min-overlap 4 --o-table featuretable_frequency_16S_11_287_206_ovlap_4.qza --o-representative-sequences featuredata_sequences_16S_11_287_206_ovlap_4.qza --o-denoising-stats denoisingstats_16S_11_287_206_ovlap_4.qza

qiime dada2 denoise-paired --i-demultiplexed-seqs demuxed-reads-16S.qza --p-trim-left-f 11 --p-trunc-len-f 280 --p-trunc-len-r 200 --p-min-overlap 4 --p-max-ee-f 2.5 --p-max-ee-r 2.5 --o-table featuretable_frequency_16S_11_280_200_ovlap_4_ee2_5.qza --o-representative-sequences featuredata_sequences_16S_11_280_200_ovlap_4_ee2_5ovlap_4_ee2_5.qza --o-denoising-stats denoisingstats_16S_11_280_200_ovlap_4_ee2_5.qza

According to Klindworth et al. 2013 the fragment size should be 464bp. Therefore, there should be overlap. I do not know why I am losing so much (about half the sequences that passed quality) when denoising and merging. Is this normal? This is my first time working with these primer pairs. I

Good afternoon Pamela,

Thank you for your detailed post.

Yes. Long amplicons pose a challenge during Illumina sequencing because quality drops at the ends of the reads. This leads to a tradeoff between 1) trimming longer so reads join, but quality is low or 2) trimming shorter so quality is high, but joining is impossible.

When I have reads like this, I try to find the happy medium value that makes the reads just long enough to join while removing as much of the low quality ends as possible.

It sounds like you are also trying multiple settings to find what works best! :+1:

IMHO, the V3-V4 region poses enough problems that it's not an obvious upgrade over V4.
When using 16S V4, the 250bp amplicon is easy to sequence in a few different ways.



Thanks for your response. In other amplicons I work with (just as long) I never had this much of a decrease (only have 15-20% of the reads left). I will continue to change the settings and see what I feel comfortable with.

If quality is usually higher, this Illumina run may be bad. You can resequence (especially if Illumina will pay for it) and see if quality improves.

Unfortunately this not an option. but thanks. I will just report on what I have. I am also working on just using the forward read as well to see what that gives me. I was just trying to see if others who used the same primers had similar issues.

