Hello,
I'm processing environmental 16S samples using V3V4 primers (341F/805R) I am losing a majority of my reads at the denoising step. I know this question has been asked before, and it seemed like the solution was to increase the number of overlapping base pairs, which is what I did, but that led to more reads being lost.
Here is the demux file after primer trimming.
Code for primer trimming:
qiime cutadapt trim-paired
--i-demultiplexed-sequences /qza/v3v4.qza
--p-cores 8
--p-front-f CCTACGGGNGGCWGCAG ###341F
--p-front-r GACTACHVGGGTATCTAATCC ###805R
--p-discard-untrimmed
--p-no-indels
--o-trimmed-sequences /qza/v3v4_trimmed.qza
v3v4_trimmed.qzv (326.2 KB)
Code for DADA2 including truncation step:
###v3v4 primers: 805-341=464 amplicon; fwd trunc @ 261; rvs trunc @ 220; (261+220)-464 = 17 bp overlap
qiime dada2 denoise-paired
--i-demultiplexed-seqs /qza/v3v4_trimmed.qza
--p-trunc-len-f 261
--p-trunc-len-r 220
--o-table v3v4_table.qza
--o-representative-sequences v3v4_rep_seqs.qza
--o-denoising-stats v3v4_denoising_stats.qza
In my code, I truncated at the base pairs I've chosen because the quality of the reads towards the 3' ends do decrease significantly, but still allowing more than the default 12 bp overlap.
Here is the denoising stats file. I am losing 60-80% of my reads using the parameters in my code.
v3v4_denoising_stats.qzv (1.2 MB)
I tried again using less conservative truncation parameters to allow for more overlap as I thought that was the issue, but results ended up being worse, and I lose even more reads (70-80%).
New DADA2 code:
###v3v4 primers: 805-341=464 amplicon; fwd trunc @ 283; rvs trunc @ 230; (283+230)-464 = 49 bp overlap
qiime dada2 denoise-paired
--i-demultiplexed-seqs /qza/v3v4_trimmed.qza
--p-trunc-len-f 283
--p-trunc-len-r 230
--o-table v3v4_table2.qza
--o-representative-sequences v3v4_rep_seqs2.qza
--o-denoising-stats v3v4_denoising_stats2.qza
And the new denoising stats file:
v3v4_denoising_stats2.qzv (1.2 MB)
Thank you.