Hi there,
We’ve done some 16S rRNA amplicon sequencing with primers that Illumina says are used to sequence the V3 and V4 variable regions of the 16S rRNA gene.
Forward primer:
5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG-3’
Reverse primer:
5’-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC-3’
Here's my qiime dada2 denoise-paired
command:
qiime dada2 denoise-paired \
--i-demultiplexed-seqs demux.qza \
--p-trim-left-f 17 \
--p-trim-left-r 21 \
--p-trunc-len-f 294 \
--p-trunc-len-r 216 \
--o-table table.qza \
--o-representative-sequences rep-seqs.qza \
--o-denoising-stats stats.qza \
--p-n-threads 8 \
--verbose
And here's my qiime feature-classifier extract-reads
command:
qiime feature-classifier extract-reads \
--i-sequences silva-138-99-seqs.qza \
--p-f-primer CCTACGGGNGGCWGCAG \
--p-r-primer GACTACHVGGGTATCTAATCC \
--p-min-length 400 \
--p-max-length 500 \
--o-reads ref-seqs.qza
My understanding is that the primers we're using produce amplicons of ~464 bp in length (see this forum post). So, by using --p-trim-left-f 17
and --p-trim-left-r 21
in the qiime dada2 denoise-paired
step, I'd end up with amplicons of length (464 − 17 − 21) = 426 bp. Is that correct? If so, what would be the best values to use for --p-min-length
and --p-max-length
in qiime feature-classifier extract-reads
?
I notice that someone used --p-min-length 400
and --p-max-length 450
in a similar situation (with the same primers and similar trimming), and got a thumbs up from @Mehrbod_Estaki. When I ran my analysis initially, I used --p-min-length 400
and --p-max-length 500
in the qiime feature-classifier extract-reads
command, but I guess I'm wondering if there's any significant difference between using, say ...
- a tight interval like:
--p-min-length 420
and--p-max-length 430
- or, a wider interval like:
--p-min-length 400
and--p-max-length 450
- or, an even wider interval like:
--p-min-length 400
and--p-max-length 500
Is there any 'rule of thumb' people use for this? Or does it even matter very much?
Some relevant info (from the qiime feature-classifier extract-reads
usage page):
--p-min-length INTEGER Minimum amplicon length. Shorter amplicons are
Range(0, None) discarded. Applied after trimming and truncation, so
be aware that trimming may impact sequence
retention. Set to zero to disable min length
filtering. [default: 50]
--p-max-length INTEGER Maximum amplicon length. Longer amplicons are
Range(0, None) discarded. Applied before trimming and truncation,
so plan accordingly. Set to zero (default) to
disable max length filtering. [default: 0]
Thanks as always for the help!
Kevin