Hi there,
We’ve done some 16S rRNA amplicon sequencing with primers that Illumina says are used to sequence the V3 and V4 variable regions of the 16S rRNA gene.
Forward primer:
5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG-3’
Reverse primer:
5’-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC-3’
Here's my qiime dada2 denoise-paired command:
qiime dada2 denoise-paired \
--i-demultiplexed-seqs demux.qza \
--p-trim-left-f 17 \
--p-trim-left-r 21 \
--p-trunc-len-f 294 \
--p-trunc-len-r 216 \
--o-table table.qza \
--o-representative-sequences rep-seqs.qza \
--o-denoising-stats stats.qza \
--p-n-threads 8 \
--verbose
And here's my qiime feature-classifier extract-reads command:
qiime feature-classifier extract-reads \
--i-sequences silva-138-99-seqs.qza \
--p-f-primer CCTACGGGNGGCWGCAG \
--p-r-primer GACTACHVGGGTATCTAATCC \
--p-min-length 400 \
--p-max-length 500 \
--o-reads ref-seqs.qza
My understanding is that the primers we're using produce amplicons of ~464 bp in length (see this forum post). So, by using --p-trim-left-f 17 and --p-trim-left-r 21 in the qiime dada2 denoise-paired step, I'd end up with amplicons of length (464 − 17 − 21) = 426 bp. Is that correct? If so, what would be the best values to use for --p-min-length and --p-max-length in qiime feature-classifier extract-reads?
I notice that someone used --p-min-length 400 and --p-max-length 450 in a similar situation (with the same primers and similar trimming), and got a thumbs up from @Mehrbod_Estaki. When I ran my analysis initially, I used --p-min-length 400 and --p-max-length 500 in the qiime feature-classifier extract-reads command, but I guess I'm wondering if there's any significant difference between using, say ...
- a tight interval like:
--p-min-length 420and--p-max-length 430 - or, a wider interval like:
--p-min-length 400and--p-max-length 450 - or, an even wider interval like:
--p-min-length 400and--p-max-length 500
Is there any 'rule of thumb' people use for this? Or does it even matter very much?
Some relevant info (from the qiime feature-classifier extract-reads usage page):
--p-min-length INTEGER Minimum amplicon length. Shorter amplicons are
Range(0, None) discarded. Applied after trimming and truncation, so
be aware that trimming may impact sequence
retention. Set to zero to disable min length
filtering. [default: 50]
--p-max-length INTEGER Maximum amplicon length. Longer amplicons are
Range(0, None) discarded. Applied before trimming and truncation,
so plan accordingly. Set to zero (default) to
disable max length filtering. [default: 0]
Thanks as always for the help! ![]()
Kevin