We’ve done some 16S rRNA amplicon sequencing with primers that Illumina says are used to sequence the V3 and V4 variable regions of the 16S rRNA gene.
qiime dada2 denoise-paired command:
qiime dada2 denoise-paired \ --i-demultiplexed-seqs demux.qza \ --p-trim-left-f 17 \ --p-trim-left-r 21 \ --p-trunc-len-f 294 \ --p-trunc-len-r 216 \ --o-table table.qza \ --o-representative-sequences rep-seqs.qza \ --o-denoising-stats stats.qza \ --p-n-threads 8 \ --verbose
And here's my
qiime feature-classifier extract-reads command:
qiime feature-classifier extract-reads \ --i-sequences silva-138-99-seqs.qza \ --p-f-primer CCTACGGGNGGCWGCAG \ --p-r-primer GACTACHVGGGTATCTAATCC \ --p-min-length 400 \ --p-max-length 500 \ --o-reads ref-seqs.qza
My understanding is that the primers we're using produce amplicons of ~464 bp in length (see this forum post). So, by using
--p-trim-left-f 17 and
--p-trim-left-r 21 in the
qiime dada2 denoise-paired step, I'd end up with amplicons of length (464 − 17 − 21) = 426 bp. Is that correct? If so, what would be the best values to use for
qiime feature-classifier extract-reads?
I notice that someone used
--p-min-length 400 and
--p-max-length 450 in a similar situation (with the same primers and similar trimming), and got a thumbs up from @Mehrbod_Estaki. When I ran my analysis initially, I used
--p-min-length 400 and
--p-max-length 500 in the
qiime feature-classifier extract-reads command, but I guess I'm wondering if there's any significant difference between using, say ...
- a tight interval like:
- or, a wider interval like:
- or, an even wider interval like:
Is there any 'rule of thumb' people use for this? Or does it even matter very much?
Some relevant info (from the
qiime feature-classifier extract-reads usage page):
--p-min-length INTEGER Minimum amplicon length. Shorter amplicons are Range(0, None) discarded. Applied after trimming and truncation, so be aware that trimming may impact sequence retention. Set to zero to disable min length filtering. [default: 50] --p-max-length INTEGER Maximum amplicon length. Longer amplicons are Range(0, None) discarded. Applied before trimming and truncation, so plan accordingly. Set to zero (default) to disable max length filtering. [default: 0]
Thanks as always for the help!