qiime feature-classifier p-trunc-len option

We have 250bp pair end reads that were amplified with the 515F/806R primer pair for 16S rRNA gene sequences, but while running dada2 I truncated forward read to 204bp and reverse to 174bp, now what truncated length should I use for qiime2 feature classifier?

qiime feature-classifier extract-reads
–i-sequences 99_otus.qza
–p-f-primer GTGCCAGCMGCCGCGGTAA
–p-r-primer GGACTACHVGGGTWTCTAAT
–p-trunc-len —???
–p-min-length 100
–p-max-length 400
–o-reads ref-seqs.qza

thanks

Hi @Yogesh_Gupta,
Please read the documentation (including the “notes”) for more details:
https://docs.qiime2.org/2019.4/tutorials/feature-classifier/#extract-reference-reads

It says The min-length parameter is applied after the trim-left and trunc-len parameters, and max-length before, so be sure to set appropriate settings to prevent valid sequences from being filtered out.
min-length and max-length, I can take from rep.seq.qzv but how to decide about trunc-len?

.
Thanks
Yogesh

you missed the first note:

The --p-trunc-len parameter should only be used to trim reference sequences if query sequences are trimmed to this same length or shorter. Paired-end sequences that successfully join will typically be variable in length. Single-end reads that are not truncated at a specific length may also be variable in length. For classification of paired-end reads and untrimmed single-end reads, we recommend training a classifier on sequences that have been extracted at the appropriate primer sites, but are not trimmed.

1 Like

Hi @Nicholas_Bokulich,

As I can understand it seems like I don’t need to use --p-trunc-len, as pair end read sequences length may vary after dada2 step.

Yes, exactly. So even if you used trunc-len when using dada2 for trimming the unmerged reads, with extract-reads you should not use trunc-len since the reads are now merged and variable length.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.