Is it necessary to specify p-trunc-len when training a feature classifier for paired-end reads?

Hi
For PE Miseq reads where we amplify a fragment about 450bp in size with 341F and 785R, is that necessary to use p-trunc-len ??? to determine the size of fragment, while it is determined by F and R primers. The difference between Moving picture tutorials and our real Miseq data is that we have the sequence of whole fragment size about 450bp which is generated by f and r primer pairs.

qiime feature-classifier extract-reads
–i-sequences 85_otus.qza
–p-f-primer GTGCCAGCMGCCGCGGTAA
–p-r-primer GGACTACHVGGGTWTCTAAT
–p-trunc-len 120
–o-reads ref-seqs.qza

Best wishes
Sajjad

Hi Sajjad,
I have similar problem and and also curious about the answer.

I can realize that when Deblur is applied, It is nessesary that sequences have the same length. I’m also working on 341F and 785R primers and myp-trunc-len is set to 250.

When set to above 250 (let’s say 300 or 350) there is a problem, because almost 100% sequences have less length and do not pass this step. This situation cause that a lot of sequencing information is lost, especially in the case of not fully overlapping primers (here V3-V4).

So the second option is to choose Dada2 which not require to join sequences of the same length? Im a bit confused and hoping for some detailed clues.

Best wishes :slight_smile:

Hi @sajjad.sarikhan,
Thanks for asking! When analyzing paired-end reads that fully overlap, you do not need to specify --p-trunc-len when extracting reads to train a feature classifier. This is because your reads will cover the full amplified region and hence you can extract the full read. A trunc-len is only useful (but not required) for single-end data, such as shown in the moving pictures tutorial.

We have an open issue to clarify this issue in the tutorials.

@Jaroslaw_Grzadziel it sounds like your problem relates to denoising methods, not the extract-reads action that is used while training a feature classifier. deblur has a similar trim-length parameter, and dada2 also uses a parameter called trunc-len but this is not the same issue and is not relevant to feature extraction.

Hi Nicholas_Bokulich
Thanks a lot for helpful comments you gave on my questions.

Best wishes

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.