Usage of three primer pairs in Extract reference reads** step

steffi · July 4, 2018, 10:08am

I am trying to train my feature classifiers according to my data-sets. In the Extract reference reads step, I need to provide the primer sequence information. When I checked my sequencing report, Sequencing team used three primer pairs. 27F-518R, 516F-1080R and 1114F-1492R.
How to proceed further?
Thank you in advance

Nicholas_Bokulich · July 4, 2018, 12:03pm

Mixtures of amplicons are tricky; they can complicate classification, and different sections may receive different classifications (e.g., more/less taxonomic specificity), so things can get messy. You have two different options:

Use one of the full-length 16S pre-trained classifiers for your data instead of training your own classifier.
separate your sequences into three separate files (one per primer pair) using exclude-seqs. Train one classifier per primer pair and use that classifier on the appropriate file.

Option 2 might increase accuracy slightly; but I would personally go with option 1 because it is a lot simpler.

Good luck!

system · August 4, 2018, 6:04pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.