prepare a "regional database for each primer set" for SIDLE

lea.k · May 18, 2022, 1:45pm

Hello Justine,
I´m new to the sequencing world but am planning a 16s amplicon sequencing with two variable regions.
Therefore I´m trying to set up a pipeline and bioinformatic workflow, before I start my experiment to be on the safe side.
I think sidle/the SMUFR algorithm will fit my approach and went through the tutorial.
The primers I would like to use and that are suggested in literature by for example Parada et al 2016 contain Ys,Ms,Ws,Rs,Hs etc..
Now I am wondering if I can prepare a "regional database for each primer set" and extract the regions from the database with these non biological primer sequences or will that be a problem.
If so, do you have any advice how to deal with that?

Thank you very much!
Best
Lea

jwdebelius · May 18, 2022, 2:18pm

Hi @lea.k,

The Ys, Ms, Ws, etc in your primers are sometimes referred to as degenerate nucleotides, essentially places where multiple nucleotides are possible. Lots of primers have these degenerate positions. Sidle (and QIIME 2 in general) are set up to handle degenerate positions. In fact, one of the primer sets Sidle is tested with contains a degenerate position.

Best,
Justine