How to train the classifier with multiple mixed forward primers?

HI @Birong ,

It appears that all of these primers bind to the same location, and only differ by a few bases. You could combine these 4 sequences into a pseudo-sequence using the IUPAC ambiguity codes like this:

An extreme case would result in something like this:
GRRTTYGATYMTGGYTYAG
^^Warning: This might be too ambiguous and lead to spurious hits.

Since we can allow for a certain amount of mis-matches lets try something like you suggested by slightly lowering the identity, or make a new sequence string, (see below). I retained the initial ambiguous IUPAC bases added additional ones where the common base had a stronger bond, (i.e. a G or a C).
GARTTTGATYMTGGCTYAG
^^This still might be too ambiguous, but you get the idea

:point_right: Another option, which I'd recomend, is to use only one of the primer sets. Specifically, the one that uses 28F-YM primer and use the resulting extracted sequences as a reference pool for guiding the extraction of this region without the use of additional primer pairs. That is, follow the approach outlined here.

-Cheers!
-Mike

2 Likes