How to train the classifier with multiple reverse primers?

Hi everyone,
A got the sequencing data from elsewhere, and the author said that
“The 16S-specific primers contain degenerate sites or, in the case of 926R, represent a combination of three distinct oligonucleotides in order to capture broad eubacterial diversity.”
This is the forward and the reverse primer:
|518F|CCAGCAGCYGCGGTAAN |v4-v5 forward primer|
|926R1|CCGTCAATTCNTTTRAGT |v4-v5 reverse primer|
|926R3|CCGTCAATTTCTTTGAGT |v4-v5 reverse primer|
|926R4|CCGTCTATTCCTTTGANT |v4-v5 reverse primer|
As it shows, there are three reverse primers they used. I want to use the GreenGenes 16S reference sequences. Is it ok to only choose one reverse primer for extract reference reads? According to, it seems I can only use one reverse primer…But will it bring any bias?

Good morning,

Using only a single primer primer could introduce bias, but fortunately there are other settings for extract reads that could help mitigate this issue. For example

--p-identity FLOAT   minimum combined primer match
                     identity threshold.  [default: 0.8]

Let’s see how similar your three reverse primers are!

|926R1|CCGTCAATTCNTTTRAGT |v4-v5 reverse primer|
|926R3|CCGTCAATTTCTTTGAGT |v4-v5 reverse primer|
|926R4|CCGTCTATTCCTTTGANT |v4-v5 reverse primer|
926R1 vs 926R3 == 3 differences 
926R3 vs 926R4 == 3 differences
926R1 vs 926R4 == 4 differences

So now when we look at the p-identity default in the script (.8), we could compare this to the three differences in the script.

3 differences / 18 bp length == 83% similar
80% similar over 18 bp == less than or equal to 3.6 bp different

So you could reasonably use just the 926R3 primer, and the default p-identity will let you capture matches to the other two primers. If you want to make sure you get more reads, you could use a lower identity selection. Say --p-identity 0.7

Let me know if this helps!


Hi Colin,
Thank you for the explanation! This perfectly solved my question! :grin:



This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.