I used the feature-classifier classify-sklearn to classify my sequences using the default arguments and confidence cut-offs. This worked and it's super neat- but I would like to see wha the confidence value is for the sequences at each taxonomic level. Ex. Order 100%_ Family 98%_ Genus_ 95%_ Species_
So that I can filter reads and make cut-offs at different levels.
Are those values recorded in any of the outputs created from the function? or would I have to rerun with a different code?
This is not possible. The classifier performs classification from tip to root, so first it attempts to classify species, then genus, etc. So confidence is only reported at the terminal rank that exceeds the confidence threshold.
Confidence is the probability of a given label being correct (vs. all other possible labels), it should not be interpreted like sequence similarity. So having a dynamic threshold would not really make sense in this context.