Hello,
I am working on a custom OTU clustering algorithm and trying to measure sensitivity, specificity.
Should I calcualte the True Positive (TP), False Positive (FP), False Negative(FN) and True Negative values for each OTU and get the average, or it should calculate it for all. If so, then
How can I calculate True Negatives? Since it is different for each OTU cluster?
My evaluation is :
If a sequence is correctly clustered then it is TP
If a sequence is not correctly clustered then it is FP
If a sequence could not be clustered and left alone it is FN
now, for True Negatives I can calculate them for each OTU :
TN= N-(TP+FP+FN)
Should I simply calculate sensitivity and specifity for each OTU alone and then take an average ?
I really appreciate your help for me to understand this concept.
It sounds like you are asking about different parts of the confusion matrix. If possible, I would report on all 4 combinations (TP, FP, TN, FN), then people can choose whatever metric they like best.
For example, I like balanced accuracy, and I could calculate that from (TP, FP, TN, FN).
The authors of OptiClust, report the Matthews correlation coefficient
The vsearch devs graphed TPR vs FDR as figure 1, then reported Rand Index, recall, and precision in Figures 2 and 3
How do you define that? Do your custom algorithm consider something that others do not?
I think I'm beginning to understand. So you take sequences from greengenes, then cluster them against greengenes, and it's a True Positive if it aligns to the same sequence? Or it's TP if it aligned to the same genus?
Sure. The average of sensitivity and specificity is balanced accuracy. That's my favorite too
Selected sequences are excluded from the refererence database.
my TP is if it is in the same genus .
I am trying to measure how well my clustering algorithm is with the grand truth.
So, suppose there were 100 clusters in the grand truth and my algorithm has some number of OTUs. So I will calculate Sensitivity and specifity for each OTU and take the average,. I hope this measurement make sense ?
And finally i am disregarding the specifity and sensitivity of OTUs with only 1 sequence. I hope this is also Ok.
Ah OK! This sounds like "leave-one-out cross validation". That was used to test the RDP classifier and SINTAX.
It's a good method! I've never done this benchmark before, but I would be interested in seeing your results.
Based on what you have described here, it sounds like you are testing both OTU clustering, and also taxonomy assignment, is that correct? Which database are you using as ground truth?
I have randomly selected 100 genus having minimum 50 sample in each genus group from GreenGenes.
I have excluded these selected sequences from GreenGenes.
The remaining GrenGenes database is used for closed reference. I am still learning. I am using the taxonomy only for the ground truth purposes. I am also wondering the difference between closed reference OTU picking and taxonomy assignment :)))
I really appreciate your time and help. There is more I did not tell here but I will share it in this forum too If it turns out to be something