when we are thinking about the evaluation of classification machine learning problem, confusion Metrix comes in top. as the name describes confusion, really very confusing to understand this scenario. but it is not too hard to understand if you apply some logic behind this concept.
so generally, this concept is used to understand how much record we have correctly predicted and incorrectly. in classification problem we have values in categories might be 2 or multiple categories.
binary classification Metrix.
we have 0 for negative and 1 for positive in targeted variable. and let's consider we have total 50 target record and out of 50 we have 30 positive and 20 negative, please keep in mind.
now, consider we have a machine learning model and result for output is 35 positive and 15 negative. but we are unable to decide which one predicted correctly and which one in incorrect.
so, let's understand some concept in confusion metix.
0--negative
1--positive.
TP is called true positive means model predicted positive that is really positive in real data.
TN is called rue negative means model predicted negative that is really negative in real data.
FP is called false positive means model predicted positive that is really negative in real data.
FN is called false negative means model predicted negative that is really positive in real data.
so according to above terminology, our model has prediction of.
TP=25
TN=10
FP=10
FN=5
our concept is clear till now, now understand the rule and policy here to calculate the values in our datasets.
precision=TP/TP+FP--So the precision is calculated as the percentage of total positive result by total positive prediction by our model.
as, per above examples we have TP=25 and FP=10, so precision =25/25+10=25/35=0.714
recall=TP/TP+FN--recall
as, per above examples we have TP=25 and Fn=5, so precision =25/25+10=25/30=0.714=0.83333
now let's understand about the type 1 error and type 2 error.
FP= false positive is called type 1 error
FN= false negative is called type 2 error.
what is specificity.
the specificity is calculated as the number of correct negative prediction divided by total number of negatives. TN\TN+FP
this is also called true negative rate. the best specificity is 1 and worst is 0.
what is accuracy:
accuracy in confusion Metrix is calculated as total number of correct predictions divided by total number of records.
TP+TN/TP+TN+FN+FP
what is f measure: The formula for the standard F1-score is the harmonic mean of the precision and recall. A perfect model has an F-score of 1.
f measure is calculated as 2*recall*precision/recall+precision
Comments
Post a Comment