Confusion matrix is used in case of logistics model where there is a binary response for the dependent variable. It is used to compare the observed value of the dependent variable with predicted values of the dependent variable. It validates the accuracy of the model.
|Confusion Matrix||Predicted Class|
True Negative: accounts for the instances that were predicted as non-events and are actually being classified as non-events.
False Positive: accounts for those events that were predicted as events but were actually classified Non-events. This is also referred as Type-I error and false alarms that had been raised.
False Negative: accounts for those events that were predicted as Non-events and are actually being classified as Events. They are also referred as Type-II error and were that opportunities that we had missed to raise an alarm to.
True Positive: accounts for the instances that were predicted as Events and are actually being classified as Events. This is also referred as the Power of the model.
Now we all know the level of significance in a model is defined as Alpha (α). 1-α is referred as ‘power of the model’ or True Positive.
How do we check Model performance?
|Where:||TN = True Negative|
|TP = True Positive|
|FP = False Positive|
|FN = False Negative|
Accuracy: measures overall accuracy of the model discriminatory power.
Precision: measures the efficiency with which the logistic model measures events.
Sensitivity: measures out of the total cases observed how many are actually events.
Specificity: Specificity measures true negative rate.
ROC Curve is used in case of a logistic model to check how well can the predictive model discriminate or distinguish between good and bad.
ROC depicts sensitivity on y-axis and 1 – specificity on x-axis.
As we can see from the graph, Sensitivity calculates, what % cases are actually classified (defaults) by the model is measured while Specificity measures true negative rate.