Once we have developed the Logistic model and we have our final equation we need to check the accuracy of the model on the data that is outside the sample that we have taken. This is known as “Out of sample validation”(also called Out of Time-OOT). Mainly done to check- that if we rerun the model on a different data set, using the parameter estimate that we got during the development of the model, how approximate the predicted probability will be with the actual value.
This can be done with the help of “Actual v/s Predicted” curve.
This curve shows how well our predicted curve fits the actual curve, in simpler terms it checks precision of your model.
For the accuracy test of the model we require 2 things- Predicted Value and Actual Value.
- Predicted Value: Using the Parameter Estimates generated during the development phase, we take out the predicted probability.
Logistic Regression gives us an equation as follow:
Now keeping these Beta’s same as that during development phase we feed new values of our dependent variables (i.e. X’s) and get new predicted probability(let’s call it Phat).
This is done with the help of Proc Score, which takes the parameter values of development model and gives us the predicted value of the target variable.
- Actual Value: The actual target variable from the ‘out of sample data’.
We use the predicted variable and plot it against the Actual value. We can also divide the scored probability variable in deciles, and compare them deciles by deciles.