Named after statistician Corrado Gini, it is mainly a measure of statistical dispersion that was intended to represent the distribution of income or wealth of a nation. In Credit Risk it is used as a measure to check the effectiveness of discrimination between goods and bads by the model as compared to a random selection. GINI lies between -1 and 1. A function that perfect separates the goods and bads has a GINI = 1, on the other hand a model with GINI = 0 has no prediction power or is no different from a random selection.
Explaining the GINI with the help of a Graph: it’s the area between Lorenze Curve (dashed line) and perfect diagonal (45 degree line)-represented below:
Its’ important to keep in mind, What happens if we have a perfect Lorenze curve? Well a perfect Lorenze curve may give us a high GINI but it means that it fits our data completely (i.e. we have taken all the variables that can probably explain the ‘explanatory’ variable), which also means that any deviation or change in the data might not be captured well by the model. Hence an over fitted model may also not be good.
Now that we know, GINI captures the discriminatory power of the model. The question arises, how we calculate GINI in our model. Let’s explain this with the help of an example:
Now suppose you have the data from a telecom company, who have taken a survey with respect to their new scheme. Out of 5291 people (aged between 25- 35) 182 people have responded NO for the new scheme, while 5109 have responded with a YES. You have ranked the data on the basis of age and divided it into 10 deciles. You need to check the overall effectiveness of this scheme: