The Gini coefficient
Wildon's Weblog 2018-04-11
We define the Gini coefficient of a probability measure on with mean by
where and are independently distributed according to . Thus . Dividing by makes a dimensionless quantity. The reason for normalizing by will be seen shortly.
The Gini coefficient is a measure of inequality. For instance, if is the probability that a citizen’s wealth is , then we can sample by picking two random citizens, and taking the (absolute value of) the difference of their wealths. In
- Utopia, where everyone happily owns the same (ample) amount, the Gini coefficient is ;
- Dystopia, where the ruler has a modest fortune of units, and the other `citizens’ have nothing at all, and the Gini coefficient is ;
- Subtopia, where each serf has unit, and each land-owner has units (plus two serfs to boss around), the mean income is and the Gini coefficient is .
A striking interpretation of uses the Lorenz curve. Staying with the wealth interpretation, define to be the proportion of all wealth owned by the poorest of the population. Thus , and, since the poorest third (for example) cannot have more than third of the wealth, for all . When is large we can approximate by the following functions : in
- Utopia for all ;
- Dystopia if and ;
- Subtopia if and if .
Since the area below the curve is , the orange area is . This is half the Gini coefficient.
Theorem. is the area between the Lorenz curve for and the diagonal.
Proof. Let be the cumulative density function for , so . If you are person , and your wealth is , so , then if and only if you are in the poorest of the population. Therefore the poorest of the population form the event . Their proportion of the wealth is the expectation of , taken over this event, scaled by . That is
The area, say, under the Lorenz curve is therefore
Now since , where the event is assumed to be negligible, it follows from linearity of expectation that . (Here denotes the expectation taken over both random variables.) Substituting we obtain
From the identity then another application of linearity of expectation, and finally the definition of we get
Therefore , as claimed.