












Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The calculation of the theil-t and theil-l inequality measures, and their decomposition into between-group and within-group components. It provides a detailed example using income data from three groups, calculating the overall theil indices as well as the group-specific indices. The document also touches on the gini coefficient, the sen social welfare function, and the impact of adding individuals with zero income on the inequality measures. Overall, the document presents a comprehensive analysis of income inequality using various statistical tools and techniques, making it a valuable resource for students and researchers interested in the topic of income distribution and inequality.
Typology: Cheat Sheet
1 / 20
This page cannot be seen from the preview
Don't miss anything!













A population is divided into four groups, each one with four individuals. The individual incomes are: x 1 = [1, 1 , 2 , 4] ; x 2 = [1, 1 , 2 , 4] ; x 3 = [2, 2 , 4 , 8] ; x 4 = [4, 4 , 8 , 16] Calculate the two Theil measures of inequality, verifying that the between and the within group components are the same for both measures.
Solution
For the two Theil measures, we use the formulas T =
i (yi^ ln^ N yi)^ and^ L^ = −
i
N ln^ N yi
, where N is the size of the population and yi is the share of individual i’s income in total income, that is, yi = ∑xi i xi^ = (^) N μxi. Using individual
incomes, we have that T = (^) N μ^1
i xi^ ln^
xi μ e^ L^ =^ −^
1 N
i ln^
xi μ. For the total population, we have that N = 16 and μ = (1 + 1 + 2 + 4 + 1 + 1 + 2 + 4 + 2 + 2 + 4 + 8 + 4 + 4 + 8 + 16) /16 = 4 , therefore N μ = 64. The Theil-T index for the total population is
1 × ln
1 × ln
2 × ln
4 × ln
1 × ln
2 × ln
4 × ln
8 × ln
1 × ln
2 × ln
[4 (− 2 × ln 2) + 4 (− 2 × ln 2) + 2 × 8 × ln 2 + 16 × 2 × ln 2]
= +
[(− 8 − 8 + 16 + 32) × ln 2] =
(32 × ln 2) =
× ln 2
T = 0. 34657
The Theil-L for the total population is
ln
ln
ln
ln
4 × ln
(− 2 × 4 × ln 2 − 4 × ln 2 + 5 × 0 + 2 × ln 2 + 2 × ln 2)
[(− 8 − 4 + 2 + 2) × ln 2] = −
(− 8 × ln 2) =
ln 2 2 L = 0. 34657
Now let’s decompose each index and calculate their within and between groups components. We denote the size of group h by nh and the share of group h in total population by πh = n Nh. The share of total income for individual i from group h is yhi = x N μhi and the fraction of total income for group h is Yh =
i yhi.
ln
(ln 2 + ln 2 + ln 1 − ln 2) = ln 2 4
L 2 = ln 2 4
L 3 =
ln
(ln 2 + ln 2 + ln 1 − ln 2) =
ln 2 4
L 4 =
ln
(ln 2 + ln 2 + ln 1 − ln 2) =
ln 2 4
Note that L 1 = L 2 = L 3 = L 4 , that is, inequalities within the groups are the same. This happened because the Theil-L index (as well as the Theil-T and the Gini) attends the income homogeneity condition, also known as scale inde- pendence. Observe that x 4 = 2x 3 = 4x 2 = 4x 1 , that is, the income distribution of any given group can be written as the income distribution of any other group multiplied by a constant. The Theil-L within groups component is
∑^4
h=
πhLh = 0, 25 × 4 × ln 2 4
ln 2 4
Therefore, the between groups component is
Lb = L −
h=
πhLh =
ln 2 2
ln 2 4
ln 2 4
We also verify for the Theil-L index that inequality between and within groups are the same.
The individual incomes for three groups are given. In group 1, there are six individuals with incomes x 11 = x 12 = 0. 5 ; x 13 = x 14 = x 15 = 1; x 16 = 8 In group 2, there are five individuals with incomes x 21 = x 22 = x 23 = x 24 = 1; x 25 = 16 Group 3 has only three individuals and their incomes are x 31 = x 32 = x 33 = 16 The three groups together constitute a total population of 14 individuals. Calculate the mean, median, mode, amplitude and variance of the income taking into account the 14 individuals. Calculate the Theil-T index related to the inequality in each group, the index related to global inequality and its within and between groups componentes. Do the same for the Theil-L index. Do the same for the Gini index.
Solution
First, let’s calculate the measures for the total population. We have that N = 14 and total income is∑
h
i xhi^ = 2^ ×^0 ,^ 5 + 7^ ×^ 1 + 8 + 4^ ×^ 16 = 80 The mean is μ = 80 / 14 = 40 / 7. To find the median, let’s rank the distribution from the lowest to the highest income and see what is the central term of the distribution. We have that x = [^1 / 2 , 1 / 2 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 8 , 16 , 16 , 16 , 16]. The median is 1 and is the mean between the 7 o^ and the 8 o^ terms in the distribution. We have therefore that half of the population have incomes less than or equal to 1 , while the other half have incomes bigger than or equal to 1. The mode is the most repeated value in the distribution and is also 1. The amplitude is the differente between the extreme incomes of the distri- bution and therefore is 16 − 12 = 15. 5. We denote the variance by σ^2. We have that σ^2 =
∑(x i−μ)^2 N =
∑(x i−^40 /^7 ) 14 = 45.^59694 The Theil-T index for the total distribution is
N μ
i
xi ln xi μ
=
ln
The Theil-L is
i
ln
xi μ
= −
2 × ln
(^40) / 7 + 7^ ×^ ln^
(^40) / 7 + ln^
(^40) / 7 + 4^ ×^ ln
Populations in each group are n 1 = 6; n 2 = 5; n 3 = 3 Shares of population by each groups are π 1 = 146 ; π 2 = 145 ; π 3 = 143 Mean incomes in each group are μ 1 = (2×^0 ,5+3 6 ×1+8)= 2; μ 2 = (4×1+16) 5 = 4; μ 3 = 3 × 316 = 16 Shares in total income by each group are Y 1 = 2 × 806 = 1280 = 203 ; Y 2 = 480 ×^5 = 2080 = 205 ; Y 3 = 1680 × 3 = 4880 = (^1220) Let’s calculate the Theil-T indexes within each group. We have that
n^2 μ
i
ixi −
nh
Now let’s decompose it in the within and between groups components. Let’s consider the groups ranked from the lowes to the highest share in total income and, within each group, we rank the individuals from the lowest to the highest income. We denote group h’s share in total income as Φh = (^) N μ^1
∑h j=1 μj^ nj^ and the share of group h’s income of individual ias Φhi = (^) nh^1 μh
∑i j=1 xhj^.^ We can decompose the Gini index as follows: G = Gb +
h πhYhGh^ +^ Gs, where^ Gb^ = 1^ −^
h (Φh^ + Φh−^1 )^ πh^ and^ Gh^ = 2 n^2 hμh
i ixhi^ −
1 + (^) n^1 h
We can make use of the following table: h nhμh πh N μΦh Φh 1 12 6 / 14 12 12 / 80 2 20 5 / 14 32 32 / 80 3 48 3 / 14 80 1 We have that
Gb = 1 −
h
(Φh + Φh− 1 ) πh
Gb = 0. 43929
The Gini for each group is
n^21 μ 1
i
ix 1 i −
n 1
n^22 μ 2
i
ix 2 i −
n 2
n^23 μ 3
i
ix 3 i −
n 3
The within groups component is therefore
h
πhYhGh =
Consider two populations divided into three stratum each. In population A, the 40% poorest have 10% of total income, the 40% of the middle have 40% and the 20% richest have 50%. In population B, the three stratum (40% poorest, 40% of the middle and 20% richest) have 20%, 20% and 60% of total income, respectively. We suppose there is no inequality within each stratum. Calculate the Gini index for each one of the two populations. Do the same for the Theil-T index and for the Theil-L indexes. Based on these results, verify in each one of the two populations the income distribution is more unequal. Comment the results taking into account the Lorenz curve for each popula- tion.
Solution
We can calculate the Gini using the formula of the between and within groups decomposition, noting that the within groups component is zero in this case. Therefore, the Gini for each population is
h
(Φh + Φh− 1 ) πh = 1 − [(0, 1 − 0) × 0 , 4 + (0, 5 + 0, 1) × 0 , 4 + (1 + 0, 5) × 0 , 2)] = 0. 42000
h
(Φh + Φh− 1 ) πh = 1 − [(0, 2 − 0) × 0 , 4 + (0, 4 + 0, 2) × 0 , 4 + (1 + 0, 4) × 0 , 2)] = 0. 40000
The Theil-T for each population is
i
yh ln
yh πh
= 0, 1 × ln
i
yh ln yh πh
= 0, 2 × ln
The Theil-L is
h
πh ln
πh Yh
= 0, 4 × ln
h
πh ln
πh Yh
= 0, 4 × ln
Inequality measured by the Theil-T can be obtained from its dual using the formula UT = 1 − exp (−T ) Therefore, for 2003 we have that
U ′ 03 = 1 − exp (−T 2003 ) 0 , 78596 = 1 − exp (−T 2003 ) exp (−T 2003 ) = 0. 21405 −T 2003 = ln 0. 21405 T 2003 = − ln 0.21405 = 1. 54155
We present the mean and the inequality of per capita income using the Gini index of a hipothetical country before and after a socialist revolution. Calculate the evolution of the well being in this society taking into account the function proposed by Sen.
Solution
Sen’s social welfare function is W = μ (1 − G), where μ is the mean and G is the Gini index of the distribution. We have that,
Wo = 300 × (1 − 0 .6) = 120 W 1 = 250 × (1 − 0 .5) = 125
Social welfare, as measured by the Sen’s function, increased from 120 to 125, that is, 4.2%.
Answer true or false and comment. The extension of temporal variability of observed income always influence inequality of annual incomes keeping constant the presente value of the income earned during the life cycle.
Solution
False. If the temporal variability of observed income doesn’t change the per- manent income, it will only affect the distribution of current income, once a negative shock in one period will be compensated by a positive shock in a fu- ture one.
Write down the formula and discuss possible problems of the following indica- tors:
Solution
We have that W = μ (1 − G), where μ is the mean and G is the Gini index of the distribution. The inequality measure used in Sen’s Social Welfare Function is the Gini index, measure that has a complex decomposition in between and within groups components. Except in the cases where thera are no intersections between the income brackets in the different groups, we observe a residual that is very hard to interpret in the decomposition. Besides that, the funcion considers that efficient and equality has the same weights, being a particular case of the Graaff’s Social Welfare Function, given by W = μ (1 − G)σ^. In the Sen’s case, we have σ = 1.
We have that V = (^) N ∑[E[log(^1 y)]−log(y)] 2 The main advantages are it is scale-invariant and decomposable. The decom- position follows because of the properties of the logarithmic function, specially additivity. However, it is now defined for null incomes and it is not very sen- sitive to changes in the top of the distribution, once the logarithmic function "smoothes" the distribution.
Write down the two alternative formulas and explain the logic and the intution behind the Gini index.
Solution
i>j
j |^ xi^ −^ xj^ |
The Gini corresponds to the ratio between the average of the absoluto deviations of the incomes of all the people in the sample and twice the average. Remember that there are N^ (N 2 − 1)distinct pairs. Note that in the case of perfect equality (xi = μ for all i), we have that the sum is equal to zero and the Gini is also zero. On the other hand, in the case of perfect inequality (xi = N μfor one individual and xi = 0 for the other ones), we have that the Gini is equal to one.
x = [1, 1 , 2 , 6 , 30] If we add one person with zero income to the sample, how do the indexes change?
Solution
In the above distribution, we have n = 5, μ = (1+1+2+6+30)/ 5 = 8 and therefore nμ = 40.
i=
yi ln nyi =
nμ
i=
xi ln xi μ
2 × 1 × ln
The dual is
UT = 1 − exp (−T ) = 1 − exp (− 0 .77488) = 0. 53924 Adding an individual with zero income represents a share of φ = 16 with zero income in the new distribution. The dual of the new distribution is
U (^) T′ = φ + (1 − φ) UT =
The Theil-T of the new distribution is therefore
T ′^ = − ln (1 − U (^) T′ ) = − ln (1 − 0 , 61078) = 0. 94361
n
i=
ln xi μ
2 × ln
It is not possible to calculate the changes in the Theil-L when we add a person with null income because the index is not defined for zero incomes.
n^2 μ
i
ixi −
n
The dual is the Gini itsfelf. If we add an individual with null income, we will have that
G′^ = φ + (1 − φ) G =
Suppose that per capita income of household A, composed of only one individual, is 8. Suppose also that there is only another household in the economy, with incomes {1, 1, 2, 6, 30}. Calculate the level of inequality according to the following concepts: i) Household per capita income between households ii) Household per capita income between individuals iii) Calculate the inequality component of individual income between groups of households (i.e., A and B). Assume now that income of household A is 7. Recalculate it. iv) Suppose there is no socialization of incomes inside the households. How much of total income inequality is going to be underestimated taking into ac- count both scenarios?
Solution
Let’s use the Theil-T index to calculate the level of inequality. House per capita income for household B is x 2 = x^21 +x^22 +x n^232 +x^24 +x^25 = 1+1+2+6+30 5 = 8. The share in total population for each household is π 1 = 16 and π 2 = 56. The share in total income for each household is Y 1 = 488 = 16 and Y 2 = 4048 = 5
The Theil-T between households is, therefore
Tb =
h
Yh ln
Yh πh
ln
ln
Because all individuals have household per capita income of 8, the Theil-T index between individuals will also be 0. Considering now x 1 = 7, we have that Y 1 = 477 and Y 2 = 4047. The Theil-T between households is
T (^) b′ =
ln
ln
Note that, in this case, T = Tb, because we still have that the within group components are zero since we are considering household per capita income. If we consider individual income, we have that
Solution
Lorenz dominance is a criterion that permits us to compare two different dis- tribution and say unequivocally which one is more unequal. The dominance is verified when a Lorenz curve is always above another one, that is, when they don’t cross. If they cross, there is no Lorenz dominance and it might be the case that different inequality indexes imply different results in terms of which distribution is more unequal. Therefore, we cannot say unequivocally which one is more unequal. If is the case of Lorenz dominance, all the most relevant inequality (including the Theil-T, Theil-L and the Gini) indexes will point in the same direction.
3.1 - Empirical Estimates
For the model ln(Yi) = α + β × Xi + ui, we have the following estimate
ln(Yi) = 0. 8972 (0.01768)
(0.0497)
× Xi
where Yi = income from the main activity Xi = years of schooling The numbers in brackets are the standard errors of the estimates. Interpret the slope coefficient (give the formula), its significance and the R^2 of the regression.
Solution
We have that the slope coefficient is given by
β^ ˆ = Cov(Xi, lnYi) V ar(Xi)
Note that the t statistic is
t = βˆ ep( βˆ)
so that the estimative is statistically significant. The interpretation is that each additional year of schooling is associated on average with an increase in the wages of approximately 15.4%. The R^2 of the regression indicates that approximately 45% of the variation in the wages is explained by the variation in the years of schooling.
Using the regression above and the Theil-T index, discuss and explain the logic of the role of education in the determination of the labor income inequality in Brazil.
Solution
The estimatives of the regression above, considering that the model is a good one, show us that education has an important role in determing the wages of the individuals. Actually, 15.43% is a bery big number for the returns to education, much bigger than in most of the developed countries. Even with this big educational premium, we still have in Brazil very low educational levels, so there is a big room for improvement. We will also have that education must have an important role in determing the Theil-T index for income inequality, as wages represent the most important component in individual or even household income.
Taking into account the following social welfare function, discuss how to incor- porate the Principle of Transfers in the measures of inequality. What would be the case for the Gini and Atkinson’s (with ε = 1) measures? W = u(x∗) =
0 u(x)w(x)f^ (x)dx
Solution
The Principle of Transfer can be incorporated if we consider bigger weights for the poorest individuals, that is, with lower x. In the case of the Gini, we have that w(x) = 2 [1 − F (x)]. Note then that the poorest individual, for whom F (x) = 0, has weight w(x) = 2, while the richest individual, for whom F (x) = 1 , has weight w(x) = 0. Another possibility is to consider individual utility functions with decreasing marginal utility. In this case, an income transfer from a relative rich individual to a relative poor onde increase welfare, onde the increase in the utilty of the poorest of the two individuals is bigger than the decrease in the utilty of the richest. This is the case of the utility considered in Atkinson’s Welfare Function, in which u(x) = ln(x).
Calculate the Theil-T index between groups by gender for per capita individual income using the following data. Interpret.
4.4 - Empiric
Consider the labor decomposition of individual income taking into account dif- ferent sources: a. What is the rate of unemployment in the PIA (Active Age Population - 15 to 65 years old)? b. What is the fraction of the growth of the mean labor income in the PIA that is explained for the rise in occupation? c. If we assume a 0.5% per year growth of the PIA as a result of the recent demographic transition, what should be the growth of income from all sources? d. Compare the impacts in total income of the demographic bonus with the impacts of the rise in average years of schooling of the occupied (educational bonus).
Solution
For this exercise, see handout 11. a. The rate of unemployment in the Economically Active Population (there- fore PEA) for 2009 was 1 − (^) P EAocup = 1 − 0 .833 = 0. 167. We also know that paricitpation rate in the labor market corresponds to P EAP IA = 0. 739. Therefore, we can calculate the rate of unemployment in the PIA as following unemp P IA =^
unemp P EA ×^
P EA P IA = 0.^167 ×^0 .739 = 0.^123 That is, the rate of unemployment in the PIA for 2009 was 12.3%. b. The rise in occupation in PIA was ( ocupP IA ) 2009 −( ocupP IA ) 2003 ( ocupP IA ) 2003 =^
( ocupP EA ) 2009 ×( P EAP IA ) 2009 −( ocupP EA ) 2003 ×( P EAP IA ) 2003 ( (^) P EAocup ) 2003 ×( P EAP IA ) 2003 ⇒ ( ocupP IA ) 2009 −( ocupP IA ) 2003 ( ocupP IA ) 2003 =^
(0.833)×(0.739)−(0.803)×(0.721) (0.803)×(0.721) = 6.32% We have that totincome 2009 = 806. 56 and ( (^) laborincometotincome ) 2009 = 1. 1703 Therefore, laborincome 2009 = ( (^8061). 1703.^56 ) = 689. 2 Doing the same for 2003, we have that laborincome 2009 = ( (^6421). 1874.^65 ) = 541. 2 Therefore, the rise in labor income was laborincome 2009 −laborincome 2003 laborincome 2003 =^
d. The impact of the demographic bonus is (^40) ..36%5% = 11.47%. The impact
of the rise in average years of schooling of the occupied is 24 ..12%36% = 48.62%. Therefore, the impact of the educational bonus is more than 4 times the impact of the demographic bonus.