ILO LABOUR FORCE ESTIMATES AND PROJECTIONS: 1990 ..., Schemes and Mind Maps of Statistics

Step 3: Calculation of response-probabilistic weights . ... The basic data are single-year labour force participation rates by sex and age groups, ...

Typology: Schemes and Mind Maps

2022/2023

Uploaded on 02/28/2023

sheetal_101
sheetal_101 🇺🇸

4.8

(17)

234 documents

1 / 43

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
7
4
83 %
25
20
40
60
80
23 %
ILO LABOUR FORCE ESTIMATES
AND PROJECTIONS: 1990-2030
(2017 EDITION)
Methodological description
November 2017
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b

Partial preview of the text

Download ILO LABOUR FORCE ESTIMATES AND PROJECTIONS: 1990 ... and more Schemes and Mind Maps Statistics in PDF only on Docsity!

83 %

20

40

60

80 23 %

ILO LABOUR FORCE ESTIMATES

AND PROJECTIONS: 1990-2030 (2017 EDITION)

Methodological description

November 2017

Contents

  • Contents
  • Preface
    1. Introduction
    1. Concepts, definitions and theoretical background
    • 2.a. Different forms of employment
    • 2.b. Labour force participation rates.....................................................................................
    • 2.c. The determinants of the LFPR
    1. Estimation model 1990-2016: data and methodology
    • 3.a. Introduction..................................................................................................................
    • 3.b. The data obtention and selection process
      • Streamlining the data obtention process
      • Data selection process.............................................................................................
      • Resulting input data file
    • 3.c. Missing value estimation procedure
      • Overview
      • Step 1: Logistic transformation
      • Step 2: Country-level interpolation
      • Step 3: Calculation of response-probabilistic weights
      • Step 4: Weighted multivariate estimation.............................................................
      • Step 5: Data adjustments, blending of model vintages and overfitting
    1. Projection model: 2017-2030
    • 4.a. Methodologies used worldwide
    • 4.b. Methodology used in this edition
      • Projections
      • Computation of intermediate values
    1. Strengths, limitations and future work
    • 5.a. Strengths
    • 5.b. Limitations
    • 5.c. Direction for future work..............................................................................................
    1. Bibliography
  • ANNEX 1: Country composition of each sub-regional grouping....................................................
  • ANNEX 2: Tables of regression specifications by region, sex and age group
  • ANNEX 3: Harmonizing LFPR by age bands
    • a. Harmonising data to 15 years and above........................................................................
    • b. Harmonising data from large age bands to 5-year age-bands

Preface

The 2017 Edition of the ILO Labour Force Estimates and Projections (LFEP) Database was produced by the ILO Department of Statistics.

There are two important changes in this edition as compared to the previous edition. Firstly, the process for incorporating input data has been substantially modified in order to fully integrate the LFEP model within the ILOSTAT database structure, thus taking advantage of ILOSTAT’s quality control system. Secondly, the projections are obtained solely based on econometric methods, becoming more transparent and replicable.

Consistent with the previous edition (2015), the historical estimates (1990-2016) are accompanied by detailed metadata for each data point. The metadata include several fields regarding the source of collected data, the type of adjustments made to harmonise data (when needed) and the type of imputation method used to fill missing data.

This document was prepared by Evangelia Bourmpoula, Roger Gomis (ILO Department of Statistics), and Steven Kapsos (ILO Department of Statistics). This work has benefited from the valuable comments of Rafael Diez de Medina, Director of the ILO Department of Statistics.

1. Introduction

The ILO programme on labour force estimates and projections is part of a larger international effort on demographic estimates and projections to which several UN agencies contribute. Estimates and projections of the total population and its components by sex and age group are produced by the UN Population Division, the employed, unemployed and related populations by the ILO, the agricultural population by FAO and the school attending population by UNESCO.

The main objective of the ILO programme is to provide member States, international agencies and the public at large with the most comprehensive, detailed and comparable estimates and projections of the labour force for countries and territories, the world as a whole and its main geographical regions. The first edition was published by the ILO Department of Statistics in 1971 (covering 168 countries and territories, with reference period 1950-1985)^1 ; the second edition in 1977 (with 154 countries and territories and reference period 1975- 2000)^2 ; the third edition in 1986 (with 156 countries and territories and reference period 1985-2025)^3 ; the fourth edition in 1996 (with 178 countries and territories and reference period 1950-2010)^4 ; the fifth edition in 2007 (with 191 countries and reference period 1980-2020) with two subsequent updates (in August 2008 and December 2009)^5. The sixth edition (2011) covered 191 countries and territories. The reference period for the estimates was 1990-2010 and for the projections was 2011-2020. The 2013 edition covered 191 countries, with a reference period of 1990-2012 for the estimates and 2013-2030 for the projections. The 2015 edition covered 193 countries. The reference period was 1990-2014 for the estimates and 2015-2050 for the projections. The present 2017 edition covers 189 countries. The reference period is 1990-2016 for the estimates and 2017-2030 for the projections.^6

The basic data are single-year labour force participation rates by sex and age groups, of which ten groups are defined by five-year age intervals (15-19, 20-24, ..., 60-64) and the last age group is defined as 65 years and above. The data are available on ILOSTAT, the ILO’s central statistical database: http://www.ilo.org/ilostat.

This document describes the main elements of the estimation and projection methodologies adopted for the 2017 edition. This edition continues to use the enhanced methodologies that were developed in order to improve the labour force estimates and projections in the 6th^ edition in 2011 and continued in the 2013 and 2015 editions. 7 As established in the 6th^ edition, the historical estimates are accompanied by detailed metadata for each data point. The metadata include several fields regarding the source of collected data, the type of adjustments made to harmonise them (when needed) and the type of imputation method used to fill missing data.

There are, however, some important changes in this edition as compared to the previous editions. Firstly, the process of obtaining and incorporating model input data has been substantially modified in order to fully integrate the LFEP model within the ILOSTAT database structure. Thus the new edition takes advantage of ILOSTAT’s quality control system. Secondly, the projections are obtained solely based on econometric methods. Therefore, the projections have become more transparent and replicable.

(^1) ILO, Labour force projections, 1965-85 (1st (^) edition, Geneva 1971). (^2) ILO, Labour force projections, 1950-2010 (2nd (^) edition, Geneva 1976). (^3) ILO, Economically Active Population: Estimates and projections, 1950-2025 (3rd (^) edition, Geneva 1986). (^4) ILO, Economically Active Population Estimates and projections, 1950-2010 (4th (^) edition, Geneva 1996). (^5) ILO, Estimates and Projections of the Economically Active Population, 1980-2020 (5th (^) edition, Geneva 2007, Update August 2008, Update December 2009). (^6) The reduction from 193 to 189 reference areas is due to the inclusion of the data of Dominique, Martinique, French Guyana and Réunion into metropolitan France. The change in horizon from 2050 to 2030 allows to base the projections solely on econometric methods. (^7) ILO, Estimates and Projections of the Economically Active Population, 1990-2030 (2013 edition, Geneva 2013); ILO, Labour Force Estimates and Projections, 1990-2050 (2015 edition, Geneva 2015).

(b) employed persons “not at work” due to temporary absence from a job, or to working-time arrangements (such as shift work, flexitime and compensatory leave for overtime).

“For pay or profit” refers to work done as part of a transaction in exchange for remuneration payable in the form of wages or salaries for time worked or work done, or in the form of profits derived from the goods and services produced through market transactions, specified in the most recent international statistical standards concerning employment-related income. It includes remuneration in cash or in kind, whether actually received or not, and may also comprise additional components of cash or in-kind income. The remuneration may be payable directly to the person performing the work or indirectly to a household or family member.

Employed persons on “temporary absence” during the short reference period refers to those who, having already worked in their present job, were “not at work” for a short duration but maintained a job attachment during their absence.

Included in employment are: (a) persons who work for pay or profit while on training or skills-enhancement activities required by the job or for another job in the same economic unit, such persons are considered as employed “at work” in accordance with the international statistical standards on working time; (b) apprentices, interns or trainees who work for pay in cash or in kind; (c) persons who work for pay or profit through employment promotion programmes; (d) persons who work in their own economic units to produce goods intended mainly for sale or barter, even if part of the output is consumed by the household or family; (e) persons with seasonal jobs during the off season, if they continue to perform some tasks and duties of the job, excluding, however, fulfilment of legal or administrative obligations (e.g. pay taxes), irrespective of receipt of remuneration; (f) persons who work for pay or profit payable to the household or family; (g) regular members of the armed forces and persons on military or alternative civilian service who perform this work for pay in cash or in kind.

Excluded from employment are: (a) apprentices, interns and trainees who work without pay in cash or in kind; (b) participants in skills training or retraining schemes within employment promotion programmes, when not engaged in the production process of an economic unit; (c) persons who are required to perform work as a condition of continued receipt of a government social benefit such as unemployment insurance; (d) persons receiving transfers, in cash or in kind, not related to employment; (e) persons with seasonal jobs during the off season, if they cease to perform the tasks and duties of the job; (f) persons who retain a right to return to the same economic unit but who were absent; (g) persons on indefinite lay-off who do not have an assurance of return to employment with the same economic unit.

Persons in unemployment are defined as all those of working age who were not in employment, carried out activities to seek employment during a specified recent period and were currently available to take up employment given a job opportunity, where: (a) “not in employment” is assessed with respect to the short reference period for the measurement of employment; (b) to “seek employment” refers to any activity when carried out, during a specified recent period comprising the last four weeks or one month, for the purpose of finding a job or setting up a business or agricultural undertaking. This includes also part-time, informal, temporary, seasonal or casual employment, within the national territory or abroad; (c) “currently available” serves as a test of readiness to start a job in the present, assessed with respect to a short reference period comprising that used to measure employment.

It must be noted that in practice there are large differences in terms of country practices regarding definitions of employment and unemployment (see ILO 2011).

2.b. Labour force participation rates

The labour force projections are obtained by the product of two separate projections: a projection of the population ( POP ) of country i at time t+h ( t and h are respectively the projection origin and horizon) for the

age group a (e.g. those aged [20-24]) and sex s, and a projection of the labour force participation rate (LFPR) for the same subgroup of the population.

LFi , t  h , a , s  LFPRi , t  h , a , s  POPi , t  h , a , s

where: ithas

ithas

i t has POP
LF
LFPR

, , ,

, ,, , ,, 

The decomposition of the projection exercise into two phases has several advantages. Firstly, the determinants of the changes in population and the LFPR are not the same and can be identified.^9 The determinants of the changes in population are primarily due to changes in fertility, mortality and migration flows (United Nations 2011), while the changes in the LFPR can be the result of many factors, including changes in labour demand, as highlighted in the next section. Secondly, the LFPR varies by definition between 0 and 100 per cent, which is convenient, since logistic transformations can be applied to the LFPR in order to ensure that projected values within the 0-100 per cent interval are obtained.

2.c. The determinants of the LFPR

At the macroeconomic level, average aggregated labour force participation rates are observed for the whole population or for population subgroups (male, female, prime age, youth, etc.). These data are typically derived from labour force or other household surveys or from population censuses. The variable "labour force participation rate" is of dichotomous nature: either you participate or you do not. The determinants of the LFPR can be broken down into structural or long-term factors, cyclical factors and accidental factors.

Structural factors include policy and legal determinants (e.g., flexibility of working-time arrangements, taxation, family support, retirement schemes, apprenticeships, work permits, unemployment benefits, minimum wage) as well as other determinants (e.g., demographic and cultural factors, level of education, technological progress, availability of transportation).

Some key findings regarding female labour force participation rates (LFPR)^10 :

  • In countries where working-time arrangements are more flexible, there is a higher LFPR of female workers than in other countries.
  • Taxation of second earners (relative to single earners) usually has a negative impact on female LFPR.
  • Childcare subsidies and paid parental leave usually have a positive impact on female LFPR.
  • In countries where the proportion of unmarried women is higher, there is usually a higher female LFPR than in other countries.
  • Cultural factors such as strong family ties or religion have a strong impact on LFPR for some subgroups of the population. For example, in many countries, religious or social norms may discourage women from undertaking economic activities.

These types of structural factors are the main drivers of the long-term patterns in the data. Changes in policy and legal determinants (e.g., changes in retirement and pre-retirement schemes) can result in important shifts in participation rates from one year to another.

Cyclical factors refer to the overall economic and labour market conditions that influence the LFPR. In other words, demand for labour has an impact on the labour force. In times of recession, two effects on the

(^9) See Armstrong et al. (2005) for a presentation on the decomposition of complex time series and its pros and cons. (^10) For more details see Jaumotte (2003).

3. Estimation model 1990-2016: data and methodology

3.a. Introduction

The LFEP database is a collection of actual observations and ILO estimated labour force participation rates. The database is a complete cross-sectional time series database with no missing values. A key objective in the construction of the database is to generate a set of comparable labour force participation rates across both countries and time. With this in mind, the first step in the production of the historical portion of the 2017 Edition of the LFEP database is to carefully scrutinize existing labour force participation rates and to select only those observations deemed sufficiently comparable. A subsequent adjustment is done to the national LFPR data in order to increase the statistical basis (in other words, to decrease the proportion of imputed values): the harmonization of LFPR data by age bands (see Annex 3 for a detailed description). In the second step, a linear interpolation model and a weighted least squares model were developed to produce estimates of labour force participation rates for those countries and years in which no country-reported, cross-country comparable data currently exist.

This section contains two main parts. The first part provides an overview of the selection process of the baseline labour force participation rate (LFPR) data that serve as the key input into the 2017 edition LFEP Database. The section includes the description of the obtention process, a discussion of non-comparability issues that exist in the available LFPR data and concludes with a description of the LFPR data coverage, after taking into account the various selection criteria. The second part describes the econometric model developed for the treatment of missing LFPR values, both in countries that report in some but not all of the years in question, as well as for those countries for which no data are currently available.

3.b. The data obtention and selection process
Streamlining the data obtention process

One of the major changes in the 2017 edition of the LFEP model has been streamlining the data collection process. The new set up has one single source of data: the ILOSTAT database. In previous editions, a combination of sources were used: the ILOSTAT database, ad hoc data collection for the LFEP run, and the previous LFEP input data. The procedure used in this edition presents several advantages. First, it is replicable from a methodological standpoint. Downloading the LFPR data from the online database delivers the required data without carrying out any further action. Furthermore, the observations included in ILOSTAT are clearly traced by origin and data collection channel. Second, the labour intensity of the update of the model drastically declines. Not only because the time consuming ad-hoc data collection is avoided, but it eliminates the need to select between up-to-date observations and the data from the previous run. Third, the input data benefits from all the quality control procedures that the ILOSTAT database has in place. This includes the efforts of unreliable observation deletion, error correction, data update, data revision and new data channel creation. Taking advantage of all these procedures significantly increases the quality of the data without any additional burden to carry out the new LFEP run.^11 It is worth highlighting that new ILO micro-data processing efforts as well as new channels for obtaining bulk data has enhanced international comparability of the underlying input dataset. These channels account for almost 50 per cent of the input data in the standard age format. Consequently, the quality gains are substantial. The fourth and final advantage concerns the metadata associated to the LFPR. The metadata system of ILOSTAT is exhaustive and complex. At the same time, the metadata are, as are the data, subject to quality control procedures: error detection, update, revision and

(^11) Consider a simple example, the detection of an unreliable data point through one of the appropriate procedures in ILOSTAT that results in the removal from the database of the observation. With the streamlined data obtention this efficiently translates in the removal from the new LFEP edition. In contrast, in the old recursive data collection setting, the deleted observation would appear in the new model version. To detect and correct such inclusion one would have to compare not only all data existing in the previous model run and ILOSTAT, but all previous versions of ILOSTAT.

automatic collection. Taking advantage of this system – as opposed to creating a parallel system within the LFEP process – increases the quality of metadata and hence of decision making regarding the data with little additional effort. The improvements in metadata not only deliver a positive impact during the data obtention step, they facilitate a more replicable and precise data selection step. Figure 2 below illustrates the changes in the data obtention process.

Figure 2: Data obtention process (2017 edition) comparison

Data selection process

In order to generate a set of sufficiently comparable labour force participation rates across both countries and time, it is necessary to identify and address the various sources of non-comparability. Drawing heavily on the labour force participation data comparability discussion in the Key Indicators of the Labour Market (KILM), 9th Edition (Geneva, ILO 2015). The main sources of non-comparability of labour force participation rates are as follows:

Type of source – labour force participation rates are derived from several types of sources including labour force surveys, population censuses, establishment surveys, insurance records or official government estimates. Data taken from different types of sources are often not comparable.

Age group coverage – non-comparability also arises from differences in the age groupings used in measuring the labour force. While the standard age-groupings used in the LFEP database are 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59, 60-64 and 65+, some countries report data corresponding to other age groupings, which can adversely affect comparability. For example, some countries have adopted a cut-off point at 14 or 16 years for the lower limit and 65 or 70 years for the upper limit.

Geographic coverage – some country-reported labour force participation rates correspond to a specific geographic region, area or territory such as "urban areas". Geographically-limited data are not comparable across countries.

Resulting input data file

Together, these criteria determined the data content of the final input file, which was utilized in the subsequent econometric estimation process. Table 1 provides response rates and total observations by year. These rates represent the share of total potential (or maximum) observations for which real, cross-country comparable data (after harmonisation adjustments) exist. Compared to the previous editions a natural increase in more recent data can be observed, at the same time, there are substantial decreases in coverage for more distant years. There are two main causes of the decrease. The first is a mechanical loss due to having removed 4 reference areas with a high degree of data coverage.^12 The second is the more selective criteria for data inclusion, both due to ILOSTAT’s quality control that affects the data obtention process and more stringent time series consistency checks during the data selection process.

Table 1: Response rates by LFEP version year, both sexes combined Model: 2017 2015 2013 Year Proportion of potential observations (in percentage)

Number of observations

Proportion of potential observations (in percentage)

Number of observations

Proportion of potential observations (in percentage)

Number of observations

1990 23.9 992 27.1 1,149 26.2 1, 1991 25.4 1,058 28.4 1,206 28.2 1, 1992 23.4 975 27.0 1,147 26.5 1, 1993 25.7 1,070 29.0 1,232 28.5 1, 1994 29.1 1,208 30.3 1,286 29.7 1, 1995 30.6 1,274 33.8 1,436 33.2 1, 1996 30.6 1,272 32.6 1,386 31.5 1, 1997 29.5 1,226 31.7 1,344 31.4 1, 1998 30.1 1,250 33.6 1,427 33.4 1, 1999 33.3 1,386 37.0 1,570 36.3 1, 2000 35.3 1,466 38.5 1,635 38.1 1, 2001 33.3 1,386 40.1 1,704 39.6 1, 2002 35.9 1,492 40.0 1,698 39.5 1, 2003 34.3 1,426 41.9 1,780 41.4 1, 2004 39.0 1,623 44.9 1,908 44.8 1, 2005 38.3 1,594 48.0 2,038 47 1, 2006 40.3 1,674 46.0 1,954 45.5 1, 2007 40.2 1,670 46.6 1,978 44.7 1, 2008 40.5 1,682 48.2 2,048 46.2 1, 2009 50.0 2,078 55.5 2,356 49 2, 2010 52.0 2,164 55.0 2,336 43.8 1, 2011 48.4 2,014 50.0 2,122 29.3 1, 2012 49.8 2,072 50.7 2,153 28.3 1, 2013 48.5 2,016 50.3 2, 2014 49.4 2,056 19.0 806 2015 47.4 1, 2016 38.4 1, Total 37.1 41,688 39.4 41,835 36.6 35, In LFEP 2017 the potential number of observations for each year is 4'158 data points (11 age-groups x 189 countries x 2 sexes). Hence, the total potential number of observations during the time period 1990 to 2016 is 112’266 data points. In LFEP 2015 the potential number of observations for each year is 4'246 data points (11 age-groups x 193 countries x 2 sexes). Hence, the total potential number of observations during the time period 1990 to 2014 is 106’150 data points. In LFEP (EAPEP) 2013 the potential number of observations for each year is 4'202 data points (11 age-groups x 191 countries x 2 sexes). Hence, the total potential number of observations which covers the time period 1990 to 2012 is 96’646 data points.

(^12) The 4 areas have been integrated into the France figures.

In total, comparable data are available for 41,688 out of a possible 112,266 observations, or approximately 37.1 per cent of the total. It is important to note that while the percentage of real observations is rather low, 175 out of 189 countries (92 per cent) reported labour force participation rates in at least one year during the 1990 to 2016 reference period.^13 Thus, some information on LFPR is known about the vast majority of the countries in the sample.

There is very little difference among the 11 age-groups with respect to data availability. This is primarily due to the fact that countries that report LFPR in a given year tend to report for all age groups. On the other hand, there is a clear variation in response by year. In particular, coverage has tended to improve over time, as the lowest coverage occurred in the early 1990s. While the overall response rate is approximately 37 per cent, as will be shown in the next section, response rates vary substantially among the different regions of the world.

3.c. Missing value estimation procedure
Overview

This section describes the basic missing value estimation model developed to produce the historical LFEP database. The methodology contains four steps. First, in order to ensure within-bound estimates of labour force participation rates, a logistic transformation is applied to the input data file. Second, a simple interpolation technique is used to expand the baseline data in countries that report labour force participation rates in some years. Next, the problem of non-response bias (systematic differences between countries that report data in some or all years and countries that do not report data in any year) is addressed and a solution is developed to correct for this bias. Finally, the weighted least squares estimation model, which produces the actual country-level LFPR estimates, is explained in detail. Each of these steps is described below.

Step 1: Logistic transformation

The first step in the estimation process is to transform all labour force participation rates included in the input file. This step is necessary since using simple linear estimation techniques to estimate labour force participation rates can yield implausible results (for instance labour force participation rates of more than 100 per cent). Therefore, in order to avoid out of range predictions, the final input set of labour force participation rates is transformed logistically in the following manner prior to the estimation procedure:

it

it

it y
y
Y
ln [1]

where yit is the observed labour force participation rate by sex and age in country i and year t. This transformation ensures within-range predictions, and applying the inverse transformation produces the original labour force participation rates. The specific choice of a logistic function in the present context was chosen following Crespi (2004).

Step 2: Country-level interpolation

The second step in the estimation model is to fill in, through linear interpolation, the set of available information from countries that report in some, but not all of the years in question. In many reporting countries, some gaps in the data do exist. For instance, a country will report labour force participation rates in 1990 and 1995, but not for the years in between. In these cases, a simple linear interpolation routine is applied, in which LFPR estimates are produced using equation 2.

(^13) The 14 countries or territories for which no comparable information on labour force participation rate by sex and age is available are: Afghanistan, Central African Republic, Channel Islands, Equatorial Guinea, Eritrea, Guinea-Bissau, Democratic People's Republic of Korea, Libyan Arab Jamahiriya, Solomon Islands, Somalia, Swaziland, Turkmenistan, Uzbekistan, Western Sahara.

force participation rates for the non-reporting countries, as the sample upon which the estimates are based does not sufficiently represent the underlying heterogeneity of the population.^14

The identification problem at hand is essentially whether data in the LFEP database are missing completely at random (MCAR), missing at random (MAR) or not missing at random (NMAR).^15 If the data are MCAR, non- response is ignorable and multiple imputation techniques such as those inspired by Heckman (1979) should be sufficient for dealing with missing data. This is the special case in which the probability of reporting depends neither on observed nor unobserved variables – in the present context this would mean that reporting and non-reporting countries are essentially “similar” in both their observable and unobservable characteristics as they relate to labour force participation rates. If the data are MAR, the probability of sample selection depends only on observable characteristics. That is, it is known that reporting countries are different from non- reporting countries, but the factors that determine whether countries report data are identifiable. In this case, econometric methods incorporating a weighting scheme, in which weights are set as the inverse probability of selection (or inverse propensity score), is one common solution for correcting for sample selection bias. Finally, if the data are NMAR, there is a selection problem related to unobservable differences in characteristics among reporters and non-reporters, and methodological options are limited. In cases where data are NMAR, it is desirable to render the MAR assumption plausible by identifying covariates that impact response probability (Little and Hyonggin, 2003).

Given the important methodological implications of the non-response type, it is useful to examine characteristics of reporting and non-reporting countries in order to determine the type of non-response present in the LFEP database. Table 3 confirms significant differences between reporting and non-reporting countries in the sample.

Table 3: Per-capita GDP and population size of reporting and non-reporting countries Reporters (175 countries)

Non-reporters (14 countries) Mean per-capita GDP, 2015 (2011 International $) 19504 9024 Median per-capita GDP, 2015 (2011 International $) 13089 2377 Mean population, 2017 (millions) 41.8 9. Median population, 2017 (millions) 9.4 4. Sources: World Bank, World Development Indicators Database 2017; UN, World Population Prospects 2017 Revision Database.

The table shows that reporting countries have considerably higher per capita GDP and larger populations than non-reporting countries. In the context of the LFEP database, it is important to note that countries with low per-capita GDP also tend to exhibit higher than average labour force participation rates, particularly among women, youth and older individuals. This outcome is borne mainly due to the fact that the poor often have few assets other than their labour upon which to survive. Thus, basic economic necessity often drives the poor to work in higher proportions than the non-poor. As economies develop, many individuals (particularly women) can afford to work less, youth can attend school for longer periods, and consequently, overall participation rates in developing economies moving into the middle stages of development tend to decline.^16

It appears that factors exist that co-determine the likelihood of countries to report labour force participation rates in the LFEP input dataset and the actual labour force participation rates themselves. The missing data do not appear to be MCAR. Due to the existence of data (such as per-capita GDP and population size) for both responding and non-responding countries and that are related to response likelihood, it should be possible to

(^14) For more information, see Crespi (2004) and Horowitz and Manski (1998). (^15) See Little and Hyonggin (2003) and Nicoletti (2002). (^16) See ILO, KILM 7th (^) Edition, (Geneva, ILO, 2011) and Standing, G. Labour Force Participation and Development (Geneva, ILO, 1978).

render the MAR assumption plausible and thus to correct for the problem of non-response bias.^17 This correction can be made while using the fixed-effects panel estimation methods described below, by applying “balancing weights” to the sample of reporting countries. The remainder of the present discussion describes this weighting routine in greater detail.

The basic methodology utilized to render the data MAR and to correct for sample selection bias contains two steps. The first step is to estimate each country’s probability of reporting labour force participation rates. In the LFEP input dataset, per-capita GDP, population size, year dummy variables and membership in the Highly Indebted Poor Country (HIPC) Initiative represent the set of independent variables used to estimate response probability.^18

Following Crespi (2004) and Horowitz and Manski (1998), we characterize each country in the LFEP input dataset by a vector ( yit, xit, wit, rit ), where y is the outcome of interest (the logistically transformed labour force participation rate), x is a set of covariates that determine the value of the outcome and w is a set of covariates that determine the probability of the outcome being reported. Finally, r is a binary variable indicating response or non-response as follows:

it 0
reports
r
does not report

[3]

Equation 4 indicates that there is a linear function whereby the likelihood of reporting labour force participation rates is a function of the set of covariates:

rit  wit    it

  • ' [4]
where a country reports if this index value is positive ( rit *^ ^0 ).  is the set of regression coefficients and  it is

the error term. Assuming a symmetric cumulative distribution function, the probability of reporting labour force participation rates can be written as in equation 5.

PiFw ' it  [5]

The functional form of F depends on the assumption made about the error term it. As in Crespi (2004), we assume that the cumulative distribution is logistic, as shown in equation 6:

 

   

' '

1 exp
exp

it

it

it w
w
F w
 [6]

It is necessary to estimate equation 6 through logistic regression, which is carried out by placing each country into one of the 9 estimation groups listed in Table 2. The regressions are carried out for each of the 11 standardized age-groups and for males and females. The results of this procedure provide the predicted response probabilities for each age group within each country in the LFEP dataset.

(^17) Indeed, according to Little and Hyonggin (2003), the most useful variables in this process are those that are predictive of both the missing values (in this case labour force participation rates) and of the missing data indicator. Per-capita GDP is therefore a particularly attractive indicator in the present context. (^18) HIPC membership is utilized as an explanatory variable for response probability due to the fact that HIPC member countries are required to report certain statistics needed to measure progress toward national goals related to the program. As a result, taking all else equal, HIPC countries may be more likely to report labour force participation rates.

Table 4: Independent variables in fixed-effects panel regression Variable Source Per-capita GDP, Per-capita GDP squared World Bank, Wor 2017 and IMF, World Economic Outlookld Development Indicators April Real GDP growth rate, Lagged real GDP growth rate 2017

Share of population aged 0-14, Share of population aged 15-24, Share of population aged 25- 64

United Nations, World Population Prospects 2017 Revision Database

Within the context of the LFEP database, there are two primary considerations in selecting independent variables for estimation purposes. First, the selected variables must be robust correlates of labour force participation, so that the resulting regressions have sufficient explanatory power. Second, in order to maximize the data coverage of the final LFEP database, the selected independent variables must have sufficient data coverage.

In terms of variables related to economic growth and development, as mentioned above, per-capita GDP is often strongly associated with labour force participation rates.^22 This, together with the substantial coverage of the indicator made it a prime choice for estimation purposes. However, given that the direction of the relationship between economic development and labour force participation can vary depending on a country’s stage of development, the square of this term was also utilized to allow for this type of non-linear relationship.^23 Figure 2 depicts clearly that this relationship varies by both sex and age group. For men and women in the prime-age, there is no clear relationship. For both men and women aged 15 to 19 years, and those aged between 55 and 64 years, there is an indication of a U-relationship between per capita GDP and labour force participation rate. Furthermore, annual GDP growth rates were used to incorporate the relationship between participation and the state of the macro-economy.^24 The lag of this term was also included in order to allow for delays between shifts in economic growth and changes in participation.

(^22) See also Nagai and Pissarides (2005), Mammen and Paxon (2000) and Clark et al. (1999). (^23) Whereas economic development in the poorest countries is associated with declining labour force participation (particularly among women and youth), in the middle- and upper- income economies, growth in GDP per capita can be associated with rising overall participation rates – often driven by rising participation among newly empowered women. This phenomenon is the so-called “U- shaped” relationship between economic development and participation. See ILO, KILM 4th^ Edition and Mammen and Paxon (2000). (^24) See Nagai, L. and Pissarides (2005), Fortin and Fortin (1998) and McMahon (1986).

Figure 2: Labour force participation rates by sex and age-group, and per capita GDP, 2006 (78 countries with reported data)

Note: 2006 is selected because it represents a pre-crisis year for which the response rate is the highest.

Changes in the age-structure of populations can also affect labour force participation rates over time. This happens at the country-wide level since different age cohorts tend to have different labour force participation rates, and thus changes in the aggregate age structure of a population can affect the overall participation rate.