¡Descarga Uncertainty in Migration Forecasting y más Monografías, Ensayos en PDF de Estadística solo en Docsity!
The University of Manchester Research
Evaluation of existing migration forecasting methods and
models
Link to publication record in Manchester Research Explorer
Citation for published version (APA): Disney, G., Winiowski, A., Forster, J. J., Smith, P. W. F., & Bijak, J. (2015). Evaluation of existing migration forecasting methods and models. (Report for the Migration Advisory Committee: commissioned research). ESRC Centre for Population Change, University of Southampton. https://www.gov.uk/government/publications/evaluation- of-existing-migration-forecasting-methods-and-models
Citing this paper Please note that where the full-text provided on Manchester Research Explorer is the Author Accepted Manuscript or Proof version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version.
General rights Copyright and moral rights for the publications made accessible in the Research Explorer are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
Takedown policy If you believe that this document breaches copyright please refer to the University of Manchester’s Takedown Procedures [http://man.ac.uk/04Y6Bo] or contact [email protected] providing relevant details, so we can investigate your claim.
Download date:03. Nov. 2022
Evaluation of existing migration forecasting
methods and models
Report for the Migration Advisory Committee
George Disney, Arkadiusz Wiśniowski, Jonathan J. Forster, Peter W.F. Smith and Jakub Bijak
ESRC Centre for Population Change, University of Southampton.
Southampton, 10 October 2015
Acknowledgments
This work has been prepared for and was funded by the Migration Advisory Committee (MAC), under the Home
Office Science contract HOS/14/040. We would like to thank Allan Findlay and the members of the MAC and the
MAC Secretariat for their comments on the drafts. All the interpretations in this report are those of the authors only.
Contents
- Tables and Figures
- Key Messages
- Executive Summary
- Introduction
- Uncertainty and Migration
- Data Audit and Assessment
- Review of Forecasting Methods
- 4.1. Review of Past Reviews
- 4.2. Deterministic Approaches...................................................................................
- 4.3. Probabilistic Models
- 4.3.1 Probabilistic Extrapolation of Time Series
- 4.3.2 Probabilistic Expert Based Forecasts
- 4.3.3 Bayesian Forecasts
- 4.3.4 Econometric Models with Covariate Information...............................................................
- 4.3.5 Extrapolation – Historical Forecast Errors
- 4.4 Matching Forecasting Methods to Available Data
- Backward- and Forward-Look Empirical Analysis
- 5.1 Empirical Research – Framework........................................................................
- 5.2 Empirical Research – Results
- 5.3 Sensitivity Analysis
- 5.3.1 Sensitivity to Prior Assumptions
- 5.3.2 Sensitivity to Expert Knowledge
- 5.3.3 Sensitivity of the econometric models
- 5.3 Forward Look Exercise
- Conclusions and Recommendations
- 6.1 Key findings and limitations of the study
- 6.2 Migration forecasting: a three-point modelling process
- 6.3 Concluding remarks: From forecasts to decisions
- References
- Glossary of the Key Terms
- Appendix A. Backward-look exercise, series truncated in
- Appendix B. Backward-look exercise, series truncated in
- Appendix C. Scoring rules for the forecast quality categories
- Table 1: Data Audit and Assessment Tables and Figures
- Table 2: Data and Method Matrix
- Table 3: Summary of data series and models analysed in the empirical exercise
- similarly important Table 4: Backward look summary table – symmetric scores, errors and calibration
- important than calibration Table 5: Backward look summary table – asymmetric scores, errors more
- forecasts, from the point of view of possible policy impacts Table 6: A stylised example of a risk management matrix related to migration
- Appendix A: Results for the backward-look exercise based on series truncated in
- Table A1: IPS totals
- Table A2: UK citizens
- Table A3: Other EU citizens
- Table A4: Non-EU citizens
- Table A5: Administrative data
- Appendix B: Results for the backward-look exercise based on series truncated in
- Table B1: IPS totals
- Table B2: UK citizens
- Table B3: Other EU citizens
- Table B4: Non-EU citizens
- Table B5: Administrative data
- Table C1: Symmetric Score Table, with errors and calibration similarly important Appendix C: Backward Look Summary Tables
- Table C2: Asymmetric Score Table, with errors more important than calibration
- Figure 1: Data on migration in the United Kingdom
- official net migration assumptions for NPP, 1970-based to 2012-based Figure 2: Average error and its standard deviation by projection horizon, past
- Figure 3: Selected examples of backward-look exercise
- Figure 4: Examples of forward-look exercise
- Box 1. Conversion of error and calibration measures into quality classes: Examples
this report produced considerable uncertainty when tested against past data. For example,
the more reliable models for migration from the new EU member states predicted that there
was a 50 per cent chance that the average annual immigration between 2004 and 2013 would
range from 100 to 200 thousand people; in reality, this proved to be just above 150,000.
Some models produced smaller errors and consequently were deemed to have performed
better than others with regards to forecasting migration. Notably, the more successful
forecasts were observed when using the more stable data series, such as the migration of
UK nationals, which are less susceptible to unpredictable shocks or policy changes.
From the analysis, there is no particular model that can be considered as conclusively
superior. Instead, it is recommended that any future analysis utilise a three-step approach to
migration forecasting. Namely, future analysis should assess the nature of the migration flow
being forecast, evaluate the available data, and design a bespoke forecasting model for the
given situation. Instead of trying to do the impossible and design the ‘best possible’
migration forecasting method, further work in this area should focus on the ways of
translating uncertain forecasts into policy advice and decisions.
It is imperative that all migration forecasts emphasise the uncertainty involved in the
predictions. This is necessary to transparently acknowledge that migration cannot be
forecasted without substantial error, whilst also providing an account for the possible size of
these errors. Different ways of showing the range of errors are possible, by the means of
probabilities for various ranges of possible outcomes.
Since the probability of a single forecast being correct is extremely low, it is vital that the
uncertainty around migration forecasts is made explicit to decision-makers and the general
public. Emphasising the uncertainty also allows decision-makers to correctly represent the
fact that migration can be affected by a wide range of events, including ‘shocks’, all of which
need to be taken into account as, although they are quite unlikely, their potential impact on
migratory flows could be large.
Executive Summary
Forecasting migration flows is an extremely difficult task, characterised by very high
levels of error, the highest amongst the three components of demographic change (fertility,
mortality and migration). There are many social, economic, political or even environmental
drivers which can impact migration flows: from unemployment, job prospects and wage
differentials, to social networks and various institutions. In particular, migration is very
susceptible to events which are difficult to predict in terms of timing as well as scale of impact,
such as fluctuations in the economic cycle, the incidence of armed conflict, and changes in
policies or political circumstances.
There exists no perfect migration theory that can be used for forecasting purposes.
Explanations put forward for complex migration processes are often entrenched in particular
disciplines of social sciences, such as economics, sociology or human geography, so that they
tend to focus predominantly on one fragment of the picture and examine a specific group of
migration drivers (economic, social, etc.) at the expense of others. Even if credible theoretical
explanations of past migration flows do exist, their tenets tend to be difficult to extrapolate into
the future.
This report focuses on three sources of uncertainty in migration forecasts: uncertainty
about the future, errors and differences in the data, and uncertainty related to relying on a single
particular forecasting model amongst the many possibilities available ( Section 2 ). None of these
sources of uncertainty can be eliminated completely, but it is crucial that they are acknowledged
in the forecasting process in a transparent way. Uncertainty about data can be quantified, albeit
with caution, but the uncertainty surrounding the nature and size of future shocks can be
assessed only to a very limited extent. As such, any decision-making based on migration forecasts
is particularly susceptible to error from unforeseen future events.
There exist several different sources of data on migration flows into and out of the
United Kingdom, but they differ with respect to four key attributes: the definitions of migration
they use; the particular migrant groups they cover; whether there is under- or over-reporting of
migration; and how accurate their measurements are ( Section 3 ). Even the main source of data
used to measure migration in the UK, the International Passenger Survey (IPS), has several
weaknesses. The IPS is a sample survey, so disaggregations of the data by countries of origin or
destination of migrants can have high margins of error resulting from sampling of respondents.
There can also be some bias in the numbers related to the way the data are collected, with the
initial focus mainly on the largest airports and Channel crossings having caused problems after
the 2004 enlargement of the European Union. Additionally, the long-term IPS estimates are
based on the questions about the intended (rather than actual) length of stay in the UK or
abroad.
that did not assume stability of trends, when none was to be expected, at least described the
forecast uncertainty more accurately, and hence more honestly.
In terms of recommendations for the future, rather than suggesting any particular model
or approach for all circumstances, we recommend following a three-stage process to guide the
choice of the forecasting methodology for a given task ( Section 6 ):
Stage 1. Develop a thorough understanding of the migration flow being forecast, with a
focus on whether it is stable or highly susceptible to external political or economic shocks or
policy interventions. For example, asylum flows, generated by war and conflict in other parts
of the world, can be expected to be less stable than return flows of UK nationals.
Stage 2. The available data need to be assessed with their relative strengths and weaknesses
taken into consideration, such as the length of data series or presence of structural breaks in
the data. Forecasts based on short series are typically characterised by higher uncertainty,
and so are predictions of such migration flows, which were subject to shocks in the past.
Stage 3. An appropriate modelling approach needs to be selected given the characteristics
of the migration flow in question and the available data. In particular, a data series with non-
stable characteristics should not be forecasted by using models which assume stability (in
technical terms, stationarity) of the process, and vice versa. Short data series may ideally
require additional expert input concerning the future migration flows.
Following the process outlined above cannot guarantee that the resulting forecasts will
exhibit no or only small errors. Still, it would help safeguard against making forecasts that are
either unjustifiably too precise or just too uncertain to be useful, and thus protect from making
radically incorrect policy decisions on the basis of such predictions. It is unrealistic to expect that
there will be no uncertainty: even in good forecasts, errors are inevitable, and the longer the
forecast horizon, the higher the errors. In many cases, forecast errors become too high for the
forecasts to be useful beyond the horizon of five to ten years into the future.
One key recommendation is that any migration forecasts should come with explicit
statements of uncertainty, ideally expressed in terms of probabilities for various ranges of
possible migration outcomes. Instead of trying to do the impossible and design the ‘best possible’
migration forecasting method, further work in this area should focus on translating uncertain
forecasts into decisions, creating early warning systems, and providing risk management
strategies. As a caveat, the forecasters should not offer methods producing too certain
predictions, as they will most likely fail, but neither should the decision-makers expect or require
them. Bringing together potential impacts of migration policy interventions (migration caps, visa
regulations) with the uncertainty of these impacts can also help policy makers make prudent and
more robust decisions, for example related to controlling or influencing specific migration flows.
events affecting migration are also complex and largely unpredictable with respect to the
countries involved, timing, and magnitude. In addition, migration often leads to the
establishment of networks of migrants from shared origins, which perpetuates the process
further through facilitating further migration. This further increases the complexity and
uncertainty of migration as a process.
The second source of uncertainty under consideration is associated with migration data
themselves. To illustrate this, let us consider the three components of population change – births,
deaths and migration. In comparison to births and deaths, migration is the most uncertain
component. Vital events, by their very nature, are straightforward to define and measure, with
birth registration a universal human right (UN 1966), for example. As such, there is a relatively
high level of certainty in the measurement of births or deaths.
The same cannot be said for migration, which, to begin with, is difficult to define. Sources
of migration data from different countries are often based on differing definitions (cf. Raymer et al.
2013). A further source of uncertainty is that the available data are often inaccurate, inconsistent
and incomplete. Migration into and out of the UK is no exception. The precise size of
international migration flows are difficult to measure; data collection systems used to record
migrants often produce biased and inaccurate estimates, which need correcting (Disney 2014;
Wiśniowski 2013).
The third source of uncertainty comes from the forecasting models themselves.
Applications of different models to the same data can produce different forecasts, including
different assessments of the uncertainty of the predictions. There is no perfect model, and
choosing which model to apply is a matter of judgement, therefore justification is required. If the
forecasts from various competing models are combined using formal criteria, additional
uncertainty about the model is introduced (cf. Bijak and Wiśniowski 2010).
Consequently, it is clear, that in any forecast of migration there are multiple
considerations of uncertainty that firstly need to be fully understood and then taken into account
in an empirical analysis. Experts play a key role in developing forecasts, but their task depends on
the chosen approach (Lutz and Goldstein 2004). For example, the role of the expert could be
limited to choosing the forecast model and selecting the underlying sources of data, or providing
expert judgement that is explicitly incorporated as a parameter in the model.
3. Data Audit and Assessment
As stated above, international migration is hard to define and measure. The available data are
often inconsistent and are not designed with the purpose of monitoring migration. Hence, one of
the main sources of uncertainty in any forecast of immigration comes from the data as such in
terms of how they are collected, processed and disseminated (Kupiszewska and Nowok 2008).
With this in mind, it is vital to have a framework that one can use to understand the
extent and nature of the uncertainty in migration data. The main sources of publicly available
data are outlined and assessed in Table 1 below to help aid our understanding of this uncertainty.
Each of the sources of data are assessed in relation to the ‘true flow’ (cf. Raymer et al.
2013 and Wiśniowski et al. 2013). ‘True flow’ is the unknown number that is being estimated. It
represents the number that one would obtain if one was able to monitor a given definition of
immigration perfectly, without bias and undercount and with complete coverage of the
population. A true flow for the purpose of this data audit uses the UN (1998) definition of long-
term international migration:
“A person who moves to a country other than that of his or her usual residence for a period of at least a
year (12 months), so that the country of destination effectively becomes his or her new country of usual
residence. From the perspective of the country of departure the person will be a long-term emigrant and from
that of the country of arrival the person will be a long-term immigrant” (UN 1998: page 18).
Similarly, short-term migration is defined by length of stay between three and 12 months (idem).
In general, various definitions can be used to represent the ‘true flow’ depending on its purpose.
The data collected to measure such a flow should then aim at reflecting it as closely as possible.
The quality of each source of data can be assessed in relation to the ‘true flow’ according to the
UN definition, by using the following analytical categories (Raymer et al. 2013; Disney 2014):
(i) Definition – how closely do the data match the UN definition of international migration?
(ii) Coverage – theoretically what portion of the total immigration flow does the data set cover?
(iii) Bias – is there any systematic bias as a result of the way the data are collected?
(iv) Accuracy – with regard to its intended purpose, how accurate are the data?
A summary table of the data audit and assessment is detailed below (Table 1). A traffic
light system is used to indicate how close a match, for each given assessment criteria, each source
of data are to the true flow. Green indicates a close match to true flow and red indicates that
there is a large distortion leading to a large level of uncertainty or bias in the data; with orange
indicating a medium distortion of the true flow and resulting moderate bias and uncertainty.
Mode of data collection, data availability, availability of migration characteristics such as
citizenship, country of birth, and country of previous residence, as well as an indicator as to
whether the data describe migration stocks or flows are all detailed in the table.
In Figure 1, we present the migration data as measured by various sources. The observed
increase in total immigration as measured by the International Passenger Survey (IPS) since the
late 1990s is a result of an increase of non-British migrants, and, especially after the enlargement
of the European Union (EU) in 2004, migration from the EU-8 new member states. Alongside
the IPS trends, its augmented version, the Long-Term International Migration (LTIM), is
presented, additionally including asylum seekers, migration to and from the Republic of Ireland,
as well as a correction for an estimated number of people who change their migration intentions.
A relatively smaller increase is observed for the IPS emigration, with a notable switch to
emigration of non-British being larger than emigration of British nationals. Short-Term
International Migration (STIM) remains on rather stable levels during the 2004-2012 period for
which the data are available. We also observe that between 2002 and 2014 the number of non-
EU students registered at the Higher Education Statistics Authority (HESA) more than doubled,
while the number of students from the EU grew only by 50 per cent in the same period.
Table 1 : Data Audit and Assessment (continued)
Source
Data Characteristics Data Assessment
Data Collection
Years Available
Citizen
- ship
Country of Birth
Previous residence
Flow/ Stock
Availability Definition Coverage Bias Accuracy
NINo (DWP) – Immigration
Administra- tive Registration Data
2014 data incomplete
Y N N Proxy for flow
Publicly available
Does not match the UN definition.
No duration of stay criteria so includes many short term migrants.
Probable overcount.
Does not include British migrants, full time students and children. Includes people who work formally and/or claim social security. Doesn’t include children.
Probable undercount.
Only discernible bias comes from only counting circular and repeat migrants once.
Possible undercount.
Accurately measures number of new NINo registrations.
Error will be administrative.
Non-random error in registration lag.
HESA –
Immigration
Administra- tive Data Collected by HESA
2002 – 2010 Y N N Proxy for flow
Permission obtained
Does not match exactly, but closer match than NINo.
Includes students who drop out but not short courses. Majority will study for at least 1 year.
Just public HE students. Doesn’t cover private universities and FE Colleges.
No British, children or older people. Probable undercount
There is no discernible bias
Accurately measures number of non-UK domiciled students by citizenship; error will be administrative
Worker Registration Scheme – Immigration
Administra- tive Registration Data There was a £ registration fee for migrant workers.
2004 – 2011 Y N N Proxy for flow
Publicly available
Does not match the UN definition.
There is no requirement to de- register. Information on duration of stay is limited.
Poor coverage as it only includes migrants from A8 countries and only includes people who migrate to work.
Self-employed aren’t required to register
The cost could be a disincentive for migrants to register, especially low paid
Possible undercount
Accurately measures the number of migrants registering to work
Table 1: Data Audit and Assessment (continued)
Source
Data Characteristics Data Assessment
Data Collection
Years Available
Citizen
- ship
Country of Birth
Previous residence
Flow/ Stock
Availability Definition Coverage Bias Accuracy
Home Office Entry Clearance Visa Data – Immigration
Administra- tive data
2004 – 2013 Y N N Flow data
Publicly available
There is not clear information on duration of stay.
Some limited infor- mation on length of visa is published, but this may not reflect length of stay as people may leave early or be granted an extension.
Does not include EU migrants or British return migrants.
Does include all non- EU who require a visa for entry (and residence)
There is no discernible bias
Accurately measures the number of people who require a visa to enter UK; error will administrative
2001 Census Special Migration Statistics. Table CO711B
- Immigration
Population Census Data
Transition data indicating residence 12 months prior to Census night.
2001 N N Y Flow data (tran- sitions)
Specially commi- ssioned census table.
Does not match UN definition exactly, transition data means duration of stay is unknown
Respondents are usually resident in UK
Only includes migrants who are usually resident in England and Wales.
Theoretically has complete coverage of England and Wales
Probable undercount
Non-response bias of Census form.
Hard to count groups – students, young people.
Probable undercount
Accurately records people who answer question in survey.
Error is administrative
2011 Census Data – Immigration
Population Census Data, Transition Data
2011 N N Y Flow data (tran- sitions)
Specially commi- ssioned census table.
As above As above As above As above
Figure 1: Data on migration in the United Kingdom
Source: Office for National Statistics; Higher Education Statistics Authority; Home Office (various years)
0
100000
200000
300000
400000
500000
600000
700000
800000
1975 1980 1985 1990 1995 2000 2005 2010
Immigration
IPS Total LTIM IPS British IPS Non-British IPS EU IPS EU- IPS EU-8 IPS Non-EU NINO Asylum applicants
0
100000
200000
300000
400000
500000
600000
700000
800000
1975 1980 1985 1990 1995 2000 2005 2010
Emigration
IPS British IPS Non-British IPS Total
0
500000
1000000
1500000
2000000
2500000
3000000
1975 1980 1985 1990 1995 2000 2005 2010
Short-term (STIM) migration
Immigration UN definition Emigration UN definition Immigration 3-12 months Emigration 3-12 months Immigration 1-12 months Emigration 1-12 months
0
100000
200000
300000
400000
500000
600000
700000
800000
1975 1980 1985 1990 1995 2000 2005 2010
Immigration of students (HESA)
HESA EU HESA Non-EU
When using multiple sources of data, especially data whose primary purpose is not to
measure migration, a key task is to establish the extent to which the true flow is distorted and the
associated uncertainty resulting from this distortion (Disney 2014). However, carrying out this
assessment in a coherent way, and formally including it in the empirical exercise, remains beyond
the scope of this study.
4. Review of Forecasting Methods
In this section, the main approaches to forecasting migration are reviewed. They are summarised
in the table below and are reviewed in detail later in this section. Broadly speaking, there are two
main types of migration forecasts. The first are deterministic forecasts (often called ‘projections’;
cf. Keilman 1990) and the second are probabilistic (stochastic) forecasts. In this section both
types of models are defined and illustrative examples of each forecasting approach are given.
4.1. Review of Past Reviews
The existing recent reviews of migration forecasting methods include the studies by Howe and
Jackson (2005), Bijak (2010, 2012), as well as – in the wider context of population projections or
forecasts – by Wilson and Rees (2005), Keilman (2007, 2008) and Shaw (2007). In Bijak (2010),
the theoretical discussion of the methods largely follows the grouping into the deterministic and
probabilistic (stochastic) classes, which are discussed in more detail in Sections 4.2 and 4.3 below.
Amongst the various methods reviewed in those sources, some (e.g. ethno-surveys or
event history analysis) require extensive micro-level information about individual migrants and
migrations, which is currently available only for a handful of very specific contexts. As such, the
micro methods remain outside of the scope of this review. Some other approaches, such as
‘sociodynamics’, with interlinked systems of dynamic differential equations, remain too complex
and have too high data requirements to be useable in practice. Similarly, macro-level demo-
economic models are quite complex and require bespoke specifications driven by the data
availability, and hence have not been widely used in practice. Some other methods, such as
assessments of ‘migration potential’ are criticised for not measuring migration, but rather a
general state of dissatisfaction with the situation in the home country (Bijak 2010). Still, such
variables can be potentially used as covariates in econometric models.
From the various methods, the ones most commonly used, both in academic literature
and in official statistical applications, include judgemental scenarios, econometric models, and
statistical time series extrapolations – with or without expert input, the former either as purely
expert-based or Bayesian approaches. The variables being predicted are typically either stocks or
shares of foreign-born populations, volumes or rates of flows, or net migration. Examples of the
particular studies making use of the various methods are available in Bijak (2010).
In a vast majority of countries and international institutions (Eurostat, or the 2012 round
of the UN World Population Prospects), argument-based scenarios are utilised, sometimes based
on past trends, or additionally including additional parameters (target levels). This situation has
not changed much since the early 1990s (as reported by Keilman and Cruijsen 1992, see also