






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
chapter from econometrics boook
Typology: Summaries
1 / 12
This page cannot be seen from the preview
Don't miss anything!







43
sk a half dozen econometricians what econometrics is, and you could get a half dozen different answers. One might tell you that econometrics is the science of testing economic theories. A second might tell you that econometrics is the set of tools used for forecasting future values of economic variables, such as a firm’s sales, the overall growth of the economy, or stock prices. Another might say that econometrics is the process of fitting mathematical economic models to real-world data. A fourth might tell you that it is the science and art of using historical data to make numerical, or quantitative, policy recommendations in government and business. In fact, all these answers are right. At a broad level, econometrics is the science and art of using economic theory and statistical techniques to analyze economic data. Econometric methods are used in many branches of economics, including finance, labor economics, macroeconomics, microeconomics, marketing, and economic policy. Econometric methods are also commonly used in other social sciences, including political science and sociology. This text introduces you to the core set of methods used by econometricians. We will use these methods to answer a variety of specific, quantitative questions from the worlds of business and government policy. This chapter poses four of those questions and discusses, in general terms, the econometric approach to answering them. The chapter concludes with a survey of the main types of data available to econometri- cians for answering these and other quantitative economic questions.
Many decisions in economics, business, and government hinge on understanding rela- tionships among variables in the world around us. These decisions require quantita- tive answers to quantitative questions. This text examines several quantitative questions taken from current issues in economics. Four of these questions concern education policy, racial bias in mortgage lending, cigarette consumption, and macroeconomic forecasting.
Proposals for reform of the U.S. public education system generate heated debate. Many of the proposals concern the youngest students, those in elementary schools. Elementary school education has various objectives, such as developing social skills,
44 CHAPTER 1 Economic Questions and Data
but for many parents and educators, the most important objective is basic academic learning: reading, writing, and basic mathematics. One prominent proposal for improving basic learning is to reduce class sizes at elementary schools. With fewer students in the classroom, the argument goes, each student gets more of the teacher’s attention, there are fewer class disruptions, learning is enhanced, and grades improve. But what, precisely, is the effect on elementary school education of reducing class size? Reducing class size costs money: It requires hiring more teachers and, if the school is already at capacity, building more classrooms. A decision maker contem- plating hiring more teachers must weigh these costs against the benefits. To weigh costs and benefits, however, the decision maker must have a precise quantitative understanding of the likely benefits. Is the beneficial effect on basic learning of smaller classes large or small? Is it possible that smaller class size actually has no effect on basic learning? Although common sense and everyday experience may suggest that more learn- ing occurs when there are fewer students, common sense cannot provide a quantita- tive answer to the question of what exactly is the effect on basic learning of reducing class size. To provide such an answer, we must examine empirical evidence—that is, evidence based on data—relating class size to basic learning in elementary schools. In this text, we examine the relationship between class size and basic learning, using data gathered from 420 California school districts in 1999. In the California data, students in districts with small class sizes tend to perform better on standardized tests than students in districts with larger classes. While this fact is consistent with the idea that smaller classes produce better test scores, it might simply reflect many other advantages that students in districts with small classes have over their counterparts in districts with large classes. For example, districts with small class sizes tend to have wealthier residents than districts with large classes, so students in small-class districts could have more opportunities for learning outside the classroom. It could be these extra learning opportunities that lead to higher test scores, not smaller class sizes. In Part II, we use multiple regression analysis to isolate the effect of changes in class size from changes in other factors, such as the economic background of the students.
Most people buy their homes with the help of a mortgage, a large loan secured by the value of the home. By law, U.S. lending institutions cannot take race into account when deciding to grant or deny a request for a mortgage: Applicants who are identical in all ways except their race should be equally likely to have their mortgage applications approved. In theory, then, there should be no racial bias in mortgage lending. In contrast to this theoretical conclusion, researchers at the Federal Reserve Bank of Boston found (using data from the early 1990s) that 28% of black applicants are
46 CHAPTER 1 Economic Questions and Data
While economic theory, such as the production function for health, helps us ana- lyze the mix of inputs that may lead to improved health outcomes, it does not tell us the actual values for parameters such as the spending elasticity for mortality. To estimate the value, we must examine empirical evidence about the returns to health- care spending—either based on variations in spending across countries or within countries over time (or both). In other words, we need to analyze the data on how health outcomes and healthcare expenditures are related. For many years economists have attempted to address this question by consider- ing the data on healthcare expenditures and mortality rates across countries, but such empirical research is fraught with challenges. Two of the biggest challenges concern the extensive heterogeneity across countries. The first challenge is observable hetero- geneity, which concerns factors that affect countries’ mortality rates that may also be associated with healthcare expenditure, for example, the income per capita of each country. This can be controlled for using multiple regression analysis, as described in Part II, since these factors are observable to the analyst. The second and more trou- blesome challenge is the presence of unobservable heterogeneity. Unobserved fac- tors may be important in the underlying processes determining both how decisions are made on how much money is spent on healthcare, and how the overall level of health outcome that is attained. These factors result in causality running in both directions—healthcare reduces mortality, but higher healthcare expenditure might be a response to unobserved factors, such as small natural disasters that increase mortality. Methods for handling this “simultaneous causality” are described in Chapter 12, applied to the different but conceptually similar context of estimating the price elasticity of cigarette demand.
It seems that people always want a sneak preview of the future. What will sales be next year at a firm that is considering investing in new equipment? Will the stock market go up next month, and, if it does, by how much? Will city tax receipts next year cover planned expenditures on city services? Will your microeconomics exam next week focus on externalities or monopolies? Will Saturday be a nice day to go to the beach? One aspect of the future in which macroeconomists are particularly interested is the growth of real economic activity, as measured by real gross domestic product (GDP), during the next year. A management consulting firm might advise a manufacturing cli- ent to expand its capacity based on an upbeat forecast of economic growth. Economists at the Federal Reserve Board in Washington, D.C., are mandated to set policy to keep real GDP near its potential in order to maximize employment. If they forecast anemic GDP growth over the next year, they might expand liquidity in the economy by reduc- ing interest rates or other measures, in an attempt to boost economic activity. Professional economists who rely on numerical forecasts use econometric mod- els to make those forecasts. A forecaster’s job is to predict the future by using the
1.2 Causal Effects and Idealized Experiments 47
past, and econometricians do this by using economic theory and statistical techniques to quantify relationships in historical data. The data we use to forecast the growth rate of GDP include past values of GDP and the so-called term spread in the United States. The term spread is the difference between long-term and short-term interest rates. It measures, among other things, whether investors expect short-term interest rates to rise or fall in the future. The term spread is usually positive, but it tends to fall sharply before the onset of a reces- sion. One of the GDP growth rate forecasts we develop and evaluate in Chapter 15 is based on the term spread.
Each of these four questions requires a numerical answer. Economic theory provides clues about that answer—for example, cigarette consumption ought to go down when the price goes up—but the actual value of the number must be learned empirically, that is, by analyzing data. Because we use data to answer quantitative questions, our answers always have some uncertainty: A different set of data would produce a different numer- ical answer. Therefore, the conceptual framework for the analysis needs to provide both a numerical answer to the question and a measure of how precise the answer is. The conceptual framework used in this text is the multiple regression model, the mainstay of econometrics. This model, introduced in Part II, provides a mathematical way to quantify how a change in one variable affects another variable, holding other things constant. For example, what effect does a change in class size have on test scores, holding constant or controlling for student characteristics (such as family income) that a school district administrator cannot control? What effect does your race have on your chances of having a mortgage application granted, holding con- stant other factors such as your ability to repay the loan? What effect does a 1% increase in the price of cigarettes have on cigarette consumption, holding constant the income of smokers and potential smokers? The multiple regression model and its extensions provide a framework for answering these questions using data and for quantifying the uncertainty associated with those answers.
Like many other questions encountered in econometrics, the first three questions in Section 1.1 concern causal relationships among variables. In common usage, an action is said to cause an outcome if the outcome is the direct result, or consequence, of that action. Touching a hot stove causes you to get burned, drinking water causes you to be less thirsty, putting air in your tires causes them to inflate, putting fertilizer on your tomato plants causes them to produce more tomatoes. Causality means that a specific action (applying fertilizer) leads to a specific, measurable consequence (more tomatoes).
1.3 Data: Sources and Types 49
Forecasting is a special case of what statisticians and econometricians call prediction , which is using information on some variables to make a statement about the value of another variable. A forecast is a prediction about the value of a variable in the future, like GDP growth next year. You do not need to know a causal relationship to make a good prediction. A good way to “predict” whether it is raining is to observe whether pedestrians are using umbrellas, but the act of using an umbrella does not cause it to rain. When one has a small number of predictors and the data do not evolve over time, the multiple regression methods of Part II can provide reliable predictions. Predic- tions can often be improved, however, if there is a large number of candidate predic- tors. Methods for using many predictors are covered in Chapter 14. Forecasts—that is, predictions about the future—use data on variables that evolve over time, which introduces new challenges and opportunities. As we will see in Chapter 15, multiple regression analysis allows us to quantify historical relation- ships, to check whether those relationships have been stable over time, to make quan- titative forecasts about the future, and to assess the accuracy of those forecasts.
In econometrics, data come from one of two sources: experiments or nonexperi- mental observations of the world. This text examines both experimental and nonexperimental data sets.
Experimental data come from experiments designed to evaluate a treatment or policy or to investigate a causal effect. For example, the state of Tennessee financed a large randomized controlled experiment examining class size in the 1980s. In that experiment, which we examine in Chapter 13, thousands of students were randomly assigned to classes of different sizes for several years and were given standardized tests annually. The Tennessee class size experiment cost millions of dollars and required the ongoing cooperation of many administrators, parents, and teachers over several years. Because real-world experiments with human subjects are difficult to administer and to control, they have flaws relative to ideal randomized controlled experiments. More- over, in some circumstances, experiments are not only expensive and difficult to administer but also unethical. (Would it be ethical to offer randomly selected teenag- ers inexpensive cigarettes to see how many they buy?) Because of these financial, practical, and ethical problems, experiments in economics are relatively rare. Instead, most economic data are obtained by observing real-world behavior. Data obtained by observing actual behavior outside an experimental setting are called observational data. Observational data are collected using surveys, such as telephone surveys of consumers, and administrative records, such as historical records on mortgage applications maintained by lending institutions.
50 CHAPTER 1 Economic Questions and Data
TABLE 1.1 Selected Observations on Test Scores and Other Variables for California School Districts in 1999 Observation (District) Number
District Average Test Score (fifth grade)
Student–Teacher Ratio
Expenditure per Pupil ($)
Percentage of Students Learning English 1 690.8 17.89 $6385 0.0% 2 661.2 21.52 5099 4. 3 643.6 18.70 5502 30. 4 647 .7 17 .36 7102 0. 5 640.8 18.67 5236 13. c c c c c
418 645.0 21.89 4403 24. 419 672.2 20.20 4776 3. 420 655.8 19.04 5993 5. Note: The California test score data set is described in Appendix 4.1.
Observational data pose major challenges to econometric attempts to estimate causal effects, and the tools of econometrics are designed to tackle these challenges. In the real world, levels of “treatment” (the amount of fertilizer in the tomato exam- ple, the student–teacher ratio in the class size example) are not assigned at random, so it is difficult to sort out the effect of the “treatment” from other relevant factors. Much of econometrics, and much of this text, is devoted to methods for meeting the challenges encountered when real-world data are used to estimate causal effects. Whether the data are experimental or observational, data sets come in three main types: cross-sectional data, time series data, and panel data. In this text, you will encounter all three types.
Data on different entities — workers, consumers, firms, governmental units, and so forth — for a single time period are called cross-sectional data. For example, the data on test scores in California school districts are cross sectional. Those data are for 420 entities (school districts) for a single time period (1999). In general, the number of entities on which we have observations is denoted n ; so, for example, in the California data set, n = 420. The California test score data set contains measurements of several different variables for each district. Some of these data are tabulated in Table 1.1. Each row lists data for a different district. For example, the average test score for the first dis- trict (“district 1”) is 690.8; this is the average of the math and science test scores for all fifth-graders in that district in 1999 on a standardized test (the Stanford Achieve- ment Test). The average student–teacher ratio in that district is 17.89; that is, the num- ber of students in district 1 divided by the number of classroom teachers in district 1
52 CHAPTER 1 Economic Questions and Data
TABLE 1.3 Selected Observations on Cigarette Sales, Prices, and Taxes, by State and Year for U.S. States, 1985–
Observation Number State Year
Cigarette Sales (packs per capita)
Average Price per Pack (including taxes)
Total Taxes (cigarette excise tax + sales tax) 1 Alabama 1985 116.5 $1.022 $0. 2 Arkansas 1985 128.5 1.015 0. 3 Arizona 1985 104.5 1.086 0. c c c c c c
47 West Virginia 1985 112.8 1.089 0. 48 Wyoming 1985 129.4 0.935 0. 49 Alabama 1986 117.2 1.080 0. c c c c c c
96 Wyoming 1986 127.8 1.007 0. 97 Alabama 1987 115.8 1.135 0. c c c c c c
528 Wyoming 1995 112.2 1.585 0. Note: The cigarette consumption data set is described in Appendix 12.1.
for example, GDP grew 8.8% at an annual rate. In other words, if GDP had contin- ued growing for four quarters at its rate during the first quarter of 1960, the level of GDP would have increased by 8.8%. In the first quarter of 1960, the long-term inter- est rate was 4.5%, and the short-term interest rate was 3.9%; so their difference, the term spread, was 0.6 percentage points. By tracking a single entity over time, time series data can be used to study the evolution of variables over time and to forecast future values of those variables.
Panel data , also called longitudinal data , are data for multiple entities in which each entity is observed at two or more time periods. Our data on cigarette consumption and prices are an example of a panel data set, and selected variables and observations in that data set are listed in Table 1.3. The number of entities in a panel data set is denoted n , and the number of time periods is denoted T. In the cigarette data set, we have observa- tions on n = 48 continental U.S. states (entities) for T = 11 years (time periods) from 1985 to 1995. Thus, there is a total of n * T = 48 * 11 = 528 observations. Some data from the cigarette consumption data set are listed in Table 1.3. The first block of 48 observations lists the data for each state in 1985, organized alphabeti- cally from Alabama to Wyoming. The next block of 48 observations lists the data for
Key Terms 53
Summary
Key Terms
KEY CONCEPT
randomized controlled experiment (48) control group (48)
treatment group (48) causal effect (48)
1986, and so forth, through 1995. For example, in 1985, cigarette sales in Arkansas were 128.5 packs per capita (the total number of packs of cigarettes sold in Arkansas in 1985 divided by the total population of Arkansas in 1985 equals 128.5). The aver- age price of a pack of cigarettes in Arkansas in 1985, including tax, was $1.015, of which 37 ¢ went to federal, state, and local taxes. Panel data can be used to learn about economic relationships from the experi- ences of the many different entities in the data set and from the evolution over time of the variables for each entity. The definitions of cross-sectional data, time series data, and panel data are sum- marized in Key Concept 1.1.