Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Data Analysis The Truth About Linear Regression, Exercises - Engineering, Exercises of Advanced Data Analysis

Carnegie Mellon University (CMU)Advanced Data Analysis

Data Analysis The Truth About Linear Regression, Exercises - Engineering - Prof. Cosma Shalizi, Advanced Data Analysis, The Advantages of Backwardness

Typology: Exercises

2010/2011

Uploaded on 11/03/2011

bridge 🇺🇸

4.9

(13)

287 documents

1 / 3

This page cannot be seen from the preview

Don't miss anything!

Homework Assignment 2: The Advantages of

Backwardness

36-402, Data Analysis, Spring 2011

Due 25 January 2011

Many theories of economic growth say that it’s easier for poor countries to

grow faster than rich countries — “catching up”, or the “advantages of back-

wardness”. One argument for this is that poor countries can grow by copying

existing, successful technologies and ways of doing business from rich ones. But

rich countries are already using those technologies, so they can only grow by

finding new ones, and copying is faster than innovation. So, all else being equal,

poor countries should grow faster than rich ones. One way to check this is to

look at how growth rates are related to other economic variables.

We will use the np package on CRAN to do kernel regression.1Install it,

and load its oecdpanel data set. This contains growth data for many countries

for 1960–1995, collected by the Organization for Economic Cooperation and

Development (the OECD). We won’t use all the variables this time.

GDP is “gross domestic product”, the total value of all economic production.

It’s usually reported per capita and per year. Call it Yi,t, since it depends on

the country iand the year t. GDP isn’t perfect2, but it is standard.

In oecdpanel, the variable growth is the logarithmic growth rate of GDP,

= log Yi,t+1 /Yi,t . We look at logarithms because economic models suggest that

the factors which affect growth should multiply together, rather than adding.

What’s actually recorded here is the average growth rate over a five-year period,

reducing year-to-year accidents.

initgdp is logYi,t , the logarithm of per-capita GDP at the start of each

five-year period.

A country’s investment rate is the fraction of its GDP that goes into building

or repairing productive assets (roads, harbors, power plants, factory machines,

buildings, etc.). inv is the logarithm of the investment rate, so inv=-2.26

means 10.4% of output was invested.

popgro, similarly, is the logarithm of the population growth rate.

1The package has good help files, if you want to know more. Or see http://www.jstatsoft.

org/v27/i05.

2If everyone gets worried about being robbed, GDP goes up by the amount we spend on

extra locks, alarms, guards, etc., none of which would be needed if we just didn’t have so

many burglars.

1

Discover Exercises of Advanced Data Analysis Carnegie Mellon University (CMU)

Partial preview of the text

Download Data Analysis The Truth About Linear Regression, Exercises - Engineering and more Exercises Advanced Data Analysis in PDF only on Docsity!

Homework Assignment 2: The Advantages of

Backwardness

36-402, Data Analysis, Spring 2011

Due 25 January 2011

Many theories of economic growth say that it’s easier for poor countries to grow faster than rich countries — “catching up”, or the “advantages of back- wardness”. One argument for this is that poor countries can grow by copying existing, successful technologies and ways of doing business from rich ones. But rich countries are already using those technologies, so they can only grow by finding new ones, and copying is faster than innovation. So, all else being equal, poor countries should grow faster than rich ones. One way to check this is to look at how growth rates are related to other economic variables. We will use the np package on CRAN to do kernel regression.^1 Install it, and load its oecdpanel data set. This contains growth data for many countries for 1960–1995, collected by the Organization for Economic Cooperation and Development (the OECD). We won’t use all the variables this time. GDP is “gross domestic product”, the total value of all economic production. It’s usually reported per capita and per year. Call it Yi,t, since it depends on the country i and the year t. GDP isn’t perfect^2 , but it is standard. In oecdpanel, the variable growth is the logarithmic growth rate of GDP, = log Yi,t+1/Yi,t. We look at logarithms because economic models suggest that the factors which affect growth should multiply together, rather than adding. What’s actually recorded here is the average growth rate over a five-year period, reducing year-to-year accidents. initgdp is log Yi,t, the logarithm of per-capita GDP at the start of each five-year period. A country’s investment rate is the fraction of its GDP that goes into building or repairing productive assets (roads, harbors, power plants, factory machines, buildings, etc.). inv is the logarithm of the investment rate, so inv=-2. means 10.4% of output was invested. popgro, similarly, is the logarithm of the population growth rate. (^1) The package has good help files, if you want to know more. Or see http://www.jstatsoft. org/v27/i05. (^2) If everyone gets worried about being robbed, GDP goes up by the amount we spend on extra locks, alarms, guards, etc., none of which would be needed if we just didn’t have so many burglars.

(5 points) Fit a linear model of growth on initgdp. What is the coeffi- cient? What does it suggest about catching-up?
(20 points) The npreg function in the np package does kernel regression. By default, it uses a combination of cross-validation and sophisticated but very slow optimization to pick the best bandwidth. In this prob- lem, though, we will force it to use fixed bandwidths, and do the cross- validation ourselves.

oecd.0.1 <- npreg(growth~initgdp,bws=0.1,data=oecdpanel)

does a kernel regression of growth on initgdp, using the default kernel (which is Gaussian) and bandwidth 0.1. You can run fitted, predict, etc., on the output of npreg just as you can on the output of lm. The code at the end of this assignment (also online) uses five-fold cross- validation to estimate the mean-squared error for the five bandwidths

1 , 0. 2 , 0. 3 , 0. 4 , 0 .5. Use it to create a plot of MSE versus bandwidth. Add to the same plot the MSEs of the five bandwidths on the whole data. What bandwidth predicts best?
(10 points) Make a scatterplot of initgdp versus growth. Add the line for the linear model. Add the fitted values for the kernel curve with the best bandwidth (according to the previous problem). Does the kernel regression curve suggest that poorer countries tend to grow faster? (There are at least two ways to get the fitted values for the kernel regres- sion, using fitted or predict.)
(5 points) If we want to check whether poorer countries tend to grow faster, all else being equal, it seems reasonable to try to keep all else equal. Do a linear regression of growth on initgdp, along with popgro and inv. What are the new regression coefficients? Does the coefficient of initgdp have the same sign as before? What does it suggest about catching-up?
(10 points) npreg will also do kernel regressions with multiple input vari- ables. This time, use the built-in bandwidth selector:

oecd.npr <- npreg(growth ~ initgdp + popgro + inv, data=oecdpanel, tol=0.1, ftol=0.1)

(The last two arguments tell the bandwidth selector to not be very hard to optimize — which in this case saves a lot of time, and works out well.) What are the selected bandwidths? (Use summary.)

(15 points) What are the median values of popgro and inv? For coun- tries with those median values, plot the predicted growth rate versus initial GDP, under both the linear model from problem 4 and the kernel regres- sion from problem 5. (One way to do this is to use predict, but there are probably others.) Describe what each curve suggests about catching-up.

Data Analysis The Truth About Linear Regression, Exercises - Engineering, Exercises of Advanced Data Analysis

Related documents

Partial preview of the text

Download Data Analysis The Truth About Linear Regression, Exercises - Engineering and more Exercises Advanced Data Analysis in PDF only on Docsity!

Homework Assignment 2: The Advantages of

Backwardness

36-402, Data Analysis, Spring 2011

Due 25 January 2011