Statistical Analysis Homework: Regression Analysis and Scatterplots - Prof. Mary Kathryn C, Assignments of Data Analysis & Statistical Methods

Instructions for homework 4 in the statistics and computing course (22s:105) taught by professor cowles during spring 2006. The assignment includes tasks related to finding the least-squares regression equation, creating scatterplots, and analyzing residuals using sas software. Students are required to find the relationship between returns on u.s. And overseas investments over a 26-year period.

Typology: Assignments

Pre 2010

Uploaded on 09/17/2009

koofers-user-ugp-2
koofers-user-ugp-2 🇺🇸

10 documents

1 / 1

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
22S:105, Statistical Methods and Computing
Instructor: Cowles
Spring 2006, Homework 4
Due: Fri., 02/17 in class
Please put your name at the top of your homework, and list the names of any classmates
with whom you collaborated.
1. Investors ask about the relationship between returns on investments in the U.S. and
on investments overseas. The data file “stocks.dat” gives the total returns on U.S.
and overseas common stocks over a 26-year period. (The total return is change in
price plus any dividends paid, converted into U.S. dollars. Both returns are averages
over many individual stocks. Use SAS to get the numbers required for answering the
following questions.
(a) Find the least-squares regression equation of overseas returns on U.S. returns.
(b) Produce a scatterplot with the regression line superimposed on it.
(c) Produce a residual plot.
(d) In 1997, the return on U.S. stocks was 33.4%. Use the regression line to predict
the return on overseas stocks. (You may either calculate this by hand or use
SAS output.) The actual overseas return was 2.1%. Are you confident that
predictions using the regression line will be quite accurate? Why?
(e) In both of your plots, circle the point that has the largest residual (either positive
or negative). What year is this?
(f) Are there any points that seem likely to be very influential? If so, identify them
in your scatterplot and residual plot.
2. Textbook problem 5.10.
Use SAS. Enter the data yourself as part of the data step using the “datalines” state-
ment as shown below. Note that you need a semicolon after “datalines” and another
semicolon by itself on the line after the last row in the list of data. Answer all parts
of the question, and include relevant SAS output.
data farm ;
input year pop ;
datalines ;
1935 32.1
1940 30.5
1945 24.4
1950 23.0
1955 19.1
1960 15.6
1965 12.4
1970 9.7
2
1975 8.9
1980 7.2
;
3. Textbook problems: 5.12, 5.25, 5.32 I.40, I.44, 7.3, 7.11, 7.18, 7.20, 7.41

Partial preview of the text

Download Statistical Analysis Homework: Regression Analysis and Scatterplots - Prof. Mary Kathryn C and more Assignments Data Analysis & Statistical Methods in PDF only on Docsity!

22S:105, Statistical Methods and Computing

Instructor: Cowles

Spring 2006, Homework 4

Due: Fri., 02/17 in classPlease put your name at the top of your homework, and list the names of any classmateswith whom you collaborated.1. Investors ask about the relationship between returns on investments in the U.S. and

on investments overseas.

The data file “stocks.dat” gives the total returns on U.S.

and overseas common stocks over a 26-year period.

(The total return is change in

price plus any dividends paid, converted into U.S. dollars. Both returns are averagesover many individual stocks. Use SAS to get the numbers required for answering thefollowing questions.(a) Find the least-squares regression equation of overseas returns on U.S. returns.(b) Produce a scatterplot with the regression line superimposed on it.(c) Produce a residual plot.(d) In 1997, the return on U.S. stocks was 33.4%. Use the regression line to predict

the return on overseas stocks.

(You may either calculate this by hand or use

SAS output.)

The actual overseas return was 2.1%.

Are you confident that

predictions using the regression line will be quite accurate? Why? (e) In both of your plots, circle the point that has the largest residual (either positive

or negative). What year is this? (f) Are there any points that seem likely to be very influential? If so, identify them

in your scatterplot and residual plot.

  1. Textbook problem 5.10.Use SAS. Enter the data yourself as part of the data step using the “datalines” state-ment as shown below. Note that you need a semicolon after “datalines” and anothersemicolon by itself on the line after the last row in the list of data. Answer all partsof the question, and include relevant SAS output.

data farm ;input year pop ;datalines ; 1935

  1. Textbook problems: 5.12, 5.25, 5.32 I.40, I.44, 7.3, 7.11, 7.18, 7.20, 7.