Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Problem Set 3 for Applied Regression Analysis | STAT 51200, Assignments of Statistics

Material Type: Assignment; Professor: Jennings; Class: Applied Regression Analysis; Subject: STAT-Statistics; University: Purdue University - Main Campus; Term: Fall 2008;

Typology: Assignments

Pre 2010

Uploaded on 07/31/2009

koofers-user-hc8
koofers-user-hc8 🇺🇸

10 documents

1 / 2

Toggle sidebar

Related documents


Partial preview of the text

Download Problem Set 3 for Applied Regression Analysis | STAT 51200 and more Assignments Statistics in PDF only on Docsity!

Statistics 512: Problem Set No. 3 Due September 19, 2008

  1. Consider the following data set that describes the relationship between the rate of an enzy- matic reaction (V ) and the substrate concentration (C). A common model used to describe the relationship between rate and concentration is the Michaelis-Menten model V = (^) θθ 21 +CC , where θ 1 is the maximum rate of the reaction and θ 2 describes how quickly the reaction will reach its maximum rate. With this mode, (^) V^1 can be written as a linear model with explanatory variable (^) C^1 :

1 V

=

θ 1

+

θ 2 θ 1

C

Concentration Rate 0.02 76 0.02 47 0.06 97 0.06 107 0.11 123 0.11 139

Concentration Rate 0.22 159 0.22 152 0.56 191 0.56 201 1.10 207 1.10 200

(a) Generate a scatterplot of V vs C. Comment on the shape. (b) Define new variables for (^) V^1 and (^) C^1 in SAS, and generate a scatterplot of the new variables. Does the fit appear linear? Do any assumptions appear to be violated? The new variables can be defined as follows (if the dataset original contains the raw data): data reaction; set original; vinv = 1/v; cinv = 1/c; (c) How is the distribution of (^) C^1 different from the distribution of C? Are there any points that may be more influential in determining the fit? (d) Determine the least squares regression line for (^) V^1 vs (^) C^1. Save the residuals and predicted values. Does the residual plot suggest any problems? (e) Convert this regression line back into the original nonlinear model and plot the predicted curve on a scatterplot of V vs C. Comment on the fit. To generate the predicted curve, simply take the predicted values from the regression model and “re-invert” them. For example, suppose results is the data set containing the residuals and predicted values (variable pred). data invert; set results; predv = 1/pred; symbol1 v = circle i = none c = black; symbol2 v = plus i = sm5 c = red; proc gplot data = invert; plot vc predvc / overlay;

For the next 3 questions, use the grade point average data described in the text with Problem 1..

  1. Describe the distribution of the explanatory variable. Show the plots and output that were helpful in learning about this variable.
  2. Run the linear regression to predict GPA from the entrance test score, and obtain the residuals (do not include a list of the residuals in your solution).

(a) Verify that the sum of the residuals is zero by running proc univariate with the output from the regression. (b) Plot the residuals versus the explanatory variable and briefly describe the plot noting any unusual patterns or points. (c) Plot the residuals versus the order in which the data appear in the data file and briefly describe the plot noting any unusual patterns or points. (d) Examine the distribution of the residuals by getting a histogram and a normal prob- ability plot of the residuals by using the histogram and qqplot statements in proc univariate. What do you conclude?

  1. Change the data set by changing the value of the GPA for the last observation from 2. to 29.48 (e.g., a typo). You can do this in a data step. For example, data a2; set a1; if n eq 120 then gpa = 29.48; an alternative is simply to edit the data file.

(a) Make a table comparing the results of this analysis with the results of the analysis of the original data. Include in the table the following: fitted equation, t-test for the slope, with standard error and p-value, R^2 , and the estimate of σ^2. Summarize the differences. (b) Repeat parts (b), (c), and (d) from the previous problem and explain how these plots help you to detect the unusual observation.