Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Examples with Solutions - Applied Regression Analysis | STAT 462, Study notes of Statistics

Material Type: Notes; Class: Applied Regression Analysis; Subject: Statistics; University: Penn State - Main Campus; Term: Spring 2004;

Typology: Study notes

Pre 2010

Uploaded on 09/24/2009

koofers-user-0zg
koofers-user-0zg 🇺🇸

10 documents

1 / 4

Toggle sidebar

Related documents


Partial preview of the text

Download Examples with Solutions - Applied Regression Analysis | STAT 462 and more Study notes Statistics in PDF only on Docsity! Stat 462 March 3 Example: A study is done to compare three metal alloys used to make welds to join pipes together. Y = a measure of the strength of the weld X = diameter of weld Alloy = type of alloy used to make the weld (Alloy 1, 2, or 3) Graph of the Strength versus Diameter in which alloys are indicated by different symbols. Note that Strength and Diameter are related and that there are differences among the alloys. There appears to be interaction as the slopes differ for the three alloys. The slope is steeper for alloy 2. Alloy is categorical - the numerical codes 1, 2, 3 are arbitrary. To put Alloy into a regression model, create two indicator variables A1 = 1 if observation is alloy 1 and 0 otherwise. A2 = 1 if observation is alloy 2 and 0 otherwise General Rule: If a categorical variable has k categories, then k1 indictor variables will fully describe the variable. In our example, we could create A3 = 1 if observation is alloy 3 and 0 otherwise. But, notice that A1+A2+A3 = 1 for any observation. Thus, A3=1A1A2 meaning that A3 is perfectly predictable from values of A1 and A2 and it would be redundant as a predictor in a regression equation. In the alloy problem, a “no interaction” model is E(Y) = 0 + 1X + 2A1 + 3A2 Given the plot above, this model almost surely is wrong. An interaction model is E(Y) = 0 + 1X + 2A1 + 3A2 + 4X*A1 + 5 X*A2. Notice that the interaction terms involve multiplications of the Alloy indicators and X=diameter. Page 2 Understanding the meaning of the  coefficients When indicator variables are present in the model, the data analyst must give consideration to the correct interpretation of the coefficients multiplying the predictors. To do this Consider each category of a categorical variable separately. For a specific category, determine the values of all indicator variables Substitute these values into the equation (model for E(Y)) and reduce as far as possible. When this is done for each category, compare the resulting equations to determine what the individual  coefficients measure. Alloy Example – No Interaction model Model for average Y is E(Y) = 0 + 1X + 2A1 + 3A2 Alloy 1. For this alloy, A1=1 and A2 = 0. So, E(Y) = 0 + 1X + 2(1) + 3(0) = 0 +2 + 1X Alloy 2. For this alloy, A1=0 and A2 = 1. So, E(Y) = 0 + 1X + 2(0) + 3(1) = 0 +3 + 1X Alloy 3. For this alloy, A1=0 and A2 = 0. So, E(Y) = 0 + 1X + 2(0) + 3(0) = 0 + 1X  1 = the slope between Y and X, regardless of alloy. This is what “no interaction is about – the slope between Y and X is the same for each alloy. The model actually consists of three parallel lines.  2 = difference between intercepts for alloys 1 and 3. More generally, it would be the difference between E(Y) for alloys 1 and 3 at any specified value of X.  3 = difference between intercepts for alloys 2 and 3. More generally, it would be the difference between E(Y) for alloys 2 and 3 at any specified value of X. MINITAB RESULTS INCLUDING GRAPH OF ESTIMATED MODEL The regression equation is Y = - 57.3 + 6.04 X + 12.0 A1 + 29.8 A2 Predictor Coef SE Coef T P Constant -57.27 16.43 -3.49 0.004 X 6.0425 0.8956 6.75 0.000 A1 12.009 4.866 2.47 0.027 A2 29.798 4.597 6.48 0.000 Page 3