Practice Assignment 6 - Statistical Methods | STAT 500, Assignments of Data Analysis & Statistical Methods

Material Type: Assignment; Class: STATISTICAL METHODS; Subject: STATISTICS; University: Iowa State University; Term: Unknown 1989;

Typology: Assignments

Pre 2010

Uploaded on 09/02/2009

koofers-user-r2x
koofers-user-r2x 🇺🇸

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
STAT 511 Assignment 6 Name ________________
Spring 2002
Reading Assignment: Rencher: Chapter 8. Chapter 8 covers tests of hypotheses and confidence
intervals for parameters in regression models. Reading Chapter 8 will
help you apply the general linear model theory we are developing in Stat
511 to the analysis of multiple regression models. More information on
regression analysis in given in Chapters 6, 7 and 9. This material was
covered in Stat 500, and we will not repeat it in Stat 511. You can review
as much of Chapters 6, 7, and 9 as you find useful. Next we will go into
Chapter 14 to analyze unbalanced factorial experiments. We will consider
the material on balanced factorial experiments in Chapter 13 as a special
case. The material in Chapters 12 and 13 was covered in Stat 500.
Written Assignment: On-campus students: Due Wednesday, March 6, in class. Solutions will
be distributed at that time. No late assignments will be accepted.
Distance students: Put it in the mail or e-mail or FAX by March 14.
Solutions will be posted on the course web page by 6 pm on March 14.
First Exam: The first exam will be given on Thursday, March 7, from 7-9 pm in
2245 Coover Hall. Please bring pencils, erasers, and a simple calculator.
Paper and formula sheets will be provided. Distance students will be contacted
their arrangements for this exam. You will need a two hour time period. A
formula sheet will be posted on the course web page in the near future. Feel free
to make suggestions for additions, deletions, clarifications or corrections.
Previous exams and solutions have been posted on the course web page.
1. Suppose you are designing a new study of the yield of a chemical process like the one
partially analyzed in problems 4 through 6 on assignment 5. Suppose the engineers assigned
to your project wish to run the process at the same five temperature values for each of two
new catalysts. Call them catalyst A and catalyst B. The proposed model for the observed
yield when the process is run with the i-th catalyst at the j-th temperature level is
ijkijiijk )100T(Yε+β+α+µ= i = 1, 2, j = 1, 2, …, 5, and k=1,…,n
where Tij is the temperature at which the process was run, and εijk ~ NID(0,σ2). When the
runs are made we will have n replicates for each the ten temperature/catalyst combinations.
The engineers want to test the null hypothesis 210 :Hαα == against the alternative
210 :Hαα using a type I error level of .05.0
==
α
Relative to the value of the error
variance, 2
σ, they wish to make the number of replicates (n) large enough to have
probability of at least 0.90 of rejecting the null hypothesis if σ=αα5.0
21 . What is the
smallest value of n that satisfies these conditions?
pf3
pf4
pf5

Partial preview of the text

Download Practice Assignment 6 - Statistical Methods | STAT 500 and more Assignments Data Analysis & Statistical Methods in PDF only on Docsity!

STAT 511 Assignment 6 Name ________________ Spring 2002

Reading Assignment: Rencher: Chapter 8. Chapter 8 covers tests of hypotheses and confidence intervals for parameters in regression models. Reading Chapter 8 will help you apply the general linear model theory we are developing in Stat 511 to the analysis of multiple regression models. More information on regression analysis in given in Chapters 6, 7 and 9. This material was covered in Stat 500, and we will not repeat it in Stat 511. You can review as much of Chapters 6, 7, and 9 as you find useful. Next we will go into Chapter 14 to analyze unbalanced factorial experiments. We will consider the material on balanced factorial experiments in Chapter 13 as a special case. The material in Chapters 12 and 13 was covered in Stat 500.

Written Assignment: On-campus students: Due Wednesday, March 6, in class. Solutions will be distributed at that time. No late assignments will be accepted. Distance students: Put it in the mail or e-mail or FAX by March 14. Solutions will be posted on the course web page by 6 pm on March 14.

First Exam: The first exam will be given on Thursday, March 7, from 7-9 pm in 2245 Coover Hall. Please bring pencils, erasers, and a simple calculator. Paper and formula sheets will be provided. Distance students will be contacted their arrangements for this exam. You will need a two hour time period. A formula sheet will be posted on the course web page in the near future. Feel free to make suggestions for additions, deletions, clarifications or corrections. Previous exams and solutions have been posted on the course web page.

  1. Suppose you are designing a new study of the yield of a chemical process like the one partially analyzed in problems 4 through 6 on assignment 5. Suppose the engineers assigned to your project wish to run the process at the same five temperature values for each of two new catalysts. Call them catalyst A and catalyst B. The proposed model for the observed yield when the process is run with the i-th catalyst at the j-th temperature level is

Yijk = μ+αi+β(Tij− 100 )+ε ijk i = 1, 2, j = 1, 2, …, 5, and k=1,…,n

where Tij is the temperature at which the process was run, and εijk ~ NID(0,σ^2 ). When the runs are made we will have n replicates for each the ten temperature/catalyst combinations. The engineers want to test the null hypothesis H 0 :α 1 == α 2 against the alternative H 0 :α 1 ≠≠ α 2 using a type I error level of α == 0. 05. Relative to the value of the error variance, σ^2 , they wish to make the number of replicates (n) large enough to have probability of at least 0.90 of rejecting the null hypothesis if α 1 −α 2 = 0. 5 σ. What is the smallest value of n that satisfies these conditions?

  1. Marcuse(1949, Biometrics, 5 ) recorded moisture content for three types of cheese made by two different methods. Two pieces of cheese were measure for each type and each method. The data are shown below.

Treatment Moisture Content Measurements

Type A made with Method 1 Y 11 = 39. 02 Y 12 = 38. 79 Type B made with Method 1 Y 21 = 35. 74 Y 22 = 35. 41 Type C made with Method 1 Y 31 = 37. 02 Y 32 = 36. 00 Type A made with Method 2 Y 41 = 38. 96 Y 42 = 39. 01 Type B made with Method 2 Y 51 = 35. 58 Y 52 = 35. 52 Type C made with Method 2 Y 61 = 35. 70 Y 62 = 36. 04

Consider the model Yij = μ+αi+εij, where εij ~ NID( 0 ,σ^2 ), i=1,2,3,4,5,6, and j=1,. This model can be expressed in matrix form as

ε

ε

ε

ε

ε

ε

ε

ε

ε

ε

ε

ε

α

α

α

α

α

α

μ

62

61

52

51

42

41

32

31

22

21

12

11

6

5

4

3

2

1

62

61

52

51

42

41

32

31

22

21

12

11

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

(a) What is the distribution of 11 12 21 22 31 32 41 42 51 52 61 62 T ~

Y = (Y Y Y Y Y Y Y Y Y Y Y Y )?

(b) Determine which of the following are testable hypotheses. You only need to state if the hypothesis is testable or not testable. i. H 0 :α 1 =α 2 =α 3 ii. H 0 :α 1 =α 2 =α 3 = 0 iii. H 0 :μ+α 1 = 39 andμ+α 4 = 39

iv. (^)  

β = 

H :

~

0

Note that more than one ice crystal was measured at some time points. A file containing S-Plus code for assisting you in answering some the following questions is posted in the file “crystals.ssc” on the course web page. A corresponding file with SAS code is posted as “crystals.sas”. SAS users should read the data from the file posted as “crystals.dat”. You could also make use of the pull down menus in S-Plus to analyze these data and make graphs.

(a) Compute least squares estimates for the parameters in the model Yi =β 0 +β 1 Xi+εi

where ε (^) i ~ NID( 0 ,σ^2 ). This notation means that the random errors (and the observations) have normal distributions and satisfy the Gauss-Markov property. Report the estimates and their standard errors.

(b) Define

n

2

1

1 X

1 X

1 X

X

M M

and 

~ M^

and PX = X(XTX)−^1 XT and T ~

1 ~

T (^1) ~ ~

P = 1 ( 1 1 )− 1.

Here X is the model matrix for the model in part (a). What is the distribution of

the quadratic form X (^1) ~

T (^1 02) ~

Y (P P)Y

R ( | ) −

σ

β β =?

(c) For the model in part (a), X ~

T residuals (^2) ~

Y (I P )Y

SS −

σ

= has a central chi-square

distribution with (n-2)=14 degrees of freedom. Define n 2

SS

MS (^) residuals residuals −

= and

use the results from part (b) to derive the distribution of. MS

R( | )

F

residuals

= β^1 β^0 Report

degrees of freedom and a formula for the noncentrality parameter.

(d) What is the null hypothesis associated with the F statistic in part (c)? Justify your answer by showing that the noncentrality parameter in part (c) is zero if and only if the null hypothesis is true.

(e) Report the value of the test statistics in part (c) and state your conclusion.

(f) Examine the plot of the estimated line and observations and the residual plots provided by the code posted on the course web page. What do these plots suggest?

  1. Suppose the model proposed in part (a) of problem 2 is incorrect. In particular, suppose that

the correct model is Yi = λ 0 +λ 1 Xi+λ 2 Xi^2 +ηiwhere ηi ~ NID( 0 ,ω^2 ). This model can be expressed in matrix notation as ~ ~ ~

Y = Zλ+η, where

n

2

1

~ Y

Y

Y

Y

M 

~ 2 n n

2 2 2

2 1 1

X d

1 X X

1 X X

1 X X

Z

M M M

2 n

2 2

2 1

~

X

X

X

d M

and 

λ

λ

λ λ = 3

2

1

~

Assuming that this is the correct model for ice crystal growth, find

(a) The distribution of X ~

T ~

Y (I− P ) Y, where X is the model matrix from problem 2, and (b) The distribution of X (^1) ~

T ~

Y (P − P) Y.

(c) Does Y (I P )Y/(n 2 )

Y (P P)Y

X

T

X 1

T

− −

have an F-distribution? Explain.

  1. Now, suppose that the model in problem 2 is correct, i.e. λ 3 = 0 and ηi ~ NID( 0 ,σ^2 )for the

model in problem 3. (a) Find the distribution of Z ~

T ~

Y (I− P) Y, where PZ = Z(ZTZ)−^1 ZT and Z is the model matrix from problem 3.

(b) Does Y (I P )Y/(n 3 )

Y (P P)Y

Z

T

X 1

T

− −

have an F-distribution when the model in problem 2 is

correct? Explain.

  1. Problems 3 and 4 illustrate some of the consequences of incorrectly specifying the model. When you have replication at some sets of values of the explanatory variables, as we do for the ice crystal data, you can construct a lack-of-fit test for a proposed model. We will apply a lack-of-fit test to the quadratic model from problem 3. Consider the larger model j ij

2 Yij = γ 0 +γ 1 Xi+γ 2 Xi +α +ε , where ~ NID( 0 , ) 2 εij τ and Yij denotes the observed axial length for the j-th ice crystal measure at time X. Thisi model can be expressed in matrix notation as ~ ~ ~

Y = Wγ+ε, where