Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Analysis: Tukey Method, t-Test, ANOVA, and Regression, Assignments of Systems Engineering

Solutions to various statistical analysis problems, including the calculation of t-statistics using tukey method, the application of paired t-test in a paired comparison design, the interpretation of results from a two-way analysis of variance (anova), and the estimation of coefficients using regression analysis.

Typology: Assignments

Pre 2010

Uploaded on 08/05/2009

koofers-user-9xp
koofers-user-9xp 🇺🇸

5

(1)

10 documents

1 / 5

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Analysis: Tukey Method, t-Test, ANOVA, and Regression and more Assignments Systems Engineering in PDF only on Docsity!

1. (Problem 2.7)

Source Degrees of Freedom Sum of Squares Mean Squares Block 2 520 260 Treatment 4 498 124. Residual 8 40 5 total 14 (a) All entries can be determined as above.

(b) The t statistics for Tukey method are calculated below: Group A(45) B(58) C(46) D(45) E(56) A(45) 7.12 0.55 0 6. B(58) 6.57 7.12 1. C(46) 0.55 5. D(45) 6. E(56)

  1. 68
  2. 414

q (^) k , ( b − 1 )( k − 1 ), α = q 5 , 8 , 0. 01 = =. By comparing the t values with 4.68,

Tukey method declares that the following pairs are significantly different: A&B, A&E, B&C, B&D, C&E and D&E.

(c) From (b), Tukey method declares 6 pairs of treatments to be significantly different at level 0.01. The null hypothesis of the F test is that all the treatments are same to each other. Based on the conclusion of (b), we know the null hypothesis of F test is not true at level 0.01. Hence the F test will reject the null hypothesis at level 0.01.

2. (Problem 2.8)

(a) In this experiment, our objective is to compare two catalysts, A and B , with respect to yield of the chemical reaction. We can clearly determine these catalysts as the treatments in our experiment. In order to get better error term estimates and increase the power of statistical inferences, different batches are used and both treatments are applied to the raw materials (our experimental units) of each batch (within which the variation is much smaller than between them). This is highly reasonable since raw material units are known to be quite different from different batches. By blocking the batch factor (and thereby eliminating a source of variation), we attain higher power of statistical inferences about the treatments. Each batch is divided into two portions, one for catalyst A and the other one for B. The treatments should be assigned to the (homogeneous units) in a random manner. Also, the timely order of the treatment applications should be randomized. This design is a randomized block design with 2 treatments: paired comparison design.

(b) The correct t-test is the paired t-test since we are using a paired comparison design.

1 2 3 4 5 6 A (^9 19 28 22 18 ) B (^10 22 30 21 23 )

d (^) i 1 3 2 -1 5 4

With N=6 batches we get:

1 / 6 *( 1 3 ... 4 ) 2. 33

1

= ∑ = + + + =

=

N

i

di N

d

( ) 2. 16

1 / 2

1

2 ⎟ =

= ∑

=

N

i

d di d N

s

and thus the paired t-statistic

tpaired = Nd / sd = 2. 646

Compare with critical value t (^) N − 1 , α / 2 = t 5 , 0. 025 = 2. 571 for α=0.05 level.

Since the absolute value of the t-statistic is greater than the critical value (2.646 > 2.571) the paired t-test declares the two treatments as significantly different at α=0.05.

(c) 95% CI for the difference between A and B. Using our results from (b), the CI is given by:

1 , / 2 *^ d t 5 , 0. 025 * N

s d ± tN −α d = ± . So the 95% CI is:

[ 0. 0629 ; 4. 5971 ].

3. (Problem 2.15)

Remark : It would have been desirable to include the interaction effect between tape speed and laser power. However, the degrees of freedom are not sufficient for this analysis, so I am only conducting a two-way ANOVA without interaction effect.

Two-way ANOVA: Strength versus Tape, Laser

Source DF SS MS F P Tape 2 48.919 24.459 9.32 0. Laser 2 224.184 112.092 42.69 0. Error 4 10.503 2. Total 8 283.

From the ANOVA output we can see that there is a significant difference between the laser power levels with a p-value of 0.002. In contrast, the difference between the tape speed levels seems considerably less significant with a p-value of 0.03, but still significant if we use a level of α=0.05.

Normal Probability Plot and Residual Plot:

Residual

Percent

-3 -2 -1 0 1 2 3

99 95 90 80 (^7060) (^5040) (^3020) 10 5 1

Normal Probability Plot of the Residuals (response is Strength)

Fitted Value

Residual

20 25 30 35 40

2

1

0

Residuals Versus the Fitted Values (response is Strength)

The Normal Probability Plot shows that the residuals are reasonably normally distributed since the residuals do not deviate too much from the ideal line.

However, the Residual Plot vs. the fitted values shows that the variance of the residuals seems to decrease for higher levels. This suggests that the assumption of constant variance might be violated in our case. We should therefore consider a transformation of the response data in advance (see power transformation).

Remark: If we consider the two-way anova in the content of two quantitative factors. Then the analysis is as follows:

We start only fitting lower order terms, ignoring terms that involve a quadratic interaction. Our fit is:

Estimate Std. Error t value Pr(>|t|) Intercept 31.0322 0.4038 76.853 4.86e-06 *** A.lin 8.6361 0.6994 12.348 0.00114 ** A.qua -0.3810 0.6994 -0.545 0. B.lin -1.0465 0.6994 -1.496 0. B.qua -3.9001 0.6994 -5.577 0.01138 * A.lin*B.lin 2.4700 1.2114 2.039 0.

We eliminate the interaction and the quadratic effect associated with A (laser power). The final ANOVA is Df Sum Sq Mean Sq F value Pr(>F) Intercept 1 8667.0 8667.0 3961.6538 1.916e-08 *** A.lin 1 223.7 223.7 102.2746 0.0001620 *** B.lin 1 3.3 3.3 1.5018 0. B.qua 1 45.6 45.6 20.8587 0.0060176 ** Residuals 5 10.9 2.

Residual analysis yielded nothing unusual.

4. (Problem 2.20) The ANOVA table is Source Model Error C Total

DF 18 8 26

Sum of Squares 20132428 97028 20229456

Mean Square 1118468 12129

F Ratio

Prob>F <.

Source length amplitude load lengthamplitude lengthload amplitude*load

Nparm 2 2 2 4 4 4

DF 2 2 2 4 4 4

Sum of Squares

F Ratio

Prob>F <. <. <. <.

From the p-values for all the six effects, all the main effects and the two factor interaction effects are significant. And the most significant effects are all the three main effects and the length*amplitude interaction.

Log transformation looks reasonable. It separates the linear effects from other effects very well.

Regression output :

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 12.66933 0.07461 169.816 < 2e-16 *** Al 1.66477 0.09137 18.219 1.37e-12 *** Bl -1.26198 0.09137 -13.811 1.14e-10 *** Cl -0.78499 0.09137 -8.591 1.36e-07 *** Aq -0.05713 0.05275 -1.083 0. Bq 0.01615 0.05275 0.306 0. Cq -0.04497 0.05275 -0.852 0. AB -0.07648 0.11191 -0.683 0. AC -0.13683 0.11191 -1.223 0. BC -0.04167 0.11191 -0.372 0.

We find that the linear effects are significant. Let us do the usual one way ANOVA

Df Sum Sq Mean Sq F value Pr(>F) factor(A) 2 50.062 25.031 301.3674 2.944e-08 *** factor(B) 2 28.681 14.340 172.6542 2.629e-07 *** factor(C) 2 11.201 5.600 67.4276 9.835e-06 *** factor(A):factor(B) 4 1.605 0.401 4.8301 0.02815 * factor(A):factor(C) 4 0.543 0.136 1.6346 0. factor(B):factor(C) 4 0.058 0.015 0.1754 0. Residuals 8 0.664 0.

Here also we find the main effects to be significant. Along with that AB interaction also becomes marginally significant.