










Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The steps and formulas for conducting a one-way analysis of variance (ANOVA) for single-factor studies. It includes examples of how to calculate sum of squares, mean squares, degrees of freedom, and the F-statistic for both treatment and error sources. It also discusses the assumptions of ANOVA and how to check for violations.
Typology: Exams
1 / 18
This page cannot be seen from the preview
Don't miss anything!











We look at testing whether or not the means from two or more populations are the same or different using the Analysis of Variance (ANOVA) method. From a “big picture” point of view, we continue to look at the statistical inference of (large n) multiple–sample mean problems.
mean variance proportion μ σ^2 π one large n, 3.7, 3.8, 3.9, 3.10, 4.6 4.4 6. small n, 4.3, 4. sample two large n, 3.11 4.4 6. small n, 4. multiple chapters not 6.2, 6. 7, 8 , 9 done
We look at notation related to Completely Randomized Designed (CRD) experiments.
Exercise 8.1 (Notation For Completely Randomized Design (CRD))
drug 1 5.90 5.92 5.91 5.89 5.88 y¯1+ ≈ 5. 90 drug 2 5.50 5.50 y¯2+ = 5. 50 drug 3 5.01 5.00 4.99 4.98 5.02 y¯3+ ≈ 5. 00
188 Chapter 8. Single–Factor Studies: One–Way ANOVA (ATTENDANCE 10)
(a) t = (circle one) 1 / 2 / 3 treatments (b) N = (circle one) 2 / 5 / 12 responses (c) n 1 = (circle one) 2 / 5 / 12 drug 1 responses (d) n 2 = (circle one) 2 / 5 / 12 drug 2 responses (e) drug 1, replication 1 response, y 11 = (circle one) 5. 50 / 5. 90 / 5. 92 (f) drug 3, replication 4 response, y 34 = (circle one) 4. 98 / 5. 90 / 5. 92 (g) sum of drug 1 responses, y1+ = (circle one) 5. 50 / 5. 90 / 29. 5 (h) sum of drug 2 responses, y2+ = (circle one) 4. 98 / 5. 90 / 11. 00 (i) average of drug 1 responses, ¯y1+ = (circle one) 5. 50 / 5. 90 / 29. 5 (j) average of drug 3 responses, ¯y3+ = (circle one) 4. 98 / 5. 00 / 11. 00 (k) variance of drug 1 responses, s^21 = (circle one) 0. 00025 / 0. 0158 / 29. 5 (to get s^21 , use calculator: type responses into L 1 , then STAT CALC EN- TER, then square Sx) (l) variance of drug 3 responses, s^23 = (circle one) 0. 00025 / 0. 436 / 11. 00 (m) To say the average responses are not the same (circle one) is / is not the same thing as saying the average responses are all different from one another. (n) If all the average responses to the three drugs are the same, then they (circle one) are / are not equal to the grand average. (o) The grand average, ¯y++ ≈ 5 .46, is given by (circle none, one or more) i. adding all twelve responses and dividing by 12. ii. y¯1++¯y 12 2+ +¯y^3. iii. 5¯y1++2¯y 12 2+ +5¯y3+. (p) The grand variance, s^2 ≈ 0 .0857, is given by (circle none, one or more) i. determining the variance of the twelve responses. ii. y¯1++¯y 12 2+ +¯y^3. iii. (5−1)s
(^21) +(2−1)s (^22) +(5−1)s (^23) 12 − 3.
sewers parks city hall 3 1 5 5 3 8 7 2 9 4 8 10
190 Chapter 8. Single–Factor Studies: One–Way ANOVA (ATTENDANCE 10)
D. H 0 : means the same versus H 1 : means different ii. Test. After some effort, the ANOVA table^1 is given by,
Source Degrees of Freedom Sum Of Squares Mean Squares Treatment (Drugs) 2 2.033 1. Error 12 0.0022 0. Total 14 2. and so the test statistic^2 is
F test statistic =
(circle one) 1. 02 / 123 / 5647. 2. The upper critical value at α = 0.05, with t − 1 = 3 − 1 = 2 and N − t = 15 − 3 = 12 degrees of freedom, is (circle one) 3. 22 / 3. 89 / 4. 82 (Use PRGM INVF ENTER 2 ENTER 12 ENTER 0.95 ENTER) iii. Conclusion. Since the test statistic, 5647.2, is larger than the critical value, 3.89, we (circle one) accept / reject the null hypothesis that the average patient responses to the three drugs are the same. (b) P–Value Versus Level of Significance. i. The statement of the test is (check none, one or more): A. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : μ 1 6 = μ 2 , μ 1 = μ 3. B. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : μ 1 6 = μ 3 , μ 1 6 = μ 2. C. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : at least one μi 6 = μj , i 6 = j; i, j = 1, 2 , 3. D. H 0 : means the same versus H 1 : means different ii. Test. Since the test statistic is F = 5647.2, the p–value, with t − 1 = 3 − 1 = 2 and N − t = 15 − 3 = 12 degrees of freedom, is given by
p–value = P (F ≥ 5647 .2)
which equals (circle one) 0. 00 / 0. 35 / 0. 43. (Use 2nd DISTR 9:F cdf(5647.2,E99,2,12).) The level of significance is 0.05. iii. Conclusion. Since the p–value, 0.00, is smaller than the level of signif- icance, 0.05, we (circle one) accept / reject the null hypothesis that the average patient responses to the three drugs are the same. (^1) Type data into L 1 , L 2 and L 3 , then STAT TESTS ANOVA(L 1 ,L 2 ,L 3 ). (^2) Due to round–off error, the F calculated here is not the same as the one calculated by the calculator.
Section 3. Analysis of Variance (ANOVA) (ATTENDANCE 10) 191
drug 1 5.90 5.92 5.91 5.89 5.88 y¯1+ ≈ 5. 90 drug 2 5.50 5.50 y¯2+ = 5. 50 drug 3 5.01 5.00 4.99 4.98 5.02 y¯3+ ≈ 5. 00
Test if at least two of the three average patient responses to the drug are different at α = 0.05.
(a) Test Statistic Versus Critical Value. i. The statement of the test is (check none, one or more): A. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : μ 1 6 = μ 2 , μ 1 = μ 3. B. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : μ 1 6 = μ 3 , μ 1 6 = μ 2. C. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : at least one μi 6 = μj , i 6 = j; i, j = 1, 2 , 3. D. H 0 : means the same versus H 1 : means different ii. Test. After some effort, the ANOVA table is given by,
Source Degrees of Freedom Sum Of Squares Mean Squares Treatment (Drugs) 2 2.029 1. Error 9 0.002 0. Total 11 2. and so the test statistic is
F test statistic =
(circle one) 1. 02 / 123 / 4612. 3. The upper critical value at α = 0.05, with t − 1 = 3 − 1 = 2 and N − t = 12 − 3 = 9 degrees of freedom, is (circle one) 3. 22 / 3. 89 / 4. 26 (Use PRGM INVF ENTER 2 ENTER 9 ENTER 0.95 ENTER) iii. Conclusion. Since the test statistic, 4612.3, is larger than the critical value, 4.26, we (circle one) accept / reject the null hypothesis that the average patient responses to the three drugs are the same. (b) P–Value Versus Level of Significance. i. The statement of the test is (check none, one or more): A. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : μ 1 6 = μ 2 , μ 1 = μ 3. B. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : μ 1 6 = μ 3 , μ 1 6 = μ 2.
Section 4. Analysis of a CRD: Computational Details (ATTENDANCE 10) 193
and so the test statistic is
F test statistic =
(circle one) 12. 76 / 123 / 4612. 3. The upper critical value at α = 0.05, with k − 1 = 3 − 1 = 2 and N − k = 12 − 3 = 9 degrees of freedom, is (circle one) 3. 22 / 3. 89 / 4. 26 (Use PRGM INVF ENTER 2 ENTER 9 ENTER 0.95 ENTER) iii. Conclusion. Since the test statistic, 12.76, is larger than the critical value, 4.26, we (circle one) accept / reject the null hypothesis that the average number of rats in the three locations are the same. (b) P–Value Versus Level of Significance. i. The statement of the test is (check none, one or more): A. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : μ 1 6 = μ 2 , μ 1 = μ 3. B. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : μ 1 6 = μ 3 , μ 1 6 = μ 2. C. H 0 : μ 1 = μ 2 = μ 3 versus H 1 : at least one μi 6 = μj , i 6 = j; i, j = 1, 2 , 3. D. H 0 : means the same versus H 1 : means different ii. Test. Since the test statistic is F = 12.76, the p–value, with k − 1 = 3 − 1 = 2 and N − k = 12 − 3 = 9 degrees of freedom, is given by
p–value = P (F ≥ 12 .76)
which equals (circle one) 0. 002 / 0. 35 / 0. 43. (Use 2nd DISTR 9:F cdf(12.76,E99,2,9).) The level of significance is 0.05. iii. Conclusion. Since the p–value, 0.002, is smaller than the level of significance, 0.05, we (circle one) accept / reject the null hypothesis that the average number of rats in the three locations are the same.
The observed F is
F =
where
M S[T ] =
t − 1
N − t
194 Chapter 8. Single–Factor Studies: One–Way ANOVA (ATTENDANCE 10)
where t is the number of treatments and N is the total number of replications, and where
∑^ t
i=
[ni(Y (^) i+ − Y (^) ++)^2 ]
∑^ t
i=
( Y (^) i^2 + ni
) −
where CM = Y (^) ++^2 N is called the correction for the mean and^ ni^ is the number of replications in each treatment and
∑^ t i=
∑^ ni j=
(Yij − Y (^) i+)^2
∑^ t
i=
∑^ ni
j=
Y (^) ij^2 −
Y (^) i^2 + ni
∑^ t
i=
( (ni − 1)S^2 i
)
where
SS[T OT ] =
∑^ t
i=
∑^ n
j=
Y (^) ij^2 − CM
A summary of this information^3 can be displayed in the following ANOVA table.
Source Degrees of Freedom Sum Of Squares Mean Squares Treatment t − 1 SS[T] MS[T] Error N − t SS[E] MS[E] Total N − 1 SS[TOT]
Exercise 8.3 (Notation, the ANOVA table and F )
∑ 4 j=1 y^1 j^ = 23,^
∑ 4 j=1 y
2 1 j = 2300,^ n^1 = 4, y2+ =
∑ 4 j=1 y^2 j^ = 122,^
∑ 4 j=1 y 2 2 j = 23040,^ n^2 = 6, y3+ =
∑ 4 j=1 y^3 j^ = 32,^
∑ 4 j=1 y 2 3 j = 4300, and^ n^3 = 6, then calculate F.
(a) y++ = 23 + 122 + 32 = (circle one) 145 / 177 / 256 (^3) The TI–83 calculator is able to determine the ANOVA table from “raw” data, but is not able to determine the ANOVA table from “summarized” data, as given in this section.
196 Chapter 8. Single–Factor Studies: One–Way ANOVA (ATTENDANCE 10)
Source Degrees of Freedom Sum Of Squares Mean Squares Treatment (Drugs) 2 2.033 1. Error 12 0.0022 0. Total 14 2.
(Confirm this by typing the three drug response data into L 1 , L 2 , and L 3 respectively then type STAT TESTS F:ANOVA(L 1 , L 2 , L 3 ) ENTER.)
Treatment Oak Birch Maple replications, ni 5 3 7 mean, ¯yi 4.67 3.53 1. standard deviation, si 1.06 0.27 0.
(a) Since ¯y++ = n^1 y¯1++n^2 Ny¯2+ +n^3 ¯y3+= 5(4.67)+3(3 15 .53)+7(1 .40)= (circle one) 1. 916 / 2. 916 / 3. 916. (b) SS[T ] = n 1 (¯y1+ − y¯++)^2 + n 2 (¯y2+ − y¯++)^2 + n 3 (¯y3+ − y¯++)^2 = 5(4. 67 − 2 .916)^2 + 3(3. 53 − 2 .916)^2 + 7(1. 40 − 2 .916)^2 = (circle one) 31. 60 / 32. 60 / 33. 60. (c) SS[E] = (n 1 − 1)s^21 + (n 2 − 1)s^22 + (n 3 − 1)s^23 = (5 − 1)1. 062 + (3 − 1)0. 272 + (7 − 1)1. 42 = (circle one) 4. 17 / 6. 21 / 16. 40. (d) M S[T ] = SS t−[T 1 ]= (^323) −.^601 = (circle one) 15. 3 / 16. 3 / 17. 3
(e) M S[E] = SS N [−Et] = 1615 .−^403 = (circle one) 0. 417 / 0. 517 / 6. 21
(f) F = M S M S[[TE^ ]] = (^166). 21.^3 = (circle one) 2. 62 / 32. 5 / 33. 5 and the ANOVA table is
Source Degrees of Freedom Sum Of Squares Mean Squares Treatment (Leaves) 2 32.6 16. Error 12 16.40 6. Total 14 49.
Not covered.
Section 6. Checking For Violations of Assumptions (ATTENDANCE 10) 197
The ANOVA assumptions for a completely randomized design (CRD) are:
If these conditions are satisfied, an ANOVA procedure will give correct results. We will find out how to determine if the data satisfies the first two conditions using q–q plots and e ∨ p plots, respectively^4. More than just telling us that the data does not satisfy the ANOVA assumptions, the plots can often tell us in exactly what way the data does not satisfy the ANOVA assumptions and, in fact, tell us how to transform the data so that it does satisfy the ANOVA assumptions.
See Lab 10: Q–Q Plots and e ∨ p Plots For ANOVA.
Exercise 8.4 (ANOVA Assumptions For Completely Randomized Design (CRD))
drug 1 5.90 5.92 5.91 5.89 5. drug 2 5.51 5.50 5.50 5.49 5. drug 3 5.01 5.00 4.99 4.98 5.
(a) Store This Data in a file called DRUG. i. Type data into L 1 , L 2 and L 3 ii. 2nd MEM 8:Group ENTER 1:Create New iii. type in file name, such as “DRUG”, say, using green letters above the keys, then ENTER iv. arrow down to 4:List... ENTER and select L 1 ENTER, arrow down, L 2 ENTER, arrow down and finally L 3 , then DONE ENTER (b) Retrieving Stored Data from DRUG. i. 2nd MEM 8:Group ENTER (^4) Both plots can also, often, be used to detect for dependence. Dependence may occur in different ways; for example, if there are confounding factors (dealt with by using randomization) or if the data is serially correlated.
Section 6. Checking For Violations of Assumptions (ATTENDANCE 10) 199
(a) σ = μ(1 − μ) (arcsin)
σ = μ(1 − μ)σ = μ(1 − μ)^2 (b) σ = k (constant)
σ =σ =^2
(c) σ = kμ (square root)
(^2) (d) σ = (^2) kμ (logarithm)
(^2) (e) σ = kμ (power example)
− 2
Figure 8.2 (e ∨ p Plots) The e∨p plot for the data given above indicates variance is not not constant with respect to the mean μ, but varies in the following way: (circle one) σ^2 = μ(1 − μ) / σ^2 = k / σ^2 = kμ σ^2 = kμ^2 / σ^2 = kμ−^2 / none of these. (Type drug 1, 2 and 3 responses in L 1 , L 2 , L 3 respectively, then PRGM EVPPLOT ENTER 3 ENTER) (c) It (circle one) is / is not possible to use ANOVA on this data because the variance is not constant. (d) True / False In fact, since the e ∨ p plot does not appear to show any of the specified e ∨ p plot patterns, it is not clear how to transform the data to satisfy the ANOVA assumptions.
grade 1 soil 0.306 0.363 0. grade 2 soil 0.787 0.899 1.272 1. grade 3 soil 1.634 1.682 5.
(a) Normality? The q–q plot for the data given above indicates (circle one) heavy tail / light tail / normality left skew / right skew / none of these (b) Equal Variance? The e ∨ p plot for the data given above indicates variance is related to the mean μ, in the following way: (circle one) σ^2 = μ(1 − μ) / σ^2 = k / σ^2 = kμ σ^2 = kμ^2 / σ^2 = kμ−^2 / none of these.
200 Chapter 8. Single–Factor Studies: One–Way ANOVA (ATTENDANCE 10)
(c) It (circle one) is / is not possible to use ANOVA on this data because of the left skew and the variance is not constant. (d) True / False Since the e∨p plot appears to show σ^2 = kμ^2 , we can use the logarithm transformation, g(y) = ln y, on the data to force the variance to become constant, σ^2 = k, to satisfy one of the ANOVA assumptions. Also, fortunately, this transformation also tends to make the data more closely follow a normal, another of the required ANOVA assumptions.
pesticide 1 12 13 15 15 15 16 16 17 18 22 pesticide 2 28 30 31 32 33 33 37 37 40 41 pesticide 3 42 48 48 54 54 55 56 60 63 64 pesticide 4 70 71 72 73 75 75 77 79 79 80 pesticide 5 92 94 95 96 96 97 97 98 98 99
(a) Normality? The q–q plot for the data given above indicates (circle one) heavy tail / light tail / normality left skew / right skew / none of these (b) Equal Variance? The e ∨ p plot for the data given above indicates variance is related to the mean μ, in the following way: (circle one) σ^2 = μ(1 − μ) / σ^2 = k / σ^2 = kμ σ^2 = kμ^2 / σ^2 = kμ−^2 / none of these. (c) It (circle one) is / is not possible to use ANOVA on this data because the variance is not constant. (d) True / False Since the e ∨ p plot appears to show σ^2 = kμ(1 − μ), we can use the arcsin transformation, g(y) = sin−^1
y, on the data to force the variance to become constant, σ^2 = k, to satisfy one of the ANOVA assumptions. Also, fortunately, this transformation also tends to make the data more closely follow a normal, another of the required ANOVA assumptions. In fact, the shape of the e ∨ p plot indicates the data from each of the five plots has been sampled from five different binomial distributions^5
202 Chapter 8. Single–Factor Studies: One–Way ANOVA (ATTENDANCE 10)
grade 1 soil 0.306 0.363 0. grade 2 soil 0.787 0.899 1.272 1. grade 3 soil 1.634 1.682 5.
(a) Use the logarithm transformation (g(y) = ln y) to convert this data into one with equal variance (fill in the blanks):
grade 1 soil -1.184 -1.103 -0. grade 2 soil -0.2395 0.24059 0. grade 3 soil 0.49103 0. (b) Normality? The q–q plot for the data given above indicates (circle one) heavy tail / light tail / normality left skew / right skew / none of these (c) Equal Variance? The e ∨ p plot for the data given above indicates variance is related to the mean μ, in the following way: (circle one) σ^2 = kμ(1 − μ) / σ^2 = k / σ^2 = kμ σ^2 = kμ^2 / σ^2 = kμ−^2 / none of these. (d) Even though there still appears to be a left skew, complete the ANOVA table (fill in the blanks)
Source Degrees of Freedom Sum Of Squares Mean Squares Treatment (Soil) Error Total
where F = 16.43 and the p–value is essentially zero (in other words, the soil–water fluxes are different for different grades of soil). Although this data is not the same as the original data, it is safe to say the soil–water fluxes are different. (Type the data into L 1 , L 2 , L 3 , then type STAT TESTS F:ANOVA(L 1 , L 2 , L 3 ) ENTER.)
y): Pest–Free Apples. The number of pest–free apples (out of 100 per tree) produced by apple plants in five differently applied pesticide plots of land is tabulated below. The e ∨ p plot tells us the data from each of the five plots has been sampled from five different binomial distributions.
pesticide 1 12 13 15 15 15 16 16 17 18 22 pesticide 2 28 30 31 32 33 33 37 37 40 41 pesticide 3 42 48 48 54 54 55 56 60 63 64 pesticide 4 70 71 72 73 75 75 77 79 79 80 pesticide 5 92 94 95 96 96 97 97 98 98 99
Section 7. Analysis of Transformed Data (ATTENDANCE 10) 203
(a) Use the arcsin transformation (g(y) = sin−^1
√ y/100, where MODE is set to DEGREE) to convert this data into one with equal variance (fill in the blanks): pesticide 1 20.2 21.1 22.8 22.8 22.8 23.6 23.6 24.4 25.1 27. pesticide 2 31.9 33.2 34.5 35.1 35.1 37.5 37.5 39.2 39. pesticide 3 40.4 43.9 43.9 47.3 47.3 47.9 48.4 50.8 52.5 53. pesticide 4 56.8 57.4 58.1 58.7 60 61.3 62.7 62.7 63. pesticide 5 73.6 75.8 77.1 78.5 78.5 80.0 80.0 81.9 81.9 84.
(b) Normality? The q–q plot for the data given above indicates (circle one) heavy tail / light tail / normality left skew / right skew / none of these (c) Equal Variance? The e ∨ p plot for the data given above indicates variance is related to the mean μ, in the following way: (circle one) σ^2 = kμ(1 − μ) / σ^2 = k / σ^2 = kμ σ^2 = kμ^2 / σ^2 = kμ−^2 / none of these. (d) Complete the ANOVA table (fill in the blanks)
Source Degrees of Freedom Sum Of Squares Mean Squares Treatment (Pesticides) Error Total where F = 538.7 and the p–value is essentially zero (in other words, the pesticides are acting differently on the apple trees). Although this data is not the same as the original data, it is safe to say the pesticides are acting differently. (Type the data into L 1 ,... , L 5 , then type STAT TESTS F:ANOVA(L 1 , L 2 , L 3 , L 4 , L 5 ) ENTER.)
soil drainage 1 7.6 8.4 8.9 8.9 8.5 7.7 6. soil drainage 2 9.5 10.2 10.2 9.6 8.5 7.1 5. soil drainage 3 6.3 9.0 11.3 12.5 12.5 11.3 9. soil drainage 4 3.3 8.4 14.0 17.5 17.5 14.6 10.
(a) Use the square root transformation (g(y) =
y) to convert this data into one with equal variance (fill in the blanks): soil drainage 1 2.98 2.98 2.90 2.77 2. soil drainage 2 3.09 3.20 3.20 3.10 2.91 2.66 2. soil drainage 3 2.51 3.00 3.36 3.54 3.54 3.37 3. soil drainage 4 1.84 2.90 3.75 4.19 4.19 3.82 3.