Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Analysis of Variance vs. Experimental Design: Decomposing Sums of Squares and F-Test - Pro, Exams of Statistics

University of Idaho (U of I)Statistics

Prof. Christopher J. Williams

The relationship between analysis of variance (anova) and experimental design. It explains the concept of decomposing total sums of squares into between-groups and within-groups components for a one-way anova. The document also introduces the f-test for comparing the null hypothesis of equal group means to the alternative hypothesis of at least one difference. An example calculation and an analysis of variance table.

Typology: Exams

Pre 2010

Uploaded on 08/19/2009

koofers-user-t62 🇺🇸

10 documents

1 / 2

This page cannot be seen from the preview

Don't miss anything!

1 Analysis of Variance versus Experimental Design

Not all data analyzed by ANOVA are from a designed experiment. On the other hand, some designed

experiments lead to data for which ANOVA methods are inappropriate. However, there is a strong historical

connection between ANOVA and Experimental Design.

2 Analysis of Variance

The key idea behind an analysis of variance involves a decomposition of the total sum of squares. For a

one-way ANOVA (possibly arising from a completely randomized design) the decomposition is TSS = SSB

+ SSW (Our text uses SST instead of SSB, and SSE instead of SSW). This expression not only looks a lot

like TSS = SSR + SSE from regression, but we will see later that we can set up a regression model using

dummy variables for which these expressions are identical, where SSB = SSR and SSW = SSE.

If Yij is the jth observation in group i, then Yij −Y.. = (Yij −Yi.)+(Yi. −Y..), which leads to:

k

X

i=1

n

X

j=1

(Yij −Y..)2=

k

X

i=1

n

X

j=1

(Yij −Yi.)2+

k

X

i=1

n

X

j=1

(Yi. −Y..)2,or TSS = SSW +SSB.

As an example, consider three groups with the following data: Group 1 has Y1jvalues of 1, 2, and 3,

Group 2 has Y2jvalues of 5, 3, and 4, and Group 3 has Y3jvalues of 6, 7, and 5. The overall sample mean

is Y.. = (Pk

i=1 Pn

j=1 Yij)/kn = 36/9 = 4.Then TSS is

k

X

i=1

n

X

j=1

(Yij −Y..)2= (1 −4)2+ (2 −4)2+... + (5 −4)2= 30.

The group means are Y1.= 2, Y 2.= 4,and Y3.= 6,so SSW and SSB are

SSW =

k

X

i=1

n

X

j=1

(Yij −Yi.)2= (1 −2)2+ (2 −2)2+ (3 −2)2+ (5 −4)2+... + (5 −6)2= 6,and

SSB =

k

X

i=1

n

X

j=1

(Yi. −Y..)2=

k

X

i=1

n(Yi. −Y..)2= 3(2 −4)2+ 3(4 −4)2+ 3(6 −4)2= 24.

Thus TSS = SSW +SSB or 30 = 6 + 24 partitions the total sum of squares about the overall mean into

two parts, one within groups (due to error, or effects not accounted by the model) and one between groups

(measuring the difference between sample means). Since each group here has nobservations, each group

contributes n−1 degrees of freedom for the within group sum of squares, for a total of k(n−1) degrees of

freedom for SSW. SSB is calculating the sum of squares of ksample means about their (overall) mean, so

it has k−1 degrees of freedom. For the example data above, k(n−1) = 3(2) = 6, and k−1=3-1=2.

We can summarize this information in an analysis of variance table:

Source SS df MS F

Between groups 24 2 12 12

Within groups 6 6 1

Total sum of squares 30 8

1

Discover Exams of Statistics University of Idaho (U of I)

Partial preview of the text

Download Analysis of Variance vs. Experimental Design: Decomposing Sums of Squares and F-Test - Pro and more Exams Statistics in PDF only on Docsity!

1 Analysis of Variance versus Experimental Design

Not all data analyzed by ANOVA are from a designed experiment. On the other hand, some designed experiments lead to data for which ANOVA methods are inappropriate. However, there is a strong historical connection between ANOVA and Experimental Design.

2 Analysis of Variance

The key idea behind an analysis of variance involves a decomposition of the total sum of squares. For a one-way ANOVA (possibly arising from a completely randomized design) the decomposition is TSS = SSB

SSW (Our text uses SST instead of SSB, and SSE instead of SSW). This expression not only looks a lot like TSS = SSR + SSE from regression, but we will see later that we can set up a regression model using dummy variables for which these expressions are identical, where SSB = SSR and SSW = SSE. If Yij is the jth observation in group i, then Yij − Y (^) .. = (Yij − Y (^) i.) + (Y (^) i. − Y (^) ..), which leads to:

∑^ k

i=

∑^ n

j=

(Yij − Y (^) ..)^2 =

∑^ k

i=

∑^ n

j=

(Yij − Y (^) i.)^2 +

∑^ k

i=

∑^ n

j=

(Y (^) i. − Y (^) ..)^2 , or TSS = SSW +SSB.

As an example, consider three groups with the following data: Group 1 has Y 1 j values of 1, 2, and 3, Group 2 has Y 2 j values of 5, 3, and 4, and Group 3 has Y 3 j values of 6, 7, and 5. The overall sample mean

is Y (^) .. = (

∑k i=

∑n j=1 Yij^ )/kn^ = 36/9 = 4.^ Then TSS is

∑^ k

i=

∑^ n

j=

(Yij − Y (^) ..)^2 = (1 − 4)^2 + (2 − 4)^2 + ... + (5 − 4)^2 = 30.

The group means are Y (^1). = 2, Y (^2). = 4, and Y (^3). = 6, so SSW and SSB are

SSW =

∑^ k

i=

∑^ n

j=

(Yij − Y (^) i.)^2 = (1 − 2)^2 + (2 − 2)^2 + (3 − 2)^2 + (5 − 4)^2 + ... + (5 − 6)^2 = 6, and

SSB =

∑^ k

i=

∑^ n

j=

(Y (^) i. − Y (^) ..)^2 =

∑^ k

i=

n(Y (^) i. − Y (^) ..)^2 = 3(2 − 4)^2 + 3(4 − 4)^2 + 3(6 − 4)^2 = 24.

Thus TSS = SSW +SSB or 30 = 6 + 24 partitions the total sum of squares about the overall mean into two parts, one within groups (due to error, or effects not accounted by the model) and one between groups (measuring the difference between sample means). Since each group here has n observations, each group contributes n − 1 degrees of freedom for the within group sum of squares, for a total of k(n − 1) degrees of freedom for SSW. SSB is calculating the sum of squares of k sample means about their (overall) mean, so it has k − 1 degrees of freedom. For the example data above, k(n − 1) = 3(2) = 6, and k − 1 = 3 - 1 = 2. We can summarize this information in an analysis of variance table:

Source SS df MS F Between groups 24 2 12 12 Within groups 6 6 1 Total sum of squares 30 8

To test the null hypothesis H 0 : μ 1 = μ 2 = μ 3 against the alternative hypothesis Ha : some μi’s differ, we compare the F statistic to an F distribution with numerator df = k − 1 = 3 - 1 = 2, and denominator df = k(n − 1) = 3(2) = 6. When group sample sizes are unequal we replace n in the expressions by ni, which is the sample size in the ith group.

3 The model for a completely randomized experiment (1 way

ANOVA)

The model for ANOVA with one grouping factor is

Yij = μ + αi + εij , where μ is the population grand mean, αi = μi − μ is the treatment effect for the ith group, and εij = Yij − μ − αi is the random error effect for Yij. Using this notation we can write the null hypothesis H 0 : μ 1 = μ 2 = ... = μk in the alternate form H 0 : α 1 = α 2 = ... = αk = 0. In performing ANOVA for a completely randomized experiment we assume that 1) random samples have been taken from each of the k populations, 2) the errors εij have a normal distribution with mean 0, and 3) the errors εij have a common variance σ^2. It can be shown that E(M SW ) = σ^2 , and E(M SB) = σ^2 + n

α^2 i /(k − 1). Then if H 0 is true, all αi terms equal zero and M SW and M SB be nearly equal. Thus F = M SB/M SW ≈ 1 when H 0 is true and F > 1 when H 0 is false.

Analysis of Variance vs. Experimental Design: Decomposing Sums of Squares and F-Test - Pro, Exams of Statistics

Related documents

Partial preview of the text

Download Analysis of Variance vs. Experimental Design: Decomposing Sums of Squares and F-Test - Pro and more Exams Statistics in PDF only on Docsity!

1 Analysis of Variance versus Experimental Design

2 Analysis of Variance

SSW =

SSB =

3 The model for a completely randomized experiment (1 way

ANOVA)