Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Single Factor Analysis of Variance (ANOVA) for Business Analysis, Exams of Introduction to Business Management

An overview of the single factor analysis of variance (anova) model used in quantitative business analysis. The model's components, the partition of sums of squares, degrees of freedom, and the test statistic for equality of treatment means. It also covers interval estimation of treatment means and pairwise comparisons.

Typology: Exams

Pre 2010

Uploaded on 07/31/2009

koofers-user-o8d
koofers-user-o8d 🇺🇸

5

(1)

10 documents

1 / 12

Toggle sidebar

Related documents


Partial preview of the text

Download Single Factor Analysis of Variance (ANOVA) for Business Analysis and more Exams Introduction to Business Management in PDF only on Docsity!

Analysis of Variance (ANOVA)

Factors and Factor Levels

  • The purpose of analysis of variance is to study the statistical relationship between a

dependent variable and one or more independent variables.

  • The dependent variable is usually called the response variable.
  • An independent variable in ANOVA is called a factor.
  • A factor level or treatment is a particular outcome of the independent variable.
  • The relationship may be the effect on the response variable of different treatments of a factor

or the differences or similarities in the response variable between treatments.

  • Based on the number of factors studied, ANOVA can be classified into one factor ANOVA,

usually called one way ANOVA, and two factor ANOVA, usually called two way ANOVA.

The Single Factor ANOVA model

  • The model

Y ij = μ j + eij

where, μ j is the subpopulation mean of the th treatment, called population treatment

mean , and e are independent error terms.

j

ij

  • Assumptions
    1. For each treatment , the probability distribution of the response variable Y is normal

with a mean

j

μ j.

  1. All probability distributions of Y have the same variance

2

σ.

  1. The error terms eij are independent and normally distributed with a mean 0 and variance

2

σ.

4. It follows that E Y [ ij ]= μ j , E e [ ij ] = 0 , and

2 Y ij eij

σ = σ = σ.

  • Steps in ANOVA

1. Test whether the treatment means μ j are equal. If they are equal, there is no factor

effect and no further analysis is required.

2. If the treatment means μ j are not equal, study the nature of the treatment effects.

Test for Equality of Treatment Means

  • Notations

Y ij − the i th observation in the j th treatment or factor level.

Y (^) j − the sample treatment mean, i.e ., the sample mean, for the j th treatment or factor

level.

Y − the overall sample mean or grand mean.

n (^) j − the sample size, i.e ., the number of observations, in the j th treatment.

n T − the grand sample size , i.e ., the total number of observations, in all the treatments.

k − number of treatments or factor levels.

  • Formulas

1

nj

j i j (^) i

Y Y

n (^) =

= (^) ∑ j , , 1

k

T j j

n n

= (^) ∑ 1 1

k nj

ij T (^) j i

Y Y

n (^) = =

= (^) ∑∑

  • The sample treatment mean Yj is a point estimator of the population treatment mean μ (^) j.
  • It is not necessary to have for j 1 n = n j 2 1 2 jj , i.e ., the sample sizes are not necessarily the

same.

Partition of Sum of Squares

  • Total sum of squares

2

1 1

( )

k n j

ij j i

SST Y Y

= =

= (^) ∑∑ −

  • Between treatments sum of squares

2 2

1 1 1

( ) ( )

k n j k

j j j j i j

SSB Y Y n Y Y = = =

= (^) ∑∑ − = (^) ∑ −

  • Within treatment sum of squares

2

1 1

( )

k n j

ij j j i

SSW Y Y

= =

= (^) ∑∑ −

  • Partition of the sum of squares

YijY = ( Y (^) ijY (^) j ) + ( Y (^) jY )

Squaring both sides, summing over all observations and combining terms, we obtain

2 2

1 1 1 1 1 1

( ) ( ) (

k n^ j k n^ j k nj

ij ij j j j i j i j i

SST SSW SSB

Y Y Y Y Y Y

= = = = = =

∑∑ −^ =^ ∑∑ −^ +^ ∑∑ −

  

2 ).

Therefore, we have

SST = SSW + SSB
  • Example

A kitchen utensils manufacturer selected 24 similar department stores to try out four different

promotional displays for a newly designed rice cooker. The display that generates the

highest sales in the study is to be used in the manufacturer’s national promotion program.

Each display was assigned at random to six stores. Sales in units for the store during the two

week observation period are given in the following table.

Store^ Display (^ j )

( i ) (^1 2 3 )

Total

Total 162 132 114 192 600

Mean Y 1 = 27 Y 2 = 22 Y 3^ =^19 Y 4 = 32

n (^) j n 1 (^) = 6 n 2 (^) = 6 n 3 (^) = (^6) n 4 (^) = 6 nT = 24

In this example, the dependent variable is sales Y , i.e ., the number of units sold. The factor,

or the independent variable, is the display. The treatments, or the factor levels, are the

different displays. We assume every thing else is the same. Only the effects of display are

studied.

We also have k = 4 and 1 1

k n j

ij T (^) j i

Y Y

n (^) = =

= (^) ∑∑ = =.

The following table lists the YijY and the

2 ( Y (^) ijY ) values

Yi (^) 1 − Y Yi (^) 2 − Y Yi (^) 3 − Y Yi (^) 4 − Y

2 ( Yi (^) 1 − Y )

2 ( Yi (^) 2 − Y )

2 ( Yi (^) 3 − Y )

2 ( Yi (^) 4 − Y )

1 -1 4 -9 7 1 16 81 49
2 0 -2 -9 11 0 4 81 121
3 5 -6 -8 5 25 36 64 25
4 -5 -4 -11 3 25 16 121 9
5 5 -3 1 10 25 9 1 100
6 8 -7 0 6 64 49 0 36

Sum 140 130 348 340

From the table, we obtain

6 2 1 1

( (^) i ) 140 i

Y Y

=

∑ −^ = ,

6 2 2 1

( (^) i ) 130 i

Y Y

=

∑ −^ = ,

6 2 3 1

( (^) i ) 348 i

Y Y

=

∑ −^ = ,

6 2 4 1

( (^) i ) 340 i

Y Y

=

∑ −^ =

2

1 1

( ) 140 130 348 340 958

k n j

ij j i

SST Y Y

= =

= (^) ∑∑ − = + + + =.

The following table lists the YijYj and the

2 ( Y (^) ijYj ) values

Yi (^) 1 − Y 1 Yi (^) 2 − Y 2 Yi (^) 3 − Y 3 Yi (^) 4 − Y 4

2 ( Yi (^) 1 − Y 1 )

2 ( Yi (^) 2 − Y 2 )

2 ( Y (^) i (^) 3 − Y 3 )

2 ( Y (^) i (^) 4 − Y 4 )

1 -3 7 -3 0 9 49 9 0
2 -2 1 -3 4 4 1 9 16
3 3 -3 -2 -2 9 9 4 4
4 -7 -1 -5 -4 49 1 25 16
6 6 -4 6 -1 36 16 36 1

Sum 116 76 132 46

From this table, we obtain

6 2 1 1 1

( (^) i ) 116 i

Y Y

=

∑ −^ = ,

6 2 2 2 1

( (^) i ) i

Y Y

=

∑ −^ =^76 ,

6 2 3 3 1

( (^) i ) 132 i

Y Y

=

∑ −^ = ,

6 2 4 4 1

( (^) i ) i

Y Y

=

∑ −^ =^46

2

1 1

( ) 116 76 132 46 370

k n j

ij j j i

SSW Y Y

= =

= (^) ∑∑ − = + + + =.

Furthermore, we have

Y 1 − Y = 27 − 25 = 2 , Y 2 − Y = 22 − 25 = − 3 , Y 3 − Y = 19 − 25 = − 6 , Y 4 − Y = 32 − 25 = 7

2 2 2 2 2

1

( ) 6(2) 6( 3) 6(6) 6(7) 588

k

j j j

SSB n Y Y

= (^) ∑ − = + − + + =.

Computational Formula

2

2 2

1 1 1 1 1 1

k n^ j k n^ j k nj

ij ij ij T j i T j i j i

SST Y Y Y n Y = = n = = = =

 
= −   = −
 
 

∑∑ ∑∑ ∑∑

2

2 2 2 2

1 1 1 1 1 1

k n^ j k n^ j k nj

ij ij ij T j j^ i T^ j i j j i

SSB Y Y Y n Y = n^ = n^ = = = n =

     
=   −   =   −
     
     

∑ ∑ ∑∑ ∑ ∑

2 2

1 1 1 1

k n^ j k nj

ij ij j i j j i

SSW Y Y

= = = n =

 
= −  
 
 

∑∑ ∑ ∑ or^ SSW^ =^ SST^ − SSB

The Y values are given in the following table

2 ij

j i 1 2 3 4

Sum 4490 2980 2298 6190

From this table, we obtain

6 2 1 1

i^4490 i

Y

=

∑ = ,^ ∑ ,^ ∑ ,^ ∑

6 2 2 1

i^2980 i

Y

=

=

6 2 3 1

i^2298 i

Y

=

=

6 2 4 1

i^6190 i

Y

=

=

Therefore

2

1 1

k n j

ij j i

Y

= =

∑∑ =^ +^ +^ +^ =

From the first table, we have

2 2

1 1

k n j

ij T (^) j i

Y

n (^) = =

 
  = =
 
 

∑∑

Also,

1

2 2

1 (^1 )

n

i i

Y

n (^) =

 
  =^ =
 
 

∑ ,^

2

2 2

2 (^2 )

n

i i

Y

n (^) =

 
  =^ =
 
 

∑ ,^

3

2 2

3 (^3 )

n

i i

Y

n (^) =

 
  =^ =
 
 

∑ ,

4

2 2

4 4 1

n

i i

Y

n (^) =

 
  =^ =
 
 

∑ ,

therefore,

2

1 1

k n j

ij j j i

Y

= n =

 
  = + + + =
 
 

∑ ∑.

Using the computational formulas,

2 2

1 1 1 1

k n^ j k nj

ij ij j i T j i

SST Y Y

= = n = =

 
= −   = − =
 
 

∑∑ ∑∑

2 2

1 1 1 1

k n^ j k nj

ij ij j j^ i T j i

SSB Y Y

= n^ = n = =

   
=   −   = −
   
   

∑ ∑ ∑∑ =

2 2

1 1 1 1

k n^ j k nj

ij ij j i j j i

SSW Y Y

= = = n =

 
= −   = − =
 
 

∑∑ ∑ ∑

or SSW = SSTSSB = 958 − 588 = 370

Partition of Degrees of Freedom

  • Degrees of freedom associated with SSB k − 1
  • Degrees of freedom associated with SSW nTk
  • Degrees of freedom associated with SST nT − 1
  • The partition is
N N^ N

for for for

T^1 1 T

df SST df^ SSB df SSW

n − = k − + nk

Mean Squares

  • Between treatments mean squares 1
SSB
MSB

k

=
  • Within treatment mean squares

T

SSW
MSW

n k

=
  • For the example above
SSB
MSB

k

= = =
− −
T^24
SSW
MSW

n k

= = =
− −

ANOVA Table

  • General form

Source of Variation SS df^ MS (^) F

Between SSB k − 1 MSB F^ = MSB MSW

Within SSW nT^ −^ k MSW

Total SST nT − 1

  • Example

Source of Variation SS df^ MS (^) F

Between 588 3 196 F^ =^ 196 18.5^ =10.

Within 370 20 18.

Total 958 23

  • The test statistic
MSB
F
MSW

= has a (^) 1, k nT k F (^) − − distribution.

  • For a 1 , 2

F ν ν random variable, ν 1 is the numerator degrees of freedom and ν 2 is the

denominator degrees of freedom.

Test of Equality of Treatment Means

  • The null and alternative hypotheses

H 0 : μ 1 = μ 2 = ... =μ k

Ha : not all treatment means are equal

  • The decision rule for a given significance level α

Do not reject H 0 if , 1, k nT k FF α − −

Reject H 0^ if^ , 1, , k nT k F > F α − −

Where, (^) , 1, is such a value that k nT k F α − − ( (^) 1, , 1, ) k nT k k nT k

P F − − > F α − − = α.

  • If is not rejected, we do not have significant evidence that there is any difference

between the treatment means. If is rejected, the treatment means may be all different or

only some of them are different.

H 0
H 0
  • Example

For the example above, test the equality of the treatment means at a significance level

α = 0.01. Then F 0.01,3,30 =4.51. The decision rule is,

Do not reject H 0 if F ≤4.

Reject H 0 if F >4.51.

The conclusion is to reject H 0 because F > 4.51. At a significance level α = 0.01, we can

say that the treatment means are significantly different.

MINITAB Result

  • For one-way ANOVA, MINITAB takes two input formats, one-way… and one-way

(unstacked).

  • The following is the MINITAB printout

One-way Analysis of Variance

Analysis of Variance for Sales Source DF SS MS F P Display 3 588.0 196.0 10.59 0. Error 20 370.0 18. Total 23 958. Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev -----+---------+---------+---------+- 1 6 27.000 4.817 (----------) 2 6 22.000 3.899 (----------) 3 6 19.000 5.138 (----------) 4 6 32.000 3.033 (----------) -----+---------+---------+---------+- Pooled StDev = 4.301 18.0 24.0 30.0 36.

Analysis of Treatment Effects

  • There are many different ways to analyze treatment effects. We only discuss a few of them.

Interval Estimation of Treatment Means

  • As noted earlier, the sample treatment mean Y (^) j is an estimator of the population treatment

mean μ j.

  • The estimated variance of Y (^) j is

2 Y j j

MSW

s n

=
  • The test statistic

j

j

Y

Y

s

j t

− μ = has a t distribution with nTk degrees of freedom.

  • The confidence limits for μ (^) j given a confidence coefficient 1 − α are

j (^) 2 , nT kYj L = Yt (^) α (^) − s

j (^) 2 , nT kYj R = Y + t (^) α (^) − s

  • Example

For the example above,

j 6 Y j

MSW

s n

= = and

j 6

sY = = ,

because all n (^) j are the same, i.e. , n (^) j = 6 for all j = 1, 2,3, 4. If the n (^) j are different,

2 Y j s will

be different.

Now construct 95% confidence intervals for the treatment means. Then α = 0.05and

2 ,^ 0.025,

n T k t (^) α (^) − = t =.

For μ 1 ,

(^1 2) , (^1)

27 2.086(1.756) 23.

n T k Y L = Yt (^) α (^) − s = − =

(^1 2) , (^1)

27 2.086(1.756) 30.

n T kY

R Y t (^) α s

= + = + =

With 95% confidence we can say that μ 1 is somewhere between 23.337 and 30.663.

For μ 2 ,

(^2 2) , (^2)

22 2.086(1.756) 18.

n T kY L = Yt (^) α (^) − s = − =

(^2 2) , (^2)

22 2.086(1.756) 25.

n T kY R = Y + t (^) α (^) − s = + =

With 95% confidence we can say that μ 2 is somewhere between 18.337 and 25.663.

For μ 3 ,

(^3 2) , (^3)

19 2.086(1.756) 15.

n T k Y L = Yt (^) α (^) − s = − =

(^3 2) , (^3)

19 2.086(1.756) 22.

n T k Y R = Y + t (^) α (^) − s = + =

With 95% confidence we can say that μ 3 is somewhere between 15.337 and 22.663.

For μ 4 ,

(^4 2) , (^4)

32 2.086(1.756) 28.

n T kY

L Y t (^) α s

= − = − =

(^4 2) , (^4)

32 2.086(1.756) 35.

n T kY R = Y + t (^) α (^) − s = + =

With 95% confidence we can say that μ 4 is somewhere between 28.337 and 35.663.

Comparison of Two Treatment Means

  • Sometimes, we are interested in the difference of two treatment means. Therefore, we make

pairwise comparisons of the following form

j 1 (^) j 2 μ −μ

  • A point estimator of this difference is

j 1 (^) j 2

Y − Y
  • The estimated variance of this estimator is

1 2 1 2 1

Y j Yj j j j

MSW MSW

s MSW n n n n

 
= + =  + 
 

j 2 

  • The test statistic

1 2 1 2

1 2

( ) (

j j

j j j j

Y Y

Y Y

s

)

t

μ μ

− − −

= has a distribution with degrees of

freedom.

t nTk

  • The 1 − α confidence limits for j 1 j 2 μ − μ are

( 1 2 )

2 ,^ T j 1^ j 2

L = YjY (^) jt (^) α (^) nksYY

( 1 2 )

2 ,^ T j 1^ j 2

R = Y (^) jY (^) j + t (^) α (^) nksYY

  • Example

For the example above, construct 90% confidence intervals for μ 1 − μ 3 and for μ 2 − μ 3. In

this problem, 2 ,^ 0.05,

n T k t α (^) (^) − = t =.

For μ 1 −μ 3

1 3

2

1 3

sY (^) Y MSW n n

   
=  +  =  + =
  ^ 

and 1 3

sY (^) − Y = =

L = ( Y 1 − Y 3 ) − t 0.05,20 sY 1 − Y 3 = (27 − 19) − 1.725(2.483) =3.

R = ( Y 1 − Y 3 ) + t 0.05,20 sY 1 − Y 3 = (27 − 19) + 1.725(2.483) =12.

With 90% confidence we can say that μ 1 − μ 3 is somewhere between 3.717 and 12.283.

For μ 2 −μ 3

2 3

2

2 3

sY (^) Y MSW n n

   
=  +  =  + =
  ^ 

and 3 4

sY (^) − Y = =

( )

2 3 0.05,20 (^2 ) L = YYt sY (^) − Y = (22 − 19) − 1.725(2.483) = −1.

( )

2 3 0.05,20 (^2 ) R = YY + t sY (^) − Y = (22 − 19) + 1.725(2.483) =7.

With 90% confidence we can say that μ 2 − μ 3 is somewhere between –1.283 and 7.283.

  • If a 1 − α confidence interval for j 1 j 2 μ − μ covers 0, j 1 μ and j 2 μ are not significantly

different at a significance level α. Otherwise, they are significantly different.