3 randomized complete block design (rcbd), Exams of Design

The goal is to control the effects of a variable not of interest by bringing experimental units that are similar into a group called a “block”. The treatments ...

Typology: Exams

2022/2023

Uploaded on 03/01/2023

hugger
hugger 🇺🇸

4.8

(12)

916 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
3RANDOMIZED COMPLETE BLOCK DESIGN (RCBD)
The experimenter is concerned with studying the effects of a single factor on a response of interest.
However, variability from another factor that is not of interest is expected.
The goal is to control the effects of a variable not of interest by bringing experimental units that are
similar into a group called a “block”. The treatments are then randomly applied to the experimental
units within each block. The experimental units are assumed to be homogeneous within each block.
By using blocks to control a source of variability, the mean square error (MSE) will be reduced. A
smaller MSE makes it easier to detect significant results for the factor of interest.
Assume there are atreatments and bblocks. If we have one observation per treatment within each
block, and if treatments are randomized to the experimental units within each block, then we have a
randomized complete block design (RCBD). Because randomization only occurs within blocks,
this is an example of restricted randomization.
3.1 RCBD Notation
Assume µis the baseline mean, τiis the ith treatment effect, βjis the jth block effect, and
ij is the random error of the observation. The statistical model for a RCBD is
yij =µ+τi+βj+ij and ij II DN (0, σ2).(6)
µ,τi(i= 1,2, . . . , a), and βj(j= 1,2, . . . , b) are not uniquely estimable. Constraints must be
imposed. To be able to calculate estimates bµ,bτi, and b
βj, we need to impose two constraints.
Initially, we will assume the textbook constraints:
a
X
i=1
τi= 0 and
b
X
j=1
βj= 0.
These are not the default SAS constraints (τa= 0, βb= 0) or R constraints (τ1= 0, β1= 0).
Applying these constraints, will yield least-squares estimates
bµ=bτi= and b
βj=
where ¯yi·is the mean for treatment i, and ¯y·jis the mean for block j.
Substitution of the estimates into the model yields:
yij =bµ+bτi+b
βj+eij
= ¯y·· + ( ¯yi·¯y··) + ( ¯y·j¯y·· ) + eij
where eij =bij is the residual of an observation yij from a RCBD. The value of eij is
eij =yij yi·¯y··)( ¯y·j¯y··)¯y·· =
The total sum of squares (SStotal ) for the RCBD is partitioned into 3 components:
a
X
i=1
b
X
j=1
(yij ¯y·· )2=
a
X
i=1
b
X
j=1
yi·¯y··)2+
b
X
j=1
a
X
i=1
y·j¯y··)2+
a
X
i=1
b
X
j=1
(yij ¯yi·¯y·j+ ¯y··)2
=b
a
X
i=1
yi·¯y··)2+a
b
X
j=1
y·b¯y··)2+
a
X
i=1
b
X
j=1
(yij ¯yi·¯y·j+ ¯y··)2
=b
a
X
i=1
+a
b
X
j=1
+
a
X
i=1
b
X
j=1
OR SST otal =SST rt +SSBlock +SSE
78
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download 3 randomized complete block design (rcbd) and more Exams Design in PDF only on Docsity!

3 RANDOMIZED COMPLETE BLOCK DESIGN (RCBD)

  • The experimenter is concerned with studying the effects of a single factor on a response of interest.

However, variability from another factor that is not of interest is expected.

  • The goal is to control the effects of a variable not of interest by bringing experimental units that are

similar into a group called a “block”. The treatments are then randomly applied to the experimental

units within each block. The experimental units are assumed to be homogeneous within each block.

  • By using blocks to control a source of variability, the mean square error (MSE) will be reduced. A

smaller MSE makes it easier to detect significant results for the factor of interest.

  • Assume there are a treatments and b blocks. If we have one observation per treatment within each

block, and if treatments are randomized to the experimental units within each block, then we have a

randomized complete block design (RCBD). Because randomization only occurs within blocks,

this is an example of restricted randomization.

3.1 RCBD Notation

  • Assume μ is the baseline mean, τi is the i

th treatment effect, βj is the j

th block effect, and

ij is the random error of the observation. The statistical model for a RCBD is

yij = μ + τi + βj + ij and ij ∼ IIDN (0, σ

2 ). (6)

  • μ, τi (i = 1, 2 ,... , a), and βj (j = 1, 2 ,... , b) are not uniquely estimable. Constraints must be

imposed. To be able to calculate estimates μ̂ , ̂τi, and β̂j , we need to impose two constraints.

  • Initially, we will assume the textbook constraints:

∑^ a

i=

τi = 0 and

∑^ b

j=

βj = 0.

  • These are not the default SAS constraints (τa = 0, βb = 0) or R constraints (τ 1 = 0, β 1 = 0).
  • Applying these constraints, will yield least-squares estimates

μ̂ = ̂τi = and β̂j =

where ¯yi· is the mean for treatment i, and ¯y·j is the mean for block j.

  • Substitution of the estimates into the model yields:

yij = μ̂ + ̂τi + β̂j + eij

= y¯·· + (¯yi· − ¯y··) + (¯y·j − ¯y··) + eij

where eij = ̂ij is the residual of an observation yij from a RCBD. The value of eij is

eij = yij − (¯yi· − y¯··) − (¯y·j − y¯··) − y¯·· =

  • The total sum of squares (SStotal) for the RCBD is partitioned into 3 components:

∑^ a

i=

∑^ b

j=

(yij − y¯··)

2

∑^ a

i=

∑^ b

j=

(¯yi· − y¯··)

2

∑^ b

j=

∑^ a

i=

(¯y·j − y¯··)

2

∑^ a

i=

∑^ b

j=

(yij − y¯i· − ¯y·j + ¯y··)

2

= b

∑^ a

i=

(¯yi· − y¯··)

2

  • a

∑^ b

j=

(¯y·b − ¯y··)

2

∑^ a

i=

∑^ b

j=

(yij − y¯i· − y¯·j + ¯y··)

2

= b

∑^ a

i=

  • a

∑^ b

j=

∑^ a

i=

∑^ b

j=

OR SST otal = SST rt + SSBlock + SSE

  • Alternate formulas to calculate SST otal, SST rt and SSBlock.

SST otal =

∑^ a

i=

∑^ b

j=

y

2 ij −^

y

2 ··

ab

SST rt =

∑^ a

i=

y

2 i·

b

y

2 ··

ab

SSBlock =

∑^ b

j=

y

2 ·j

a

y

2 ··

ab

SSE = SST otal − SST rt − SSBlock where

y

2 ··

ab

is the correction factor.

3.2 Cotton Fiber Breaking Strength Experiment

An agricultural experiment considered the effects of K 2 O (potash) on the breaking strength of cotton

fibers. Five K 2 O levels were used (36, 54, 72, 108, 144 lbs/acre). A sample of cotton was taken from each

plot, and a strength measurement was taken. The experiment was arranged in 3 blocks of 5 plots each.

K 2 O lbs/acre (treatment)

Block 36 54 72 108 144 Totals

1 7.62 8.14 7.76 7.17 7.46 y· 1 =38.

2 8.00 8.15 7.73 7.57 7.68 y· 2 =39.

3 7.93 7.87 7.74 7.80 7.21 y· 3 =38.

y 1 · y 2 · y 3 · y 4 · y 5 ·

Totals 23.55 24.16 23.23 22.54 22.35 y··=115.

Treatment Means y¯ 1 · = 7. 850 ¯y 2 · = 8. 053 y¯ 3 · = 7. 743 y¯ 4 · = 7. 513 y¯ 5 · = 7. 450

Block Means y¯· 1 = 7. 630 ¯y· 2 = 7. 826 y¯· 3 = 7. 710

Grand Mean y¯ = 7. 723

Uncorrected Sum of Squares =

∑a

i=

∑b

j=1 y

2 ij =

Correction factor = y

2 ··/ab^ = 115.^83

2 /15 =

a ∑

i=

y

2 i·

b

2

    1. 16

2

    1. 23

2

    1. 54

2

    1. 35

2

∑^ b

j=

y

2 ·j

a

2

    1. 13

2

    1. 55

2

SST otal = 895. 6183 − 894 .4393 =

SST rt = 895. 1717 − 894 .4393 =

SSBlock = 894. 5364 − 894 .4393 =

SSE = 1. 1790 − 0. 7324 − 0 .0971 =

Analysis of Variance (ANOVA) Table

Source of Sum of Mean F

Variation Squares d.f. Square Ratio p-value

K 2 O lbs/acre .18311.

Blocks .04856 —–

Error .043685 ——

Total 14 —— ——

Dependent Variable: strength

Fit Diagnostics for strength

Adj R-Square 0.

R-Square 0.

MSE 0.

Error DF 8

Parameters 7

Observations 15

Proportion Less

0.0 0.4 0.

Residual

0.0 0.4 0.

Fit–Mean

-0.

-0.

-0.48 -0.24 0 0.24 0.

Residual

0

10

20

30

Percent

0 5 10 15

Observation

Cook's D

7.2 7.4 7.6 7.8 8.0 8.

Predicted Value

strength

-2 -1 0 1 2

Quantile

-0.

Residual

0.5 0.6 0.7 0.8 0.

Leverage

0

1

2

RStudent

7.4 7.6 7.8 8.0 8.

Predicted Value

0

1

2

RStudent

7.4 7.6 7.8 8.0 8.

Predicted Value

-0.

-0.

Residual

ANOVA RESULTS FOR STRENGTH BY TREATMENT

The GLM Procedure

36 54 72 108 144

k2O

strength

block^123

Interaction Plot for strength

ANOVA RESULTS FOR STRENGTH BY TREATMENT

The GLM Procedure

ANOVA RESULTS FOR STRENGTH BY TREATMENT

The GLM Procedure

7.2^15

strength

1 2 3

block

Distribution of strength

strength

Level of block N Mean Std Dev

1 5 7.63000000 0.

2 5 7.82600000 0.

3 5 7.71000000 0.

ANOVA RESULTS FOR STRENGTH BY TREATMENT

The GLM Procedure

Dependent Variable: strength

ANOVA RESULTS FOR STRENGTH BY TREATMENT

The GLM Procedure

Dependent Variable: strength

Parameter Estimate

Standard Error t Value Pr > |t|

K2O=36 0.12800000 0.10793208 1.19 0.

K2O=54 0.33133333 0.10793208 3.07 0.

K2O=72 0.02133333 0.10793208 0.20 0.

K2O=108 -0.20866667 0.10793208 -1.93 0.

K2O=144 -0.27200000 0.10793208 -2.52 0.

ANOVA RESULTS FOR STRENGTH BY TREATMENT

The GLM Procedure

Tukey's Studentized Range (HSD) Test for strength

ANOVA RESULTS FOR STRENGTH BY TREATMENT

The GLM Procedure

Tukey's Studentized Range (HSD) Test for strength

Note: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ.

Alpha 0.

Error Degrees of Freedom 8

Error Mean Square 0.

Critical Value of Studentized Range 4.

Minimum Significant Difference 0.

Means with the same letter are not significantly different.

Tukey Grouping Mean N k2O

A 8.0533 3 54

A

B A 7.8500 3 36

B A

B A 7.7433 3 72

B A

B A 7.5133 3 108

B

B 7.4500 3 144

Critical Value of Studentized Range 4.

Minimum Significant Difference 0.

**Comparisons significant at the 0.05 level are indicated by *.

k2O Comparison

Difference Between Means

Simultaneous 95% Confidence Limits

54 - 36 0.2033 -0.3862 0.

54 - 72 0.3100 -0.2796 0.

54 - 108 0.5400 -0.0496 1.

54 - 144 0.6033 0.0138 1.1929 ***

36 - 54 -0.2033 -0.7929 0.

36 - 72 0.1067 -0.4829 0.

36 - 108 0.3367 -0.2529 0.

36 - 144 0.4000 -0.1896 0.

72 - 54 -0.3100 -0.8996 0.

72 - 36 -0.1067 -0.6962 0.

72 - 108 0.2300 -0.3596 0.

72 - 144 0.2933 -0.2962 0.

108 - 54 -0.5400 -1.1296 0.

108 - 36 -0.3367 -0.9262 0.

108 - 72 -0.2300 -0.8196 0.

108 - 144 0.0633 -0.5262 0.

144 - 54 -0.6033 -1.1929 -0.0138 ***

144 - 36 -0.4000 -0.9896 0.

144 - 72 -0.2933 -0.8829 0.

144 - 108 -0.0633 -0.6529 0.

3.3 SAS Code for Cotton Fiber Breaking Strength RCBD

DM ’LOG; CLEAR; OUT; CLEAR;’;

OPTIONS NODATE NONUMBER LS=76;

ODS GRAPHICS ON;

ODS PRINTER PDF file=’C:\COURSES\ST541\RCBD.PDF’;

*** A RANDOMIZED COMPLETE BLOCK DESIGN ***;

DATA in; INPUT k2O block strength @@; CARDS;

36 1 7.62 36 2 8.00 36 3 7.

54 1 8.14 54 2 8.15 54 3 7. 72 1 7.76 72 2 7.73 72 3 7.

108 1 7.17 108 2 7.57 108 3 7.

144 1 7.46 144 2 7.68 144 3 7.

PROC GLM DATA=in PLOTS = (ALL);

CLASS k2O block; MODEL strength = k2O block / SS3 SOLUTION;

MEANS block;

MEANS k2O / TUKEY CLDIFF LINES;

ESTIMATE ’K2O=36’ K2O 4 -1 -1 -1 -1 / DIVISOR=5;

ESTIMATE ’K2O=54’ K2O -1 4 -1 -1 -1 / DIVISOR=5;

ESTIMATE ’K2O=72’ K2O -1 -1 4 -1 -1 / DIVISOR=5;

ESTIMATE ’K2O=108’ K2O -1 -1 -1 4 -1 / DIVISOR=5; ESTIMATE ’K2O=144’ K2O -1 -1 -1 -1 4 / DIVISOR=5;

TITLE ’ANOVA RESULTS FOR STRENGTH BY TREATMENT’;

RUN;

  • In both the fixed and random block cases, the ANOVA F -tests associated with treatment effects are

identical. You use F 0 = M Strt/M SE to test

H 0 : τ 1 = · · · = τa = 0 against H 1 : not all of the τis are equal (9)

  • The EMS for the RCBD indicates that the correct denominator EMS for testing for a significant

block effect (either fixed or random) is the EMS for the restriction error. The problem is that this is

not estimable from the data.

  • Under these circumstances, the test of the hypothesis involving the combination of the block effects

and the restriction error in (10) would be appropriate to test for a ‘general’ blocking effect.

  • The statistic F = M Sblocks/M SE is actually a test of

H 0 : σ

2 δ +^ φ(β) = 0^ against^ H^1 :^ σ

2 δ +^ φ(β)^6 = 0^ (10)

Note that even if β 1 = β 2 = · · · = βb = 0 (fixed) or σ

2 β = 0 (random)^ is true, we still have the

restriction error in the EMS which prevents it from matching the error EMS = σ

2 .

  • Because of the restriction on randomization, A&M claim that there is no F test for blocks. That

is, there is no test for H 0 : σ

2 β

= 0 if blocks are random and no test for H 0 : β 1 = β 2 = · · · = βb if

blocks are fixed.

  • Fortunately this is not a problem because most of the time the experimenter is only interested in

whether or not blocking had been effective in reducing the M SE for improved testing of the effects

of the treatment of interest.

3.5 Example of an Analysis With and Without Blocks

Three different disinfecting solutions are being compared to study their effectiveness in stopping the growth

of bacteria in milk containers. The analysis is done in a laboratory, and only three trials can be run on

any day. Because days could represent a potential source of variability, the experimenter decides to use

a randomized block design with days as blocks. Observations are taken for four days. The inside of the

milk containers are covered with a certain amount of bacteria. The response is the percentage of bacteria

remaining after rinsing the container with a disinfecting solution.

Day

Solution 1 2 3 4

  • The data were analyzed assuming two different models. The first model does not include blocks.

The second model includes blocks. The SAS analysis for both models is on the next page. Here are

important results:

Without blocks With blocks

R

M SE

p-value

  • Note that we would fail to reject H 0 if blocks were not in the model because there is large variability

across blocks (M Sday = 368.97).

  • If the SSday = 1106.92 and dfday = 3 is pooled with the the SSE = 41.83 and dfE = 6 in the model

with days (blocks), then it forms the SSE = 1158.75 and dfE = 9 for the model without days (blocks).

SAS Code for RCBD Analyses With and Without Blocks

DM ’LOG;CLEAR;OUT;CLEAR’;

ODS GRAPHICS ON;

  • ODS PRINTER PDF file=’C:\COURSES\ST541\RCBD2.PDF’;

OPTIONS NODATE NONUMBER LS=76 PS=54;

*** RCBD ANALYSES WITH AND WITHOUT BLOCKS ***;

DATA IN;

DO solution = 1 TO 3;

DO day = 1 TO 4;

INPUT growth @@; OUTPUT;

END; END;

LINES;

13 22 18 39 16 24 17 44 5 4 1 22

;

*******************************************************;

*** RUN AN ANOVA WITH SOLUTION ONLY, NO DAY BLOCKS ***;

*******************************************************;

PROC GLM DATA=IN;

CLASS solution;

MODEL growth = solution / ss3;

TITLE ’RCBD WITHOUT DAYS (BLOCKS) IN THE MODEL’;

*** RUN AN ANOVA WITH DAYS AS BLOCKS ***;

PROC GLM DATA=IN;

CLASS day solution;

MODEL growth = solution day / ss3;

TITLE ’RCBD WITH DAYS (BLOCKS) IN THE MODEL’;

RUN;

3.6 Type I vs Type III Analyses

  • Without the /ss3 option in the MODEL statement, SAS will contain two ANOVA tables: ANOVA

for Type I sum of squares and ANOVA for Type III sum of squares.

  • If there are no missing observations, the Type I and Type III analyses are identical.
  • If there are missing observations, the Type I and Type III analyses are different. To see how they

differ we will first look at the Type I analysis.

3.6.1 Type I Analysis

  • The Type I analysis is based on sequentially fitting the data to the model one factor at a time. It is

often referred to as the sequential sum of squares method.

  • For the RCBD there are two possibilities that I will refer to as
    • Version 1 (V1) when fitting treatments before blocks.
    • Version 2 (V2) when fitting blocks before treatments.
  • Let RSSi be the error sum of squares (SSE ) after fitting the model in the i

th step.

  • The steps for determining the ANOVA SS for V1 are:
    1. Fit yij = μ + ij and obtain RSS 1 = SStotal.
    2. Fit yij = μ + τi + ij and obtain RSS 2 = SSE for the model with treatments only.
    3. Fit yij = μ + τi + βj + ij and obtain RSS 3 = SSE for the model with treatments and blocks.
  • The steps for determining the ANOVA SS for V2 are:
    1. Fit yij = μ + ij and obtain RSS 1 = SStotal.

. Fit yij = μ + βj + ij and obtain RSS

∗ 2 =^ SSE^ for the model with blocks only.

  1. Fit yij = μ + τi + βj + ij and obtain RSS 3 = SSE for the model with blocks and treatments..
  • The ANOVA sum of squares for V1 and V2 are summarized in the following table:

Step V1 Source Fit df Type I SS for V

1 Total μ N − 1 RSS 1

2 Treatment τi a − 1 R(τ |μ) = RSS 1 − RSS 2

3 Blocks βj b − 1 R(β|τ, μ) = RSS 2 − RSS 3

3 Error ij N − a − b + 1 RSS 3

Step V2 Source Fit df Type I SS for V

1 Total μ N − 1 RSS 1

′ Blocks βj b − 1 R(β|μ) = RSS 1 − RSS

∗ 2

3 Treatment τi a − 1 R(τ |β, μ) = RSS

∗ 2 −^ RSS

∗ 3

3 Error ij N − a − b + 1 RSS 3

  • In V1, the quantity R(τ |μ) is called the reduction in SS due to τ adjusted for μ and R(β|τ, μ)

is called the reduction in SS for β adjusted for τ and μ.

  • In V2, the quantity R(β|μ) is called the reduction in SS due to β adjusted for μ and R(τ |β, μ)

is called the reduction in SS for τ adjusted for β and μ.

3.6.2 Type III Analysis

  • The Type III analysis is referred to as the marginal means or the Yates weighted squares of

means analysis.

  • For a RCBD, the Type III SStrt and SSblocks are computed using the following procedure:
    1. Fit the model with treatments only: yij = μ + τi + ij. Then RSS 2 = SSE for this model.
    2. Fit the model with blocks only: yij = μ + βj + ij. Then RSS

∗ 2 =^ SSE^ for this model.

  1. Fit the model yij = μ + τi + βj + ij. Then RSS 3 = SSE and RSS 1 = SStotal for the model

with both treatments and blocks.

Step Source Fit df Type III SS

1 Total μ N − 1 RSS 1

2 Treatment τi a − 1 R(τ |β, μ) = RSS

∗ 2 −^ RSS^3

3 Blocks βj b − 1 R(β|τ, μ) = RSS 2 − RSS 3

1 Error ij N − a − b + 1 RSS 3

  • If any yij values are missing, then SStrt + SSblocks + SSE 6 = SStotal for a Type III analysis.

3.6.3 RCBD Analysis with a Missing Observation

See the example in Section 3.5 for the description of the experiment. Suppose y 23 was missing from the

RCBD. The RCBD data table is:

Day

Solution 1 2 3 4

  • Let us examine the Type I and Type III sums of squares. The next page contains the SAS output.
  • The top of the next page contains the Type I (V1) sum of squares and the bottom of the page contains

the Type I (V2) sum of squares. Note the difference in sums of squares, mean squares, F-statistics,

and p-values for the Type I analyses.

  • The reason for the difference between the V1 and V2 Type I sum of squares is that a Type I analysis

is sequential so the order in which terms enter the model is important.

  • The Type III analysis is the same for both analyses Type III sums of squares are not calculated

sequentially. That is, the order in which terms enter the model is not important.

  • The following page contains the two analyses with only one effect in each model. I included these

analyses so you can see how RSS 2 and RSS

∗ 2 are calculated.

RSS 2 is the SSE for the model with only treatments and no blocks.

ANOVA RESULTS FOR THE MODEL WITH SOLUTION (TREATMENTS) ONLY

ANOVA RESULTS (SOLUTION ONLY)

The GLM Procedure

Variable: growth

ANOVA RESULTS (SOLUTION ONLY)

The GLM Procedure

Variable: growth

Source DF

Sum of

Squares Mean Square F Value Pr > F

Model 2 790.909091 395.454545 2.96 0.

Error 8 1068.000000 133.

Corrected Total 10 1858.

R-Square Coeff Var Root MSE growth Mean

0.425469 61.10405 11.55422 18.

Source DF Type I SS Mean Square F Value Pr > F

solution 2 790.9090909 395.4545455 2.96 0.

Source DF Type III SS Mean Square F Value Pr > F

solution 2 790.9090909 395.4545455 2.96 0.

0

10

20

30

40

growth

1 2 3

solution

Prob > F 0.

F 2.

Distribution of growth

0

10

20

30

40

growth

1 2 3

solution

Prob > F 0.

F 2.

Distribution of growth

RSS

∗ 2 is the^ SSE^ for the model with only blocks and no treatments.

ANOVA RESULTS FOR THE MODEL WITH DAYS (BLOCKS) ONLY

ANOVA RESULTS (DAY ONLY)

The GLM Procedure

Variable: growth

ANOVA RESULTS (DAY ONLY)

The GLM Procedure

Variable: growth

Source DF

Sum of

Squares Mean Square F Value Pr > F

Model 3 1141.075758 380.358586 3.71 0.

Error 7 717.833333 102.

Corrected Total 10 1858.

R-Square Coeff Var Root MSE growth Mean

0.613842 53.55403 10.12658 18.

Source DF Type I SS Mean Square F Value Pr > F

day 3 1141.075758 380.358586 3.71 0.

Source DF Type III SS Mean Square F Value Pr > F

day 3 1141.075758 380.358586 3.71 0.

40

Prob > F 0.

F 3.

Distribution of growth

40

Prob > F 0.

F 3.

All of these calculations are done automatically in the RCBD analyses for the two models Distribution of growth

on the previous page.

Type I SS (V1) Summary

RSS 1 = 1858. 91 R(μ) = RSS 1 = 1858.

RSS 2 = 1068. 00 R(τ |μ) = RSS 1 − RSS 2 = 790.

RSS 3 = 47. 33 R(β|τ, μ) = RSS 2 − RSS 3 = 1020.

Type I SS (V2) Summary

RSS 1 = 1858. 91 R(μ) = RSS 1 = 1858.

RSS

2 = 717.^83 R(β|μ) =^ RSS^1 −^ RSS

RSS 3 = 47. 33 R(τ |β, μ) = RSS

2 −^ RSS^3 = 670.

Type III SS Summary

RSS 1 = 1858. 91 R(μ) = RSS 1 = 1858.

RSS 3 = 47. 33

RSS

2 = 717.^83 R(β|τ, μ) =^ RSS

2 −^ RSS^1 = 1020.

RSS 2 = 1068. 00 R(τ |β, μ) = RSS 2 − RSS 1 = 670.

DM ’LOG; CLEAR; OUT; CLEAR;’;

ODS GRAPHICS ON;

ODS PRINTER PDF file=’C:\COURSES\ST541\RCBDMISS.PDF’;

OPTIONS NODATE NONUMBER;

***************************************;

*** RCBD WITH A MISSING OBSERVATION ***;

***************************************;

DATA IN;

DO solution = 1 TO 3; DO day = 1 TO 4; INPUT growth @@; OUTPUT; END; END;

CARDS;

13 22 18 39 16 24. 44 5 4 1 22

;

***************************************************;

*** RUN AN ANOVA WITH SOLUTION APPEARING FIRST ***;

***************************************************;

PROC GLM DATA=IN;

CLASS solution day; MODEL growth = solution day;

TITLE ’ANOVA RESULTS (SOLUTION THEN DAY)’;

**********************************************;

*** RUN AN ANOVA WITH DAY APPEARING FIRST ***;

**********************************************;

PROC GLM DATA=IN;

CLASS day solution; MODEL growth = day solution;

TITLE ’ANOVA RESULTS (DAY THEN SOLUTION)’;

****************************************;

*** RUN AN ANOVA WITH SOLUTION ONLY ***;

****************************************;

PROC GLM DATA=IN;

CLASS solution; MODEL growth = solution;

TITLE ’ANOVA RESULTS (SOLUTION ONLY)’;

***********************************;

*** RUN AN ANOVA WITH DAY ONLY ***;

***********************************;

PROC GLM DATA=IN;

CLASS day; MODEL growth = day;

TITLE ’ANOVA RESULTS (DAY ONLY)’;

RUN;

3.6.4 Type I vs Type III Hypotheses

  • Because of differences between Type I and Type III SS, there will be differences in the hypotheses

associated with the F -tests (assuming the restriction on randomization is ignored).

  • Let μij = μ + τi + βj be the i

th treatment, j

th block mean.

Hypotheses for Type III and Type I (V2) Sum of Squares

H 0 : μ 1 · = μ 2 · = · · · = μa·

H 1 : μi· 6 = μi∗· for some i 6 = i

∗ and μi· =

∑^ b

j=

μij

 (^) /b.

Hypotheses for Type I (V1) Sum of Squares

H 0 :

n 1 ·

∑^ b

j=

n 1 j μ 1 j =

n 2 ·

∑^ b

j=

n 2 j μ 2 j = · · · =

na·

∑^ b

j=

naj μaj

H 1 :

ni·

∑^ b

j=

nij μij 6 =

ni∗·

∑^ b

j=

ni∗j μi∗j for some i 6 = i∗.

where ni· = the number of nonmissing yij values for the i

th treatment, and nij = 1 if yij is not missing

and nij = 0 if yij is missing.

  • The Type III hypotheses are comparing the treamtment means average across the blocks (and are

the ones I want to test.) Therefore I recommend using the p-values from a Type III analysis.

  • If there are no missing yij values, the Type I and Type III hypotheses are the same.

3.7 RCBD Normal Equations

  • For model yij = μ + τi + βj + ij , the error is ij = yij − μ − τi − βj
  • Substituting in estimates produces the residual ̂ij = eij = yij − μ̂ − τ̂i − β̂j.
  • Goal: Find μ̂ , ̂τi, and β̂j that minimize L:

L =

a ∑

i=

b ∑

j=

2 ij =

a ∑

i=

b ∑

j=

(yij − μ̂ − ̂τi − β̂j )

2

  • Solution: Solve the normal equations

∂L

∂ μ̂

∑^ a

i=

∑^ b

j=

(yij − μ̂ − ̂τi − β̂j ) = 0

∂L

∂ ̂τi

∑^ b

j=

(yij − ̂μ − τ̂i − β̂j ) = 0 for i = 1, 2 ,... , a

∂L

∂ β̂j

∑^ a

i=

(yij − ̂μ − τ̂i − β̂j ) = 0 for j = 1, 2 ,... , b

  • After distributing the sum and then simplifying, we get:

(i) y·· = ab μ̂ + b

a ∑

i=

̂ τi + a

b ∑

j=

β̂ j

(ii) yi· = bμ̂ + b ̂τi +

b ∑

j=

β̂ j for^ i^ = 1,^2 ,... , a

(iii) y·j = a μ̂ +

a ∑

i=

̂ τi + aβ̂j for j = 1, 2 ,... , b

  • For (i), (ii), and (iii), there is a total of 1 + a + b equations. If you sum the a equations in (ii), you

get (i). If you sum the b equations in (iii), you also get (i). Thus, the rank is a + b − 1 which implies

that μ and each τi and βj are not uniquely estimable. To get estimates of μ and each τi and βj , we

must impose 2 constraints. We will use

∑^ a

i=

τi = 0 and

∑^ b

j=

βj = 0.

  • Substitution of these constraints into (i), (ii), and (iii) yields

(1) abμ̂ = y·· (2) bμ̂ + b̂τi = yi· (3) a μ̂ + aβ̂j = y·j

  • Then, from (1), we have

μ̂ =

y··

ab

  • Substitution of ̂μ = y·· in (2) yields:

by·· + b̂τi = yi· −→ y·· + ̂τi = yi· −→ ̂τi =

  • Substitution of ̂μ = y·· in (3) yields:

ay·· + a β̂j = y·j −→ y·· + β̂j = y·j −→ β̂j =

3.8 Matrix Forms for the RCBD

Example: The goal is to determine whether or not four different tips produce different readings on a

hardness testing machine. The machine operates by pressing the tip into a metal test coupon, and from

the depth of the resulting depression, the hardness of the coupon can be determined. The experimenter

decides to obtain four observations for each tip. Four randomly selected coupons (blocks) were used and

each tip (treatment) was tested on each coupon. The data represent deviations from a desired depth in

0.1 mm units:

Type of Tip

Type of Coupon 1 2 3 4

Alternate Approach: Keeping a + b + 1 Columns

μ τ 1 τ 2 τ 3 τ 4 β 1 β 2 β 3 β 4

X =

y =

                             X

′ X =

(X

′ X)

− 1

             X

′ y =

(X

′ X)

− 1 X

′ y =

μ

̂ τ 1

̂ τ 2

̂ τ 3

̂ τ 4

β̂ 1

β̂ 2

β̂ 3

β̂ 4