
EST 504 (Remmenga)
Assignment 8: Subsampling and Split-Plot Designs
DATA SETS
1. Use the Seafood Storage data set shown on the back. (I will call the data set SEAFOOD.) The SEAFOOD data set can be found
at my website. You may read it into SAS however you like.
SAS PROGRAMMING
We are going to perform four analyzes on this data. The first and second will treat the containers as subsamples; we will
perform a one-way AOV on the subsample averages and a one-way AOV accounting for subsamples both in GLM. For the
third and fourth analyses, we will treat the containers as subplot treatments; let Container 1 be untreated and Container 2 treated
with a preservative. The third and fourth analyses will be a split-plot analyses in GLM and MIXED, respectively.
2. One-way AOV on the subsample averages.
a. Obtain the subsample averages in PROC MEANS and put them in a data set using the OUTPUT statement. You will need to
sort the data by temperature and storage unit, and obtain the means by temperature and storage unit. Use the NOPRINT option
of the MEANS statement and print out the OUTPUT data set with PROC PRINT.
b. Compare the ln(bacterial counts) for the different temperatures by analyzing the subsample averages. Request the temperature
least squares means, standard errors and table of p-values for pair-wise comparisons. Perform this analysis using PROC GLM
requesting only the SS3.
3. One-way AOV with subsampling.
a. The variability due to storage units treated alike needs to separated from the variability of the subsamples within a unit. To do
this, the experimental unit of temperature (storage unit) is put into the GLM model and the pure subsampling variability is left
as residual. (Don’t forget to put storage unit into the CLASS statement.) Again request only the SS3.
Note: If we just use the variable for storage unit (say, UNIT) then its as if we have 9 reps of one treatment and there will be no
df left for error. So we have to use storage units within temperature levels, UNIT(TEMP), as the reps.
b. GLM doesn’t know that the storage unit is an experimental unit and not a treatment variable of some kind. To tell SAS to use
the storage units as the error term for testing temperature effects, put in a RANDOM statement specifying the storage units as a
random effect and use the test option to get the correct tests.
Note: The RANDOM statement in GLM doesn’t actually treat the specified factor as random. It provides a table of
Expected Mean Squares (EMS) and allows you to request tests using the specified factor as the error term. Other parts of
the program (least squares means, standard errors, contrasts, estimates, pair-wise tests, etc) are not affected by the
statement.
c. Request the temperature least squares means, standard errors and table of p-values as usual.
d. Again, the LSMEANS statement in GLM doesn’t know that the residual error is not the correct error term for testing TEMP so
we must specify the correct term to use as the error. To see the difference, leave in one LSMEANS statement as we would
usually do it. Then add a second one with the correct error term specified. To specify the correct error term, use the E= option of
the LSMEANS statement as in: LSMEANS TEMP/STDERR PDIFF E=UNIT(TEMP);
4. Split-plot Analysis in PROC GLM
a. Start by copying your GLM from 3.
b. Add the container variable (say CON) into the class statement and model as well as the interaction between the container and
temperature (i.e. TEMP*CON). (The container now represents a second treatment factor, preservative, so we need to analyze
the factorial treatment structure formed by having both temperature and preservative as treatment factors.)
c. To get the correct pair-wise tests of means, we need different error terms depending on whether we are comparing temperature
or preservative main effects or the simple effects of the temperature/ preservative combinations. So you will have to put in two
LSMEANS statements; one to get the means, standard errors and test for the temperature main effects using the E= option to
specify the storage units as the experimental units and a second to get means, standard errors and tests for the preservative main
effects and for the simple effects of the temperature/preservative combinations.
d. Write two contrast statements to compare Temperature 1/Container 1 to Temperature 1/Container 2 and to compare
Temperature 1/Container 1 to Temperature 2/Container 1. Include estimate statements that are copies of your contrasts so that
you can verify your choice of coefficients. Insert all the contrast statements before the random statement to produce expected
mean squares (EMS) of the contrasts.
5. Split-plot Analysis in PROC MIXED
a. We want a complete analysis with tests for main effects and interaction along with comparisons of pair-wise means for both
treatments and for the treatment combinations. Also include the CONTRASTS from 4. All class variables fixed and random
must be in the CLASS statement. Place only fixed factors in the MODEL statement and only random factors in the RANDOM
statement. Just like GLM, use UNIT(TEMP) to identify the random storage units (our only random factor.) In MIXED, we only
need one LSMEANS statement for all of our effects (main and simple) and if you request pair-wise comparisons with PDIFF
you will get the standard errors too. We don’t need to specify error terms or anything special.