




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Professor: Hoff; Class: DESIGN ANLYS EXPMTS; Subject: Statistics; University: University of Washington - Seattle; Term: Autumn 2005;
Typology: Study notes
1 / 160
This page cannot be seen from the preview
Don't miss anything!





























































































©^ cDecember 6, 2006
2.10.2 Which two-sample t-test to use? tdiff(yA, yB) vs. t(yA, yB)
Input variables consist of
controllable factors: measured and determined by scientist
uncontrollable factors: measured but not determined by scientist
noise factors: unmeasured, uncontrolled factors (experimental variability or “error”)
For any interesting process, there are inputs such that:
variability in input → variability in output
If variability in an input factor x leads to variability in output y, we say x is a source of variation. In this class we will discuss methods of designing and analyzing experiments to determine important sources of variation.
1.3 Experiments and observational studies
Information on how inputs affect output can be gained from:
Example (Women’s Health Initiative, WHI):
Observational Study:
Experimental Study (WHI randomized controlled trial):
373,092 women determined to be eligible ↪→ 18,845 provided consent to be in experiment ↪→ 16,608 included in the experiment
16,608 women randomized to either
x = 1 (estrogen treatment) x = 0 (control, i.e. no estrogen treatment) using a randomized block design: Women were treated at different clinics, and were of different ages. age group 1 (50-59) 2 (60-69) 3 (70-79) clinic 1 n 11 n 12 n 13 2 n 21 n 22 n 23 .. .
ni,j = # of women in study, in clinic i and in age group j = # of women in block i, j
Observational study
.
Randomized experiment
randomization
..............^ ... ................
Observational studies can suggest good experiments to run, but can’t definitively show causation.
Randomization can eliminate correlation between x 1 and y due to a different cause x 2 , aka a confounder.
“No causation without randomization”
1.4 Steps in designing an experiment
(a) factors of interest (b) nuisance factors
(a) treatment variables (b) potential large sources of variation/blocking variables
These factors are often constrained by budgets, ethics, time,...
Three principles in Experimental Design
Example: Wheat yield
Factor of interest: Fertilizer type, A or B. One factor of interest, having two levels.
Question: Is one fertilizer better than another, in terms of yield?
Experimental material: One plot of land to be divided into 2 rows of 6 subplots.
A A B B A B 26.9 11.4 26.6 23.7 25.3 28. B B A A B A 14.2 17.9 16.5 21.1 24.3 19.
How much evidence is there that fertilizer type is a source of yield variation? Evidence about differences between two populations is generally measured by comparing summary statistics across the two sample populations. (Recall, a statistic is any computable function of known, observed data).
2.1 Summaries of sample populations
Distribution:
Fˆ (y) = #(yi ≤ y)/n = Pr(ˆ −∞, y]
Note that these summaries more or less retain all the information in the data except the unit labels.
Location:
∑n i=1 yi
#(yi ≤ y(1/2))/n ≥ 1 / 2 #(yi ≥ y(1/2))/n ≥ 1 / 2
To find the median, sort the data in increasing order, and call these values y(1),... , y(n). If there are no ties, then if n is odd, then y( n+1 2 ) is the median; if n is even, then all numbers between y( n 2 ) and y( n+1 2 ) are medians.
mean(yA) [1] 20. mean(yB) [1] 22.
median(yA) [1] 20. median(yB) [1] 24
sd(yA) [1] 5. sd(yB) [1] 5.
quantile(yA,prob=c(.25,.75)) 25% 75% 17.275 24. quantile(yB,prob=c(.25,.75)) 25% 75% 19.350 26.
So there is a different in yield for these wheat fields. Would you recommend B over A for future plantings? Do you think these results generalize to a larger population?
2.2 Hypothesis testing via randomization
Questions:
Hypothesis tests:
A statistical hypothesis test evaluates the plausibility of H 0 in light of the data.
Suppose we are interested in mean wheat yields. We can evaluate H 0 by answering the following questions:
To answer the above, we need to compare
{|¯yB − y¯A| = 2. 4 }, the observed difference in the experiment to values of |y¯B − y¯A| that could have been observed if H 0 were true.
Hypothetical values of |y¯B − ¯yA| that could have been observed under H 0 are referred to as samples from the null distribution.
Finding a null distribution: Let
g(YA, YB ) = g({Y 1 ,A,... , Y 6 ,A}, {Y 1 ,B ,... , Y 6 ,B }) = | Y¯B − Y¯B |.
This is a function of the outcome of the experiment. It is a statistic. Since we will use it to perform a hypothesis test, we will call it a test statistic.
Observed test statistic: g(26. 9 , 11. 4 ,... , 24 .3) = 2.4 = gobs
Hypothesis testing procedure: Compare gobs to g(YA, YB) for values of YA and YB that could have been observed, if H 0 were true.
Recall the outcome of the experiment:
A A B B A B
IDEA: To consider what types of outcomes we would see in universes where H 0 is true, compute g(YA, YB ) under every possible treatment assignment and assuming H 0 is true.
Under our randomization scheme, there were
12! 6!6!
equally likely ways the treatments could have been assigned. For each one of these, we can calculate the value of the test statistic that would’ve been observed under H 0 : {g 1 , g 2 ,... , g 924 }
This enumerates all potential pre-randomization outcomes of our test statistic, assuming no treatment effect. Along with the fact that each treatment assignment is equally likely, these value give a null distribution, a probability distribution of possible experimental results, if H 0 is true.
Pr(g(YA, YB ) ≤ x|H 0 ) =
#{gk ≤ x} 924
This distribution is sometimes called the randomization distribution, be- cause it is obtained by the randomization scheme of the experiment. Is there any contradiction between H 0 and our data?
Pr(g(YA, YB ) ≥ 2. 4 |H 0 ) = 0. 47
According to this calculation, the probability of observing a mean difference of 2.4 or more is not unlikely under the null hypothesis. This probability calculation is called a p-value. Generically, a p-value is
“The probability, under the null hypothesis, of obtaining a result as or more extreme than the observed result.”
The basic idea:
small p-value → evidence against H 0 large p-value → no evidence against H 0
Density
Density
Figure 2.2: Randomization distribution for the wheat example
Approximating a randomization distribution We don’t want to have to enumerate all
( (^) n n/ 2
possible treatment assignments. Instead, repeat the following Nsim times:
(a) randomly simulate a treatment assignment from the population of pos- sible treatment assignments, under the randomization scheme.
(b) compute the value of the test statistic, given the simulated treatment assignment and under H 0.
The empirical distribution of {g 1 ,... , gNsim} approximates the null dis- tribution :
#(|gk|) ≥ 2 .4) Nsim
≈ Pr(g(YA, YB ) ≥ 2. 4 |H 0 )
The approximation improves if Nsim increased. Here is some R-code:
y<- c( 26.9,11.4,26.6,23.7,25.3,28.5,14.2,17.9,16.5,21.1,24.3,19.6) x<- c("A","A","B","B","A","B","B","B","A","A","B","A")