Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Research Design Principles - Lecture Notes | STAT 502, Study notes of Statistics

University of Washington (UW) - Seattle Statistics

Prof. Peter Hoff

Material Type: Notes; Professor: Hoff; Class: DESIGN ANLYS EXPMTS; Subject: Statistics; University: University of Washington - Seattle; Term: Autumn 2005;

Typology: Study notes

Pre 2010

Uploaded on 03/18/2009

koofers-user-0y8 🇺🇸

9 documents

1 / 160

This page cannot be seen from the preview

Don't miss anything!

Statistics 502 Lecture Notes

Peter D. Hoff

December 6, 2006

Discover Study notes of Statistics University of Washington (UW) - Seattle

Partial preview of the text

Download Research Design Principles - Lecture Notes | STAT 502 and more Study notes Statistics in PDF only on Docsity!

Statistics 502 Lecture Notes

Peter D. Hoff

©^ cDecember 6, 2006

2.10.2 Which two-sample t-test to use? tdiff(yA, yB) vs. t(yA, yB)

1 Research Design Principles
- 1.1 Induction
- 1.2 Model of a process or system
- 1.3 Experiments and observational studies
- 1.4 Steps in designing an experiment
2 Comparing Two Treatments
- 2.1 Summaries of sample populations
- 2.2 Hypothesis testing via randomization
- 2.3 Essential nature of a hypothesis test
- 2.4 Basic decision theory, or “What use is a p-value?”
- 2.5 Relating samples to (super)populations
- 2.6 Normal distribution
- 2.7 Introduction to the t-test
- 2.8 Two sample tests
- 2.9 Power and Sample Size Determination
  - 2.9.1 The non-central t-distribution
  - 2.9.2 Computing the Power of a test
- 2.10 Checking Assumptions of a two-sample t-test
  - 2.10.1 Two-sample t-test with unequal variances
3 Comparing Several Treatments
- 3.1 Introduction to ANOVA
  - 3.1.1 A model for treatment variation
  - 3.1.2 Model Fitting
  - 3.1.3 Testing hypothesis with M SE and M ST
  - 3.1.4 Partitioning sums of squares CONTENTS ii
  - 3.1.5 The ANOVA table
  - 3.1.6 More sums of squares geometry
  - 3.1.7 Unbalanced Designs
  - 3.1.8 Normal sampling theory for ANOVA
  - 3.1.9 Sampling distribution of the F -statistic
  - 3.1.10 Comparing group means
  - 3.1.11 Power calculations for the F-test
- 3.2 Treatment Comparisons
  - 3.2.1 Contrasts
  - 3.2.2 Orthogonal Contrasts
  - 3.2.3 Multiple Comparisons
- 3.3 Model Diagnostics
  - 3.3.1 Detecting violations with residuals
  - 3.3.2 Variance stabilizing transformations
4 Multifactor Designs
- 4.1 Factorial Designs
  - 4.1.1 Data analysis:
  - 4.1.2 Additive effects model
  - 4.1.3 Evaluating additivity:
  - 4.1.4 Inference for additive treatment effects
- 4.2 Randomized complete block designs
- 4.3 Unbalanced designs
  - 4.3.1 Non-orthogonal sums of squares:
- 4.4 Analysis of covariance
5 Nested Designs
- 5.1 Nested Designs
  - 5.1.1 Mixed-effects approach
  - 5.1.2 Repeated measures analysis:
2.1 Wheat yield distributions List of Figures
2.2 Randomization distribution for the wheat example
2.3 The superpopulation model
2.4 The χ^2 distribution
2.5 The t-distribution
2.6 A t 8 null distribution and α = 0.05 rejection region
2.7 The t-based null distribution for the wheat example
2.8 Randomization distribution for the t-statistic
2.9 The non-central t-distribution
2.10 Critical regions and the non-central t-distribution
- and power versus sample size. 2.11 Null and alternative distributions for another wheat example,
2.12 Normal scores plots.
3.1 Bacteria data
3.2 Randomization distribution of the F -statistic
3.3 Coagulation data
3.4 F-distributions
3.5 Normal-theory and randomization distributions of the F -statistic
3.6 Power
3.7 Power
3.8 Yield-density data
3.9 Normal scores plots of normal samples, with n ∈ { 20 , 50 , 100 }
3.10 Crab data
3.11 Crab residuals
3.12 Fitted values versus residuals
3.13 Data and log data
3.14 Diagnostics after the log transformation
3.15 Mean variance relationship of the transformed data LIST OF FIGURES iv
4.1 Marginal Plots.
4.2 Conditional Plots.
4.3 Cell plots.
4.4 Mean-variance relationship.
4.5 Mean-variance relationship for transformed data.
4.6 Plots of transformed poison data
- ery. 4.7 Comparison between types I and II, without respect to deliv-
4.8 Comparison between types I and II, with delivery in color.
4.9 Marginal plots of the data.
4.10 Three datasets exhibiting non-additive effects.
4.11 Experimental material in need of blocking.
4.12 Results of the experiment
4.13 Marginal plots and residuals
4.14 Marginal plots for pain data
4.15 Interaction plots for pain data
4.16 Oxygen uptake data
4.17 ANOVA and ANCOVA fits to the oxygen uptake data
5.1 Potato data.
5.2 Diagnostic plots for potato ANOVA.
5.3 Potato data
5.4 Potato data

CHAPTER 1. RESEARCH DESIGN PRINCIPLES 2

Input variables consist of

controllable factors: measured and determined by scientist

uncontrollable factors: measured but not determined by scientist

noise factors: unmeasured, uncontrolled factors (experimental variability or “error”)

For any interesting process, there are inputs such that:

variability in input → variability in output

If variability in an input factor x leads to variability in output y, we say x is a source of variation. In this class we will discuss methods of designing and analyzing experiments to determine important sources of variation.

1.3 Experiments and observational studies

Information on how inputs affect output can be gained from:

Observational studies: Input and output variables are observed from a pre-existing population. It may be hard to say what is input and what is output.
Controlled experiments: (some) Input variables are controlled and ma- nipulated by the experimenter to determine their effect on the output.

Example (Women’s Health Initiative, WHI):

Population: Healthy, post-menopausal women in the U.S.
Input variables:
1. estrogen treatment, yes/no
2. demographic variables (age, race, diet, family history,... )
3. unmeasured variables (?)
Output variables
1. coronary heart disease (eg. MI)

CHAPTER 1. RESEARCH DESIGN PRINCIPLES 3

invasive breast cancer 3....

Scientific question: How does estrogen treatment affect health out- comes?

Observational Study:

Observational population: 93,676 women enlisted starting in 1991, tracked over eight years on average. Data consists of x= input variables, y=health outcomes, gathered concurrently on existing populations.
Results: good health/low rates of CHD generally associated with estro- gen treatment.
Conclusion: Estrogen treatment is positively associated with health out- comes, such as prevalence of CHD.

Experimental Study (WHI randomized controlled trial):

Experimental population:

373,092 women determined to be eligible ↪→ 18,845 provided consent to be in experiment ↪→ 16,608 included in the experiment

16,608 women randomized to either

x = 1 (estrogen treatment) x = 0 (control, i.e. no estrogen treatment) using a randomized block design: Women were treated at different clinics, and were of different ages. age group 1 (50-59) 2 (60-69) 3 (70-79) clinic 1 n 11 n 12 n 13 2 n 21 n 22 n 23 .. .

ni,j = # of women in study, in clinic i and in age group j = # of women in block i, j

CHAPTER 1. RESEARCH DESIGN PRINCIPLES 5

Observational study

correlation

X

X1 Y

cause cause

Randomized experiment

randomization

..............^ ... ................

X

X1 Y

Observational studies can suggest good experiments to run, but can’t definitively show causation.

Randomization can eliminate correlation between x 1 and y due to a different cause x 2 , aka a confounder.

“No causation without randomization”

CHAPTER 1. RESEARCH DESIGN PRINCIPLES 6

1.4 Steps in designing an experiment

Identify research hypotheses to be tested.
Choose a set of experimental units, which are the units to which treatments will be randomized.
Choose a response/output variable.
Determine potential sources of variation in response:

(a) factors of interest (b) nuisance factors

Decide which variables to measure and control:

(a) treatment variables (b) potential large sources of variation/blocking variables

Decide on the experimental procedure and how treatments are to be randomly assigned.

These factors are often constrained by budgets, ethics, time,...

Three principles in Experimental Design

Replication: Repetition of an experiment. Replicates are runs of an experiment or sets of experimental units that have the same values of the control variables. More replication → more precise inference Let yA,i = response of the ith unit assigned to treatment A yB,i = response of the ith unit assigned to treatment B i = 1,... , n. Then ¯yA 6 = ¯yB provides evidence that treatment affects response, i.e. treatment is a source of variation. ( larger n → more evidence ).
Randomization: Random assignment of treatments to experimental units. This removes potential for systematic bias/ removes any pre-experimental source of bias. Makes confounding unlikely.

Chapter 2 Comparing Two Treatments

Example: Wheat yield

Factor of interest: Fertilizer type, A or B. One factor of interest, having two levels.

Question: Is one fertilizer better than another, in terms of yield?

Experimental material: One plot of land to be divided into 2 rows of 6 subplots.

Design question: How to assign treatments/factor levels to the plots? Want to avoid confounding a treatment effect with another potential source of variation.
Potential sources of variation: Fertilizer, soil, sun, water.
Implementation: If we assign treatments randomly, we can avoid any pre-experimental bias in results: 12 playing cards, 6 red, 6 black were shuffled and dealt. 1st card red → 1st plot gets A 2nd card red → 2nd plot gets A 3rd card black → 3rd plot gets B .. . This is our first design, a completely randomized design.

CHAPTER 2. COMPARING TWO TREATMENTS 9

Results:

A A B B A B 26.9 11.4 26.6 23.7 25.3 28. B B A A B A 14.2 17.9 16.5 21.1 24.3 19.

How much evidence is there that fertilizer type is a source of yield variation? Evidence about differences between two populations is generally measured by comparing summary statistics across the two sample populations. (Recall, a statistic is any computable function of known, observed data).

2.1 Summaries of sample populations

Distribution:

Empirical distribution: Pr(ˆ a, b] = #(a < yi ≤ b)/n
Empirical CDF (cumulative distribution function)

Fˆ (y) = #(yi ≤ y)/n = Pr(ˆ −∞, y]

Histograms
Kernel density estimates

Note that these summaries more or less retain all the information in the data except the unit labels.

Location:

sample mean or average : ¯y = (^1) n

∑n i=1 yi

sample median : ˆq(1/2) is a/the value y(1/2) such that

#(yi ≤ y(1/2))/n ≥ 1 / 2 #(yi ≥ y(1/2))/n ≥ 1 / 2

To find the median, sort the data in increasing order, and call these values y(1),... , y(n). If there are no ties, then if n is odd, then y( n+1 2 ) is the median; if n is even, then all numbers between y( n 2 ) and y( n+1 2 ) are medians.

CHAPTER 2. COMPARING TWO TREATMENTS 11

mean(yA) [1] 20. mean(yB) [1] 22.

median(yA) [1] 20. median(yB) [1] 24

sd(yA) [1] 5. sd(yB) [1] 5.

quantile(yA,prob=c(.25,.75)) 25% 75% 17.275 24. quantile(yB,prob=c(.25,.75)) 25% 75% 19.350 26.

So there is a different in yield for these wheat fields. Would you recommend B over A for future plantings? Do you think these results generalize to a larger population?

2.2 Hypothesis testing via randomization

Questions:

Could the observed differences be due to fertilizer type?
Could the observed differences be due to plot-to-plot variation?

Hypothesis tests:

H 0 (null hypothesis): Fertilizer type does not affect yield.

CHAPTER 2. COMPARING TWO TREATMENTS 12

H 1 (alternative hypothesis): Fertilizer type does affect yield.

A statistical hypothesis test evaluates the plausibility of H 0 in light of the data.

Suppose we are interested in mean wheat yields. We can evaluate H 0 by answering the following questions:

Is a mean difference of 2.4 plausible/probable if H 0 is true?
Is a mean difference of 2.4 large compared to experimental noise?

To answer the above, we need to compare

{|¯yB − y¯A| = 2. 4 }, the observed difference in the experiment to values of |y¯B − y¯A| that could have been observed if H 0 were true.

Hypothetical values of |y¯B − ¯yA| that could have been observed under H 0 are referred to as samples from the null distribution.

Finding a null distribution: Let

g(YA, YB ) = g({Y 1 ,A,... , Y 6 ,A}, {Y 1 ,B ,... , Y 6 ,B }) = | Y¯B − Y¯B |.

This is a function of the outcome of the experiment. It is a statistic. Since we will use it to perform a hypothesis test, we will call it a test statistic.

Observed test statistic: g(26. 9 , 11. 4 ,... , 24 .3) = 2.4 = gobs

Hypothesis testing procedure: Compare gobs to g(YA, YB) for values of YA and YB that could have been observed, if H 0 were true.

Recall the outcome of the experiment:

Cards were shuffled and dealt R, R, B, B,... and fertilizer types planted in subplots:

A A B B A B

B B A A B A

CHAPTER 2. COMPARING TWO TREATMENTS 14

IDEA: To consider what types of outcomes we would see in universes where H 0 is true, compute g(YA, YB ) under every possible treatment assignment and assuming H 0 is true.

Under our randomization scheme, there were

12! 6!6!

equally likely ways the treatments could have been assigned. For each one of these, we can calculate the value of the test statistic that would’ve been observed under H 0 : {g 1 , g 2 ,... , g 924 }

This enumerates all potential pre-randomization outcomes of our test statistic, assuming no treatment effect. Along with the fact that each treatment assignment is equally likely, these value give a null distribution, a probability distribution of possible experimental results, if H 0 is true.

Pr(g(YA, YB ) ≤ x|H 0 ) =

#{gk ≤ x} 924

This distribution is sometimes called the randomization distribution, be- cause it is obtained by the randomization scheme of the experiment. Is there any contradiction between H 0 and our data?

Pr(g(YA, YB ) ≥ 2. 4 |H 0 ) = 0. 47

According to this calculation, the probability of observing a mean difference of 2.4 or more is not unlikely under the null hypothesis. This probability calculation is called a p-value. Generically, a p-value is

“The probability, under the null hypothesis, of obtaining a result as or more extreme than the observed result.”

The basic idea:

small p-value → evidence against H 0 large p-value → no evidence against H 0

CHAPTER 2. COMPARING TWO TREATMENTS 15

YB − Y A

Density

|YB − Y A|

Density

Figure 2.2: Randomization distribution for the wheat example

Approximating a randomization distribution We don’t want to have to enumerate all

( (^) n n/ 2

possible treatment assignments. Instead, repeat the following Nsim times:

(a) randomly simulate a treatment assignment from the population of pos- sible treatment assignments, under the randomization scheme.

(b) compute the value of the test statistic, given the simulated treatment assignment and under H 0.

The empirical distribution of {g 1 ,... , gNsim} approximates the null dis- tribution :

#(|gk|) ≥ 2 .4) Nsim

≈ Pr(g(YA, YB ) ≥ 2. 4 |H 0 )

The approximation improves if Nsim increased. Here is some R-code:

y<- c( 26.9,11.4,26.6,23.7,25.3,28.5,14.2,17.9,16.5,21.1,24.3,19.6) x<- c("A","A","B","B","A","B","B","B","A","A","B","A")

Research Design Principles - Lecture Notes | STAT 502, Study notes of Statistics

Related documents

Partial preview of the text

Download Research Design Principles - Lecture Notes | STAT 502 and more Study notes Statistics in PDF only on Docsity!

Statistics 502 Lecture Notes

Peter D. Hoff

Contents

CHAPTER 1. RESEARCH DESIGN PRINCIPLES 2

CHAPTER 1. RESEARCH DESIGN PRINCIPLES 3

CHAPTER 1. RESEARCH DESIGN PRINCIPLES 5

correlation

X

X1 Y

cause cause

X

X1 Y

CHAPTER 1. RESEARCH DESIGN PRINCIPLES 6

Chapter 2

Comparing Two Treatments

CHAPTER 2. COMPARING TWO TREATMENTS 9

CHAPTER 2. COMPARING TWO TREATMENTS 11

CHAPTER 2. COMPARING TWO TREATMENTS 12

B B A A B A

CHAPTER 2. COMPARING TWO TREATMENTS 14

CHAPTER 2. COMPARING TWO TREATMENTS 15

YB − Y A

|YB − Y A|