Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Two-Sample T-Test vs. Paired T-Test: Analyzing Differences in Means with Dependent Data - , Study notes of Data Analysis & Statistical Methods

Cornell University Data Analysis & Statistical Methods

Prof. R. Strawderman

An in-depth comparison between the two-sample t-test and the paired t-test, focusing on their applications when analyzing differences in means for dependent data. The assumptions, calculations, and practical examples of both tests, highlighting the importance of considering the relationship between observations within a subject and the resulting impact on statistical analysis.

Typology: Study notes

Pre 2010

Uploaded on 12/09/2010

wk2151 🇺🇸

3 documents

1 / 26

This page cannot be seen from the preview

Don't miss anything!

Two-Sample Inference for

Location Parameters II

FCSM Chapter 6

Dependent samples: 6.4

Sample size: 6.6

Two Sample Methods 1

BTRY 6010 & ILRST 6100

Discover Study notes of Data Analysis & Statistical Methods Cornell University

Partial preview of the text

Download Two-Sample T-Test vs. Paired T-Test: Analyzing Differences in Means with Dependent Data - and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

Two-Sample Inference for Location Parameters II

FCSM Chapter 6 Dependent samples: 6.4Sample size: 6.

Two Sample Methods

BTRY 6010 & ILRST 6100

Example:

Growth Hormones

Example:

Growth Hormones

The paper “Growth Hormone Increase DuringSleep After Daytime Exercise” (1974,

J. of

Endocrinology, 473-

) reports results of an

experiment involving 6 healthy male subjectsexperiment involving 6 healthy male subjects.Blood samples were drawn from each participantd^ i^

l^

t^ diff

t^

i ht^

i

during sleep on two different nights, using avenous catheter.Night #1: no strenuous exercise the day before;Night #2: strenuous exercise the day before.

BTRY 6010 & ILRST 6100

Two Sample Methods

Two^ sample

t^ test shows no Two-sample

t^ test shows no difference, whether or notwe assume equal variances:( p^ = 0 2183

p^ = 0 2156) ( p^ 0.2183,

p^ 0.2156). Right?? What other assumptions

BTRY 6010 & ILRST 6100

Two Sample Methods

p have we made???

W^ !!

C^ i

i^

d h^

b^

th

^ Wrong!!

Care is required here because the

observations within each subject are certainlydependent – the observations from the exercisedependent

the observations from the exercise

“groups” (night #1 vs night #2) being compared arefrom the same subjects, thus far from independent.  In fact: notice that there is an increase in hormonelevel from night #1 to night

within^

each subject.

^ Relationship between observations within a subjectmust be accounted for in the analysis. ^ Appropriate analysis in this situation:a^ paired t-test

or^ matched pairs t-test

BTRY 6010 & ILRST 6100

Two Sample Methods

Basic idea:Basic

idea:

^ Suppose we have

2n^ observations that can be

sensibly paired with each other (e.g., pre- vspost-tests, measurements on both eyes,measurements on twins etc

measurements on twins, etc…)  Let^ y^ represent the ij^

th^ j observation for the

th^ i pair,

h^ i^

d^ j 1 2

where^

i=1,…,n

and^ j

^ Consider the difference in sample means:

1 2

2 1

1 1

n i^

i i^

y^ y^

y n^

n ^

^ ^

^ ^

^

^ ^

^

^ ^

^

BTRY 6010 & ILRST 6100

Two Sample Methods

1 i^

i n^

n ^

 ^

^ ^

Using a little bit of algebra: Using^ a little bit of algebra:

^

 1

y^ y^

d y^ y^

^

^

 2 1 2

1 (^1) i i^ i

i i^ d

y^ y^

d n y^ y^

d ^



^

^

^ ^

^ With paired data: a difference between sampleaverages can be

naturally

expressed as the

average of the differences between theaverage of the differences between theobservations on the same pair.A^

th t di ti

t^ i

i d

d^

t Th

^ Assume that distinct

pairs^ are independent. Then

a paired t-test amounts to using a standard one-sample t-test on the differences

dd^1

d

sample t test on the differences

d, d^ , …, d^12

. n^

BTRY 6010 & ILRST 6100

Two Sample Methods

^ Compare

Difference

(^

) vs.^ Mean

(^ ) on

y^ y ^

^ Compare

Difference

(^

) vs.^ Mean

(^ ) on

Slides 4 & 9 – these are equal (slide 8).  But:^ Std Err Dif

and^ Std Err Mean

on Slides 4 &

y^ y^^1

^ But:^

Std^ Err Dif

and^ Std

Err Mean

on^ Slides 4 &

9 are actually quite different: 3.303 vs. 1.346Note: one-sample t-test is performed using

d^ d^

d^ ; so

Note: one-sample t-test is performed using

d, d, …, d^1

;^ so n (^) 2 1

=^ , whe

1 r^

( 1 e^

n ) d

i i

Std Err Me

d^ dn

a^ n^ n

^

 

^ Accounting for pairing here leads to a 2.5-foldreduction in variance (compared to usual two

(^1) n^1 i n^



reduction in variance (compared to usual two-sample t-test). The resulting test statistics are very^ different: 1.32 vs. 3.24 very^ different: 1.32 vs. 3.

BTRY 6010 & ILRST 6100

Two Sample Methods

Aside:

Var( X

-^ Y )^

for dependent

X & Y

Aside:

Var(^ X

Y )^

for^ dependent

X^ & Y

^ Let^ X

&^ Y be two random variables.

In general: the

variance of

X^ Y^

or^ Var(X

Y)^ is given by

variance of

X^ – Y , or

Var(X^

- Y),^ is

given by

Var( X – Y ) = Var(X) + Var(Y) – 2 Cov( X,Y ) where^ Cov( X ,Y )

denotes covariance of

X^ &^ Y.

^ We say

X^ &^ Y^

are uncorrelated if

Cov( X ,Y )

y^

(^ )

correlated if

Cov( X ,Y )

^0

positively correlated if

Cov( X ,Y )

negatively correlated if

Cov( X ,Y )

^ Independence of

X^ &^ Y^

implies

Cov( X ,Y )

p^

(^ ,^ )

(but not

vice versa

) BTRY 6010 & ILRST 6100

Two Sample Methods

Result: Exact Sampling Distribution of

p^ g

Normally Distributed DifferencesL t^ b^

th^

diff^

f t

Let^ be the sample mean difference of two D observations taken on a SRS of

n^ independent

pairs. Assume the differences follow a

N (^ ,^  dd

D^ n p

(^ ,^ ) dd^

distribution. Then: n^ d ~ D^  t

If^30

Dn^ d n^

 N

^

. (^1) n t sd n

If 

30,^

d n^

N

 s n The result above forms the basis for CIs andhypothesis tests in a paired data setting. Two Sample Methods

BTRY 6010 & ILRST 6100 hypothesis tests in a paired data setting.

Testing:

paired data,

^ unknown

Testing:

paired

data,

^ unknown *^

Test Statistic:

d^  dt  sd n As usual, we have three possible sets of hypotheses:

(i)^ H :^ ^0

≤^  vs Hd d^

:^  >^  a d^ d

^ RR is

*^ t > t^ n-1,^ ^

& p = P (^

t^ > t*^ ) n-^

(ii) H :^ ^0

≥^  vs Hd d^

:^  <^  a d^ d

^ RR is

*^ t < - t^ n-1,

& p = P  (^ t^ < t*^ ) n-^

0 d^ d

a^ d^ d^

n 1,^ ^

n 1

(iii) H^ :^ ^0

=^  vs Hd d^

:^ ^   add^

^ RR is^

*  t | > t^ n-1,^ 

& p = 2P (^ t^ > |t*|n-^

)

Usual comments apply: If

n^ ≥^ 30, we

can use normal critical points in

place of^ t.

If the sample size is very small and normality of differences is suspect, use the t-test with skepticism or use nonparametricTwo Sample Methods

BTRY 6010 & ILRST 6100 p^ ,^

methods (Wilcoxon signed-rank).

Example:

Growth Hormones

Example:

Growth Hormones

Obligatory check: no “obvious” deviations from normality

BTRY 6010 & ILRST 6100 Obligatory check: noTwo Sample Methods

obvious

deviations from normality.

Example:

Fertilized Tomatoes

Example:

Fertilized Tomatoes

Does a new fertilizer improve tomato yield?Concerns about impact of differences in soil type,light, moisture, … led to following experiment: ^ 30 plots of tomato plants, 10 plants per plot. ^ New & existing fertilizer applied within each plot,

g^

pp^

p^ ,

5 plants each, order randomized. Average yieldfor each fertilizer is computed within each plot.  Should we use a paired or unpaired t-test toevaluate difference in yields (in pounds)?

BTRY 6010 & ILRST 6100

Two Sample Methods

Comments:Comments: ^ Pairing

( matching

) can be useful in observational studies as a way to control for

confounding

: the impact of

as a way to control for

confounding

: the impact of

measured (and unmeasured) variables that may beassociated with both “response” and “group” variable. Itt

t^ ll d^

f^ i bilit

serves to reduce uncontrolled sources of variability.  Pairing^

is an example of

blocking

, a term originating from

experimental design

Blocking serves to reduce impact experimental design. Blocking serves to reduce impactof uncontrolled sources of variability on a comparison oftreatments (e.g., new vs. existing fertilizer; post- vs. pre-i^

) b^ fi^ t

ti^

bl^ k^ (i

f^ i^ il

exercise) by first creating

blocks^ (i.e., groups of similar,

or relatively homogeneous, units, such as tomato plants,plots, or subjects) and then assessing treatment effects

j^ )^

by using “within-block” differences.

BTRY 6010 & ILRST 6100

Two Sample Methods

^ Settings involving pairs of measurements represent a^ Settings

involving pairs of measurements represent a special case of the more general problem of

repeated

measurements

(two or more measurements per it/bl^ k)

S^ h d t

i^ i^

lti l

unit/block). Such data can arise in multiple ways, e.g.,multiple treatments per block, longitudinal data on eachsubject, and so on.  As in the paired setting, one expects the measurementson one unit/block to be more correlated with each otherth^ ith

t^ diff

it /bl^ k

than with measurements on different units/blocks.  Methods of analysis must deal with the various levels ofcorrelation that may exist; otherwise one can easilycorrelation that may exist; otherwise, one can easilyobtain incorrect assessments of sampling variability,leading to impaired statements of statistical significanced/^

fid^

l^ l^ (^

th^

bl^ )

and/or confidence levels (among other problems).

BTRY 6010 & ILRST 6100

Two Sample Methods

Two-Sample T-Test vs. Paired T-Test: Analyzing Differences in Means with Dependent Data - , Study notes of Data Analysis & Statistical Methods

Related documents

Partial preview of the text

Download Two-Sample T-Test vs. Paired T-Test: Analyzing Differences in Means with Dependent Data - and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

Two-Sample Inference for Location Parameters II

FCSM Chapter 6 Dependent samples: 6.4Sample size: 6.

Example:

Growth Hormones

Example:

Growth Hormones

The paper “Growth Hormone Increase DuringSleep After Daytime Exercise” (1974,

J. of

Endocrinology, 473-

) reports results of an

experiment involving 6 healthy male subjectsexperiment involving 6 healthy male subjects.Blood samples were drawn from each participantd^ i^

l^

t^ diff

t^

i ht^

i

during sleep on two different nights, using avenous catheter.Night #1: no strenuous exercise the day before;Night #2: strenuous exercise the day before.

W^ !!

C^ i

i^

d h^

b^

th

^ Wrong!!

Care is required here because the

observations within each subject are certainlydependent – the observations from the exercisedependent

the observations from the exercise

“groups” (night #1 vs night #2) being compared arefrom the same subjects, thus far from independent.  In fact: notice that there is an increase in hormonelevel from night #1 to night

within^

each subject.

^ Relationship between observations within a subjectmust be accounted for in the analysis. ^ Appropriate analysis in this situation:a^ paired t-test

or^ matched pairs t-test

Basic idea:Basic

idea:

^ Suppose we have

2n^ observations that can be

sensibly paired with each other (e.g., pre- vspost-tests, measurements on both eyes,measurements on twins etc

measurements on twins, etc…)  Let^ y^ represent the ij^

th^ j observation for the

th^ i pair,

h^ i^

d^ j 1 2

where^

i=1,…,n

and^ j

^ Consider the difference in sample means:

^ ^

^ ^

^

^ ^

^

^ ^

^

^ ^

Using a little bit of algebra: Using^ a little bit of algebra:

^

^

^ ^

^ With paired data: a difference between sampleaverages can be

naturally

expressed as the

average of the differences between theaverage of the differences between theobservations on the same pair.A^

th t di ti

t^ i

i d

d^

t Th

^ Assume that distinct

pairs^ are independent. Then

a paired t-test amounts to using a standard one-sample t-test on the differences

dd^1

d

sample t test on the differences

d, d^ , …, d^12

. n^

^ Compare