The t-distribution, Exams of Acting

So far we've (thoroughly!) discussed how to carry out hypothesis tests and construct confidence intervals for categorical outcomes: success versus failure, ...

Typology: Exams

2022/2023

Uploaded on 02/28/2023

jesus33
jesus33 🇺🇸

4.2

(16)

422 documents

1 / 25

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ztests
The χ2-distribution
The t-distribution
Summary
The t-distribution
Patrick Breheny
October 13
Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19

Partial preview of the text

Download The t-distribution and more Exams Acting in PDF only on Docsity!

The χ^2 -distributionz^ tests The t-distributionSummary

The t-distribution

Patrick Breheny

October 13

The χ^2 -distributionz^ tests The t-distributionSummary

Introduction z tests What’s wrong with z-tests?

Introduction

So far we’ve (thoroughly!) discussed how to carry out hypothesis tests and construct confidence intervals for categorical outcomes: success versus failure, life versus death This week we’ll turn our attention to continuous outcomes like blood pressure, cholesterol, etc. We’ve seen how continuous data must be summarized and plotted differently, and how continuous probability distributions work very differently from discrete ones It should come as no surprise, then, that there are also big differences in how these data must be analyzed

The χ^2 -distributionz^ tests The t-distributionSummary

Introduction z tests What’s wrong with z-tests?

Using the central limit theorem

We’ve already used the central limit theorem to construct confidence intervals and perform hypothesis tests for categorical data The same logic can be applied to continuous data as well, with one wrinkle For categorical data, the parameter we were interested in (p) also determined the standard deviation:

p(1 − p) For continuous data, the mean tells us nothing about the standard deviation

The χ^2 -distributionz^ tests The t-distributionSummary

Introduction z tests What’s wrong with z-tests?

Estimating the standard error

In order to perform any inference using the CLT, we need a standard error We know that SE = SD/

n, so it seems reasonable to estimate the standard error using the sample standard deviation as a stand-in for the population standard deviation This turns out to work decently well for large n, but as we will see, has problems when n is small

The χ^2 -distributionz^ tests The t-distributionSummary

Introduction z tests What’s wrong with z-tests?

FVC example (cont’d)

In the study, the mean difference in reduction in FVC (placebo − drug) was 137, with standard deviation 223 Performing the z-test of H 0 : μ = 0: #1 SE = 223/

√ 14 = 60

z =

137 − 0 60 = 2. 28

#3 The area outside ± 2. 28 is 2Φ(− 2 .28) = 2(0.011) = 0. 022 This is fairly substantial evidence that the drug helps prevent deterioration in lung function

The χ^2 -distributionz^ tests The t-distributionSummary

Introduction z tests What’s wrong with z-tests?

Flaws with the z-test

However, as I mentioned before, these procedures are flawed when n is small This is a completely separate flaw than the issue of “how accurate is the normal approximation?” in using the central limit theorem Indeed, this is a problem even when the sampling distribution is perfectly normal This flaw can be witnessed by repeatedly drawing random samples from the normal distribution, then carrying out this test and recording the type I error rate

The χ^2 -distributionz^ tests The t-distributionSummary

Introduction z tests What’s wrong with z-tests?

Why isn’t the z-test working?

The flaw with the z-test is that it is ignoring one of the sources of the variability in the test statistic We’re acting as if we know the standard error, but we’re really just estimating it from the data In doing so, we underestimate the amount of uncertainty we have about the population based on the data

The χ^2 -distributionz^ tests The t-distributionSummary

Distribution of the sample variance

Before we get into the business of fixing the z-test, we need to discuss a more basic issue: what does the sampling distribution of the variance look like? We have this beautiful central limit theorem describing what the sampling distribution of the mean looks like for any underlying distribution Unfortunately, there is no corresponding theorem for the sample variance

The χ^2 -distributionz^ tests The t-distributionSummary

The χ^2 distribution

An important distribution highly related to the normal distribution is the χ^2 -distribution Suppose Z ∼ N(0, 1); then Z^2 is said to follow a χ^21 distribution, with pdf:

f (x) =

2 π

x−^1 /^2 e−x/^2

0 1 2 3 4

x

Density

The χ^2 -distributionz^ tests The t-distributionSummary

The χ^2 distribution: Degrees of freedom

An important generalization is to consider sums of squared observations from the normal distribution Suppose Z 1 , Z 2 ,... , Zp ∼ N(0, 1) and are mutually independent; then

∑p i=1 Z

2 i is said to follow a chi-squared distribution with p degrees of freedom, denoted χ^2 p:

f (x) =

Γ(p/2)2p/^2

xp/^2 −^1 e−x/^2

0 5 10 15 20 25 30

x

Density (10 df)

The χ^2 -distributionz^ tests The t-distributionSummary

Independence of mean and variance

By working out the joint distribution of X¯ and X 2 − X, X¯ 3 − X,... , X¯ n − X¯, we also arrive at the useful conclusion that the sampling distributions of X¯ and S^2 are independent In other words, for normally distributed variables, the mean and variance have no relationship whatsoever This is obviously not true for other distributions – for example, we saw that the binomial distribution has Var(X) = nE(X)(1 − E(X))

The χ^2 -distributionz^ tests The t-distributionSummary

Distribution of the sample mean (normal case)

Finally, it is worth mentioning that when a random variable follows a normal distribution, the distribution of its sample mean is exactly normal (i.e., the central limit theorem is an exact result, not an approximation) More formally, suppose X 1 , X 2 ,... , Xn ∼ N(μ, σ^2 ) are mutually independent; then

√ n

X¯ − μ σ

∼ N(0, 1)

The χ^2 -distributionz^ tests The t-distributionSummary

The t-distribution

The problem of “What is the resulting distribution when you divide one random variable by another?” was studied by a statistician named W. S. Gosset, who showed the following Suppose that Z ∼ N(0, 1), X^2 ∼ χ^2 n, and that Z and X^2 are independent; then

Z √ X^2 /n

∼ tn,

the t-distribution with n degrees of freedom

The χ^2 -distributionz^ tests The t-distributionSummary

t-distribution vs. normal distribution, df = 4

−4 −2 0 2 4

Density

Normal t