Notes on T Distributions - Statistical Methods | STA 2023 | Study notes Data Analysis & Statistical Methods

NOTES: t-distributions

Hypothesis tests and confidence intervals involving mean

with unknown

Suppose that hypothesis tests and confidence intervals involving

are provided with a known

standard deviation of the population

(

)

. This would enable us to calculate the standard deviation of the

sample mean distribution since

( )

xSD

, where

is the sample size. We now assume that

not known (as is most generally the case in real life), and therefore replace

(

)

xSD

with the Standard Error

of the Mean. In other words, we use

( )

=SE

, where

is the standard deviation of the sample.

This, however, causes a problem since the

fluctuates with the sample size

in such a way as to create

a different (possible not normal) distributions for each sample size

. Each of these distributions is called

a t-distribution.

Facts about t-distributions:

• bell-shaped curve

• mean = 0

• symmetric about 0

• Each one is different from the standard normal curve. As

gets larger the

t-distributions get closer to the standard normal curve. When

≥

the

t-distributions are almost normal (are very close to the standard normal curve).

• degrees of freedom =

−

. The degrees of freedom (df) indicate which

t-distribution you will use.

For hypothesis tests, we will use a “t-test statistic” or “t-score” instead of a z-score. The equation for the t-

score is

−

, where

is the assumed population mean from the null hypothesis

. You will

also need to use tcdf instead of normalcdf on your calculator. 2

VARS tcdf. Inputs are tcdf(LB, UB, df).

For confidence intervals we will use

instead of

. The

value depends on the t-distribution you

are using, and so depends on the degrees of freedom (

−

cannot be obtained from your graphing

calculator and so it must be looked up on a table in your textbook. The formula for a confidence interval

for means (with unknown

) is

Requirements for using t-distributions:

1. SRS: The sample mean must be chosen from a random sample.

2. Sufficiently large sample size:

a. CASE:

. The data should be very close to a Normal model. Do not use t-

methods if there is strong skewness or outliers.

b. CASE:

4015

≤

. t-methods should work as long as the data is unimodal and

reasonably symmetric (make a histogram). t-methods should not be used in the presence

of outliers or strong skewness.

c. CASE:

≤

. t-methods can be used even in the presence of strong skewness or a

few outliers. In this case t-methods are called “Robust.”

Partial preview of the text

Download Notes on T Distributions - Statistical Methods | STA 2023 and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

NOTES: t-distributions

Hypothesis tests and confidence intervals involving mean μ with unknown σ.

Suppose that hypothesis tests and confidence intervals involving μ are provided with a known

standard deviation of the population ( σ ). This would enable us to calculate the standard deviation of the

sample mean distribution since ( )

n

SDx

= , where n is the sample size. We now assume that σ is

not known (as is most generally the case in real life), and therefore replace SD ( x )with the Standard Error

of the Mean. In other words, we use ( )

n

S

SE x = x , where S x is the standard deviation of the sample.

This, however, causes a problem since the S x fluctuates with the sample size n in such a way as to create

a different (possible not normal) distributions for each sample size n. Each of these distributions is called

a t-distribution.

Facts about t-distributions:

bell-shaped curve
mean = 0
symmetric about 0

• Each one is different from the standard normal curve. As n gets larger the

t-distributions get closer to the standard normal curve. When n ≥ 30 the

t-distributions are almost normal (are very close to the standard normal curve).

• degrees of freedom = n − 1. The degrees of freedom ( df ) indicate which

t-distribution you will use.

For hypothesis tests , we will use a “t-test statistic” or “t-score” instead of a z-score. The equation for the t-

score is

n

S

x

t

= −^ μ^0

, where μ 0 is the assumed population mean from the null hypothesis H 0. You will

also need to use tcdf instead of normalcdf on your calculator. 2 ND^ VARS tcdf. Inputs are tcdf (LB, UB, df).

For confidence intervals we will use t instead of z . The t *value depends on the t-distribution you

are using, and so depends on the degrees of freedom ( n − 1 ). t *cannot be obtained from your graphing

calculator and so it must be looked up on a table in your textbook. The formula for a confidence interval

for means (with unknown σ ) is

n

S

x ± t * x.

Requirements for using t-distributions:

SRS: The sample mean must be chosen from a random sample.
Sufficiently large sample size:

a. CASE: n < 15. The data should be very close to a Normal model. Do not use t-

methods if there is strong skewness or outliers.

b. CASE: 15 ≤ n < 40. t-methods should work as long as the data is unimodal and

reasonably symmetric (make a histogram). t-methods should not be used in the presence of outliers or strong skewness.

c. CASE: 40 ≤ n. t-methods can be used even in the presence of strong skewness or a

few outliers. In this case t-methods are called “Robust.”

Notes on T Distributions - Statistical Methods | STA 2023, Study notes of Data Analysis & Statistical Methods

Related documents

Partial preview of the text

Download Notes on T Distributions - Statistical Methods | STA 2023 and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

NOTES: t-distributions

Hypothesis tests and confidence intervals involving mean μ with unknown σ.

Suppose that hypothesis tests and confidence intervals involving μ are provided with a known

standard deviation of the population ( σ ). This would enable us to calculate the standard deviation of the

sample mean distribution since ( )

n

SDx

= , where n is the sample size. We now assume that σ is

not known (as is most generally the case in real life), and therefore replace SD ( x )with the Standard Error

of the Mean. In other words, we use ( )

n

S

SE x = x , where S x is the standard deviation of the sample.

This, however, causes a problem since the S x fluctuates with the sample size n in such a way as to create

a different (possible not normal) distributions for each sample size n. Each of these distributions is called

• Each one is different from the standard normal curve. As n gets larger the

t-distributions get closer to the standard normal curve. When n ≥ 30 the

• degrees of freedom = n − 1. The degrees of freedom ( df ) indicate which

n

S

x

t

= −^ μ^0

, where μ 0 is the assumed population mean from the null hypothesis H 0. You will

For confidence intervals we will use t instead of z . The t *value depends on the t-distribution you

are using, and so depends on the degrees of freedom ( n − 1 ). t *cannot be obtained from your graphing

for means (with unknown σ ) is

n

S

x ± t * x.

a. CASE: n < 15. The data should be very close to a Normal model. Do not use t-

b. CASE: 15 ≤ n < 40. t-methods should work as long as the data is unimodal and

c. CASE: 40 ≤ n. t-methods can be used even in the presence of strong skewness or a

Notes on T Distributions - Statistical Methods | STA 2023, Study notes of Data Analysis & Statistical Methods

Related documents

Partial preview of the text

Download Notes on T Distributions - Statistical Methods | STA 2023 and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

NOTES: t-distributions

Hypothesis tests and confidence intervals involving mean μ with unknown σ.

Suppose that hypothesis tests and confidence intervals involving μ are provided with a known

standard deviation of the population ( σ ). This would enable us to calculate the standard deviation of the

sample mean distribution since ( )

n

SDx

= , where n is the sample size. We now assume that σ is

not known (as is most generally the case in real life), and therefore replace SD ( x )with the Standard Error

of the Mean. In other words, we use ( )

n

S

SE x = x , where S x is the standard deviation of the sample.

This, however, causes a problem since the S x fluctuates with the sample size n in such a way as to create

a different (possible not normal) distributions for each sample size n. Each of these distributions is called

• Each one is different from the standard normal curve. As n gets larger the

t-distributions get closer to the standard normal curve. When n ≥ 30 the

• degrees of freedom = n − 1. The degrees of freedom ( df ) indicate which

n

S

x

t

= −^ μ^0

, where μ 0 is the assumed population mean from the null hypothesis H 0. You will

For confidence intervals we will use t *instead of z *. The t *value depends on the t-distribution you

are using, and so depends on the degrees of freedom ( n − 1 ). t *cannot be obtained from your graphing

for means (with unknown σ ) is

n

S

x ± t * x.

a. CASE: n < 15. The data should be very close to a Normal model. Do not use t-

b. CASE: 15 ≤ n < 40. t-methods should work as long as the data is unimodal and

c. CASE: 40 ≤ n. t-methods can be used even in the presence of strong skewness or a

For confidence intervals we will use t instead of z . The t *value depends on the t-distribution you