



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to hypothesis testing, a statistical method used to assess claims about population parameters based on sample data. The concept of statistical inference, the goals of statistical inference, and the role of hypothesis testing in making decisions about populations. It also covers the terminology of hypothesis tests, including null and alternative hypotheses, and the concept of p-values. Examples of one-sided and two-sided hypothesis tests, as well as the use of z and t statistics. It also discusses the concept of type i and type ii errors.
Typology: Study notes
1 / 6
This page cannot be seen from the preview
Don't miss anything!




Statistical Methods and Computing
Introduction to Hypothesis Testing
Lecture 14 Mar. 10 and 14, 2008
Kate Cowles 374 SH, 335- [email protected]
Introduction to Hypothesis Testing
Recall that statistical inference is using data contained in a sample to draw conclusions or make decisions about the entire population from which the sample is taken.
Two main goals of statistical inference
The purpose of hypothesis testing is to āassess the evidence provided by data about some claim concerning a population.ā*
3
Example:
I claim that my husbandās resting pulse rate is 45 beats per minute. This is very low and would be typical of either a highly trained athlete or a sick individual.
To test my claim, you wish to measure his rest- ing heart rate on 5 different occasions.
Here, the āpopulationā of interest is all possi- ble measurements of my husbandās resting pulse rate. My claim may be interpreted as saying that the mean μ of this āpopulationā of values is 45 beats per minute.
4 Suppose the measurements you get are:
42 52 43 48 47
The sample mean ĀÆx = 46.4. Does this provide evidence against my claim?
We will consider this question by asking what would happen if my claim were true and we repeated the sample of 5 measurements many times.
Suppose first that we knew that the standard deviation of measurements of my husbandās rest- ing heart rate was Ļ = 4 beats per minute.
Terminology of hypothesis tests
The null hypothesis is the statement being tested.
7
The alternative hypothesis is the claim for which we are trying to find evidence.
In the example about my husbandās heart rate, your alternative hypothesis probably was
Ha : μ > 45
The p-value of the test is the probability, com- puted assuming that H 0 is true, that the ob- served outcome would take a value as extreme as or more extreme than, what we actually ob- served.
8 The result of a hypothesis test is a decision. The possible outcomes are called
Before we carry out the test, we must decide how strong we will require the evidence to be in order for us to reject H 0. We specify this in terms of a significance level.
Two-sided hypothesis tests
Example: We wish to compare fasting serum cholesterol levels in persons over 21 living in a group of islands in the South Pacific with typical levels found in the U.S.
We know that levels in adults over 21 in the US are approximately normally distributed with
We have no idea what the relative levels of serum cholesterol are on the islands as compared with the U.S.
We will assume that the levels on the islands are normally distributed with
15
The hypotheses for our two-sided test are:
H 0 : μ = 190 Ha : μ 6 = 190
Before we look at our data, we will decide on the significance level α for our test. Let us choose α = .05.
We then perform blood tests on 100 adults from the islands and find that the sample mean level xĀÆ = 181.5 mg/dl.
To carry out our hypothesis test, we note that, if H 0 is true, the sampling distribution of ĀÆx is normal with
μ = 190 ĻĀÆx =
16 We will standardize the value of ĀÆx that we ob- served to find out how likely we would have been to get a value as extreme as what we got, or more extreme, if H 0 were true.
z =
xĀÆ ā μ 0 Ļ/
n
=
We must find out what area under the standard normal curve lies
The answer is .017 + .017 = .034.
This is the p ā value for the test. Since p < .05 we reject the null hypothesis and conclude that serum cholesterol levels are different among adult residents of the Pacific Islands than among adults in the U.S.
One sample t-tests If we donāt know the population standard devi- ation, then we
Example: If we do not assume that we know Ļ for serum cholesterol levels among residents of the Pacific Islands.
From the sample of 100 adults, we compute s = 38. 1 mg/dl
We then compute t =
xĀÆ ā μ 0 s/
n
19
20 We try to use Table C to find the area to the left of -2.231 and to the right of 2.231 under a t curve with 99 degrees of freedom.
The closest we can come is that under a t curve with 100 degrees of freedom, the area in one tail would be between .01 and .02.
Thus we conclude that the p-valueis somewhere between .02 and .04.
SAS can do a much better job for us! It would provide a p-value of .0279.
Thus, if we had chosen α = .05, we would reject the null hypothesis.