Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Inference Confidence Intervals - Tests 3 Notes - Statistics | STAT 2000, Study notes of Statistics

Test 3 notes Material Type: Notes; Professor: Morse; Class: Introductory Statistics; Subject: Statistics; University: University of Georgia; Term: Fall 2010;

Typology: Study notes

2009/2010

Uploaded on 12/09/2010

sralac1
sralac1 🇺🇸

4.5

(4)

16 documents

Partial preview of the text

Download Statistical Inference Confidence Intervals - Tests 3 Notes - Statistics | STAT 2000 and more Study notes Statistics in PDF only on Docsity!

Page ① of 89

Chapter 8: Statistical Inference: Confidence Intervals

8.1 What are Point and Interval Estimates of Population Parameters?

When we first began our discussion of statistics, we mentioned that there were two branches of statistics: descriptive and inferential. The inferential branch uses sample information to draw conclusions about the population.

One of the most common uses of the inferential branch is to use sample statistics, such as⎯x, to estimate population parameters, such as μ. It makes sense that if we take a large enough sample,⎯x should be pretty close to the actual value of μ. But the chances are pretty small that⎯x turns out to be exactly μ. This is why we call⎯x a point estimate of μ.

The key here is that sample statistics estimate population parameters. For example,⎯x is a point estimate of μ and is a point estimate of p.

So we know the value of⎯x is probably pretty close to μ, but we want to get even closer. So rather than just say⎯x is close to μ, we are going to build an interval around⎯x, and then say that μ probably lies somewhere on this interval. We call this an interval estimate. Here’s an example:

ˆ p

Page ② of 89

Suppose you were asked to estimate the average age of all the students in our class. You might survey 10 students and find their average age to be 20. This sample mean of 20 would be a point estimate of μ. BUT you could also express your guess by giving a range of ages centered around your sample mean. So your guess could be 20 give or take 2 years. This “give or take 2 years” part is what we call the margin of error which we will talk about more later. So mathematically, your guess would be 20 +/- 2 which would be the interval estimate. Suppose you were then asked how confident you were that μ, the mean age of all students, was within your interval estimate of 18 to 22 years old. You might say “I am 95% confident that the mean age of all students is within 18 to 22 years old.”

In statistics, we construct intervals for the population mean that are centered around an estimate. This estimate is⎯x, the sample mean. Since we can’t get the full population mean, we go for the next best thing. We take a sample and calculate a sample mean.

And what we add and subtract from the sample mean to get the interval estimate is the margin of error. When we construct these interval estimates, we call them confidence intervals.

Page ③ of 89

So a confidence interval of a parameter consists of an interval of numbers, and this interval is our point estimate +/- our margin of error. Just as in our example above where our interval was 20 +/- 2, or in other words, 18 to 22. 20 is our point estimate and 2 is our margin of error.

We call the value we obtain when we take the point estimate minus the margin of error, in our example 20 – 2 or 18, the lower limit or lower bound. And we call the value we obtain when we take the point estimate plus the margin of error, in our example 20 + 2 or 22, the upper limit or upper bound.

You will also see the notation of the lower and upper limit in parentheses for confidence intervals. In our above example, the confidence interval may be written as (18,22).

Page ④ of 89

It is also important for us to note the level of confidence of a confidence interval. In our example before, our level of confidence would have been that we were 95% confident that the mean age of all students in the class was somewhere on our interval.

So the level of confidence is the probability that the interval contains the population parameter, in this case, μ.

We will see in examples that as we increase our level of confidence, we will get wider and wider intervals.

We will be constructing two different types of confidence intervals:

  1. In Section 8.2, we will be calculating the confidence interval for the population proportion, p.
  2. In Section 8.3, we will be calculating the confidence interval for the population mean, μ (like our classroom age example).

Before we get to these sections, let’s make sure we understand the terms in the example on the next page. In this example, the confidence interval will already be constructed for us. In Sections 8.2 and 8.3, we will actually learn how to construct these confidence intervals.

Page ⑤ of 89

Example: Suppose a farmer is trying to estimate the average number of peaches per tree in his orchard. He does not want to count every peach on every tree, so he takes a random sample of a few trees and calculates a 95% confidence interval based on the sample. That 95% confidence interval for the mean number of peaches per tree in the orchard is 112 to 148 peaches per tree. This means that we are 95% confident that the population mean, μ, for the number of peaches per tree is somewhere between 112 and 148 peaches per tree.

What is the lower limit?

What is the upper limit?

What is the level of confidence?

What is the width of the confidence interval?

What is the sample mean,x? Remember, the sample mean is always the middle of the confidence interval. The sample mean,x, will always be on the confidence interval, but the population mean, μ, may or may not be on the confidence interval.

What is the margin of error?

Page ⑥ of 89

Example of a Confidence Interval for a proportion: Suppose there is an election coming up, and we want to predict what proportion of the votes that candidate A will receive. Suppose we took a random sample of 200 voters and found that 112 of these voters said they would vote for candidate A. What proportion of the voters in our sample said they would vote for candidate A? In other words, what is the sample proportion for this sample?

We are trying to predict what proportion of all voters will vote for candidate A by using this sample. What if I told you that the margin of error for a 95% confidence interval to be used to predict the population proportion is equal to 0.07. Construct and interpret this interval.

Candidate A will win if he/she gets more than 50% of the votes. Using the interval, are we 95% confident that more than half of the voters will vote for candidate A?

Page ⑦ of 89

8.2 How Can We Construct a Confidence Interval

to Estimate a Population Proportion?

Recall from Section 8.1 that confidence intervals can be written in the general format: point estimate +/- margin of error. The point estimate and margin of error change depending on what parameter is being estimated. For example, we looked at an example of a Confidence Interval for μ, so our point estimate was⎯x. Now we will consider the format of the Confidence Interval for the population proportion, p.

The point estimate for this type of Confidence Interval is the sample proportion, = x / n , where x is the number of individuals in the sample with the desired characteristic and n is the sample size.

So we know what goes before the +/-, the point estimate, and we can calculate that easily. Now we need to know how to calculate what goes after the +/-, the margin of error.

ˆ p

Page ⑧ of 89

The margin of error will always be a multiple of the standard error. In Section 8.2, we discuss confidence intervals for population proportions, so the standard error will be:

Why is it now in the formula and not p?

So the margin of error will always be some number times the standard error we see above. The number we multiply the standard error by to get the margin of error TOTALLY depends on the level of confidence.

The general formula for a confidence interval for the population proportion is:

You can see that the margin of error is this “Z” value times the standard error. Later on in this chapter, we will see how to get this Z value, because this Z value TOTALLY depends on our level of confidence, how confident we want to be that the population proportion is on our interval.

For now, we will just focus on 95% confidence intervals, where this Z value equals 1.96.

p ˆ

Page ⑬ of 89

How can we use a Confidence Level Other Than 95%?

So far we have just been creating 95% confidence intervals, so our margin of error has been 1.96 * (standard error). But where does this 1.96 come from? And what if we want something different than a 95% confidence interval?

We can never have a 100% confidence interval, because we can never be 100% sure that the population proportion is on the interval if we don’t know it. But what if we want a 90% confidence interval, or a 99% confidence interval?

Here is how we get the 1.96 for a 95% confidence interval: First, when you think of a 95% confidence interval, think of a normal curve with 95% shaded in the middle like this:

If .95 or 95% is in the middle, then what is the area in each of the tails? 1 - .95 = .05 and .05/2 = .025 so .025 is in each of the tails. Now, put .025 as the little tail area to the right in your StatCrunch calculator with mean = 0 and standard deviation = 1, hit Compute and you get 1.96!

Page ⑭ of 89

This 1.96 is the Z-score that matches up with a 95% confidence interval. This is why it was important for us to find those probabilities involving Z-scores before.

This Z-score value is what we now take and multiply the standard error by to get the margin of error for a confidence interval, and it will always be a positive Z-score.

We did a lot of work to get there, let’s work through it again and see what the Z-score would be for a 90% confidence interval: First, draw the curve with .90 in the middle and find the area of both tails:

Then put the area of the tail in the StatCrunch Normal Calculator with mean = 0 and standard deviation = 1:

The z-score you get = 1.64485.

So to get the margin of error for a 90% confidence interval you multiply the standard error by 1.64485. Let’s use this in the following example:

Page ⑮ of 89

Example: A study of 70 randomly selected people in Atlanta was conducted to estimate the proportion of Atlantans that owned dogs. The study revealed that 42 of the 70 people were dog-owners.

a) Obtain a point estimate for the population proportion of dog-owners in Atlanta.

b) Verify that the requirements for constructing a confidence interval about p are satisfied.

c) Construct a 90% confidence interval for the proportion of Atlantans that are dog-owners.

d) Interpret the confidence interval.

Page ⑯ of 89

Now, using the same example as above, construct a 99% confidence interval. Let’s see how the interval changes if we increase the confidence level.

We have the point estimate and the standard error, so we just need the new Z-score for this confidence interval: First, draw the curve with .99 in the middle and find the area of both tails:

Then put the area of the tail in the StatCrunch Normal Calculator with mean = 0 and standard deviation = 1:

The Z-score =

Now create the 99% confidence interval:

Notice that the 99% confidence interval is wider than the 90% confidence interval.

Page ⑰ of 89

In this example, we saw that:

As the level of confidence increases, the margin of error increases and the confidence interval gets wider.

ALSO, as the level of confidence decreases, the margin of error decreases and the confidence interval gets narrower.

This applies to all confidence intervals, like in the picture below:

Why is this true?

With a 95% confidence interval, we want to be 95% confident that the population parameter is on the interval.

But with a 99% confidence interval, we want to be even more confident (99% confident) that the population parameter is on the interval. So to be that much more sure the proportion is on the interval, we need a wider interval.

Page ⑱ of 89

We have seen what happens when we change the confidence level, what about if we change the sample size?

As the sample size increases, the margin of error decreases and the confidence interval gets narrower.

As the sample size decreases, the margin of error increases and the confidence interval gets wider.

So the opposite happens when we increase the sample size. The confidence interval gets narrower.

Why is this true?

As we increase our sample size, the sample statistic we obtain (whether we are looking for a mean or a proportion) is a better representation of the population.

So as we increase our sample size, our point estimate is a better and better estimate, and we don’t need such a wide confidence interval.

Page ⑲ of 89

RECAP:

The following symbols go along with the following terms when calculating the confidence interval for the population proportion:

Term Symbol

Point Estimate Margin of Error

Standard Error

Confidence Interval

Page ⑳ of 89

HOW CAN STATCRUNCH CALCULATE THESE

CONFIDENCE INTERVALS FOR US?

Look back at our example where we wanted to get a 90% confidence interval for the population proportion of ALL Atlantans that own dogs on page 30 of our notes.

We got the 90% confidence interval which has a lower limit of .50369 and an upper limit of .69631.

On page 31, we got the 99% confidence interval which has a lower limit of .44917 and an upper limit of .75083.

Guess what, STATCRUNCH can get these values for us. Go to Stat Æ Proportions Æ One Sample ÆWith Summary

Here we can type in how many Atlantans owned dogs in our sample. In our sample, 42 of 70 Atlantans owned dogs. Put those numbers in just like this and hit Next:

On the next screen choose “Confidence Interval”, and we want a 90% confidence interval, so change the 0.95 to 0.90:

Page 25 of 89

So our confidence interval formula for the population mean is:

Lower limit:x − T · s n Upper limit:x + T · s n

These intervals are valid when we :

  1. use a random sample AND
  2. either use a sample size > 30 OR when we are sampling from a normal population.

So we can get the sample mean, sample standard deviation and n value, but we haven’t yet talked about what the T value is that we want from the T Calculator.

To get the T value is just the same as getting the Z value when we were doing confidence intervals for the population proportion in Section 8.2.

The only difference is that the T value depends on BOTH the confidence level and the sample size.

Page 26 of 89

Let’s find the T value for a 95% confidence interval if the sample size we used is n = 32. First, draw a curve with .95 in the middle and find the area of both tails:

Next put in the right tail area = .025 in the T Calculator AND put DF = 32 – 1 = 31.

Hit Compute and you get T = 2.

Let’s do a few more: These are the same thing they are asking you to get on Homework 8.3-8.4. a) Find the t-score for a 99% confidence interval for a population mean with 5 observations in our sample. First, draw a curve with .99 in the middle and find the area of both tails:

Next put in the right tail area = .005 in the T Calculator AND put DF = 5 - 1 = 4. Hit Compute and you get

T =

So now we can construct confidence intervals for the population means. We can get all the symbols in these formulas:

Lower limit:x − T · s n Upper limit:x + T · s n

Finally, let’s do some examples.

Example 7 in Section 8.3: Ipods are sold all the time on eBay. We have the prices for a random sample of seven Ipods that recently sold on eBay:

We will assume these prices are normally distributed. We want to find the 95% confidence interval for the population mean price of Ipods sold. In other words, we want to construct an interval of numbers, and be 95% confident that, if we averaged the price of ALL Ipods sold on eBay, the average price would be on our interval. We need to find:

Lower limit:x − T · s n Upper limit:x + T · s n

Let’s break it down. n = 7 because we are using a sample of 7 Ipods sold on eBay. How do we get ⎯x and s? Easy, we can list our seven prices in StatCrunch, go to Stat Æ Summary Stats Æ Columns Choose our column with the data and get:

so ⎯x = 233.57143 and s = 14. Now finally we need to get the T score:

Page 29 of 89

First, draw a curve with .95 in the middle and find the area of both tails:

Next put in the right tail area = .025 in the T Calculator AND put DF = 7 - 1 = 6.

Hit Compute and you get

T = 2.

Page 30 of 89

Now we have everything we need, we can now construct the lower and upper limits of the 95% confidence interval:

Lower limit =⎯ x − T · s n

Upper Limit =⎯ x + T · s n

So we are 95% confident that the mean price of ALL Ipods sold on eBay is somewhere between $220.03 and $247.11.

EXTRA QUESTION: According to our confidence interval, is it likely that the population mean price of ALL Ipods sold on eBay = $250?

No, $250 is not on our confidence interval, so therefore it is not a likely mean price for ALL Ipods sold on eBay. We think that mean price should be somewhere between $220.03 and $247.11.

EXTRA QUESTION #2: According to our confidence interval, is it possible that the population mean price of ALL Ipods sold on eBay = $225?

Yes, $225 is a possible mean price because it is on our interval.

Using STATCRUNCH to construct confidence intervals: Whenever we have actual data like in the above eBay example, we can put this data into StatCrunch and StatCrunch will actually calculate these intervals for us. First, put the seven eBay prices in a column on StatCrunch.

Go to Stat Æ T Statistics Æ One Sample Æ with data Choose the column you have put the data in and hit “Next”. Choose “Confidence Interval” and type in 0.95. Hit Calculate and here are our results:

The same amounts we got before: Lower limit of the confidence interval = 220. Upper limit of the confidence interval = 247.

Let’s do an example like this where we have to calculate the limits using the summary statistics and not the actual data. Example: Suppose we are trying to estimate the average of a large population of test scores. Suppose a random sample of 16 test scores is taken from a normal population. If the average of these 16 test scores isx = 78.2 and the sample standard deviation, s = 2.55, then construct a 90% confidence interval for the population mean of all test scores. Let’s do this by hand first.

Page 37 of 89

What if we are not dealing with a population proportion example, but a population mean example? That is, we want to know what sample size we need so that the sample mean we get is close enough to the true population mean.

For example, maybe we want to estimate the average income for an entire company. We want to take a sample of their employees, and get a sample mean of their income. And we want this sample mean income to be within $ of the entire company’s mean income with 95% confidence. We can determine what sample size is needed so that whatever sample mean income we get, it will be within $5000 of the population mean income, and we can be 95% confident of that. Here is the formula we use to determine sample size for estimating the population mean: 2 2 2

Z

n

m

σ

where σ is the provided standard deviation, m is the margin of error, and Z is obtained just like before.

Page 38 of 89

Example: An estimate is needed of the mean height of women in Ontario, Canada. A 95% confidence interval should have a margin of error of 3 inches. A study ten years ago in this province had a standard deviation of 10 inches. (a) About how large a sample of women is needed?

(b) About how large a sample of women is needed for a 99% confidence interval to have a margin of error of 3 inches?

Chapter 9 Statistical Inference: Significance Tests

about Hypotheses (Hypothesis Testing)

9.1: What are the steps for performing a

Significance Test?

In this section we will introduce the language and

steps of significance testing. The procedures will be

addressed in later sections of Chapter 9.

Basics of Significance Testing

1. A statement is made about a population

parameter.

2. A claim is made that this statement is incorrect.

3. Evidence (sample data) is collected in order to

test the claim.

4. The data are analyzed in order to support or

refute the claim.

Example: A car manufacturer advertises a mean gas

mileage of 26 mpg. A consumer group claims that the

mean gas mileage is less than 26 mpg. A sample of

33 cars is taken and the sample mean for these 33

cars is 25.2 mpg.

Significance testing is a procedure, based on sample

evidence and probability, used to test claims

regarding a characteristic of one or more populations.

We use sample data to test hypotheses.

The Five Steps of a Significance Test:

1. Assumptions

2. Hypotheses

3. Test Statistic

4. P-value

5. Conclusion

Page 41 of 89

  1. Assumptions – each type of test will have certain assumptions that we need to check (ex. is the sample size large enough?)
  2. Hypotheses Each significance test has two hypotheses about a population parameter: the null and alternative hypotheses.

The null hypothesis, denoted H 0 (read “H-naught”) is a statement to be tested. The null hypothesis is assumed true until evidence indicates otherwise. In this chapter, it will be a statement regarding the value of a population parameter. In our car example, the null hypothesis is H 0 : μ = 26 mpg

This is the statement made by the car manufacturer that we have to accept as true before we test the claim.

The alternative hypothesis, denoted HA , is a claim to be tested. Generally, this is a statement that says the population parameter has a value different, in some way, from the value given in the null hypothesis. In experiments, we are usually trying to find evidence for the alternative hypothesis. In our car example, the alternative hypothesis is HA : μ < 26 mpg

This is the claim made by the consumer group, that the mileage is less than what the car manufacturer stated.

Page 42 of 89

There are three ways to set up the null and alternative hypotheses.

  1. Less than test (left-tailed test) H0: parameter = some value HA: parameter < some value

Example: A car manufacturer advertises a mean gas mileage of 26 mpg. A consumer group claims that the mean gas mileage is less than 26 mpg.

  1. Greater than test (right-tailed test) H0: parameter = some value HA: parameter > some value

Example: A newspaper states that a candidate will receive 46% of the votes in an upcoming election. An analyst believes the percentage will be higher than 46%.

  1. Not equal to test (two-tailed test) H0: parameter = some value HA: parameter ≠ some value

Example: Five years ago, the average daily rainfall in a jungle was 2 inches. A scientist thinks it is different now.

We always test about population parameters, like μ and p. We never test about sample statistics, like⎯x and p-hat, because they change with every sample.

Example: Determine whether the significance test is left- tailed, right-tailed or two-tailed.

a) H0: μ = 26 HA: μ < 26

b) H0: p = 0. HA: p > 0.

c) H0: μ = 2 HA: μ ≠ 2

  1. Test Statistic

In all of these tests, we will be testing the population parameter based on what we get in a sample. The test statistic tells how far away the sample statistic is from the assumed population parameter. It tells us this information in terms of how many standard errors away the sample statistic we get is from the assumed population parameter.

Think of the car example. The manufacturer states that their cars get 26 mpg. We take a sample of their cars, and in our sample, their cars get 25.2 mpg on average. We want to see how far away our sample mean, 25.2, is from the assumed population parameter, 26, in terms of the standard error.

Example: If the test statistic = -1, then the sample mean of 25.2 was only one standard error below the population mean of 26 mpg. Draw a curve to represent this.

Example: If the test statistic = -2.5, then the sample mean of 25.2 was 2.5 standard errors below the population mean of 26 mpg. Draw a curve to represent this.

We will see the formula for how to calculate this test statistic for different tests in Sections 9.2 and 9.3.

Page 49 of 89

9.2: Significance Tests about Proportions

Let’s look at an example of a significance test about a population proportion before we get to the steps in the test.

Example: A magazine states that 40% of the population will vote for candidate A in the upcoming election. We claim that the proportion is higher than 40%. State the null and alternative hypotheses:

To test this claim, let’s say we take a random sample of 200 people and ask those 200 people who they are going to vote for. This will give us a sample proportion, the proportion in our sample of 200 that will vote for candidate A.

We will then calculate a test statistic (we will see the formula for this in a few pages) that will tell us how many standard errors away from .4 our sample proportion is.

If our sample proportion is really far above .4, then we will reject the null hypothesis and accept our claim, the alternative hypothesis.

If the sample proportion is not that far above .4, then we will not reject the null hypothesis.

Page 50 of 89

Steps in a Significance Test about a Population Proportion:

We will do an example of this in a few pages, for now let’s just talk through the steps.

  1. Assumptions – When performing significance tests about a population proportion, we need the following three assumptions to be true:

a) The data is categorical.

b) The data are obtained using randomization (like a random sample).

c) We need the shape of the sampling distribution of the sample proportion to be approximately normal SO we need: n * p 0 ≥ 15 AND n * (1 – p (^) 0) ≥ 15 where p 0 = assumed population proportion we are testing

  1. Hypotheses – We set up the hypotheses just like we have seen before.

The null hypothesis will be p = value (like p = .4)

If it is a two-tailed test, the alternative hypothesis will be p ≠ value (like p ≠ .4)

If it is a left- or right-tailed test, the alternative hypothesis will be either p < value or p > value (like p >.4)

  1. Test Statistic – The test statistic tells us how far the sample proportion we get, , falls from the assumed population proportion, p, if the null hypothesis is true.

The test statistic will tell us how many standard errors away from p that our sample proportion, , is.

To get this test statistic value, here is the formula:

where = our sample proportion p0 = assumed population proportion that we are testing and n = sample size we used to get our sample proportion

So again, our test statistic is how far away our sample proportion is from our assumed population proportion

( ) in terms of the standard error

(divided by ).

And as you can see, we treat this test statistic as a Z-score.

Again, we will calculate these values in examples on the next few pages.

  1. P-value – The p-value is the probability that we would get a sample proportion that far away from the assumed population proportion assuming that the null hypothesis is true (in other words, assuming that the assumed population proportion is correct.) Draw an example of a curve showing this:

But we have turned our sample proportion into a test statistic, we have turned it into a Z-score in Step 3 so that we can find this area using the Normal Calculator in StatCrunch.

So we now want to find the probability that we would get this extreme a Z-score under our standard normal curve. So we need to look back at our alternative hypothesis to see what type of test we are using.

If it is right tailed test, (p > value), then the p-value is the area to the right of this Z-score under the standard normal curve. If it is a left tailed test, (p < value), then the p-value is the area to the left of this Z-score under the standard normal curve. If it is a two tailed test, (p ≠ value), then the p-value is the sum of the area to the left of the negative Z-score plus the area to the right of the positive Z-score. This sounds complicated, but it’s not that bad. We will do examples and see that we can just find the area to the right of the positive Z-score and double it because the graph is symmetrical.

Page 53 of 89

  1. Conclusion – The p-value is going to lead us to our conclusion.

In the problems, we will always be given a significance level to compare our p-value to.

If the p-value < significance level, then the probability of us obtaining that sample proportion was really small. So we did get a sample proportion far enough away from the population proportion to reject it. So if the p-value < significance level, we reject the null hypothesis and accept our claim, the alternative hypothesis.

If the p-value > significance level, then the probability of us obtaining that sample proportion was not that small, and the sample proportion is not that far away from the population proportion. So if the p-value > significance level, we do not reject the null hypothesis. We don’t have enough evidence based on this sample to reject the null hypothesis.

The last four pages may have gone right by you, but that’s okay, these things are much easier to see in examples, so let’s take a look at one.

Page 54 of 89

Example: A magazine states that 40% of the population

will vote for candidate A in the upcoming election. We claim that the proportion is higher than 40%. In a random sample of 200 people, 86 said they would vote for candidate A. Is this sufficient evidence to claim that the proportion is higher than 40% at the .05 significance level?

1. Assumptions - Is this a random sample and is the data categorical?

Is n * p 0 ≥ 15 and n * (1-p0) ≥ 15?

2. Hypotheses – Set up the null and alternative hypotheses

Before we get to step 3, what is p-hat, the sample proportion?

3. Test Statistic -

Test Statistic:

We want to see how far away our sample proportion is from the assumed population proportion in terms of the standard error.

So our sample proportion is 0.86603 standard errors above the assumed population proportion.

4. P-value -

Because HA has >, this is a right tailed test. Draw a graph with the population proportion in the middle and shade the area to the right of our sample proportion. This shaded area is the p-value.

Draw a standard normal curve and shade the area to the right of our Z-score. This shaded area is also the p-value.

We need to find that area to the right of our Z-score. We can do this in StatCrunch by putting in the Z-score and finding the area to the right of it.

So our p-value =. This means that there is a .19324 or a 19.324% chance that we would get a sample proportion this far off from the population proportion, if the assumed population proportion is correct.

So if the population proportion is correct, then our sample proportion isn’t that rare. There was a 19.324% chance of us getting a sample proportion this far from the population proportion. So we probably won’t reject the null hypothesis, but we still need to compare this p-value to the level of significance that they gave us to see if it is significant.

Page 61 of 89

3. Test Statistic -

We want to see how far away our sample proportion is from the assumed population proportion in terms of the standard error.

So our sample proportion is 2.77128 standard errors above the assumed population proportion.

4. P-value -

Because HA has ≠, this is a two tailed test. Draw a graph with the population proportion in the middle and shade the area to the right of our sample proportion. Also, shade the same symmetrical area on the left. This shaded area of both tails is the p-value.

Draw a standard normal curve and shade the area to the right of our positive Z-score, and to the left our our negative Z-score. This shaded area is also the p-value.

Page 62 of 89

We need to find the area in both tails because this is a two- tailed test. We can do this on StatCrunch by putting in the positive Z- score and finding the area to the right of it.

Our area in the left tail (area to the left of the negative Z- score) will be the same because the graph is symmetrical. So our p-value = .00279*2 =.

This means that there is a .00558 or only a 0.558% chance that we would get a sample proportion this far off from the population proportion, if the assumed population proportion is correct.

5. Conclusion -

The P-Value ≤ the given significance level, so we will reject H 0 and state that there is sufficient evidence to reject H 0 and accept HA at the .01 level of significance.

Interpretation: So we are rejecting the credit card company’s statement that the population proportion of college students that carry credit card debt is 50%, and accepting our claim that the population proportion of college students that carry credit card debt is different from 50%. We got a sample proportion far enough away from 50% that we could make this claim.

This was done at the 1% level of significance. The level of significance is saying that if we wanted to reject the null hypothesis, there needed to be less than a 1% chance of us getting the sample proportion that we got, and there was. There was only a 0.558% chance of us getting a sample proportion this far from the population proportion.

Again, let’s see how StatCrunch can easily do it for us.

Go to Stat Æ Proportions Æ One Sample Æ with summary

In our sample, we had that 174 out of the 300 college students in our sample had credit card debt. So the number of yeses or “successes” = 174 And the number of observations = 300

Hit Next and choose “Hypothesis Test” The credit card company states that 50% or .5 of the college students carry credit card debt. So enter the null hypothesis that the proportion =. Our alternative hypothesis is our claim, that we think that proportion is really different from 50% or .5, so choose “≠” Hit Calculate and what we get is on the next page:

Page 65 of 89

Same test statistic = 2.77128 and same p-value =.

Page 66 of 89

Section 9.3: Significance Tests about Means

In this section, we run significance tests to test values

of population means that have been given, like our

car manufacturer example:

Example: A car manufacturer advertises a mean gas

mileage of 26 mpg. A consumer group claims that the

gas mileage is less. A sample of 33 cars is taken and

the sample mean for these 33 cars is 25.2 mpg.

Before we work through examples, we need to see

how the 5 steps in the test have changed. We will see

the big change is the new formula for the test statistic,

and the fact that we treat the test statistic as a t-value

rather than a Z-score like we used in Section 9.2.

Steps in a Significance Test about a Population Mean:

  1. Assumptions – When performing significance tests about a population mean, we need the following assumptions to be true:

a) The variable is quantitative

b) The data are obtained using randomization (like a random sample)

c) The population distribution is approximately normal or we are using a sample size ≥ 30.

  1. Hypotheses – We set up the hypotheses just like we have seen before.

The null hypothesis will be μ = value (like μ = 26)

If it is a two-tailed test, the alternative hypothesis will be μ ≠ value (like μ ≠ 26)

If it is a one-tailed test, the alternative hypothesis will be either μ < value or μ > value (like μ < 26)

Page 73 of 89

3. Test Statistic –

We want to see how far away our sample mean is from the assumed population mean in terms of the standard error.

So our sample mean is 1.58471 standard errors below the assumed population mean.

4. P-value -

Because HA has <, this is a left tailed test. Draw a graph with the population mean in the middle and shade the area to the left of our sample mean. This shaded area is the p-value.

Draw a t curve and shade the area to the left of our t-value. This shaded area is also the p-value.

Page 74 of 89

We need to find that area to the left of our t-value. We can do this in StatCrunch by putting in the t-value and finding the area to the left of it.

So our p-value =. This means that there is a .06143 or a 6.143% chance that we would get a sample mean this far off from the population mean, if the assumed population mean is correct.

5. Conclusion -

The P-Value > the given significance level, so we will NOT reject H 0 and state that there is NOT enough evidence to reject H 0 and accept HA at the .05 level of significance.

Interpretation: So we are not rejecting the car manufacturer’s statement that the mean gas mileage of all their cars is 26 mpg, and we are not accepting our claim that the mean gas mileage of all their cars is less than 26 mpg because we got a sample mean that was NOT significantly below 26 mpg.

This was done at the 5% level of significance. This means that we did not reject the null hypothesis because there was not less than a 5% chance of us getting a sample mean this far off from the population mean.

Now let’s see how StatCrunch can do it for us. Go StatÆT StatisticsÆOne SampleÆwith summary Put in the sample mean, sample standard deviation and sample size in like this:

Hit Next and choose “Hypothesis Test”. Put in mean = 26 for the null hypothesis and it is a left-tailed test so put in “<” for the alternative hypothesis like this:

Here are the results:

Page 77 of 89

Problem on Homework Chapter 9

An industrial plant claims to discharge no more than 1000 gallons of wastewater per hour, on the average, into a neighboring lake. An environmental action group decides to monitor the plant, in case this limit is being exceeded. Doing so is expensive, and only a small sample is possible. A random sample of four hours is selected over a period of a week. Test at the 0.05 significance level. Assume the distribution of wastewater is approximately normal. The observations are below. 2000, 1500, 3000, 2500

1. Assumptions - Is this a random sample and is the variable quantitative?

Is the population normally distributed?

2. Hypotheses – Set up the null and alternative hypotheses

We aren’t given the sample mean and standard

deviation but we can put these numbers into

StatCrunch (StatÆSummary StatsÆColumns) and

get that the sample mean = 2250 and the sample

standard deviation = 645.49725.

Page 78 of 89

3. Test Statistic –

We want to see how far away our sample mean is from the assumed population mean in terms of the standard error.

So our sample mean is 3.87298 standard errors above the assumed population mean.

4. P-value -

Because HA has >, this is a right tailed test. Draw a graph with the population mean in the middle and shade the area to the right of our sample mean. This shaded area is the p-value.

Draw a t curve and shade the area to the right of our t-value. This shaded area is also the p-value.

We need to find that area to the right of our t-value. We can do this in StatCrunch by putting in the t-value and finding the area to the right of it.

So our p-value =. This means that there is only a .01523 or a 1.523% chance that we would get a sample mean this far off from the population mean, if the assumed population mean is correct.

5. Conclusion -

The P-Value ≤ the given significance level, so we

will reject H 0 and state that there is sufficient

evidence to reject H 0 and accept H A at the .05 level of

significance.

So we are rejecting the industrial plant’s statement

that the population mean of the wastewater per hour

that the plant puts out is 1000 gallons, and accepting

our claim that the population mean is greater than

1000 gallons.