Statistics 571 Discussion 5: Hypothesis Testing and p-values, Exams of Data Analysis & Statistical Methods

An overview of hypothesis testing, focusing on the concepts of mean and population distribution. It covers the calculation of sample mean, test statistics, null and alternative hypotheses, p-values, and their interpretation. The document also includes examples and practice problems for testing means from normal distributions and testing proportions from binomial distributions.

Typology: Exams

Pre 2010

Uploaded on 09/02/2009

koofers-user-zw8
koofers-user-zw8 🇺🇸

5

(3)

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
STATISTICS 571 TA: Perla Reyes DISCUSSION 5
Review
1. The notion “MEAN”:
(a) Population mean µ.is the mean of entire population, usually unknown. We use sample mean
¯xto estimate it.
(b) Sample mean. is a certain number. After you get a set of observations –sample–, the number
1
nPn
i=1 xi= ¯xis the sample mean.
(c) Random variable ¯
X.: Suppose you decide to get a sample of size n from the population. Before
your experiment, you know you will get n random variables from the population, and their average
¯
X=1
nPn
i=1 Xiis still a random variable. When you get different samples, ¯
Xmay change. Any
certain sample mean is a realization of the random variable ¯
X.
2. Steps for a significant test:
(a) Parameter of interest
Aspect of the population that it is of interest: µ,p.
(b) Formulate the null(H0) and alternative(HA) hypothesis.
i. H0is the position that we wish to support unless there is strong evidence against it. Standard,
known from before, established value. H0:µ=µ0.
ii. HAis challenging assertion or new idea, that one wishes to be able to check. Usually two-
sided (HA:µ6=µ0) is prefered unless there is a strong reason to use one-sided (HA:µ < µ0
or HA:µ > µ0).
(c) Find test statistic and null distribution.
The test statistic and the distribution of the test statistic under the null hypothesis depend on:
i. The parameter we are testing.
ii. The distribution of our population.
iii. The information we have about the distribution of our population.
The following table resume all the possible options that we have until now.
Parameter of
Interest
Population
Distribution
Information Test statistic Distribution
statistic
µ N(µ, σ2)σ2known Z=¯
Xµ0
σ/nExactly N(0,1)
µ N(µ, σ2)σ2unknown T=¯
Xµ0
S/nExactly Tn1
µUnknown σ2known
and nlarge
Z=¯
Xµ0
σ/nApproximately
N(0,1)
µUnknown σ2unknown
and nlarge
T=¯
Xµ0
S/nApproximately
N(0,1)
p Bi(n, p)np05 and
nq05
Z=PXinp0
np0q0Approximately
N(0,1)
p Bi(n, p)np0<5 or
nq0<5
Y=PXiExactly
Bi(n, p0)
(d) Calculate p-value.
The p-value is the probability of observing an event as extreme or more extreme than what we
observed, if H0is true and using HAto determine what kinds of data constitute ”extreme” data.
The are the possible options, for two cases. All other cases have a similar construction.
email: [email protected] 1 Office: 248 MSC M2:30-3:30 R3:30-4:30
pf2

Partial preview of the text

Download Statistics 571 Discussion 5: Hypothesis Testing and p-values and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

STATISTICS 571 TA: Perla Reyes DISCUSSION 5

Review

  1. The notion “MEAN”:

(a) Population mean μ. is the mean of entire population, usually unknown. We use sample mean ¯x to estimate it. (b) Sample mean. is a certain number. After you get a set of observations –sample–, the number 1 n

∑n i=1 xi^ = ¯x^ is the sample mean. (c) Random variable X¯.: Suppose you decide to get a sample of size n from the population. Before your experiment, you know you will get n random variables from the population, and their average X¯ = 1 n

∑n i=1 Xi^ is still a random variable. When you get different samples,^ X¯^ may change. Any certain sample mean is a realization of the random variable X¯.

  1. Steps for a significant test:

(a) Parameter of interest Aspect of the population that it is of interest: μ, p. (b) Formulate the null(H 0 ) and alternative(HA) hypothesis. i. H 0 is the position that we wish to support unless there is strong evidence against it. Standard, known from before, established value. H 0 : μ = μ 0. ii. HA is challenging assertion or new idea, that one wishes to be able to check. Usually two- sided (HA : μ 6 = μ 0 ) is prefered unless there is a strong reason to use one-sided (HA : μ < μ 0 or HA : μ > μ 0 ). (c) Find test statistic and null distribution. The test statistic and the distribution of the test statistic under the null hypothesis depend on: i. The parameter we are testing. ii. The distribution of our population. iii. The information we have about the distribution of our population. The following table resume all the possible options that we have until now. Parameter of Interest

Population Distribution

Information Test statistic Distribution statistic

μ N (μ, σ^2 ) σ^2 known Z = X¯−μ 0 σ/√n Exactly^ N^ (0,^ 1)

μ N (μ, σ^2 ) σ^2 unknown T = X¯−μ 0 S/√n Exactly^ Tn−^1

μ Unknown σ^2 known and n large

Z =

X¯−μ 0 σ/√n Approximately N (0, 1)

μ Unknown σ^2 unknown and n large

T =

X¯−μ 0 S/√n Approximately N (0, 1)

p Bi(n, p) np 0 ≥ 5 and nq 0 ≥ 5

Z =

P (^) X √i−np^0 np 0 q 0 Approximately N (0, 1)

p Bi(n, p) np 0 < 5 or nq 0 < 5

Y =

Xi Exactly Bi(n, p 0 )

(d) Calculate p-value. The p-value is the probability of observing an event as extreme or more extreme than what we observed, if H 0 is true and using HA to determine what kinds of data constitute ”extreme” data. The are the possible options, for two cases. All other cases have a similar construction.

email: [email protected] 1 Office: 248 MSC M2:30-3:30 R3:30-4:

STATISTICS 571 TA: Perla Reyes DISCUSSION 5

Testing μ from a population with Normal distribution and σ^2 known. HA Observed value p-value μ < μ 0 z P {Z < z} μ > μ 0 z P {Z > z} μ 6 = μ 0 z P {Z < −z} + P {z < Z}

Testing p from a population with Binomial distribution and np 0 < 5 or nq 0 < 5. HA 0bserved value p-value p < p 0 y P {Y ≤ y} p > p 0 y P {Y ≥ y} p 6 = p 0 y P {Y ≤ (np 0 − |np 0 − y|)} + P {Y ≥ (np 0 + |np 0 − y|)} (e) Interpretation of p-value and Conclusions i. If we have a defined 100α% significance level A. p-value ≤ α −→ reject H 0 B. p-value > α −→ accept H 0 ii. If we do not have a defined 100 α% significance level. p-value ≥ 0.10 no evidence against H 0 0.05 ≥ p-value < 0.10 weak evidence against H 0 0.01 ≥ p-value < 0.05 moderate evidence against H 0 0.001 ≥ p-value < 0.01 strong evidence against H 0 p-value < 0.001 very strong evidence against H 0

Practice Problem

  1. Consider taking a random sample from a N(μ, σ^2 ) distribution with σ = 6. Consider testing hypothesis H 0 : μ = 35 versus HA: μ 6 = 35. Suppose that a random sample of size 9 is taken and that ¯x = 32. What is the p-value for your hypothesis test? Are the results significant at 5%?
  2. A particular strain of bacteria is used for nitrogen fixation on a certain variety of alfafa. The nitrogen fixation is known to follow (approximately) a normal distribution. A scientist claims the mean amount of nitrogen fixed in a plant is 26.7 mg. Data are available on a random sample of 12 plants:

23.9, 26.2, 27.9, 22.2, 24.4, 25.8, 25.6, 28.1, 26.6, 26.0, 24.9, 23.

State symbolically the null and alternative hypothesis. Find the p-value for the test of the claim. Are the results significant at α = 5%? at α = 1%?

  1. A five-year-old census recorded that 20% of the families in a large community lived below the poverty level. To determine if this percentage has changed, a random sample of 400 families is studied and 70 are found to be living below the poverty level. Does this finding indicate that the current percentage of families earning incomes below the poverty level has changed from what it was five years ago?
  2. My cat Felix likes to hunt mice. I claim that each day his probability for hunting success (catching some mice) is 0.6. Assume that his hunting success is independent from day to day. I observe him carefully for 10 days dand found that he has hunting success 9 days. Perform a test of my claim.

email: [email protected] 2 Office: 248 MSC M2:30-3:30 R3:30-4: