Estimating Parameters with Confidence Intervals in Math Course - Prof. Glenn Tesler, Study notes of Mathematics

A lecture note from a mathematics 186 class taught by prof. Tesler on february 25, 2009. The notes cover the topics of confidence intervals and sample variance, with a focus on estimating parameters of the normal distribution and binomial distribution from data. Examples, formulas, and explanations of concepts such as z-scores, confidence intervals, and sample variance.

Typology: Study notes

Pre 2010

Uploaded on 03/28/2010

koofers-user-d5n
koofers-user-d5n 🇺🇸

10 documents

1 / 31

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Estimating parameters
5.3 Confidence Intervals
5.4 Sample Variance
Prof. Tesler
Math 186
February 25, 2009
Prof. Tesler () Ch. 5: Confidence Intervals & Sample Variance Math 186 / February 25, 2009 1 / 31
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f

Partial preview of the text

Download Estimating Parameters with Confidence Intervals in Math Course - Prof. Glenn Tesler and more Study notes Mathematics in PDF only on Docsity!

Estimating parameters

5.3 Confidence Intervals

5.4 Sample Variance

Prof. Tesler

Math 186 February 25, 2009

Estimating parameters of the normal distribution (μ, σ)

or the binomial distribution (p) from data

We will assume throughout that the SAT math test was designed to have a normal distribution.

Secretly, μ = 500 and σ = 100 , but we don’t know those are the values so we want to estimate them from data.

Chapter 5.3: Pretend we know σ but not μ and we want to estimate μ from experimental data. Chapter 5.4: Estimate both μ and σ from experimental data.

Sample data

  • 1 720 490 660 520 390 390 528. Trial # x 1 x 2 x 3 x 4 x 5 x 6 m = ¯x
  • 2 380 260 390 630 540 440 440.
  • 3 800 450 580 520 650 390 565.
  • 4 510 370 530 290 460 540 450.
  • 5 580 500 540 540 340 340 473.
  • 6 500 490 480 550 390 450 476.
  • 7 530 680 540 510 520 590 561.
  • 8 480 600 520 600 520 390 518.
  • 9 340 520 500 650 400 530 490.
  • 10 460 450 500 360 600 440 468.
  • 11 540 520 360 500 520 640 513.
  • 12 440 420 610 530 490 570 510.
  • 13 520 570 430 320 650 540 505.
  • 14 560 380 440 610 680 460 521.
  • 15 460 590 350 470 420 740 505.
  • 16 430 490 370 350 360 470 411.
  • 17 570 610 460 410 550 510 518.
  • 18 380 540 570 400 360 500 458.
  • 19 410 730 480 600 270 320 468.
  • 20 490 390 450 610 320 440 450. - Average 491.

Sample mean notation

Variable name

Common notations are ¯x or ¯y, etc. (bar over the variable name means to average it). Another notation is m (sample mean). Latin letters used for sample mean m = ¯x, sample standard deviation s, sample variance s^2. Greek letters used for true mean μ and true variance σ^2.

Lowercase/Uppercase

Lowercase: Given specific numbers x 1 ,... , xn, the sample mean evaluates to a number as well. Uppercase: We will study performing this computation repeatedly with different data, treating the data X 1 ,... , Xn as random variables. This makes the sample mean a random variable. m = ¯x =

x 1 + · · · + xn n

M = X =

X 1 + · · · + Xn n

Confidence intervals

We will rearrange this equation to isolate μ: P(− 1. 96 6 Z 6 1. 96 ) = P(− 1. 96 6

M − μ σ/

n

Interpretation: in ≈ 95 % of the trials of this experiment, the value M = m satisfies − 1. (^96 6) σ/m−√μn 6 1. 96

Solve for bounds on μ from the upper limit on Z: m−μ σ/√n 6 1.^96 ⇔^ m^ −^ μ^6 1.^96 √σ n ⇔^ m^ −^1.^96 √σ n 6 μ Notice the 1.96 turned into − 1. 96 and we get a lower limit on μ.

Also solve for an upper bound on μ from the lower limit on Z: − 1. (^96 6) σ/m−√μn ⇔ − 1. 96 √σn 6 m − μ ⇔ μ 6 m + 1. 96 √σn

Together, (^) m − 1. 96 √σn 6 μ 6 m + 1. 96 √σn

Confidence intervals

In ≈ 95 % of the trials of this experiment, the value M = m satisfies m − 1. 96

σ √ n

6 μ 6 m + 1. 96

σ √ n So, 95 % of the time we perform this experiment, the true mean μ is in the interval (^) ( m − 1. 96

σ √ n

, m + 1. 96

σ √ n

which is called a (two-sided) 95% confidence interval.

For a 100 ( 1 − α)% C.I., use ±zα/ 2 instead of ± 1. 96. Other commonly used percentages: For a 99 % confidence interval, use ± 2. 58 instead of ± 1. 96. For a 90 % confidence interval, use ± 1. 64 instead of ± 1. 96. For demo purposes: For a 75 % confidence interval, use ± 1. 15 instead of ± 1. 96.

Confidence intervals

σ = 100 known, μ = 500 unknown, n = 6 points per trial, 20 trials

Confidence intervals not containing point μ = 500 are marked (393.05,486.95). Trial # x 1 x 2 x 3 x 4 x 5 x 6 m = ¯x 75% conf. int. 95% conf. int. 1 720 490 660 520 390 390 528.33 (481.38,575.28) (448.32,608.35) 2 380 260 390 630 540 440 440.00 (393.05,486.95) (359.98,520.02) 3 800 450 580 520 650 390 565.00 (518.05,611.95) (484.98,645.02) 4 510 370 530 290 460 540 450.00 (403.05,496.95) (369.98,530.02) 5 580 500 540 540 340 340 473.33 (426.38,520.28) (393.32,553.35) 6 500 490 480 550 390 450 476.67 (429.72,523.62) (396.65,556.68) 7 530 680 540 510 520 590 561.67 (514.72,608.62) (481.65,641.68) 8 480 600 520 600 520 390 518.33 (471.38,565.28) (438.32,598.35) 9 340 520 500 650 400 530 490.00 (443.05,536.95) (409.98,570.02) 10 460 450 500 360 600 440 468.33 (421.38,515.28) (388.32,548.35) 11 540 520 360 500 520 640 513.33 (466.38,560.28) (433.32,593.35) 12 440 420 610 530 490 570 510.00 (463.05,556.95) (429.98,590.02) 13 520 570 430 320 650 540 505.00 (458.05,551.95) (424.98,585.02) 14 560 380 440 610 680 460 521.67 (474.72,568.62) (441.65,601.68) 15 460 590 350 470 420 740 505.00 (458.05,551.95) (424.98,585.02) 16 430 490 370 350 360 470 411.67 (364.72,458.62) (331.65,491.68) 17 570 610 460 410 550 510 518.33 (471.38,565.28) (438.32,598.35) 18 380 540 570 400 360 500 458.33 (411.38,505.28) (378.32,538.35) 19 410 730 480 600 270 320 468.33 (421.38,515.28) (388.32,548.35) 20 490 390 450 610 320 440 450.00 (403.05,496.95) (369.98,530.02)

Confidence intervals

σ = 100 known, μ = 500 unknown, n = 6 points per trial, 20 trials

In the 75% confidence interval column, 14 out of 20 (70%) intervals contain the mean (μ = 500 ). This is close to 75%.

In the 95% confidence interval column, 19 out of 20 (95%) intervals contain the mean (μ = 500 ). This is exactly 95% (though if you do it 20 more times, it wouldn’t necessarily be exactly 19 the next time).

A k% confidence interval means if we repeat the experiment a lot of times, approximately k% of the intervals will contain μ. It is not a guarantee that exactly k% will contain it.

Note: If you really don’t know the true value of μ, you can’t actually mark the intervals that do or don’t contain it.

One-sided confidence intervals

In a two-sided 95% confidence interval, we excluded the highest and lowest 2. 5 % of values and keep the middle 95%. One-sided removes the whole 5% from one side.

One-sided to the right: remove the highest (right) 5 % values of Z P(Z 6 z. 05 ) = P(Z 6 1. 64 ) =. 95

95% of experiments have

m − μ σ/

n

6 1. 64 so μ > m − 1. 64

σ √ n So the one-sided (right) 95% CI for μ is (m − 1. 64 √σn , ∞)

One-sided to the left: remove lowest (left) 5 % of values of Z P(−z. 05 6 Z) = P(− 1. 64 6 Z) =. 95 The one-sided (left) 95% CI for μ is (−∞, m + 1. 64 √σn )

Confidence intervals for p in the binomial distribution

An election has two candidates, A and B.

There are no other candidates and no write-ins.

Let p be the fraction of votes cast for A when the election is held and 1 − p be the fraction of votes cast for B.

A poll is taken in advance to estimate p. A single point estimate is called ˆp, and we also want a 95 % confidence interval for it.

Poll assumptions

The pollster only polls from a random sample of the N people who actually do vote, and gets n responses. The sample of people is representative. The respondents tell the truth and don’t change their minds.

Hypergeometric distribution

Notation

Actual election Sample polled before election N voters n people polled NA voting for A k voting for A NB voting for B n − k voting for B p = NA/N ˆp = k/n

Hypergeometric distribution (for sampling w/o replacement)

If we do sampling without replacement (i.e., once a person is polled, they cannot be polled again in the same poll), it is really the hypergeometric distribution (chapter 3.2): P(X = k) =

(NA

k

)( NB

n−k

) / (N

n

The expected number in the poll voting for A is still np with p = NA/N. The variance is np(^1 (−Np−)( 1 N) −n). If n  N, this pdf approximately equals the binomial pdf for n, p.

Estimating p for a poll with binomial distribution

Assuming n  N, we use the binomial distribution: The probability k out of n respondents say they’ll vote for A is P(X = k) =

n k

pk( 1 − p)n−k

The fraction of people who say they’ll vote for A is P̂ = X = X/n, with E(X) = p and Var(X) = p( 1 − p)/n. The ̂ (caret) notation indicates it’s a point estimate. We already use P for too many things, so we’ll use the X notation. Since X is a random variable, E(X) = p means that if you conduct polls of all combinations of n voters, the average is p. This is a theoretical statement, and ignores the fact that people would not cooperate with being polled multiple times.

Interval estimate of p using margin of error

Polls often report a margin of error instead of a confidence interval.

The half-width of the 95% confidence interval is 1. 96

p( 1 − p)/n, and before we estimated p by the point estimate ˆp.

The margin of error is the maximum that this half-width could be over all possible values of p ( 0 6 p 6 1 ); this is at p = 1 / 2 , giving margin of error 1. 96

( 1 / 2 )( 1 / 2 )/n = 1. 96 /( 2

n).

Interval estimate of p using margin of error

The margin of error is the maximum that this half-width could be over all possible values of p ( 0 6 p 6 1 ); this is at p = 1 / 2 , giving margin of error 1. 96

( 1 / 2 )( 1 / 2 )/n = 1. 96 /( 2

n).

With 1000 people, the margin of error is 1. 96 /( 2

or about 3 %. With 700 A’s, report ˆp =. 70 ±. 03.

A 3 % margin of error means that if a large number of polls are conducted, each on 1000 people, then at least 95% of the polls will give values of ˆp such that the true p is between ˆp ± 0. 03.

The reason it is “at least 95 %” is that 1. 96

p( 1 − p)/n 6 0. 03 and only = 0. 03 when p = 1 / 2 exactly. If the true p is not equal to 1 / 2 , then 0. 03 /

p( 1 − p)/n > 1. 96 so it would be a higher percent confidence interval than 95%.