Estimating Averages and Standard Errors in Statistics, Study notes of Statistics

Methods for estimating averages and standard errors in statistics, using examples and formulas. Topics include mean and standard deviation for averages, approximations based on the central limit theorem, confidence intervals for averages, and gauss models. The document also covers the difference between standard deviation and standard error, and provides examples of problems to solve.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-6l1
koofers-user-6l1 🇺🇸

10 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Models for Averages
In this class we consider situations where we are interested in the
average value of a population. We sample at random from a box
containing tickets with different values and we want to estimate the
accuracy of he average.
We consider the following specific topics:
Mean and standard deviation for averages
Approximations based on the Central Limit Theorem
Confidence intervals for averages
Gauss models
AMS-5: Statistics
215
Estimating Averages
We want to estimate the accuracy of samples obtained from box
models where the population is not divided in two groups, that is, a
sample is obtaining by drawing tickets with replacement from a box
and the numbers are recorded, what can we say about the average?
Consider the box
1 2 3 4 5 6 7
suppose 25 draws are made from the box with replacement.
The average of the box is 4, so the expected value of the sum is
25 ×4 = 100
The SD of the box is 2, so the SE for the sum is
25 ×2 = 10
AMS-5: Statistics
216
What about the average?
The expected value for the average of draws = average of box
SE for the average of draws = SE for sum
number of draws
So, for the previous box we have that the expected value of the
average is 4 and the SE is
10
25 =.4
so the average of draws will be equal to 4 give or take 0.4.
Q: What if we want to calculate the probability that the average of
draws will be above 4.4?
A: We can use the normal curve as an approximation. This is
because the normal approximation is valid for the sum, and the
average is just a chance of scale.
AMS-5: Statistics
217
Changing to standard units we have that
4.44.0
.4= 1
and the probability that a standard normal will be above 1 is
approximately 16%.
When the number of draws is increased by a factor of kthe SE for
the average decreases by a factor of k.
AMS-5: Statistics
218
pf3
pf4
pf5

Partial preview of the text

Download Estimating Averages and Standard Errors in Statistics and more Study notes Statistics in PDF only on Docsity!

Models for Averages

We consider the following specific topics:accuracy of he average.containing tickets with different values and we want to estimate theaverage value of a population. We sample at random from a box In this class we consider situations where we are interested in the

Mean and standard deviation for averages

Approximations based on the Central Limit Theorem

Confidence intervals for averages

Gauss models

215

Estimating Averages

Consider the boxand the numbers are recorded, what can we say about the average?sample is obtaining by drawing tickets with replacement from a boxmodels where the population is not divided in two groups, that is, a We want to estimate the accuracy of samples obtained from box

The average of the box is 4, so the expected value of the sum is suppose 25 draws are made from the box with replacement.

×

The SD of the box is 2, so the SE for the sum is

×

AMS-5: Statistics

What about the average? The expected value for the average of draws = average of box

SE for the average of draws =

SE for sum

number of draws

average is 4 and the SE is So, for the previous box we have that the expected value of the

Q: so the average of draws will be equal to 4 give or take 0.4.

What if we want to calculate the probability that the average of

A: draws will be above 4.4?

We can use the normal curve as an approximation. This is

average is just a chance of scale.because the normal approximation is valid for the sum, and the

217

Changing to standard units we have that

When the number of draws is increased by a factor ofapproximately 16%.and the probability that a standard normal will be above 1 is

k

the SE for

the average decreases by a factor of

k .

AMS-5: Statistics

The sample average

$32,396,714. So we can obtain the average income byis taken and the total income of those families turns out to be25,000 families living in his town. A simple random sample of 1,000 Suppose a city manager wants to know the average income of the

box that produced the sample, which is not available.We need the SE of the sample, but this depends on the SD of thechance error.of the 25,000 families, but we would like to have an estimate of theWe can use this number as an approximation to the average income using the SD of the sample For a large sample size, the SD of the box can be estimated

Suppose the SD of the sample is $19,000, the SE for the sum is

219

equal to

×

and the SE for the average can thus be estimated as

and the average income in the town is

byintervals. In fact a 95% confidence interval for the average is givenWe can now use the normal approximation to create confidence

×

AMS-5: Statistics

NOTICE:

The former does

NOT

mean that 95% of the families in

NOTICE:interval for the average income of the families in town.town have an income between $31,200 and $33,600. This is an

The normal approximation may not be valid for the

box is far from normal.averages with a normal, even if the probability histogram of theCLT we can approximate the probability histogram of the samplesample but still be valid for the sample average. That is, using the

221

Sodium Chloride Concentration

. In 36 randomly selected

We can estimate the SE asconfidence interval for the mean sodium chloride concentration.23 cc/cubic meter and the SD was 6.7 cc/cubic meter. Find a 95%samples of seawater, the mean sodium chloride concentration was

cc/cubic meter

then, a 95% confidence interval is approximately

×

cc/cubic meter

cc/cubic meter? Changing to standard unitsWhat is the probability that the concentration will be above 24

18%.The probability that a standard normal will be above 0.89 is about

AMS-5: Statistics

model does not applyIf the data show a trend or a pattern over time then a box

will be below and above the average.around their mean value. About the same proportion of the dataData that correspond to box models are irregularly scatteredwinter, so the data will show a seasonal pattern.summer we expect to see higher temperatures than during theAirport. These do not correspond to a box model, since during the Consider the daily maximum temperatures at San Francisco

227

Gauss Models

A

Gauss Model

is a model for measurement error. Each time a

We can write the Gauss model asThe estimate is good if there are enough measurements.measurements can be used to estimate the SD of the error box.When the Gauss model is applied the SD of a series of repeatedis added to the exact value. The average error is equal to 0.from the error box. The number in the ticket is chance error. Thismeasurement is done a ticket is drawn at random with replacement

measurement = exact value + chance error

where the chance error has an expected value of 0.

AMS-5: Statistics

Problems

Problem 1:

A survey organization takes a simple random sample

explain:Say whether each of the following statements is true or false andthere are 2.30 persons per sample households, and the SD is 1.75.of 625 households from a city of 80,000 households. On the average,

  1. The SE for the sample average is 0.07.

This is true since

SE =

  1. A 95% confidence interval for the sample average is 2.16 to

This is false, since the sample average is a known quantity.2.44.

  1. A 95% confidence interval for average household size in the city

is 2.16 to 2.44.

229

parameters that we are trying to estimate. True, the average household size in the city is the population

  1. 95% of the households in the city contain bertween 2.16 and

average household size.This is false. The confidence interval is a statement about the2.44 persons.

  1. The 95% confidence level is about right because the household

This is false. The sample average is approximately normal.size follows the normal curve.

Problem 2:

In a long series of trials, a computer program is found

is 2 seconds. There is no trend or pattern in the data.to take on average 58 seconds of CPU time to execute and the SD

  1. How long can you expect it to take in order for the program to

This is like considering box with an average of 58 seconds. Werun 100 times?

AMS-5: Statistics

take 100 draws from the box. The expected value is

×

seconds

  1. What is a 95% confidence interval for the time it takes to run

The SD is 2 seconds so the SE isthe program 100 times?

×

seconds

So we have that a 95% confidence interval is given by

×

seconds

AMS-5: Statistics