



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Methods for estimating averages and standard errors in statistics, using examples and formulas. Topics include mean and standard deviation for averages, approximations based on the central limit theorem, confidence intervals for averages, and gauss models. The document also covers the difference between standard deviation and standard error, and provides examples of problems to solve.
Typology: Study notes
1 / 5
This page cannot be seen from the preview
Don't miss anything!




Models for Averages
We consider the following specific topics:accuracy of he average.containing tickets with different values and we want to estimate theaverage value of a population. We sample at random from a box In this class we consider situations where we are interested in the
Mean and standard deviation for averages
Approximations based on the Central Limit Theorem
Confidence intervals for averages
Gauss models
215
Estimating Averages
Consider the boxand the numbers are recorded, what can we say about the average?sample is obtaining by drawing tickets with replacement from a boxmodels where the population is not divided in two groups, that is, a We want to estimate the accuracy of samples obtained from box
The average of the box is 4, so the expected value of the sum is suppose 25 draws are made from the box with replacement.
The SD of the box is 2, so the SE for the sum is
AMS-5: Statistics
What about the average? The expected value for the average of draws = average of box
SE for the average of draws =
SE for sum
number of draws
average is 4 and the SE is So, for the previous box we have that the expected value of the
Q: so the average of draws will be equal to 4 give or take 0.4.
What if we want to calculate the probability that the average of
A: draws will be above 4.4?
We can use the normal curve as an approximation. This is
average is just a chance of scale.because the normal approximation is valid for the sum, and the
217
Changing to standard units we have that
When the number of draws is increased by a factor ofapproximately 16%.and the probability that a standard normal will be above 1 is
k
the SE for
the average decreases by a factor of
k .
AMS-5: Statistics
The sample average
$32,396,714. So we can obtain the average income byis taken and the total income of those families turns out to be25,000 families living in his town. A simple random sample of 1,000 Suppose a city manager wants to know the average income of the
box that produced the sample, which is not available.We need the SE of the sample, but this depends on the SD of thechance error.of the 25,000 families, but we would like to have an estimate of theWe can use this number as an approximation to the average income using the SD of the sample For a large sample size, the SD of the box can be estimated
Suppose the SD of the sample is $19,000, the SE for the sum is
219
equal to
and the SE for the average can thus be estimated as
and the average income in the town is
byintervals. In fact a 95% confidence interval for the average is givenWe can now use the normal approximation to create confidence
AMS-5: Statistics
The former does
mean that 95% of the families in
NOTICE:interval for the average income of the families in town.town have an income between $31,200 and $33,600. This is an
The normal approximation may not be valid for the
box is far from normal.averages with a normal, even if the probability histogram of theCLT we can approximate the probability histogram of the samplesample but still be valid for the sample average. That is, using the
221
Sodium Chloride Concentration
. In 36 randomly selected
We can estimate the SE asconfidence interval for the mean sodium chloride concentration.23 cc/cubic meter and the SD was 6.7 cc/cubic meter. Find a 95%samples of seawater, the mean sodium chloride concentration was
cc/cubic meter
then, a 95% confidence interval is approximately
cc/cubic meter
cc/cubic meter? Changing to standard unitsWhat is the probability that the concentration will be above 24
18%.The probability that a standard normal will be above 0.89 is about
AMS-5: Statistics
model does not applyIf the data show a trend or a pattern over time then a box
will be below and above the average.around their mean value. About the same proportion of the dataData that correspond to box models are irregularly scatteredwinter, so the data will show a seasonal pattern.summer we expect to see higher temperatures than during theAirport. These do not correspond to a box model, since during the Consider the daily maximum temperatures at San Francisco
227
Gauss Models
Gauss Model
is a model for measurement error. Each time a
We can write the Gauss model asThe estimate is good if there are enough measurements.measurements can be used to estimate the SD of the error box.When the Gauss model is applied the SD of a series of repeatedis added to the exact value. The average error is equal to 0.from the error box. The number in the ticket is chance error. Thismeasurement is done a ticket is drawn at random with replacement
measurement = exact value + chance error
where the chance error has an expected value of 0.
AMS-5: Statistics
Problems
Problem 1:
A survey organization takes a simple random sample
explain:Say whether each of the following statements is true or false andthere are 2.30 persons per sample households, and the SD is 1.75.of 625 households from a city of 80,000 households. On the average,
This is true since
This is false, since the sample average is a known quantity.2.44.
is 2.16 to 2.44.
229
parameters that we are trying to estimate. True, the average household size in the city is the population
average household size.This is false. The confidence interval is a statement about the2.44 persons.
This is false. The sample average is approximately normal.size follows the normal curve.
Problem 2:
In a long series of trials, a computer program is found
is 2 seconds. There is no trend or pattern in the data.to take on average 58 seconds of CPU time to execute and the SD
This is like considering box with an average of 58 seconds. Werun 100 times?
AMS-5: Statistics
take 100 draws from the box. The expected value is
seconds
The SD is 2 seconds so the SE isthe program 100 times?
seconds
So we have that a 95% confidence interval is given by
seconds
AMS-5: Statistics