










Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Statistical Methods I >5; Subject: Mathematics; University: University of Oregon; Term: Winter 2005;
Typology: Study notes
1 / 18
This page cannot be seen from the preview
Don't miss anything!











Lecture Notes for Math 425/
Qi-Man Shao Department of Mathematics University of Oregon
@ 2005 by Qi-Man Shao. All rights reserved.
The science of collecting, organizing and interpreting data
Example 1.2 The distribution of marital status for all American age 18 and over
Marital status Count Percent (millions) Never married 43.9 22. Married 116.7 60. Widowed 13.4 7. Divorced 17.6 9.
Graphs for quantitative variables Distribution: The pattern of variation of a variable. The distribution records the numerical values of the variable and how often each value occurs. The distribution of a variable is best displayed graphically. Below are 3 commonly used graphical tools. Stemplots:
Step 1. Separate each observation into a stem and a leaf stem: all but the final digit leaf: the final digit
Step 2. Write the stems in a vertical column with the smallest at the top, and draw a vertical line to the right of the stems
Step 3. Add the leaves to the right of the line
Example 1.3 Here are the number of home runs that Babe Ruth hit in each of his 15 years with the New York Yankees, 1920 to 1934: 54 59 35 41 46 25 47 60 54 46 49 46 41 34 22 Make a stemplot.
Example 1.4 Does increasing the amount of calcium in our diet reduce blood pressure? A randomized comparative experiment gave one group of 10 black men a calcium supplement for 12 weeks. The control group of 11 black men received a placebo (a dummy pill) that appeared identical. The blood pressure distributions in the two groups should have been similar at the beginning of the experiment. Here are the initial blood pressure readings for the two groups:
Calcium group
107 110 123 129 112 111 107 112 136 102
Placebo group
123 109 112 102 98 114 119 112 110 117 130 Make a back-to-back stemplot of these data.
Example 1.5 Babe Ruth’s home run record for a single year was broken by another Yankee, Roger Maris, who hit 61 home runs in 1961. Here are Maris’s home run totals for his 10 years in the American League: 13 23 26 16 33 61 28 39 14 8 Is Maris’s record 61 an outlier in his distribution of home runs hit? Who is the better home run hitter, Maris or Ruth?
Examining a distribution:
Time plots: Plots each observation against the time
1.2 Describing Distributions with Numbers
Measuring center:
n
(x 1 + x 2 + · · · + xn) or ¯x =
n
∑^ n
i=
xi =
n
xi
Example 1.7 Find Ruth’s mean number and median number of home runs hit in a year 54 59 35 41 46 25 47 60 54 46 49 46 41 34 22
Find Maris’s mean number and median number of home runs hit in a year 13 23 26 16 33 61 28 39 14 8
Solution:
Mean or Median
Example 1.8 Some people worry about how many calories they consume. Consumer Reports magazine, in a story on hot dogs, measured the calories in 20 brands of beef hot dogs, 17 brand of meat hot dogs, and 17 brands of poultry hot dogs. Here are the computer outputs:
Hot dogs Min Q 1 M Q 3 Max Beef 111 140 152.5 178.5 190 Meat 107 139 153 179 195 Poultry 87 102 129 143 170
Make side-by-side boxplots of the calorie counts for the three types of hot dogs.
Measuring spread: the standard deviation(s.d.) The most common measure of the spread about the mean
s^2 =
n − 1
[(x 1 − x¯)^2 + (x 2 − x¯)^2
or s^2 =
n − 1
(xi − ¯x)^2
s =
n − 1
(xi − x¯)^2
Properties of the standard deviation:
Example 1. Data Set I: 12 25 38 8 42 Data Set II: 19 32 45 15 49 Note that each value of the second data set is obtained by adding 7 to the corresponding value of the first data set. Calculate the mean and the standard deviation for each of these two data sets. Comment on the relationship between the two means and the two standard deviations.
Solution:
Example 1. Data Set I: 2 8 15 9 11 Data Set II: 4 16 30 18 22 Note that each value of the second data set is obtained by multiplying the corresponding value of the first data set by 2. Calculate the mean and the standard deviation for each of these two data sets. Comment on the relationship between the two means and the two standard deviations.
Changing the unit of measurement A linear transformation changes the original variable x into the new variable y given by an equation of the form y = a + bx
Then y¯ = a + bx¯ sy = |b|sx
1.3 The Normal Distributions
Most useful mathematical model in probability and statistics Strategy for exploring distributions: Graphical −→ numerical −→ mathematical model Note:
Density curve: an idealized description of the distribution of data
Example 1.12 Refer to Exercise 1.79 (p.84)
Solution:
Median and mean of a density curve
μ, σ and ¯x, s
Normal distributions symmetric, single peaked, bell-shaped density curves
f (x) =
2 πσ
e−(x−μ)
(^2) /(2σ (^2) ) , −∞ < x < ∞
The 68-95-99.7 Rule In any normal distribution N (μ, σ)
P (a < Z ≤ b) = P (Z ≤ b) − P (Z ≤ a)
Example 1.14 Use Table A to find
a. P (Z < 2 .5)
b. P (Z > 2 .5)
c. P (Z < − 1 .6)
d. P (− 1. 6 < Z < 2 .5)
Solution:
Example 1.15 The distribution of heights of young women aged 18 to 24 is approximately N (64. 5 , 2 .5) (in inches). What proportion of all young women
a. are less than 68 inches tall?
b. are between 64.5 and 67 inches tall?
c. are at least 70 inches tall?
Solution:
To compute proportions for N (μ, σ)
x − μ σ to restate the problem in terms of an N (0, 1) variable
Find a value given a proportion
Example 1.16 Use Table T-11 to find the value z of a standard normal variable that satisfies each of the following conditions.
(a) the point z with 10% of the observations falling below it
(b) the point z with 5% of the observations falling above it
Solution:
Example 1.17 The scores of a reference population on the Wechsler Intelligence Scale for Children (WISC) are normally distributed with μ = 100 and σ = 15. What score must a child achieve on the WISC in order to fall in the top 5% of the population? In the top 1%?
Solution: