






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Assignment; Professor: Burke; Class: Elementary Statistics; Subject: Mathematics; University: Sierra College; Term: Spring 2009;
Typology: Assignments
1 / 10
This page cannot be seen from the preview
Don't miss anything!







1
Today: Sections 10-1/10- Assignment: 10-2 {1, 3, 5, 7, 9, 13, 17, 19, 23} 10-3 {1, 3, 5, 7, 13, 15, 17, 19, 21, 23} Next: Sections 13-1; 13-6; 15-3; Work on Project
Instructor: John Burke E-mail: [email protected] Web Page: http://math.sierracollege.edu/Staff/JohnBurke/ Telephone: 916 337- Office hours: (V-307) MW 2:35-5:00; M 2:45-3:45 (official)
2
In Chapter 10, we examine relationships between paired quantitative data.
We use collected data to
Can We Predict the Time of the Next Eruption of Old Faithful?
Is there a relationship between any two variables?
Can we predict how long it will be to the next eruption based upon duration, interval before, or height?
Height (L 4 )* 140 110 125 120 140 120 125 150
Interval After Eruption (L 3 )* 92 65 72 94 83 94 101 87
Interval Before Eruption (L 2 )* 98 90 92 98 93 105 81 108
Duration (L 1 )* 240 120 178 234 235 269 255 220
Eruptions of the Old Faithful Geyser
4
Paired sample data is sometimes called bivariate data.
A correlation exists between two variables when one of them is related to the other in some way.
We can often see if a relationship exists by using a scatterplot (or scatter diagram ), a graph in which the paired (x, y) sample data are plotted with each pair represented as a single point.
Assumptions : we will consider only linear relationships, which means that when graphed, the points approximate a straight line. (Recall slope and direction of line.)
5
(b) Strong positive
(c) Perfect positive
(a) Positive
(d) Negative (e) Strong negative
(f) Perfect negative
10
The value of r does not change if all values of either variable are converted to a different scale. The value of r is not affected by the choice of x or y. Interchange all x- and y- values and the value of r will not change. r measures the strength of a linear relationship. It is not designed to measure the strength of a relationship that is not linear. r^2 is the proportion of the variation in y that is explained by the linear relationship between x and y.
The value of r is always between -1 and +1 inclusive.
2 2 2 2
n xy x y r n x x n y y
11
Interpreting r using Table A-6 :
If the absolute value of the computed value of r exceeds the value in Table A-6, conclude that there is a significant linear correlation.
Otherwise, there is not sufficient evidence to support the conclusion of a significant linear correlation.
4 (^56) (^78) 9 (^1011) (^1213) 14 (^1516) (^1718) 19 (^2025) (^3035) 40 (^4550) (^6070) 80 10090
n . .959. .875. . .765. .708. . .641. .606. . .561. .463. . .378. .330. . .269.
. .878. .754. . .632. .576. . .514. .482. . .444. .361. . .294. .254. . .207.
α = .05^ α^ =.
Causation : It is wrong to conclude that correlation implies causality (Remember eating lobster and its “effect” on pregnancy).
Averages : Averages suppress individual variation and may inflate the correlation coefficient.
Linearity : There may be some relationship between x and y even when there is no significant linear correlation.
13
# Pairs
x y xy x*^2 y^2
Sum Mean
2 2 2 2
n xy x y r n x x n y y
14
Let H 0 : ρ = 0; H 1 : ρ ≠ 0
Select a significance level α
Calculate r
The test statistic is r. Critical values are determined from Table A-
If |r| > the C.V., reject H 0 ; otherwise, fail to reject H 0
If H 0 is rejected, conclude that there is a significant linear correlation. If you fail to reject H 0 , then there is not sufficient evidence to conclude that there is a linear correlation.
Fail to reject ρ = 0
r = - 0.707 0 r^ = 0.707^1
Sample data: r = 0.
-
Reject ρ = 0
Reject ρ = 0
We conclude there is a significant positive correlation between the Interval After Eruption and the Duration of Eruption.
19
r = - 0.707 (^0) r = 0.707^1
Sample data: r = 0.
the test statistic does fall within the critical region.
significant linear correlation between the weights of discarded plastic and household size.
Is there a significant linear correlation?
Fail to reject ρ = 0
Reject ρ = 0
Reject ρ = 0
20
Once we have found a linear correlation, our next task is to find the best mathematical model to use for prediction.
Linear correlation means the data approximates a straight line; hence we are looking for the straight line y = mx + b that most closely approximates the data.
Interval After Eruption (L 3 ) 92 65 72 94 83 94 101 87
Duration (L 1 ) 240 120 178 234 235 269 255 220
Eruptions of the Old Faithful Geyser
Interval After (L 3 ) vs. Duration (L 1 )
22
Assumptions: We are investigating only linear relationships. The pairs of (x, y) data have a bivariate normal distribution.
Definitions: Given a collection of paired sample data, the regression equation y = b 0 + b 1 x algebraically describes the relationship between the two variables. The graph of the regression equation is called the regression line (or line of best fit, or least-squares line.)
23
Notation: β 0 and β 1 are the population parameters with regression equation y = β 0 + β 1 x , and b 0 and b 1 are the sample statistics with regression equation y = b 0 + b 1 x.
(^1 2 )
Slope:
Y-Intercept:
Round each to three decimal places.
Start
Calculate the value of r and test the hypothesis that ρ = 0.
Is ρ = 0 Rejected?
Use the regression equation to make predictions
Given any value of one variable, the best predicted value is the sample mean.
Yes
No
28
0. 2
1. 3
2. 3
2. 6
2. 4
1. 2
0. 1
3. 5
Data from the Garbage Project x Plastic (lb) y Household
29
for a sample of paired (x,y) data, the difference ( y - y) between an observed sample y-value and the value of y ( the value of y that is predicted by using the regression equation).
A straight line satisfies this property if the sum of the squares of the residuals is the smallest sum possible.
02
46
(^108)
1214
16
1820
22
2426
28
3032
1 2 3 4 5
x
y (^) Residual = 7
Residual = -5 Residual = -
Residual = 11