Describing the Relationship between Two Variables, Exercises of Statistics

This is only for equations who have a linear relationship. • Linear Correlation Coefficient: A calculation that shows us the strength of the ...

Typology: Exercises

2022/2023

Uploaded on 02/28/2023

anasooya
anasooya 🇺🇸

4.1

(13)

244 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Property of Regent University Math Tutoring Lab, Adapted from Fundamentals of Statistics: Informed Decisions Using Data 5th
Edition by Michael Sullivan III, edited 8/17/2018
Describing the Relationship between
Two Variables
Key Definitions
Scatter Diagram: A graph made to show the relationship between two different variables
(each pair of x’s and y’s) measured from the same equation.
Linear Relationship: A linear relationship will have all the points close together and no
curves, dips, etc. in the graph. It will be an almost straight line up or down.
Nonlinear Relationship: A nonlinear relationship will still have the points close together, but
will have a curve or dip.
No Relationship: Having no relationship in a scatter diagram means that the data doesn’t
have a particular pattern to it. The data is dispersed.
Positively Associated: This is when you have one value increase, the other value will
increase as well. The line on your diagram will start low and increase upwards. This is only
for equations who have a linear relationship.
Negatively Associated: This is when you have one value decrease, the other will decrease as
well. The line on your diagram will start high and decrease downwards. This is only for
equations who have a linear relationship.
Linear Correlation Coefficient: A calculation that shows us the strength of the linear
relation and the direction of the linear relation.
Lurking Variable: This is something that shows a linear relation between the variables, but is
not actually correlated.
Residual: A residual is the space in between the observed y and the predicted y which is
also known as the error.
Slope: The rate that the line is increasing or decreasing at.
Y-Intercept: The point on the line where 𝑥 = 0.
Coefficient of Determination: The percentage of total variation in the response variable
that is explained in the least-squares regression line.
Total Variation: The deviation between the observed and mean values.
Explained Variation: The deviation between the predicted and mean values.
Unexplained Variation: The deviation between the observed and predicted values.
Scatter Diagram, Relations, and Association
How a Scatter Diagram is Set Up: Your scatter diagram is like a usual graph from algebra where we
graph x and y values, but there is no line connecting each point. On the graph, the x-axis is called the
explanatory variables and the y-axis is called the response variables. Plot each point on the graph to
create the graph.
pf3
pf4
pf5

Partial preview of the text

Download Describing the Relationship between Two Variables and more Exercises Statistics in PDF only on Docsity!

Property of Regent University Math Tutoring Lab, Adapted from Fundamentals of Statistics: Informed Decisions Using Data 5th

Describing the Relationship between

Two Variables

Key Definitions

 Scatter Diagram: A graph made to show the relationship between two different variables

(each pair of x’s and y’s) measured from the same equation.

 Linear Relationship : A linear relationship will have all the points close together and no

curves, dips, etc. in the graph. It will be an almost straight line up or down.

 Nonlinear Relationship : A nonlinear relationship will still have the points close together, but

will have a curve or dip.

 No Relationship : Having no relationship in a scatter diagram means that the data doesn’t

have a particular pattern to it. The data is dispersed.

 Positively Associated : This is when you have one value increase, the other value will

increase as well. The line on your diagram will start low and increase upwards. This is only

for equations who have a linear relationship.

 Negatively Associated : This is when you have one value decrease, the other will decrease as

well. The line on your diagram will start high and decrease downwards. This is only for

equations who have a linear relationship.

 Linear Correlation Coefficient : A calculation that shows us the strength of the linear

relation and the direction of the linear relation.

 Lurking Variable : This is something that shows a linear relation between the variables, but is

not actually correlated.

 Residual : A residual is the space in between the observed y and the predicted y which is

also known as the error.

 Slope : The rate that the line is increasing or decreasing at.

 Y-Intercept: The point on the line where 𝑥 = 0.

 Coefficient of Determination : The percentage of total variation in the response variable

that is explained in the least-squares regression line.

 Total Variation : The deviation between the observed and mean values.

 Explained Variation : The deviation between the predicted and mean values.

 Unexplained Variation : The deviation between the observed and predicted values.

Scatter Diagram, Relations, and Association

How a Scatter Diagram is Set Up: Your scatter diagram is like a usual graph from algebra where we graph x and y values, but there is no line connecting each point. On the graph, the x -axis is called the explanatory variables and the y -axis is called the response variables. Plot each point on the graph to create the graph.

Property of Regent University Math Tutoring Lab, Adapted from Fundamentals of Statistics: Informed Decisions Using Data 5th

Determining Meaning from the Scatter Diagram: After each point is plotted on the graph, you are able to determine if the equation has a linear relation, nonlinear relation, or no relation. You can also determine whether the equation is positively associated or negatively associated.  Examples of Different Scatter Diagrams:

Linear Correlation Coefficient How to Find the Correlation Coefficient: The following is the formula given to us on how to find the correlation coefficient:

∑(𝑥𝑖^ 𝑠− 𝑥̅

𝑥

)(𝑦𝑖^ 𝑠− 𝑦̅

𝑦

This formula takes a very long time to do by hand. Therefore, we use technology to help us find the answer. We do this using Excel. The following are step by step instructions:

  1. First, input your data. Make sure you x -values and y -values are in separate columns.
  2. In a blank cell, type in =CORREL( click and drag down to highlight the cells of the x-values , click and drag down to highlight the cells of the y-values ) [The text italicized is instructions on what to input]. Press enter and the correlation coefficient answer will replace what you wrote.

0

20

40

60

80

100

0 5 10

Response

Explanatory

Positive, Linear Scatter Diagram

0

20

40

60

80

100

0 5 10

Response

Explanatory

Negative, Linear Scatter Diagram

0

5

10

15

20

0 5 10

Response

Explanatory

Nonlinear Scatter Diagram

0

2

4

6

8

10

12

0 5 10

Response

Explanatory

No Relation Scatter Diagram

Property of Regent University Math Tutoring Lab, Adapted from Fundamentals of Statistics: Informed Decisions Using Data 5th

Using the data points given above in the correlation coefficient example, we can find the linear equation. First using the points (74, 100) and (68, 98), we will find the slope: 𝑚 =

Next, we plug the slope and one of the data points into the slope-point formula to find the equation: 𝑦 − 98 = 0.33 (𝑥 − 68) 𝑦 − 98 = 0.33𝑥 − 22. 𝑦 = 0.33𝑥 − 22.44 + 98 𝑦 = 0.33𝑥 + 75.  Least Squares Method: A residual is the space in-between the observed y and the predicted y. The least squares method tries to make this distance and error as small as possible. To do this, we need to have the observed y (the linear equation) and the predicted y (the least-squares regression line).  How to Find the Least-Squares Regression Line: To find the least-squares regression line, you need to find the slope and y-intercept first. We do not find it the same way we find the linear equation’s slope and y-intercept. We do this by first finding the slope (we use the symbol 𝑏 1 ). To find the slope, you need to have the correlation coefficient, the standard deviation of the y -values, and the standard deviation of the x -values. The following is the formula for the slope: 𝑏 1 = 𝑟 ∙

After finding the slope, you can now find the y-intercept. To find the y-intercept, you need to have the slope, mean of the x -values, and the mean of the y -values. The following is the formula: 𝑏 0 = 𝑦̅ − 𝑏 1 𝑥̅ From there, you have all the information needed to put your information into the least-squares regression line formula: 𝑦̂ = 𝑏 1 𝑥 + 𝑏 0  Example of Finding the Least-Squares Regression Line: Using the data values from the example on the correlation coefficient, we know 𝑟 =. 90 , 𝑥̅ = 79 , 𝑦̅ = 106. 6 , 𝑠𝑥 = 10. 35 , and 𝑠𝑦 = 9. 55. First, we find the slope: 𝑏 1 = 𝑟 ∙

Now that we know the slope, we can find the y-intercept: 𝑏 0 = 𝑦̅ − 𝑏 1 𝑥̅ = 106.6 − 0.83(79) = 106.6 − 65.57 = 41. Now, we can put the information into our least-squares regression line formula: 𝑦̂ = 𝑏 1 𝑥 + 𝑏 0 𝑦̂ = .083𝑥 + 41.

The Coefficient of Determination

How to Find the Coefficient of Determination: To find the coefficient of determination for a least- squares regression line, you take your linear correlation coefficient and square it. Therefore, the formula is the following: 𝑅^2 = 𝑟^2  Example of How to Find the Coefficient of Determination: Using the data from the linear correlation coefficient example, we know that 𝑟 = .90. Now we just plug it into the formula:

Property of Regent University Math Tutoring Lab, Adapted from Fundamentals of Statistics: Informed Decisions Using Data 5th

𝑅^2 = (0.90)^2 =.

Since the coefficient of determination is a percentage, we just turn our decimal into a percentage by multiplying it by 100%. Therefore, our answer is 81%.

Symbol Guide

Chapter Title Symbols

Term Symbol Use

Sample Mean x̅ To identify the sample mean

Sample Standard Deviation

s

To identify the sample standard

deviation

Sample Size n To identify the sample size

Sum ∑ To identify when we add up

everything

Linear Correlation Coefficient r To identify the correlation

coefficient

Linear Line Slope m To identify the slope of a linear line

Y-Intercept b To identify the y-intercept of a

linear line

Least-Squares Regression Line Slope 𝑏

1

To identify the least-squares

regression line slope

Least-Squares Regression Line Y-Intercept 𝑏

0

To identify the least-squares

regression line y-intercept

Predicted y

To identify the predicted y of a

least-squares regression line

The Coefficient of Determination

𝑅^2

To identify the coefficient of

determination