Linear Regression with Excel, Exams of Mathematics

Examples of linear regression problems and how to solve them using Excel. It explains the concept of correlation coefficient and how to calculate it. It also shows how to find the best fit linear regression equation and make predictions using it. The document emphasizes the importance of making reasonable predictions and the difference between interpolation and extrapolation. The examples include data on stars, sales revenue, city temperatures, life expectancy, and test scores.

Typology: Exams

2023/2024

Available from 09/22/2023

coursehero
coursehero 🇺🇸

3.8

(70)

2.9K documents

1 / 55

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
r= 0.93
Correct! You nailed it.
Week 8 questions and answers
Performing Linear Regressions with Technology
An amateur astronomer is researching statistical properties of known stars using a
variety of databases. They collect the absolute magnitude or MV and stellar mass
or Mfor 30 stars. The absolute magnitude of a star is the intensity of light that would
be observed from the star at a distance of 10 parsecs from the star. This is measured
in terms of a particular band of the light spectrum, indicated by the subscript letter,
which in this case is V for the visual light spectrum. The scale is logarithmic and
an MV that is 1 less than another comes from a star that is 10 times more luminous
than the other. The stellar mass of a star is how many times the sun's mass it has. The
data is provided below. Use Excel to calculate the correlation coefficient r between the
two data sets, rounding to two decimal places.
Answer Explanation
The correlation coefficient, rounded to two decimal places, is r≈−0.93.
A market researcher looked at the quarterly sales revenue for a large e-commerce store
and for a large brick-and-mortar retailer over the same period. The researcher recorded
the revenue in millions of dollars for 30 quarters. The data are provided below. Use
Excel to calculate the correlation coefficient r between the two data sets. Round your
answer to two decimal places.
Answer Explanation
The correlation coefficient, rounded to two decimal places, is r≈−0.81.
r= 0.81
Yes that's right. Keep it up!
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37

Partial preview of the text

Download Linear Regression with Excel and more Exams Mathematics in PDF only on Docsity!

r= −0.

Correct! You nailed it.

Week 8 questions and answers

Performing Linear Regressions with Technology

An amateur astronomer is researching statistical properties of known stars using a

variety of databases. They collect the absolute magnitude or MV and stellar mass

or M⊙ for 30 stars. The absolute magnitude of a star is the intensity of light that would

be observed from the star at a distance of 10 parsecs from the star. This is measured

in terms of a particular band of the light spectrum, indicated by the subscript letter,

which in this case is V for the visual light spectrum. The scale is logarithmic and

an MV that is 1 less than another comes from a star that is 10 times more luminous

than the other. The stellar mass of a star is how many times the sun's mass it has. The

data is provided below. Use Excel to calculate the correlation coefficient r between the

two data sets, rounding to two decimal places.

Answer Explanation

The correlation coefficient, rounded to two decimal places, is r≈−0.93.

A market researcher looked at the quarterly sales revenue for a large e-commerce store

and for a large brick-and-mortar retailer over the same period. The researcher recorded

the revenue in millions of dollars for 30 quarters. The data are provided below. Use

Excel to calculate the correlation coefficient r between the two data sets. Round your

answer to two decimal places.

Answer Explanation

The correlation coefficient, rounded to two decimal places, is r≈−0.81.

r= −0.

Yes that's right. Keep it up!

Yes that's right. Keep it up!

The table below contains the geographic latitudes, x, and average January

temperatures, y, of 20 cities. Use Excel to find the best fit linear regression

equation. Round the slope and intercept to two decimal places.

HelpCopy to ClipboardDownload CSV

y = −2.68, x147.

Thus, the equation of line of best fit with slope and intercept rounded to two

decimal places is yˆ=−2.68x+147.24.

An organization collects information on the life expectancy (in years) of a person in

certain countries and the fertility rate per woman in those countries. The data

for 21 randomly selected countries for the year 2011 is given below. Use Excel to

find the best fit linear regression equation, where fertility rate is the explanatory

variable. Round the slope and intercept to two decimal places.

y = −4.21, x 83.68 Answer Explanation yˆ=−4.21, x+83.68.

x y

Yes that's right. Keep it up!

Yes that's right. Keep it up!

y = 2.89, x 4.

Thus, the equation of line of best fit with slope and intercept rounded to two

decimal places is yˆ=2.86x+4.69.

In the following table, the age (in years) of the respondents is given as the x value, and

the earnings (in thousands of dollars) of the respondents are given as the y value. Use

Excel to find the best fit linear regression equation in thousands of dollars. Round the

slope and intercept to three decimal places.

y = 0.433, x=24.

Answer Explanation

Thus, the equation of line of best fit with slope and intercept rounded to three

decimal places is yˆ=0.433x+24.493.

PREDICITONS USING LINEAR REGRESSION

Question

The table shows data collected on the relationship between the time spent studying per

day and the time spent reading per day. The line of best fit for the data

is yˆ=0.16x+36.2. Assume the line of best fit is significant and there is a strong

linear relationship between the variables.

Studying (Minutes) 507090110 Reading (Minutes) 44485054

(a) According to the line of best fit, what would be the predicted number of minutes

spent reading for someone who spent 67 minutes studying? Round your answer to two

decimal places.

The predicted number of minutes spent reading is $$46.92.

That's incorrect - mistakes are part of learning. Keep trying!

Answer Explanation

The predicted number of minutes spent reading is 1$$. Correct answers:

Substitute 67 for x into the line of best fit to estimate the number of minutes spent

reading for someone who spent 67 minutes

studying: yˆ=0.16(67)+36.2=46.92.

Question

The table shows data collected on the relationship between the time spent studying per day and the time spent reading per day. The line of best fit for the data

is yˆ=0.16x+36..

Studying (Minutes) 507090110 Reading (Minutes) 44485054

(a) According to the line of best fit, the predicted number of minutes spent reading for

someone who spent 67 minutes studying is 46..

(b) Is it reasonable to use this line of best fit to make the above prediction?

The estimate, a predicted time of 46.92 minutes, is both reliable and reasonable.

The estimate, a predicted time of 46.92 minutes, is both unreliable and unreasonable.

The estimate, a predicted time of 46.92 minutes, is reliable but unreasonable.

The estimate, a predicted time of 46.92 minutes, is unreliable but reasonable.

Answer Explanation

The data in the table only includes exercise times between 15 and 30 minutes, so the

line of best fit gives reliable and reasonable predictions for values

of x between 15 and 30. Since 23 is between these values, the estimate is

reasonable.

Your answer:

The estimate, a predicted test score of 76.98, is unreliable and unreasonable.

This estimate is reliable, because 23 is inside the range 15 to 30 given in the table.

And, it is a realistic score, so it is reasonable.

Nomenclature

  • When using regression lines to make predictions, if the x-value is within the

range of observed x-values, one can conclude the prediction is both reliable and

reasonable. That is, the prediction is accurate and possible. For example, if a

prediction were made using x=1995 in the video above, one could conclude

the predicted y-value is both reliable (quite accurate) and reasonable

(possible). This is an example of interpolation.

  • When using regression lines to make predictions, if the x-value is outside the

range of observed x-values, one cannot conclude the prediction is both reliable

and reasonable. That is, the prediction is will be much less accurate and the

prediction may, or may not, be possible. For example, x=2020 is not within the

range of 1950 to 2000. Therefore, the prediction is much less reliable (not as accurate) even though it is reasonable (it is possible that a person will live to be 79.72 years old). This is an example of extrapolation.

Reasonable Predictions

Note that not all predictions are reasonable using a line of best fit. Typically, it is

considered reasonable to make predictions for x-values which are between the smallest

and largest observed x-values. These are known as interpolated values. Typically, it is

considered unreasonable to make predictions for x-values which are not between the

smallest and largest observed x-values. These are known as extrapolated values.

A scatterplot has a horizontal axis labeled x from 0 to 20 in increments of 1 and a vertical axis labeled y from 0 to 28 in increments of 2. 15 plotted points strictly follow the pattern of a line that rises from left to right and passes through the points left- parenthesis 6 comma 10 right-parentheses, left-parenthesis 8 comma 13 right- parenthesis, and left-parenthesis 14 comma 2 right-parentheses. There are other plotted points at left-parenthesis 10 comma 15 right-parenthesis and left-parenthesis 13 comma 19 right-parenthesis. The regions between the horizontal axis points from 1 to 6 and 14 to 20 are shaded as unreasonable. The region between the horizontal axis points from 6 to 14 is shaded as reasonable. All coordinates are approximate

In the figure above, we see that the observed values have x-values ranging

from 6 to 14. So it would be reasonable to use the line of best fit to make a prediction

for the x value of 9 (because it is between 6 and 14 ), but it would be unreasonable to

make a prediction for the x-value of 20 (because that is outside of the range).

Nomenclature

  • When using regression lines to make predictions, if the x-value is within the

range of observed x-values, one can conclude the prediction is both reliable and

reasonable. That is, the prediction is accurate and possible. For example, if a

prediction were made using x=1995 in the video above, one could conclude

the predicted y-value is both reliable (quite accurate) and reasonable

(possible). This is an example of interpolation.

  • When using regression lines to make predictions, if the x-value is outside the

range of observed x-values, one cannot conclude the prediction is both reliable

and reasonable. That is, the prediction is will be much less accurate and the

prediction may, or may not, be possible. For example, x=2020 is not within the

range of 1950 to 2000. Therefore, the prediction is much less reliable (not as accurate) even though it is reasonable (it is possible that a person will live to be 79.72 years old). This is an example of extrapolation.

The predicted test score is 95.2, and the estimate is not reasonable.

The predicted test score is 95.2, and the estimate is reasonable.

The predicted test score is 107.2, and the estimate is not reasonable.

The predicted test score is 107.2, and the estimate is reasonable.

Answer Explanation

Correct answer:

The predicted test score is 107.2, and the estimate is not reasonable.

Substitute 70 for x in the line of best fit to estimate the test score for someone who

spent 70 minutes reading: yˆ=0.8(70)+51.2=107.2. The data in the table

only includes reading times between 30 and 50 minutes, so the line of best fit only

gives reasonable predictions for values of x between 30 and 50. Since 70 is far

outside of this range of values, the estimate is not reasonable.

Another thing to notice is that it predicts a test score of greater than 100 , which is

typically impossible.

Your answer:

The predicted test score is 107.2, and the estimate is reasonable.

The predicted value is not reasonable because the value of 70 minutes is not

between 30 and 50 minutes.

Question

Data is collected on the relationship between the average number of minutes spent

exercising per day and math test scores. The data is shown in the table and the line of

best fit for the data is yˆ=0.42x+64.6. Assume the line of best fit is significant and

there is a strong linear relationship between the variables.

Minutes 25303540 Test Score 75778081

Well done! You got it right.

Perfect. Your hard work is paying off

(a) According to the line of best fit, what would be the predicted test score for someone

who spent 38 minutes exercising? Round your answer to two decimal places.

The predicted test score is $$80.56.

Answer Explanation

The predicted test score is 1$$. Correct answers:

Substitute 38 for x into the line of best fit to estimate the test score for someone who

spent 38 minutes exercising: yˆ=0.42(38)+64.6=80.56.

Question

Data is collected on the relationship between the average number of minutes spent exercising per day and math test scores. The data is shown in the table and the line of

best fit for the data is yˆ=0.42x+64..

Minutes 25303540 Test Score 75778081

(a) According to the line of best fit, the predicted test score for someone who

spent 38 minutes exercising is 80..

(b) Is it reasonable to use this line of best fit to make the above prediction?

The estimate, a predicted test score of 80.56, is both reliable and reasonable.

Correct! You nailed it.

The predicted number of minutes spent watching television is $$133.15.

Answer Explanation

The predicted number of minutes spent watching television is 1$$. Correct answers:

Substitute 45 for x into the line of best fit to estimate the number of minutes spent

watching television for an average daily temperature

of 45 degrees: y^=−0.81(45)+96.7=60..

Question

Data is collected on the relationship between the average daily temperature and time spent watching television. The data is shown in the table and the line of best fit for the

data is y^=−0.81x+96..

Temperature (Degrees) 30405060 Minutes Watching Televisio

n 73635748

(a) According to the line of best fit, the predicted number of minutes spent watching

television for an average daily temperature of 45 degrees is 60..

(b) Is it reasonable to use this line of best fit to make the above prediction?

The estimate, a predicted time of 60.25 minutes, is unreliable but reasonable.

The estimate, a predicted time of 60.25 minutes, is both reliable and reasonable.

Keep trying - mistakes can help us grow.

Yes that's right. Keep it up!

The estimate, a predicted time of 60.25 minutes, is both unreliable and unreasonable.

The estimate, a predicted time of 60.25 minutes, is reliable but unreasonable.

Answer Explanation

Correct answer:

The estimate, a predicted time of 60.25 minutes, is both reliable and reasonable.

The data in the table only includes temperatures between 30 and 60 degrees, so the

line of best fit only gives reliable and reasonable predictions for values

of x between 30 and 60. Since 45 is between these values, the estimate is both

reliable and reasonable.

Question

Homer is studying the relationship between the average daily temperature and time

spent watching television and has collected the data shown in the table. The line of best

fit for the data is yˆ=−0.6x+94.5. Assume the line of best fit is significant and

there is a strong linear relationship between the variables.

Temperature (Degrees) 40506070 Minutes Watching Televisio

n 70655952

(a) According to the line of best fit, what would be the predicted number of minutes

spent watching television for an average daily temperature of 39 degrees? Round your

answer to two decimal places, as needed.

The predicted number of minutes spent watching television is $$71.1.

Answer Explanation

The predicted number of minutes spent watching television is 1$$.

Correct answers:

Keep trying - mistakes can help us grow.

The data in the table only includes temperatures between 40 and 70 degrees, so the

line of best fit gives reliable and reasonable predictions for values

of x between 40 and 70. Since 39 is not between these values, the estimate is not

reliable. However, 71.1 minutes is a reasonable time.

Your answer:

The estimate, a predicted time of 71.1 minutes, is both reliable and reasonable.

This estimate is not reliable, because 39 is outside of the range 40 to 70 given in the

table.

Question

Daniel owns a business consulting service. For each consultation, he

charges $95 plus $70 per hour of work. A linear equation that expresses the total

amount of money Daniel earns per consultation is y=70x+95. What are the

independent and dependent variables? What is the y-intercept and the slope?

The independent variable (x) is the amount, in dollars, Daniel earns for a consultation. The dependent variable (y) is the amount of time Daniel consults.

Daniel charges a one-time fee of $95 (this is when x=0), so the y-intercept is 95. Daniel earns $70 for each hour he works, so the slope is 70.

The independent variable (x) is the amount of time Daniel consults. The dependent variable (y) is the amount, in dollars, Daniel earns for a consultation.

Daniel charges a one-time fee of $95 (this is when x=0), so the y-intercept is 95. Daniel earns $70 for each hour he works, so the slope is 70.

The independent variable (x) is the amount, in dollars, Daniel earns for a consultation. The dependent variable (y) is the amount of time Daniel consults.

Daniel charges a one-time fee of $70 (this is when x=0), so the y-intercept is 70. Daniel earns $95 for each hour he works, so the slope is 95.

The independent variable (x) is the amount of time Daniel consults. The dependent

variable (y) is the amount, in dollars, Daniel earns for a consultation.

Daniel charges a one-time fee of $95 (this is when x=0), so the y-intercept is 95.

Daniel earns $70 for each hour he works, so the slope is 70.

The independent variable (x) is the amount of time Daniel consults. The dependent

variable (y) is the amount, in dollars, Daniel earns for a consultation.

Daniel charges a one-time fee of $70 (this is when x=0), so the y-intercept is 70.

Daniel earns $95 for each hour he works, so the slope is 95.

$$ y =−

Well done! You got it right.

The independent variable (x) is the amount of time Daniel consults. The dependent variable (y) is the amount, in dollars, Daniel earns for a consultation.

Daniel charges a one-time fee of $70 (this is when x=0), so the y-intercept is 70. Daniel earns $95 for each hour he works, so the slope is 95.

Answer Explanation

Correct answer:

The independent variable (x) is the amount of time Daniel consults because it is the

value that changes. He may work different amounts per consultation, and his earnings

are dependent on how many hours he works. This is why the amount, in dollars, Daniel

earns for a consultation is the dependent variable (y).

The y-intercept is 95 (b=95). This is his one-time fee. The slope is 70 (a=70).

This is the increase for each hour he works.

Your answer:

Question

Given the following line, find the value of y when x=2.

y=−4x−

Answer Explanation

The independent variable (x) is the amount of time Evan works each house visit. The

dependent variable (y) is the amount, in dollars, Evan earns for each session.

At the start of the repairs, Evan charges a one-time fee of $55 (this is when x=0), so

the y-intercept is 55. Evan earns $30 for each hour he works, so the slope is 30.

At the start of the repairs, Evan charges a one-time fee of $55 (this is when x=0), so the y-intercept is 55. Evan earns $30 for each hour he works, so the slope is 30.

The independent variable (x) is the amount of time Evan works each house visit. The dependent variable (y) is the amount, in dollars, Evan earns for each session. At the start of the repairs, Evan charges a one-time fee of $55 (this is when x=0), so the y-intercept is 55. Evan earns $30 for each hour he works, so the slope is 30.

The independent variable (x) the amount, in dollars, Evan earns for each session. The dependent variable (y) is the amount of time Evan works each house visit. At the start of the repairs, Evan charges a one-time fee of $30 (this is when x=0), so the y-intercept is 30. Evan earns $55 for each hour he works, so the slope is 55.

The independent variable (x) is the amount of time Evan works each house visit. The dependent variable (y) is the amount, in dollars, Evan earns for each session. At the start of the repairs, Evan charges a one-time fee of $30 (this is when x=0), so the y-intercept is 30. Evan earns $55 for each hour he works, so the slope is 55.

Answer Explanation

Correct answer:

The independent variable (x) is the amount of time Evan works each house visit

because it is the value that changes. He may work different amounts per day, and his

earnings are dependent on how many hours he works. This is why the amount, in

dollars Evan earns for each session is the dependent variable (y).

The y-intercept is 55 (b=55). This is his one-time fee. The slope is 30 (a=30).

This is the increase for each hour he works

Question

Using a calculator or statistical software, find the linear regression line for the data in the

table below.

$$ y =0.53 x +1.

Keep trying - mistakes can help us grow.

  • $y=0.54x+1.59$ y =0.54 x +1.

Enter your answer in the form y=mx+b, with m and b both rounded to two decimal

places.

Answer 1:

HelpCopy to ClipboardDownload CSV

Answer 2:

Answer Explanation

Correct answers:

If you use a TI-83 or TI-84 calculator, you press STAT, and then ENTER, which brings you to the edit menu where you can enter values. In the L1 list, you enter the values

of x from the table above, 0,1,2,3,4,5. Then, in the L2 list, you enter the values

of y from the table above, 2.12,2.19,1.92,2.79,3.81,4.72.

Now, press STAT again, and arrow to the right, to CALC. Arrow down to the LinReg

$$ y =0.53 x +1.

Keep trying - mistakes can help us grow.

x y