





























































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to linear regression and its application in Excel. It explains the concept of correlation coefficient and line of best fit. It also discusses the reliability and reasonability of predictions made using regression lines. examples and explanations for each concept.
Typology: Exams
1 / 69
This page cannot be seen from the preview
Don't miss anything!






























































Correct! You nailed it. Week 8 questions and answers Performing Linear Regressions with Technology An amateur astronomer is researching statistical properties of known stars using a variety of databases. They collect the absolute magnitude or MV and stellar mass or M⊙ for 30 stars. The absolute magnitude of a star is the intensity of light that would be observed from the star at a distance of 10 parsecs from the star. This is measured in terms of a particular band of the light spectrum, indicated by the subscript letter, which in this case is V for the visual light spectrum. The scale is logarithmic and an MV that is 1 less than another comes from a star that is 10 times more luminous than the other. The stellar mass of a star is how many times the sun's mass it has. The data is provided below. Use Excel to calculate the correlation coefficient r between the two data sets, rounding to two decimal places. Answer Explanation The correlation coefficient, rounded to two decimal places, is r≈−0.93. A market researcher looked at the quarterly sales revenue for a large e-commerce store and for a large brick-and-mortar retailer over the same period. The researcher recorded the revenue in millions of dollars for 30 quarters. The data are provided below. Use Excel to calculate the correlation coefficient r between the two data sets. Round your answer to two decimal places.
Yes that's right. Keep it up! Answer Explanation The correlation coefficient, rounded to two decimal places, is r≈−0.81.
Perfect. Your hard work is paying off
Great work! That's correct. That's not right - let's review the answer. Well done! You got it right. An economist is trying to understand whether there is a strong link between CEO pay ratio and corporate revenue. The economist gathered data including the CEO pay ratio and corporate revenue for 30 companies for a particular year. The pay ratio data is reported by the companies and represents the ratio of CEO compensation to the median employee salary. The data are provided below. Use Excel to calculate the correlation coefficient r between the two data sets. Round your answer to two decimal places. The correlation coefficient, rounded to two decimal places, is r≈−0.17. A researcher is interested in whether the variation in the size of human beings is proportional throughout each part of the human. To partly answer this question they looked at the correlation between the foot length (in millimeters) and height (in centimeters) of 30 randomly selected adult males. The data is provided below. Use Excel to calculate the correlation coefficient r between the two data sets. Round your answer to two decimal places. The correlation coefficient, rounded to two decimal places, is r≈0.50. The table below gives the average weight (in kilograms) of certain people ages 1 –
Use Excel to find the best fit linear regression equation, where age is the explanatory variable. Round the slope and intercept to two decimal places. Answer 1:
Answer 2:
Yes that's right. Keep it up! Yes that's right. Keep it up!
Thus, the equation of line of best fit with slope and intercept rounded to two decimal places is yˆ=2.86x+4.69. In the following table, the age (in years) of the respondents is given as the x value, and the earnings (in thousands of dollars) of the respondents are given as the y value. Use Excel to find the best fit linear regression equation in thousands of dollars. Round the slope and intercept to three decimal places.
Answer Explanation Thus, the equation of line of best fit with slope and intercept rounded to three decimal places is yˆ=0.433x+24.493. PREDICITONS USING LINEAR REGRESSION Question The table shows data collected on the relationship between the time spent studying per day and the time spent reading per day. The line of best fit for the data is yˆ=0.16x+36.2. Assume the line of best fit is significant and there is a strong linear relationship between the variables. Studying (Minutes) 507090110 Reading (Minutes) 44485054 (a) According to the line of best fit, what would be the predicted number of minutes spent reading for someone who spent 67 minutes studying? Round your answer to two decimal places.
That's incorrect - mistakes are part of learning. Keep trying! Answer Explanation
Substitute 67 for x into the line of best fit to estimate the number of minutes spent reading for someone who spent 67 minutes studying: yˆ=0.16(67)+36.2=46.92. Question The table shows data collected on the relationship between the time spent studying per day and the time spent reading per day. The line of best fit for the data is yˆ=0.16x+36.. Studying (Minutes) 507090110 Reading (Minutes) 44485054 (a) According to the line of best fit, the predicted number of minutes spent reading for someone who spent 67 minutes studying is 46.. (b) Is it reasonable to use this line of best fit to make the above prediction? The estimate, a predicted time of 46.92 minutes, is both reliable and reasonable. The estimate, a predicted time of 46.92 minutes, is both unreliable and unreasonable. The estimate, a predicted time of 46.92 minutes, is reliable but unreasonable. The estimate, a predicted time of 46.92 minutes, is unreliable but reasonable. Answer Explanation
Not quite - review the answer explanation to help get the next one. Correct answer: The estimate, a predicted time of 46.92 minutes, is both reliable and reasonable. The data in the table only includes studying times between 50 and 110 minutes, so the line of best fit gives reliable and reasonable predictions for values of x between 50 and 110. Since 67 is between these values, the estimate is both reliable and reasonable. Your answer: The estimate, a predicted time of 46.92 minutes, is unreliable but reasonable. This estimate is both reliable and reasonable because 67 is inside the range 50 to 110 given in the table. Janet is studying the relationship between the average number of minutes spent exercising per day and math test scores and has collected the data shown in the table. The line of best fit for the data is yˆ=0.46x+66.. Minutes 15202530 Test Score 73767880 (a) According to the line of best fit, the predicted test score for someone who spent 23 minutes exercising is 76.. (b) Is it reasonable to use this line of best fit to make the above prediction? The estimate, a predicted test score of 76.98, is reliable and reasonable. The estimate, a predicted test score of 76.98, is unreliable but reasonable. The estimate, a predicted test score of 76.98, is unreliable and unreasonable. The estimate, a predicted test score of 76.98, is reliable but unreasonable. Answer Explanation
The data in the table only includes exercise times between 15 and 30 minutes, so the line of best fit gives reliable and reasonable predictions for values of x between 15 and 30. Since 23 is between these values, the estimate is reasonable. Your answer: The estimate, a predicted test score of 76.98, is unreliable and unreasonable. This estimate is reliable, because 23 is inside the range 15 to 30 given in the table. And, it is a realistic score, so it is reasonable. Nomenclature
Note that not all predictions are reasonable using a line of best fit. Typically, it is considered reasonable to make predictions for x-values which are between the smallest and largest observed x-values. These are known as interpolated values. Typically, it is
considered unreasonable to make predictions for x-values which are not between the smallest and largest observed x-values. These are known as extrapolated values. A scatterplot has a horizontal axis labeled x from 0 to 20 in increments of 1 and a vertical axis labeled y from 0 to 28 in increments of 2. 15 plotted points strictly follow the pattern of a line that rises from left to right and passes through the points left- parenthesis 6 comma 10 right-parentheses, left-parenthesis 8 comma 13 right- parenthesis, and left-parenthesis 14 comma 2 right-parentheses. There are other plotted points at left- parenthesis 10 comma 15 right-parenthesis and left-parenthesis 13 comma 19 right-parenthesis. The regions between the horizontal axis points from 1 to 6 and 14 to 20 are shaded as unreasonable. The region between the horizontal axis points from 6 to 14 is shaded as reasonable. All coordinates are approximate In the figure above, we see that the observed values have x-values ranging from 6 to 14. So it would be reasonable to use the line of best fit to make a prediction for the x value of 9 (because it is between 6 and 14 ), but it would be unreasonable to make a prediction for the x-value of 20 (because that is outside of the range). Nomenclature
Note that not all predictions are reasonable using a line of best fit. Typically, it is considered reasonable to make predictions for x-values which are between the smallest and largest observed x-values. These are known as interpolated values. Typically, it is considered unreasonable to make predictions for x-values which are not between the smallest and largest observed x-values. These are known as extrapolated values. A scatterplot has a horizontal axis labeled x from 0 to 20 in increments of 1 and a vertical axis labeled y from 0 to 28 in increments of 2. 15 plotted points strictly follow the pattern of a line that rises from left to right and passes through the points left- parenthesis 6 comma 10 right-parentheses, left-parenthesis 8 comma 13 right- parenthesis, and left-parenthesis 14 comma 2 right-parentheses. There are other plotted points at left- parenthesis 10 comma 15 right-parenthesis and left-parenthesis 13 comma 19 right-parenthesis. The regions between the horizontal axis points from 1 to 6 and 14 to 20 are shaded as unreasonable. The region between the horizontal axis points from 6 to 14 is shaded as reasonable. All coordinates are approximate In the figure above, we see that the observed values have x-values ranging from 6 to 14. So it would be reasonable to use the line of best fit to make a prediction for the x value of 9 (because it is between 6 and 14 ), but it would be unreasonable to make a prediction for the x-value of 20 (because that is outside of the range).
Erin is studying the relationship between the average number of minutes spent reading per day and math test scores and has collected the data shown in the table. The line of best fit for the data is yˆ=0.8x+51.2. According to the line of best fit, what would be the predicted test score for someone who spent 70 minutes reading? Is it reasonable to use this line of best fit to make this prediction? Minutes 3035404550 Test Score 7578858890
The predicted test score is 95.2, and the estimate is not reasonable. The predicted test score is 95.2, and the estimate is reasonable. The predicted test score is 107.2, and the estimate is not reasonable. The predicted test score is 107.2, and the estimate is reasonable. Answer Explanation Correct answer: The predicted test score is 107.2, and the estimate is not reasonable. Substitute 70 for x in the line of best fit to estimate the test score for someone who spent 70 minutes reading: yˆ=0.8(70)+51.2=107.2. The data in the table only includes reading times between 30 and 50 minutes, so the line of best fit only gives reasonable predictions for values of x between 30 and 50. Since 70 is far outside of this range of values, the estimate is not reasonable. Another thing to notice is that it predicts a test score of greater than 100 , which is typically impossible. Your answer: The predicted test score is 107.2, and the estimate is reasonable. The predicted value is not reasonable because the value of 70 minutes is not between 30 and 50 minutes. Question Data is collected on the relationship between the average number of minutes spent exercising per day and math test scores. The data is shown in the table and the line of best fit for the data is yˆ=0.42x+64.6. Assume the line of best fit is significant and there is a strong linear relationship between the variables. Minutes 25303540 Test Score 75778081
That's not right - let's review the answer. The estimate, a predicted test score of 80.56, is reliable but unreasonable. The estimate, a predicted test score of 80.56, is both unreliable and unreasonable. The estimate, a predicted test score of 80.56, is unreliable but reasonable. Answer Explanation Correct ans wer: The estimate, a predicted test score of 80.56, is both reliable and reasonable. The data in the table only includes exercise times between 25 and 40 minutes, so the line of best fit gives reasonable predictions for values of x between 25 and 40. Since 38 is between these values, the estimate is both reliable and reasonable. Question Data is collected on the relationship between the average daily temperature and time spent watching television. The data is shown in the table and the line of best fit for the data is y^=−0.81x+96.7. Assume the line of best fit is significant and there is a strong linear relationship between the variables. Temperature (Degrees) 30405060 Minutes Watching Televisio n 73635748 (a) According to the line of best fit, what would be the predicted number of minutes spent watching television for an average daily temperature of 45 degrees? Round your answer to two decimal places. Ans wer 1:
Ans wer 2:
Keep trying - mistakes can help us grow. Correct! You nailed it.
Answer Explanation
Substitute 45 for x into the line of best fit to estimate the number of minutes spent watching television for an average daily temperature of 45 degrees: y^=−0.81(45)+96.7=60.. Question Data is collected on the relationship between the average daily temperature and time spent watching television. The data is shown in the table and the line of best fit for the data is y^=−0.81x+96.. Temperature (Degrees) 30405060 Minutes Watching Televisio n 73635748 (a) According to the line of best fit, the predicted number of minutes spent watching television for an average daily temperature of 45 degrees is 60.. (b) Is it reasonable to use this line of best fit to make the above prediction? The estimate, a predicted time of 60.25 minutes, is unreliable but reasonable. The estimate, a predicted time of 60.25 minutes, is both reliable and reasonable.
Not quite - review the answer explanation to help get the next one. Substitute 39 for x into the line of best fit to estimate the number of minutes spent watching television for an average daily temperature of 39 degrees: yˆ=−0.6(39)+94.5=71.1. Question Homer is studying the relationship between the average daily temperature and time spent watching television and has collected the data shown in the table. The line of best fit for the data is yˆ=−0.6x+94.. Temperature (Degrees) 40506070 Minutes Watching Televisio n 70655952 (a) According to the line of best fit, the predicted number of minutes spent watching television for an average daily temperature of 39 degrees is 71.. (b) Is it reasonable to use this line of best fit to make the above prediction? The estimate, a predicted time of 71.1 minutes, is both unreliable and unreasonable. The estimate, a predicted time of 71.1 minutes, is both reliable and reasonable. The estimate, a predicted time of 71.1 minutes, is unreliable but reasonable. The estimate, a predicted time of 71.1 minutes, is reliable but unreasonable. Answer Explanation Correct answer: The estimate, a predicted time of 71.1 minutes, is unreliable but reasonable.