Anscombe's Quartet: Comparing Four Data Sets, Summaries of Descriptive statistics

Information about anscombe's quartet, a famous example in statistics that demonstrates the importance of visualizing data before drawing conclusions based on descriptive statistics. The data sets, their descriptive statistics, and teacher's notes that explain the similarities and differences between them.

Typology: Summaries

2021/2022

Uploaded on 09/12/2022

jackie4
jackie4 🇨🇦

4.6

(19)

262 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
NV 07/06/21
Version 1.0
1 of 2
Comparing data sets: Anscombes Quartet
Set 1
Set 2
Set 3
Set 4
x
x
y
x
y
x
y
10
10
9.14
10
7.46
8
6.58
8
8
8.14
8
6.77
8
5.76
13
13
8.74
13
12.74
8
7.71
9
9
8.77
9
7.11
8
8.84
11
11
9.26
11
7.81
8
8.47
14
14
8.1
14
8.84
8
7.04
6
6
6.13
6
6.08
8
5.25
4
4
3.1
4
5.39
19
12.5
12
12
9.13
12
8.15
8
5.56
7
7
7.26
7
6.42
8
7.91
5
5
4.74
5
5.73
8
6.89
1. Compare these data sets by calculating appropriate descriptive statistics.
2. What’s the same & what’s different?
3. How would you describe the correlation of each data set?
4. How else could you investigate these data sets?
pf2

Partial preview of the text

Download Anscombe's Quartet: Comparing Four Data Sets and more Summaries Descriptive statistics in PDF only on Docsity!

NV 07 /0 6 / 21 1 of (^2) Version 1.

Comparing data sets: Anscombe’s Quartet

Set 1 Set 2 Set 3 Set 4 x y x y x y x y 10 8. 04 10 9.14 10 7.46 8 6. 8 6.95 8 8.14 8 6.77 8 5. 13 7.58 13 8.74 13 12.74 8 7. 9 8.81 9 8.77 9 7.11 8 8. 11 8.33 11 9.26 11 7.81 8 8. 14 9.96 14 8.1 14 8.84 8 7. 6 7.24 6 6.13 6 6.08 8 5. 4 4.26 4 3.1 4 5.39 19 12. 12 10.84 12 9.13 12 8.15 8 5. 7 4.82 7 7.26 7 6.42 8 7. 5 5.68 5 4.74 5 5.73 8 6.

  1. Compare these data sets by calculating appropriate descriptive statistics.
  2. What’s the same & what’s different?
  3. How would you describe the correlation of each data set?
  4. How else could you investigate these data sets?

NV 07 /0 6 / 21 2 of (^2) Version 1. Teacher’s notes: Desmos file of data sets: https://www.desmos.com/calculator/6tdarsx61s These data are Anscombe’s Quartet. All 4 data sets have the same descriptive statistics to 2dp:

  • mean x value is 9 for each dataset
  • mean y value is 7.50 for each dataset
  • Assuming each data set is a population: the variance for x is 10.0 & the variance for y is 3.75 to 3sf
  • If students assume these are sample: the variance for x is 11.0 and the variance for y is 4.13 to 3sf
  • The correlation between x and y is 0.816 for each dataset
  • A linear regression (line of best fit) for each dataset follows the equation y = 0.5x + 3 Without plotting these data sets as scatter graphs, students may conclude that the PMCC value of 0. implies strong positive linear correlation and that the data sets are indistinguishable from each other. These are a great way to reinforce the importance of looking at data before drawing conclusions from descriptive statistics.