Download 9 Data Analysis and more Summaries Statistics in PDF only on Docsity!
9 Data Analysis
9.1 Mean, Median, Mode and Range
In Unit 8, you were looking at ways of collecting and representing data. In this unit, you
will go one step further and find out how to calculate statistical quantities which summa-
rise the important characteristics of the data.
The mean , median and mode are three different ways of describing the average.
- To find the mean , add up all the numbers and divide by the number of numbers.
- To find the median , place all the numbers in order and select the middle number.
- The mode is the number which appears most often.
- The range gives an idea of how the data are spread out and is the difference between
the smallest and largest values.
Worked Example 1
Find
(a) the mean (b) the median (c) the mode (d) the range
of this set of data.
Solution
(a) The mean is
(b) To find the median, place all the numbers in order.
As there are two middle numbers in this example, 5 and 6,
median = 5 +^6
(c) From the list above it is easy to see that 6 appears more than any other number, so
mode = 6.
(d) The range is the difference between the smallest and largest numbers, in this case
2 and 8. So the range is 8 − 2 = 6.
Worked Example 2
Five people play golf and at one hole their scores are
For these scores, find
(a) the mean (b) the median (c) the mode (d) the range.
Solution
(a) The mean is
(b) The numbers are already in order and the middle number is 4. So
median = 4.
(c) The score 4 occurs most often, so,
mode = 4.
(d) The range is the difference between the smallest and largest numbers, in this case
3 and 7, so
range = 7 − 3
Exercises
1. Find the mean median, mode and range of each set of numbers below.
(a) 3, 4, 7, 3, 5, 2, 6, 10
(b) 8, 10, 12, 14, 7, 16, 5, 7, 9, 11
(c) 17, 18, 16, 17, 17, 14, 22, 15, 16, 17, 14, 12
(d) 108, 99, 112, 111, 108
(e) 64, 66, 65, 61, 67, 61, 57
(f) 21, 30, 22, 16, 24, 28, 16, 17
2. Twenty children were asked their shoe sizes. The results are given below.
For this data, find
(a) the mean (b) the median (c) the mode (d) the range.
(b) What are the mean and range of the data?
(c) Richard's friend, Najir, also goes fishing. The mode of the number of fish
he has caught is also 0 and his range is 15.
What is the largest number of fish that Najir has caught?
8. A garage owner records the number of cars which visit his garage on 10 days.
The numbers are:
(a) Find the mean number of cars per day.
(b) The owner hopes that the mean will increase if he includes the number of
cars on the next day. If 252 cars use the garage on the next day, will the
mean increase or decrease?
9. The children in a class state how many children there are in their family.
The numbers they state are given below.
(a) Find the mean, median and mode for this data.
(b) Which is the most sensible average to use in this case?
10. The mean number of people visiting Jane each day over a five-day period is 8.
If 10 people visit Jane the next day, what happens to the mean?
11. The table shows the maximum and minimum temperatures recorded in six cities
one day last year.
City Maximum Minimum Los Angeles 22 ° C 12 ° C Boston 22 ° C − 3 C° Moscow 18 °^ C^ −^ 9 C° Atlanta 27 ° C 8 ° C Archangel 13 ° C − 15 C° Cairo 28 ° C 13 ° C
(a) Work out the range of temperature for Atlanta.
(b) Which city in the table had the lowest temperature?
(c) Work out the difference between the maximum temperature and the
minimum temperature for Moscow.
(LON)
12. The weights, in grams, of seven potatoes are
What is the median weight?
(SEG)
13. Here are the number of goals scored by a school football team in their matches this
term.
(a) Work out the mean number of goals.
(b) Work out the range of the number of goals scored.
(LON)
(a) The weights, in kilograms, of the 8 members of Hereward Hous e tug of war
team at a school sports are:
Calculate the mean weight of the team.
(b) The 8 members of Nelson House tug of war team have a mean weight of
64 kilograms.
Which team do you think will win a tug of war between Hereward House
and Nelson House? Give a reason for your answer.
(MEG)
15. Pupils in Year 8 are arranged in eleven classes. The class sizes are
(a) What is the modal class size?
(b) Calculate the mean class size.
The range of the class sizes for Year 9 is 3.
(c) What does this tell you about the class sizes in Year 9 compared with those
in Year 8?
(SEG)
16. A school has to select one pupil to take part in a General Knowledge Quiz.
Kim and Pat took part in six trial quizzes. The following lists show their scores.
Kim 28 24 21 27 24 26 Pat 33 19 16 32 34 18
Kim had a mean score of 25 with a range of 7.
(a) Calculate Pat's mean score and range.
(b) Which pupil would you choose to represent the school? Explain the reason
for your choice, referring to the mean scores and ranges.
(MEG)
Information
The study of statistics was begun by an English mathematician, John Graunt (1620–1674).
He collected and studied the death records in various cities in Britain and, despite the fact
that people die randomly, he was fascinated by the patterns he found.
Worked Example 2
The bar chart shows how many cars were sold by a salesman over a period of time.
Find the mean number of cars sold per day.
Solution
The data can be transferred to a table and a third column included as shown.
Mean = 50
Worked Example 3
A police station kept records of the number of road traffic accidents in their area each day
for 100 days. The figures below give the number of accidents per day.
Find the mean number of accidents per day.
Frequency
Cars sold per day
Cars sold daily Frequency Cars sold × Frequency 0 2 0 × 2 = 0 1 4 1 × 4 = 4 2 3 2 × 3 = 6 3 6 3 × 6 = 18 4 3 4 × 3 = 12 5 2 5 × 2 = 10
TOTALS 20 50
(Total days) (Total number of cars sold)
Solution
The first step is to draw out and complete a tally chart. The final column shown below
can then be added and completed.
Number of Accidents Tally Frequency No. of Accidents × Frequency
0 |||| 4 0 × 4 = 0 1 |||| |||| 10 1 × 10 = 10 2 |||| |||| |||| |||| || 22 2 × 22 = 44 3 |||| |||| |||| |||| ||| 23 3 × 23 = 69 4 |||| |||| |||| | 16 4 × 16 = 64 5 |||| |||| |||| || 17 5 × 17 = 85 6 |||| | 6 6 × 6 = 36 7 | 1 7 × 1 = 7 8 | 1 8 × 1 = 8 TOTALS 100 323
Mean number of accidents per day = 323
Exercises
1. A survey of 100 households asked how many cars there were in each household
The results are given below.
No. of Cars Frequency 0 5 1 70 2 21 3 3 4 1
Calculate the mean number of cars per household.
2. The survey of question 1 also asked how many TV sets there were in each house-
hold. The results are given below.
No. of TV Sets Frequency 0 2 1 30 2 52 3 8 4 5 5 3
Calculate the mean number of TV sets per household.
7. Hannah drew this bar chart to show the number of repeated cards she got when she
opened packets of football stickers.
Calculate the mean number of repeats per packet.
8. In a season a football team scored a total of 55 goals. The table below gives a
summary of the number of goals per match.
Goals per Match Frequency 0 4 1 6 2 3 8 4 2 5 1
(a) In how many matches did they score 2 goals?
(b) Calculate the mean number of goals per match.
9. A traffic warden is trying to work out the mean number of parking tickets he has
issued per day. He produced the table below, but has accidentally rubbed out some
of the numbers.
Fill in the missing numbers and calculate the mean.
Number of repeats
q^
y Frequency
Tickets per day Frequency No. of Tickets × Frequency 0 1 1 1 2 10 3 7 4 20 5 2 6 TOTALS 26 72
10. Here are the weights, in kg, of 30 students.
(a) Copy and complete the frequency table below using a class interval of 10
and starting at 30.
Weight Range (w) Tally Frequency 30 ≤ w < 40
(b) Which class interval has the highest frequency?
(LON)
11. The number of children per family in a recent survey of 21 families is shown.
(a) What is the range in the number of children per family?
(b) Calculate the mean number of children per family. Show your working.
A similar survey was taken in 1960.
In 1960 the range in the number of children per family was 7 and the mean was 2.7.
(c) Describe two changes that have occurred in the number of children per
family since 1960.
(SEG)
9.3 Calculations with the Mean
This section considers calculations concerned with the mean, which is usually taken to
be the most important measure of the average of a set of data.
Worked Example 1
The mean of a sample of 6 numbers is 3.2. An extra value of 3.9 is included in the
sample. What is the new mean?
8. The mean of 6 numbers is 12.3. When an extra number is added, the mean changes
to 11.9. What is the extra number?
9. When 5 is added to a set of 3 numbers the mean increases to 4.6. What was the
mean of the original 3 numbers?
10. Three numbers have a mean of 64. When a fourth number is included the mean is
doubled. What is the fourth number?
9.4 Mean, Median and Mode for Grouped Data
The mean and median can be estimated from tables of grouped data.
The class interval which contains the most values is known as the modal class.
Worked Example 1
The table below gives data on the heights, in cm, of 51 children.
Class Interval 140 ≤ h < 150 150 ≤ h < 160 160 ≤ h < 170 170 ≤ h < 180 Frequency 6 16 21 8
(a) Estimate the mean height. (b) Estimate the median height.
(c) Find the modal class.
Solution
(a) To estimate the mean, the mid-point of each interval should be used.
Class Interval Mid-point Frequency Mid-point × Frequency
140 ≤ h < 150 145 6 145 × 6 = 870 150 ≤ h < 160 155 16 155 × 16 = 2480 160 ≤ h < 170 165 21 165 × 21 = 3465 170 ≤ h < 180 175 8 175 × 8 = 1400 Totals 51 8215
Mean = 8215
= 161 (to the nearest cm)
(b) The median is the 26th value. In this case it lies in the 160 ≤ h < 170 class interval.
The 4th value in the interval is needed. It is estimated as
+ × 10 = 162 (to the nearest cm)
(c) The modal class is 160 ≤ h < 170 as it contains the most values.
Also note that when we speak of someone by age, say 8, then the person could be any age
from 8 years 0 days up to 8 years 364 days (365 in a leap year!). You will see how this is
tackled in the following example.
Worked Example 2
The age of children in a primary school were recorded in the table below.
Age 5 – 6 7 – 8 9 – 10 Frequency 29 40 38
(a) Estimate the mean. (b) Estimate the median. (c) Find the modal age.
Solution
(a) To estimate the mean, we must use the mid-point of each interval; so, for example
for '5 – 6', which really means
5 ≤ age < 7 ,
the mid-point is taken as 6.
Class Interval Mid-point Frequency Mid-point × Frequency
5 – 6 6 29 6 × 29 = 174
7 – 8 8 40 8 × 40 = 320
9 – 10 10 38 10 × 38 = 380
Totals 107 874
Mean = 874
= 8 2. (to 1 decimal place)
(b) The median is given by the 54th value, which we have to estimate. There are 29
values in the first interval, so we need to estimate the 25th value in the second
interval. As there are 40 values in the second interval, the median is estimated as
being
of the way along the second interval. This has width 9 − 7 = 2 years, so the
median is estimated by
× 2 = 1 25.
from the start of the interval. Therefore the median is estimated as
7 + 1 25. = 8 25. years.
(c) The modal age is the 7 – 8 age group.
But the length of the second interval is 10 5. − 5 5. = 5 , so the median is estimated by
. × 5 = 3 86.
from the start of this interval. Therefore the median is estimated as
(c) The modal class is 1–5, as this class contains the most entries.
Exercises
1. A door to door salesman keeps a record of the number of homes he visits each day.
Homes visited 0 – 9 10 – 19 20 – 29 30 – 39 40 – 49 Frequency 3 8 24 60 21
(a) Estimate the mean number of homes visited.
(b) Estimate the median.
(c) What is the modal class?
2. The weights of a number of students were recorded in kg.
Mean (kg) (^) 30 ≤ w < 35 35 ≤ w < 40 40 ≤ w < 45 45 ≤ w < 50 50 ≤ w < 55 Frequency 10 11 15 7 4
(a) Estimate the mean weight. (b) Estimate the median.
(c) What is the modal class?
3. A stopwatch was used to find the time that it took a group of children to run 100 m.
Time (seconds) 10 ≤ t < 15 15 ≤ t < 20 20 ≤ t < 25 25 ≤ t < 30 Frequency 6 16 21 8
(a) Is the median in the modal class? (b) Estimate the mean.
(c) Estimate the median.
(d) Is the median greater or less than the mean?
4. The distances that children in a year group travelled to school is recorded.
Distance (km) 0 ≤ d <0 5. 0 5. ≤ d <1 0. 1 0. ≤ d <1 5. 1 5. ≤ d <2 0. Frequency 30 22 19 8
(a) Does the modal class contain the median?
(b) Estimate the median and the mean.
(c) Which is the largest, the median or the mean?
5. The ages of the children at a youth camp are summarised in the table below.
Age (years) 6 – 8 9 – 11 12 – 14 15 – 17 Frequency 8 22 29 5
Estimate the mean age of the children.
6. The lengths of a number of leaves collected for a project are recorded.
Length (cm) 2 – 5 6 – 10 11 – 15 16 – 25 Frequency 8 20 42 12
Estimate (a) the mean (b) the median length of a leaf.
7. The table shows how many nights people spend at a campsite.
Number of nights 1 – 5 6 – 10 11 – 15 16 – 20 21 – 25 Frequency 20 26 32 5 2
(a) Estimate the mean. (b) Estimate the median.
(c) What is the modal class?
8. (a) A teacher notes the number of correct answers given by a class on
a multiple-choice test.
Correct answers 1 – 10 11 – 20 21 – 30 31 – 40 41 – 50 Frequency 2 8 15 11 3
(i) Estimate the mean. (ii) Estimate the median.
(iii) What is the modal class?
(b) Another class took the same test. Their results are given below.
Correct answers 1 – 10 11 – 20 21 – 30 31 – 40 41 – 50 Frequency 3 14 20 2 1
(i) Estimate the mean. (ii) Estimate the median.
(iii) What is the modal class?
(c) How do the results for the two classes compare?
Information
A quartile is one of 3 values (lower quartile, median and upper quartile) which divides
data into 4 equal groups.
A percentile is one of 99 values which divides data into 100 equal groups.
The lower quartile corresponds to the 25th percentile. The median corresponds to the
50th percentile. The upper quartile corresponds to the 75th percentile.
Viewing time Mid-point ×
( h hours) Mid-point Frequency Frequency 0 ≤ h < 10 5 13 65 10 ≤ h < 20 15 27 405 20 ≤ h < 30 25 33 825 30 ≤ h < 40 35 40 ≤ h < 50 45 50 ≤ h < 60 55
(SEG)
11. In an experiment, 50 people were asked to estimate the length of a rod to the
nearest centimetre. The results were recorded.
Length (cm) 20 21 22 23 24 25 26 27 28 29 Frequency 0 4 6 7 9 10 7 5 2 0
(a) Find the value of the median. (b) Calculate the mean length.
(c) In a second experiment another 50 people were asked to estimate the length
of the same rod. The most common estimate was 23 cm. The range of the
estimates was 13 cm.
Make two comparisons between the results of the two experiments.
(SEG)
12. The following list shows the maximum daily temperature, in °F , throughout the
month of April.
(a) Copy and complete the grouped frequency table below.
Temperature, T Frequency
40 < T ≤ 50 50 < T ≤ 54 54 < T ≤ 58 58 < T ≤ 62 62 < T ≤ 70
(b) Use the table of values in part (a) to calculate an estimate of the mean of this
distribution. You must show your working clearly.
(c) Draw a histogram to represent your distribution in part (a).
(MEG)
9.5 Cumulative Frequency
Cumulative frequencies are useful if more detailed information is required about a set of
data. In particular, they can be used to find the median and inter-quartile range.
The inter-quartile range contains the middle 50% of the sample and describes how
spread out the data are. This is illustrated in Example 2.
Worked Example 1
For the data given in the table, draw up a
cumulative frequency table and then draw
a cumulative frequency graph.
Solution
The table below shows how to calculate
the cumulative frequencies.
Height (cm) Frequency Cumulative Frequency 90 < h ≤ 100 5 5 100 < h ≤ 110 22 5 + 22 = 27 110 < h ≤ 120 30 27 + 30 = 57 120 < h ≤ 130 31 57 + 31 = 88 130 < h ≤ 140 18 88 + 18 = 106 140 < h ≤ 150 6 106 + 6 = 112
A graph can then be plotted using points as shown below.
0
20
40
60
80
100
120
90 100 110 120 130 140 150
q^
y
(90,0) (^) (100,5)
(110,27)
(120,57)
(130,88)
(140,106)
(150,112)
Height (cm)
Cumulative Frequency
Height (cm) Frequency 90 < h ≤ 100 5 100 < h ≤ 110 22 110 < h ≤ 120 30 120 < h ≤ 130 31 130 < h ≤ 140 18 140 < h ≤ 150 6