

























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
About statistics Math hope u learn
Typology: Summaries
1 / 33
This page cannot be seen from the preview
Don't miss anything!


























238 M ATHEMATICS
Everyday we come across a lot of information in the form of facts, numerical figures, tables, graphs, etc. These are provided by newspapers, televisions, magazines and other means of communication. These may relate to cricket batting or bowling averages, profits of a company, temperatures of cities, expenditures in various sectors of a five year plan, polling results, and so on. These facts or figures, which are numerical or otherwise, collected with a definite purpose are called data. Data is the plural form of the Latin word datum. Of course, the word ‘data’ is not new for you. You have studied about data and data handling in earlier classes.
Our world is becoming more and more information oriented. Every part of our lives utilises data in one form or the other. So, it becomes essential for us to know how to extract meaningful information from such data. This extraction of meaningful information is studied in a branch of mathematics called Statistics.
The word ‘statistics’ appears to have been derived from the Latin word ‘status’ meaning ‘a (political) state’. In its origin, statistics was simply the collection of data on different aspects of the life of people, useful to the State. Over the period of time, however, its scope broadened and statistics began to concern itself not only with the collection and presentation of data but also with the interpretation and drawing of inferences from the data. Statistics deals with collection, organisation, analysis and interpretation of data. The word ‘statistics’ has different meanings in different contexts. Let us observe the following sentences:
In the first sentence, statistics is used in a plural sense, meaning numerical data. These may include a number of educational institutions of India, literacy rates of various
STATISTICS 239
states, etc. In the second sentence, the word ‘statistics’ is used as a singular noun, meaning the subject which deals with the collection, presentation, analysis of data as well as drawing of meaningful conclusions from the data.
In this chapter, we shall briefly discuss all these aspects regarding data.
Let us begin with an exercise on gathering data by performing the following activity.
Activity 1 : Divide the students of your class into four groups. Allot each group the work of collecting one of the following kinds of data:
(i) Heights of 20 students of your class. (ii) Number of absentees in each day in your class for a month. (iii) Number of members in the families of your classmates. (iv) Heights of 15 plants in or around your school. Let us move to the results students have gathered. How did they collect their data in each group?
(i) Did they collect the information from each and every student, house or person concerned for obtaining the information? (ii) Did they get the information from some source like available school records? In the first case, when the information was collected by the investigator herself or himself with a definite objective in her or his mind, the data obtained is called primary data.
In the second case, when the information was gathered from a source which already had the information stored, the data obtained is called secondary data. Such data, which has been collected by someone else in another context, needs to be used with great care ensuring that the source is reliable.
By now, you must have understood how to collect data and distinguish between primary and secondary data.
1. Give five examples of data that you can collect from your day-to-day life. 2. Classify the data in Q.1 above as primary or secondary data.
STATISTICS 241
in a table, as given below:
Table 14.
Marks Number of students (i.e., the frequency) 10 1 20 1 36 3 40 4 50 3 56 2 60 4 70 4 72 1 80 1 88 2 92 3 95 1
Total 30
Table 14.1 is called an ungrouped frequency distribution table, or simply a frequency distribution table. Note that you can use also tally marks in preparing these tables, as in the next example.
Example 3 : 100 plants each were planted in 100 schools during Van Mahotsava. After one month, the number of plants that survived were recorded as :
95 67 28 32 65 65 69 33 98 96 76 42 32 38 42 40 40 69 95 92 75 83 76 83 85 62 37 65 63 42 89 65 73 81 49 52 64 76 83 92 93 68 52 79 81 83 59 82 75 82 86 90 44 62 31 36 38 42 39 83 87 56 58 23 35 76 83 85 30 68 69 83 86 43 45 39 83 75 66 83 92 75 89 66 91 27 88 89 93 42 53 69 90 55 66 49 52 83 34 36
242 M ATHEMATICS
To present such a large amount of data so that a reader can make sense of it easily, we condense it into groups like 20-29, 30-39,.. ., 90-99 (since our data is from 23 to 98). These groupings are called ‘classes’ or ‘class-intervals’, and their size is called the class-size or class width , which is 10 in this case. In each of these classes, the least number is called the lower class limit and the greatest number is called the upper class limit, e.g., in 20-29, 20 is the ‘lower class limit’ and 29 is the ‘upper class limit’.
Also, recall that using tally marks, the data above can be condensed in tabular form as follows:
Table 14.
Number of plants Tally Marks Number of schools survived (frequency)
20 - 29 ||| 3 30 - 39 |||| |||| |||| 14 40 - 49 |||| |||| || 12 50 - 59 |||| ||| 8 60 - 69 |||| |||| |||| ||| 18 70 - 79 |||| |||| 10 80 - 89 |||| |||| |||| |||| ||| 23 90 - 99 |||| |||| || 12
Total 100
Presenting data in this form simplifies and condenses data and enables us to observe certain important features at a glance. This is called a grouped frequency distribution table. Here we can easily observe that 50% or more plants survived in 8 + 18 + 10 + 23 + 12 = 71 schools.
We observe that the classes in the table above are non-overlapping. Note that we could have made more classes of shorter size, or fewer classes of larger size also. For instance, the intervals could have been 22-26, 27-31, and so on. So, there is no hard and fast rule about this except that the classes should not overlap.
Example 4 : Let us now consider the following frequency distribution table which gives the weights of 38 students of a class:
244 M ATHEMATICS
Now it is possible for us to include the weights of the new students in these classes. But, another problem crops up because 35.5 appears in both the classes 30.5 - 35. and 35.5 - 40.5. In which class do you think this weight should be considered?
If it is considered in both classes, it will be counted twice.
By convention , we consider 35.5 in the class 35.5 - 40.5 and not in 30.5 - 35.5. Similarly, 40.5 is considered in 40.5 - 45.5 and not in 35.5 - 40.5.
So, the new weights 35.5 kg and 40.5 kg would be included in 35.5 - 40.5 and 40.5 - 45.5, respectively. Now, with these assumptions, the new frequency distribution table will be as shown below:
Table 14.
Weights (in kg) Number of students 30.5-35.5 9 35.5-40.5 6 40.5-45.5 15 45.5-50.5 3 50.5-55.5 1 55.5-60.5 2 60.5-65.5 2 65.5-70.5 1 70.5-75.5 1
Total 40
Now, let us move to the data collected by you in Activity 1. This time we ask you to present these as frequency distribution tables.
Activity 2 : Continuing with the same four groups, change your data to frequency distribution tables.Choose convenient classes with suitable class-sizes, keeping in mind the range of the data and the type of data.
STATISTICS 245
1. The blood groups of 30 students of Class VIII are recorded as follows: A, B, O, O, AB, O, A, O, B, A, O, B, A, O, O, A, AB, O, A, A, O, O, AB, B, A, O, B, A, B, O. Represent this data in the form of a frequency distribution table. Which is the most common, and which is the rarest, blood group among these students? 2. The distance (in km) of 40 engineers from their residence to their place of work were found as follows: 5 3 10 20 25 11 13 7 12 31 19 10 12 17 18 11 32 17 16 2 7 9 7 8 3 5 12 15 18 3 12 14 2 9 6 15 15 7 6 12 Construct a grouped frequency distribution table with class size 5 for the data given above taking the first interval as 0-5 (5 not included). What main features do you observe from this tabular representation? 3. The relative humidity (in %) of a certain city for a month of 30 days was as follows: 98.1 98.6 99.2 90.3 86.5 95.3 92.9 96.3 94.2 95. 89.2 92.3 97.1 93.5 92.7 95.1 97.2 93.3 95.2 97. 96.2 92.1 84.9 90.2 95.7 98.3 97.3 96.1 92.1 89 (i) Construct a grouped frequency distribution table with classes 84 - 86, 86 - 88, etc. (ii) Which month or season do you think this data is about? (iii) What is the range of this data? 4. The heights of 50 students, measured to the nearest centimetres, have been found to be as follows: 161 150 154 165 168 161 154 162 150 151 162 164 171 165 158 154 156 172 160 170 153 159 161 170 162 165 166 168 165 164 154 152 153 156 158 162 160 161 173 166 161 159 162 167 168 159 158 153 154 159 (i) Represent the data given above by a grouped frequency distribution table, taking the class intervals as 160 - 165, 165 - 170, etc. (ii) What can you conclude about their heights from the table? 5. A study was conducted to find out the concentration of sulphur dioxide in the air in
STATISTICS 247
The representation of data by tables has already been discussed. Now let us turn our attention to another representation of data, i.e., the graphical representation. It is well said that one picture is better than a thousand words. Usually comparisons among the individual items are best shown by means of graphs. The representation then becomes easier to understand than the actual data. We shall study the following graphical representations in this section. (A) Bar graphs (B) Histograms of uniform width, and of varying widths (C) Frequency polygons
(A) Bar Graphs
In earlier classes, you have already studied and constructed bar graphs. Here we shall discuss them through a more formal approach. Recall that a bar graph is a pictorial representation of data in which usually bars of uniform width are drawn with equal spacing between them on one axis (say, the x -axis), depicting the variable. The values of the variable are shown on the other axis (say, the y -axis) and the heights of the bars depend on the values of the variable.
Example 5 : In a particular section of Class IX, 40 students were asked about the months of their birth and the following graph was prepared for the data so obtained:
Fig. 14.
Observe the bar graph given above and answer the following questions:
(i) How many students were born in the month of November?
(ii) In which month were the maximum number of students born?
248 M ATHEMATICS
Solution : Note that the variable here is the ‘month of birth’, and the value of the variable is the ‘Number of students born’.
(i) 4 students were born in the month of November.
(ii) The Maximum number of students were born in the month of August.
Let us now recall how a bar graph is constructed by considering the following example.
Example 6 : A family with a monthly income of Rs 20,000 had planned the following expenditures per month under various heads:
Table 14.
Heads Expenditure (in thousand rupees) Grocery 4 Rent 5 Education of children 5 Medicine 2 Fuel 2 Entertainment 1 Miscellaneous 1
Draw a bar graph for the data above.
Solution : We draw the bar graph of this data in the following steps. Note that the unit in the second column is thousand rupees. So, ‘4’ against ‘grocery’ means Rs 4000.
250 M ATHEMATICS
Let us represent the data given above graphically as follows:
(i) We represent the weights on the horizontal axis on a suitable scale. We can choose the scale as 1 cm = 5 kg. Also, since the first class interval is starting from 30. and not zero, we show it on the graph by marking a kink or a break on the axis.
(ii) We represent the number of students (frequency) on the vertical axis on a suitable scale. Since the maximum frequency is 15, we need to choose the scale to accomodate this maximum frequency.
(iii) We now draw rectangles (or rectangular bars) of width equal to the class-size and lengths according to the frequencies of the corresponding class intervals. For example, the rectangle for the class interval 30.5 - 35.5 will be of width 1 cm and length 4.5 cm.
(iv) In this way, we obtain the graph as shown in Fig. 14.3:
Fig. 14.
Observe that since there are no gaps in between consecutive rectangles, the resultant graph appears like a solid figure. This is called a histogram , which is a graphical representation of a grouped frequency distribution with continuous classes. Also, unlike a bar graph, the width of the bar plays a significant role in its construction.
Here, in fact, areas of the rectangles erected are proportional to the corresponding frequencies. However, since the widths of the rectangles are all equal, the lengths of the rectangles are proportional to the frequencies. That is why, we draw the lengths according to (iii) above.
STATISTICS 251
Now, consider a situation different from the one above.
Example 7 : A teacher wanted to analyse the performance of two sections of students in a mathematics test of 100 marks. Looking at their performances, she found that a few students got under 20 marks and a few got 70 marks or above. So she decided to group them into intervals of varying sizes as follows: 0 - 20, 20 - 30,.. ., 60 - 70, 70 - 100. Then she formed the following table:
Table 14.
Marks Number of students
0 - 20 7 20 - 30 10 30 - 40 10 40 - 50 20 50 - 60 20 60 - 70 15 70 - above 8 Total 90
A histogram for this table was prepared by a student as shown in Fig. 14.4.
Fig. 14.
STATISTICS 253
Since we have calculated these lengths for an interval of 10 marks in each case, we may call these lengths as “proportion of students per 10 marks interval”.
So, the correct histogram with varying width is given in Fig. 14.5.
Fig. 14.
There is yet another visual way of representing quantitative data and its frequencies. This is a polygon. To see what we mean, consider the histogram represented by Fig. 14.3. Let us join the mid-points of the upper sides of the adjacent rectangles of this histogram by means of line segments. Let us call these mid-points B, C, D, E, F and G. When joined by line segments, we obtain the figure BCDEFG (see Fig. 14.6). To complete the polygon, we assume that there is a class interval with frequency zero before 30.5 - 35.5, and one after 55.5 - 60.5, and their mid-points are A and H, respectively. ABCDEFGH is the frequency polygon corresponding to the data shown in Fig. 14.3. We have shown this in Fig. 14.6.
254 M ATHEMATICS
Fig. 14. Although, there exists no class preceding the lowest class and no class succeeding the highest class, addition of the two class intervals with zero frequency enables us to make the area of the frequency polygon the same as the area of the histogram. Why is this so? ( Hint : Use the properties of congruent triangles.)
Now, the question arises: how do we complete the polygon when there is no class preceding the first class? Let us consider such a situation.
Example 8 : Consider the marks, out of 100, obtained by 51 students of a class in a test, given in Table 14.9.
256 M ATHEMATICS
Frequency polygons can also be drawn independently without drawing histograms. For this, we require the mid-points of the class-intervals used in the data. These mid-points of the class-intervals are called class-marks.
To find the class-mark of a class interval, we find the sum of the upper limit and lower limit of a class and divide it by 2. Thus,
Class-mark =
Upper limit + Lower limit 2 Let us consider an example.
Example 9 : In a city, the weekly observations made in a study on the cost of living index are given in the following table:
Table 14.
Cost of living index Number of weeks 140 - 150 5 150 - 160 10 160 - 170 20 170 - 180 9 180 - 190 6 190 - 200 2 Total 52
Draw a frequency polygon for the data above (without constructing a histogram).
Solution : Since we want to draw a frequency polygon without a histogram, let us find the class-marks of the classes given above, that is of 140 - 150, 150 - 160,....
For 140 - 150, the upper limit = 150, and the lower limit = 140
So, the class-mark =
Continuing in the same manner, we find the class-marks of the other classes as well.
STATISTICS 257
So, the new table obtained is as shown in the following table:
Table 14.
Classes Class-marks Frequency
140 - 150 145 5 150 - 160 155 10 160 - 170 165 20 170 - 180 175 9 180 - 190 185 6 190 - 200 195 2
Total 52
We can now draw a frequency polygon by plotting the class-marks along the horizontal axis, the frequencies along the vertical-axis, and then plotting and joining the points B(145, 5), C(155, 10), D(165, 20), E(175, 9), F(185, 6) and G(195, 2) by line segments. We should not forget to plot the point corresponding to the class-mark of the class 130 - 140 (just before the lowest class 140 - 150) with zero frequency, that is, A(135, 0), and the point H (205, 0) occurs immediately after G(195, 2). So, the resultant frequency polygon will be ABCDEFGH (see Fig. 14.8).
Fig. 14.