









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Mathematics in the modern world
Typology: Study notes
1 / 15
This page cannot be seen from the preview
Don't miss anything!










Data are a set of facts and provide a partial picture of reality. They are usually collected in a raw format and thus the inherent information is difficult to understand. Therefore, raw data need to be summarized, processed, and analyzed. However, no matter how well manipulated, the information derived from the raw data should be presented in an effective format. Otherwise, it would be a great loss for both authors and readers. Text, tables, and graphs for data and information presentation are very powerful communication tools. They can make a data presentation easy to understand, attract and sustain the interest of readers, and efficiently present large amounts of complex information where logical conclusions can be derived.
As journal editors and reviewers will scan through these presentations before reading the entire text, their importance cannot be disregarded. For this reason, authors must pay as close attention to selecting appropriate methods of data presentation as when they were collecting data of good quality and analyzing them. In addition, having a well established understanding of different methods of data presentation and their appropriate use will enable one to develop the ability to recognize and interpret inappropriately presented data or data presented in such a way that it deceives readers' eyes.
Types of Data Presentation There are three major types of data presentations. They are the textual, tabular and graphical presentations.
Now the next question is: How to select the best method of presentation?
Methods of presentation must be determined according to the data format, the method of analysis to be used, and the information to be emphasized. Inappropriately presented data may fail to clearly convey information to readers and clear interpretations may not be derived. Even when the same information is being conveyed, different methods of presentation must be employed depending on what specific information is going to be emphasized. A method of presentation must be chosen after carefully weighing the advantages and disadvantages of different methods of presentation.
For easy comparison of different methods of presentation, let us look briefly at their strengths. If one wishes to compare or introduce two values at a certain time point, it is appropriate to use text or the written language. However, a table is the most appropriate when all information requires equal attention, and it allows readers to selectively look at information of their own interest. Graphs allow readers to understand the overall trend in data, and intuitively understand the comparison results between two groups. One thing to always bear in mind regardless of what method is used, however, is the simplicity of presentation.
Let us now discuss the three methods of presentation – the text, the table and the graph in details.
Text is the main method of conveying information as it is used to explain results and trends, and provide contextual information. Data are fundamentally presented in paragraphs or sentences. Text can be used to provide interpretation or emphasize certain data. If quantitative information to be conveyed consists of one or two numbers, it is more appropriate to use written language than tables or graphs. For instance, information about the incidence rates of delirium following anesthesia in 2016–2017 can be presented with the use of a few numbers: “The incidence rate of delirium following anesthesia was 11% in 2016 and 15% in 2017; no significant difference of incidence rates was found between the two years.” If this information were to be presented in a graph or a table, it would occupy an unnecessarily large space on the page, without enhancing the readers' understanding of the data. If more data are to be presented, or other information such as that regarding data trends are to be conveyed, a table or a graph would be more appropriate. By nature, data take longer to read when presented as texts and when the main text includes a long list of information, readers and reviewers may have difficulties in understanding the information.
Tables convey information that has been converted into words or numbers in rows and columns. Anyone with a sufficient level of literacy can easily understand the information presented in a table. Tables are the most appropriate for presenting individual information and can present both quantitative and qualitative information.
The strength of tables is that they can accurately present information that cannot be presented with a graph. A number such as “132.145852” can be accurately expressed in a table.
Another strength is that information with different units can be presented together. For instance, blood pressure, heart rate, number of drugs administered, and anesthesia time can be presented together in one table. Finally, tables are useful for summarizing and comparing quantitative information of different variables. However, the interpretation of information takes longer in tables than in graphs, and tables are not appropriate for studying data trends. Furthermore, since all data are of equal importance in a table, it is not easy to identify and selectively choose the information required.
Whereas tables can be used for presenting all the information, graphs simplify complex information by using images and emphasizing data patterns or trends, and are useful for summarizing, explaining, or exploring quantitative data. While graphs are effective for presenting large amounts of data, they can be used in place of tables to present small sets of data. A graph format that best presents information must be chosen so that readers and reviewers can easily understand the information.
In the following, we describe frequently used graph formats and the types of data that are appropriately presented with each format with examples.
Scatter plots present data on the x- and y-axes and are used to investigate an association between two variables. A point represents each individual or object, and an association between two variables can be studied by analyzing patterns across multiple points. A regression line is added to a graph to determine whether the association between two variables can be explained or not. If multiple points exist at an identical location, the correlation level may not be clear. In this case, a correlation coefficient or regression line can be added to further elucidate the correlation.
A bar graph is used to indicate and compare values in a discrete category or group, and the frequency or other measurement parameters (i.e., mean). Depending on the number of categories, and the size or complexity of each
patterns and trends across data that include climatic influence, large changes or turning points, and are also appropriate for representing not only time-series data, but also data measured over the progression of a continuous variable such as distance. As can be seen in the figure, mean and standard deviation of systolic blood pressure are
indicated for each time point, which enables readers to easily understand changes of systolic pressure over time. If data are collected at a regular interval, values in between the measurements can be estimated. In a line graph, the x axis represents the continuous variable, while the y-axis represents the scale and measurement values. It is also useful to represent multiple data sets on a single line graph to compare and analyze patterns across different data sets.
A box and whisker chart does not make any assumptions about the underlying statistical distribution and represents variations in samples of a population; therefore, it is appropriate for representing nonparametric data. A box and whisker chart consists of boxes that represent interquartile range (one to three), the median and the mean of the data, and whiskers presented as lines outside of the boxes.
Whiskers can be used to present the largest and smallest values in a set of data or only a part of the data (i.e., 95% of all the data). Data that are excluded from the data set are presented as individual points and are called outliers. The spacing at both ends of the box indicates dispersion in the data. The relative location of the median demonstrated within the box indicates skewness. The box and whisker chart provided as an example represents calculated volumes of an anesthetic, desflurane, consumed over the course of the observation period.
Table Number (1) TITLE Table Heading (5) Prefatory Notes or Headnote
(3) Stub Head Master Cap
Column Caption
Row Caption
Row Caption
Row Caption
Total Columns
(6) Footnotes
(7) Source Notes
Now, let us discuss the different parts of a statistical table.
(2) Box Head
(4) Field, Text or Body
A statistical table has at least four major parts and some other minor parts. The major parts are (1) The Title; (2) The Box Head (column captions); (3) The Stub (row captions); and (4) The Body. On the other hand, the minor parts include: (5) Prefatory Notes; (6) Foot Notes; and (7) Source Notes. The general sketch of table indicating its necessary parts is in the slide. (1) The Title The title is the main heading written in capitals shown at the top of the table. It must explain the contents of the table and throw light on the table, as whole different parts of the heading can be separated by commas. There are no full stops in the little
(2) The Box Head (column captions)
The vertical heading and subheading of the column are called columns captions. The spaces where these column headings are written is called the box head. Only the first letter of the box head is in capital letters and the remaining words must be written in lowercase.
(3) The Stub (row captions)
The horizontal headings and sub heading of the row are called row captions and the space where these rows headings are written is called the stub.
(4) The Body This is the main part of the table which contains the numerical information classified with respect to row and column captions.
(5) Prefatory Notes A statement given below the title and enclosed in brackets usually describes the units of measurement and is called the prefatory notes
(6) Foot Notes These appear immediately below the body of the table providing additional explanation.
(7) Source Notes The source notes are given at the end of the table indicating the source the information has been taken from. It includes the information about compiling agency, publication, etc.
General Rules of Tabulation
Four Basic Parts of a Table
Other Parts of the Table
Important Terminologies The following terminologies are deemed necessary to be defined as they will facilitate easy understanding when encountered in the succeeding discussions. We have, first and foremost,
We are now about to discuss the four-step construction of frequency distribution table.
The tabular method of presentation makes use of rows and columns like for an example a frequency distribution table, better known for its acronym FDT. The frequency distribution can present the patterns of the variation and other characteristics of the data. In the case of a grouped frequency distribution which what we actually considered on in the discussion here, however, information on the exact values like the highest or lowest values may be sacrificed which necessitates reference to the original data.
It is now time to focus our attention on how to construct a frequency distribution table or FDT in four steps.
The following table shows the scores of 40 students in a 50-item test.
RAW DATA
25 42 14 38 25 31 45 27 16 8 44 36 30 36 37 39 30 35 32 43 31 48 13 28 29 39 32 19 32 23 24 34 20 27 17 31 20 26 28 44
With the given raw data, construct a frequency distribution table.
Now, on the slide, we could see that the scores of 40 students in a 50-item test are in the form of raw data. This means that they are not yet organized nor arranged in ascending nor descending order. And we are tasked here to construct a frequency distribution table.
The following table shows the scores of 40 students in a 50-item test.
ARRAY
8 19 25 28 31 32 37 43 13 20 25 28 31 34 38 44 14 20 26 29 31 35 39 44 16 23 27 30 32 36 39 45 17 24 27 30 32 36 42 48
The first step is to arrange raw data either in an array or in a stem-and-leaf diagram. What we can see in the slide is that the given raw data are transformed into an array. They are listed from the lowest value of 8 up to the highest value of
RAW DATA 25 42 14 38 25 31 45 27 16 8 44 36 30 36 37 39 30 35 32 43 31 48 13 28 29 39 32 19 32 23 24 34 20 27 17 31 20 26 28 44
STEM-AND-LEAF DIAGRAM (Temporary) 0 1 6 2 5 9 0 3 0 4 2 Stem – Tens Digit Leaf – Units Digit
The stem-and-leaf diagram is another visual illustration of the distribution of data. This form, however, is feasible only for a small number of observations – useful only for small data sets from about 15 to 150 data points.
To start the procedure, let us consider entering, one by one, the first six scores starting from 25 which is colored orange; 16 green; 30 blue; 29 red; 20 violet; and 42 colored yellow. From ocular inspection, the scores are at most two digit numbers and the lowest is a one-digit number. With the highest score of 48 and the lowest 8, we can construct the stem of the diagram starting from 0 to 4 which are actually the tens digits of the scores. As we enter the scores into the leaf of the diagram , we will only consider writing their unit digits. Starting with the score 25, its unit digit 5 colored orange is placed along the tens digit 2 in the stem; the score 16, its unit digit 6 colored green entered along ten digit 1 in the stem; score 30, its unit digit 0 blue along tens digit 3; the score 29, its unit digit 9 red along tens digit 2; the score 20, its unit digit 0 violet along tens digit 2; and score 42, its unit digit 2 yellow along tens digit 4 in the stem.
RAW DATA 25 42 14 38 25 31 45 27 16 8 44 36 30 36 37 39 30 35 32 43 31 48 13 28 29 39 32 19 32 23 24 34 20 27 17 31 20 26 28 44
STEM-AND-LEAF DIAGRAM (Temporary) 0 8 1 6 4 7 9 3 2 5 9 0 7 5 0 3 6 4 8 7 8
Total 40
With the use of before constructed final stem-and-leaf diagram, we can now do step 4, that is, construct the frequency distribution table by completing the entries on its Frequency column. For the first class, we will only count scores with values ranging from 6 to 11. The score equal to 8 is the only number within the said range of values, thus, the first class will only have 1 score as its frequency. The second class must only consider scores with values from 12 to 17. Looking at the stem-and-leaf diagram, those numbers are colored yellow, namely, 13, 14, 16, and 17. Thus, the second class will have a frequency of 4 scores. The third class with 18 – 23 class limits contains the blue colored numbers of 19, 20, 20, and 23, thus, will have a frequency of 4 scores. The same procedure holds in determining the frequencies of the fourth up to the eight class. The complete FDT can now be seen in the slide with a total frequency of 40 corresponding to the 40 scores of students in a 50-item test.
(Final) 0 8 1 3 4 6 7 9 2 0 0 3 4 5 5 6 7 7 8 8 9 3 0 0 1 1 1 2 2 2 4 5 6 6 7 8 9 9 4 2 3 4 4 5 8 Class Limits Frequency
6 – 11 1
12 – 17 4
18 – 23 4
24 – 29 9
30 – 35 10
36 – 41 6
42 – 47 5
48 – 53 1
Total 40
Class Limits Frequency
6 – 11 1
12 – 17 4
Total 40
Derived distributions are those distribution whose constructions are mainly based on the absolute frequency and relative frequency distributions. These include the “less than” and “greater than” cumulative frequency distributions, the relative (or percent) frequency distribution, and the “less than or equal to” and “greater than or equal to” relative cumulative frequency distributions.
Class Boundaries
Class Limits
Frequency Midpoint X
Cumulative Frequency Distribution
≤CF ≥CF
5.5 – 11.5 6 – 11 1 8.5 1 40
11.5 – 17.5 12 – 17 4 14.5 5 39
17.5 – 23.5 18 – 23 4 20.5 9 35
23.5 – 29.5 24 – 29 9 26.5 18 31
29.5 – 35.5 30 – 35 10 32.5 28 22
35.5 – 41.5 36 – 41 6 38.5 34 12
41.5 – 47.5 42 – 47 5 44.5 39 6
47.5 – 53.5 48 – 53 1 50.5 40 1
Total 40
To start fully creating an FDT with added distributions, we can initially fill in entries in the “Midpoint x” column. Class midpoints, also known as class marks, midmarks or midvalues, are points halfway between class limits or class boundaries. Therefore, to compute for the midpoint of the first class, we have lower limit 6 + upper limit 11 divided by 2 equals 8.5. For the midpoint of the second class, 12 plus 17 divided by 2 equals 14.5; midpoint of the third class, 18 plus 23 divided by 2 = 20.5; the same computation holds for midpoints of the fourth up to eighth class. As can be seen in the slide, the 8th^ class has a midpoint value of 50.5.
Next, we consider the so-called cumulative frequency distributions. They are series of subtotals obtained from successive addition of frequencies.
Let us compute entries under the “less than” cumulative frequency distribution. This distribution always starts with the first or lowest class. Such class has the lowest lower and upper limits. In the slide, the lowest class is at the top. There are times the orientation is reversed, that is, the lowest class is at the bottom and the highest class is at the top. So be sure that you are aware of the orientation of the FDT.
The “less than” cumulative frequency for a class is the number of observations less than or equal to a given upper class
Class Boundaries
Class Limits
Frequency Midpoint X
Cumulative Frequency Distribution
Relative Frequency (%)
Relative Cum Frequency Distribution
≤CF ≥CF ≤RCF ≥RCF
5.5 – 11.5 6 – 11 1 8.5 1 40 2.50 2.50 100.
11.5 – 17.5 12 – 17 4 14.5 5 39 10.00 12.50 97.
17.5 – 23.5 18 – 23 4 20.5 9 35 10.00 22.50 87.
23.5 – 29.5 24 – 29 9 26.5 18 31 22.50 45.00 77.
29.5 – 35.5 30 – 35 10 32.5 28 22 25.00 70.00 55.
35.5 – 41.5 36 – 41 6 38.5 34 12 15.00 85.00 30.
41.5 – 47.5 42 – 47 5 44.5 39 6 12.50 97.50 15.
47.5 – 53.5 48 – 53 1 50.5 40 1 2.50 100.00 2.
Total 40 100.00%
We have now the complete frequency distribution table of scores of 40 students in a 50-item test with derived distributions.
Let us now discuss the last topic for this video presentation – the graphs of a frequency distribution.
The numerical data provided in a frequency distribution can be made more interesting and easier to understand when depicted in graphical form. The purpose of sketching the graph or chart of a frequency distribution is to bring out visually and clearly the important characteristics of a frequency distribution. These may be in the form of histogram, frequency polygon, and the “less than” and “greater than” ogives.
The tabular method makes use of rows and columns like for an example a frequency distribution table that we will discuss in detail later on. The frequency distribution can present the patterns of the variation and other characteristics of the data. However, in the case of the grouped frequency distribution, information on the exact values like the highest or lowest values may be sacrificed which necessitates reference to the original data. On the other hand,
The numerical data provided in a frequency distribution can be made more interesting and easier to understand when depicted in graphical form. These may be in the form of:
a. The frequency polygon is also sometimes called the line graph because the graph is just a line connecting the points representing the important data in the xy-plane. b. The bar graph is an illustration of the data using bars in the xy-plane. c. The stem-and-leaf diagram or display is another visual illustration of the distribution of data. This form, however, is feasible only for a small number of observations with at least two-digit numbers. d. The pie graph is also known as the circle graph. Obviously, the presentation makes use of a circle to represent given data that make up a whole. e. With the pictograph or pictogram, picture symbols are used to illustrate or represent the data under consideration. Usually, in depicting population data, the figures of persons are used or the data on car sales.
Number one, the histogram. It is the bar chart of the frequency distribution. Histogram consists essentially of adjoining bars whose widths represent class intervals and whose heights stand for frequencies. The horizontal axis indicates the class boundaries, and the vertical axis is measured along the absolute frequencies.
Number two, the frequency polygon. Frequency polygon is a linear graph of a frequency distribution. The class midpoint is plotted along the horizontal axis and the frequency along the vertical axis. To close a polygon, the lines joining the points are extended to the midpoints of the class preceding the first class and the class following the last class of the frequency distribution. Frequency polygon is used when classes are numerous. It is also preferred if there are more graphs to be superimposed to one another; two or more histograms superimposed on the same graph results in cluttered graph.
A frequency curve approximates smooth curve as a result of increasing indefinitely the number of classes and sample size. A frequency curve is just a smoothed and refined frequency polygon, wherein the points are connected by smoothed curves instead of straight lines.