





Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to descriptive statistics in the context of research, focusing on the importance of identifying the population, variables, and sampling methods. It covers the distinction between qualitative and quantitative variables, the concept of a sample, and the significance of previous research. The document also explains random sampling and its representation of the population, as well as the importance of sample size and the use of visual displays for data presentation.
Typology: Exams
1 / 9
This page cannot be seen from the preview
Don't miss anything!






Any good research question should contain at least two important elements:
To get started, we need to define a few important statistical terms.
Population : The entire group of individuals or objects that we are interested in studying. One or more characteristics may be required to describe a population, depending on how specific the population is. Examples:
A. All adults in the U.S. B. All registered voters in the U.S. C. All female registered voters in the U.S. D. All female registered voters who live in New Mexico. E. All female registered voters who live in New Mexico who are aged 65 and over F. All female registered voters who live in Socorro, New Mexico who are aged 65 and over.
Variable : A characteristic of the population that we wish to investigate. Examples of variables include the following: height, weight, opinion on a political issue, gender, etc. There are two types of variables: Qualitative and Quantitative.
A. Qualitative variable is any variable that involves a non-numerical characteristic of the individual or object. Examples: gender, political affiliation, hair color, opinion (agree/disagree). B. Quantitative variable is a numerically meaningful characteristic of the individual or object. Typically involves a measurement, or a count taken on the individual. Examples: height, weight, test score, number of children, etc.
Sample : A subset of individuals selected from the population we are interested in studying. Examples of samples from each of the above populations:
A. 2000 adults in the U.S. B. 1000 registered voters in the U.S. C. 1200 female registered voters in the U.S. D. 200 female registered voters who live in New Mexico. E. 500 female registered voters who live in New Mexico who are aged 65 and over F. 100 female registered voters who live in Socorro, New Mexico who are aged 65 and over.
Where Do We Begin when Determining a Research Question?
Note that a good research question asks us to be very specific about who and what we are studying. But how do we get to the point where we can be this specific? Researchers typically start by conducting an initial exploration.
For example:
Random Sampling
A simple random sample of measurements of size, say n, from a population is a subset of the population in which all individuals or items in the population have an equal chance of being selected.
How can we select a simple random sample?
Why is a random sample a representative sample?
The data collected from a qualitative variable is called qualitative data. The data collected from a quantitative variable is called quantitative data. We organize qualitative and quantitative data somewhat differently, because qualitative data comes to us in categories (with no specific order, like male and female) and quantitative data comes to us as measurements (numbers whose order has meaning.)
Organizing Qualitative Data
Suppose John has conducted a survey of the faculty at a local junior college. He selected a random sample of 81 faculty members from this college and asked them questions regarding: what their department is, degree, rank when hired, masters degree when hired, age, age when hired, gender and salary.
Since qualitative variables involve "natural" categories or groups. The organization of this data is usually straight forward. Simply list each category and the count the number of individuals in each category. This table is called a frequency table. (Note, you can report the frequency (count), the relative frequency (count/total number sampled) or percent frequency.
For example, in John’s data set variable is Degree is qualitative: the categories are bachelors, masters, master+30hr, and Ph.D.
The frequency table for the variable "Degree" is shown below for John's data:
Count of Degree Degree Total bachelors 17 masters 48 masters+30hr 10 Ph.D 6
A frequency or relative frequency table is one way to present a visual display in table form. There are a number of other visual displays that can be used. Two of the most common visual displays in graphical form are pie charts and bar charts.
John created a pie chart of faculty Degrees from the above table, see below. Note that the pie represents the entire group; it is divided into sections proportionate to the number of individuals in each group.
Degree
21%
60%
12%
7%
bachelors masters masters+30hr Ph.D
John also created a bar chart to present a visual display of faculty Degrees, see below.
0
10
20
30
40
50
60
bachelors masters masters+30hr Ph.D Degree
Count
Organizing Quantitative Data
Quantitative data involves measurements (or quantities .) To organize quantitative data, the first step is to create a list of categories or groups and count the number of individuals in each group. While quantitative data can be ordered from smallest to largest, the cut-off values used to determine the groups is up to us. The frequency table for the variable "Salary" is shown below for John's data:
John made a histogram of the salary data:
Histogram for Salary
0
5
10
15
20
25
19000 24000 29000 34000 39000 44000 More Salary
Count
The salary groups appear along one axis (in this case the horizontal axis) and the frequencies/counts appear along the other axis (in this case the vertical axis). The tick marks along the salary axis correspond to the cut-off (boundaries) for each salary group. The height of each bar reflects the number of faculty member in each salary group.
The overall pattern or shape of the histogram is one characteristic that is important to include when summarizing quantitative data.
How do we describe the pattern of a histogram?
If you cut the histogram in half and the halves are approximate mirror images of each other, we call the shape symmetric.
Note: Bell-shaped data is also symmetric. Skewed data is not symmetric.
When evaluating a visual data display, there are three main issues to consider:
Before creating your own visual display, be sure to think about these three questions.
There should be a purpose for creating the visual display: either a point is being made or a question is being answered. If the purpose is to help answer a research question, be sure that the display matches the research question.
What constitutes an effective visual display? The most important element is clarity. The reader should quickly and easily see the point that is being made.
A fair visual display does not misrepresent the information to the reader. There are many examples of visual displays that mislead the reader, either unknowingly or purposefully.
Descriptive Statistics are numbers that summarize important characteristics of a sample. They give us a quick and easy overview of the data. They are used to give estimates of the population parameters: the mean μ, the standard deviation σ, proportion p, etc.
For quantitative data, some measure of center is an important characteristic to report.
There are two measures of center to consider:
=
n
i
xi n
x 1
is the average of all the values in the data set.
The amount of variability is another characteristic that is important to include when summarizing quantitative data. It is a descriptive statistic that represents the amount of variety or spread that we see among the data.
Two measure of variability we might consider are:
1
2
= n
x x s
n
i
i is the "average distance from the
mean." This is by far the most common measure of variability.
Other descriptive statistics that we might include in our summary would be the minimum and maximum values in the data set, the first quartile, the third quartile, etc.