








Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A scatterplot shows the relationship between two quantitative variables measured for the same individuals. The values of one variable appear on the horizontal ...
Typology: Lecture notes
1 / 14
This page cannot be seen from the preview
Don't miss anything!









Diana Mindrila, Ph.D. Phoebe Balentyne, M.Ed.
Based on Chapter 4 of The Basic Practice of Statistics (6th^ ed.)
Concepts: Displaying Relationships: Scatterplots Interpreting Scatterplots Adding Categorical Variables to Scatterplots Measuring Linear Association: Correlation Facts About Correlation
Objectives: Construct and interpret scatterplots. Add categorical variables to scatterplots. Calculate and interpret correlation. Describe facts about correlation.
References: Moore, D. S., Notz, W. I, & Flinger, M. A. (2013). The basic practice of statistics (6th ed.). New York, NY: W. H. Freeman and Company.
Scatterplot The most useful graph for displaying the relationship between two quantitative variables is a scatterplot.
Many research projects are correlational studies because they investigate the relationships that may exist between variables. Prior to investigating the relationship between two quantitative variables, it is always helpful to create a graphical representation that includes both of these variables. Such a graphical representation is called a scatterplot.
The purpose of a scatterplot is to provide a general illustration of the relationship between the two variables. In this example, in general, as GPA increases so does an individual’s motivation score. One of the students in this example does not seem to follow the general pattern: Mary. She is one of the students with the lowest GPA, but she has the maximum score on the motivation scale. This makes her an exception or an outlier.
Interpreting Scatterplots
How to Examine a Scatterplot
Interpreting Scatterplots: Form Another important component to a scatterplot is the form of the relationship between the two variables.
This example illustrates a linear relationship. This means that the points on the scatterplot closely resemble a straight line. A relationship is linear if one variable increases by approximately the same rate as the other variables changes by one unit.
This example illustrates a relationship that has the form of a curve, rather than a straight line. This is due to the fact that one variable does not increase at a constant rate and may even start decreasing after a certain point. This example describes a curvilinear relationship between the variable “age” and the variable “working memory.” In this example, working memory increases throughout childhood, remains steady in adulthood, and begins decreasing around age 50.
Interpreting Scatterplots: Strength Another important component to a scatterplot is the strength of the relationship between the two variables. The slope provides information on the strength of the relationship.
The strongest linear relationship occurs when the slope is 1. This means that when one variable increases by one, the other variable also increases by the same amount. This line is at a 45 degree angle. The strength of the relationship between two variables is a crucial piece of information. Relying on the interpretation of a scatterplot is too subjective. More precise evidence is needed, and this evidence is obtained by computing a coefficient that measures the strength of the relationship under investigation.
Correlations
Example: There is a moderate, positive, linear relationship between GPA and achievement motivation.
r = 0.
Based on the criteria listed on the previous page, the value of r in this case (r = 0.62) indicates that there is a positive, linear relationship of moderate strength between achievement motivation and GPA.
Correlation The images below illustrate what the relationships might look like at different degrees of strength (for different values of r).
For a correlation coefficient of zero, the points have no direction, the shape is almost round, and a line does not fit to the points on the graph. As the correlation coefficient increases, the observations group closer together in a linear shape. The line is difficult to detect when the relationship is weak (e.g., r = -0.3), but becomes more clear as relationships become stronger (e.g., r = -0.99)
Facts About Correlation
The order of variables in a correlation is not important.
Correlations provide evidence of association, not causation.
r has no units and does not change when the units of measure of x , y , or both
are changed.
negative r values indicate negative associations.
Pearson r : Assumptions Assumptions: Correlation requires that both variables be quantitative. Correlation describes linear relationships. Correlation does not describe curve relationships between variables, no matter how strong the relationship is.
Cautions: Correlation is not resistant. r is strongly affected by outliers. Correlation is not a complete summary of two-variable data. For example:
The correlation coefficient is based on means and standard deviations, so it is not robust to outliers; it is strongly affected by extreme observations. These individuals are sometimes referred to as influential observations because they have a strong impact on the correlation coefficient. For instance, in the above example the correlation coefficient is 0.62 on the left when the outlier is included in the analysis. However, when this outlier is removed, the correlation coefficient increases significantly to 0.89. This one case, when included in the analysis, reduces a strong relationship to a moderate relationship. This case makes such a big difference in this example because the data set contains a very small number of individuals. As a general rule, as the size of the sample increases, the influence of extreme observations decreases. When describing the relationship between two variables, correlations are just one piece of the puzzle. This information is necessary, but not sufficient. Other analyses should also be conducted to provide more information.