







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Study notes for Pearson correlation and Regression Analysis
Typology: Study notes
1 / 13
This page cannot be seen from the preview
Don't miss anything!








📘 PEARSON CORRELATION COEFFICIENT (r) – COMPLETE REVIEW NOTES
🧠 1. Overview of Inferential Statistics
🔹 Definition
Inferential statistics is the branch of statistics used to make conclusions about a population based on a sample. It allows researchers to analyze, test hypotheses, and predict outcomes using data.
🔹 Purpose
● To draw inferences about a population from a sample. ● To measure reliability of conclusions ● To analyze relationships and make predictions through tests and models.
⚙ 2. Two Major Types of Inferential Statistics
Type Description Examples
Regression Analysis Determines how one variable changes with respect to another.
Simple, Multiple, Logistic Regression
Hypothesis Testing Tests assumptions about a population using sample data.
z-Test, t-Test, ANOVA
🧩 Hypothesis Testing Components
● Null Hypothesis (H ₀ ): No effect or difference exists. ● Alternative Hypothesis (H ₁ ): There is an effect or difference. ● Decision Basis: Based on test statistic , critical value , and confidence level (usually 95%). ● Types: Left-tailed, right-tailed, or two-tailed.
📉 3. Regression Analysis (Brief Review)
🔹 Definition
A statistical method that examines the relationship between dependent (Y) and independent (X) variables.
🔹 Purpose
● To measure how changes in X affect Y ● To predict outcomes ● To test relationships and influences
🔹 Regression Coefficient
Represents how much Y changes for every one-unit change in X.
📊 4. PEARSON CORRELATION COEFFICIENT (r)
🔹 Definition
A statistical measure that indicates the strength and direction of a linear relationship between two numerical variables.
Developed by:
● Sir Francis Galton (late 1800s): Introduced the idea of “co-relation” when studying heredity. ● Karl Pearson (1896): Formalized the product-moment correlation coefficient , now known as Pearson’s r.
✏ 5. Formula
Where:
● n = number of data pairs ● x, y = individual data values ● Σ = summation symbol
💡 7. Key Concepts and Characteristics
Concept Description
Range -1 ≤ r ≤ +
Direction Positive (+) means variables move together; Negative (–) means they move oppositely
Strength Closer
Unit Independence r is unit-free , unaffected by scale of measurement
Symmetry r(X,Y) = r(Y,X)
📏 8. Importance of Pearson’s r
⚖ 9. Assumptions of Pearson’s Correlation
Assumption Explanation
Independence Each observation must be independent.
Linearity Relationship between variables must be linear (check via scatterplot).
Homoscedasticity Spread of points around the line should be uniform (rectangular scatter).
📚 10. Example Problem
Situation:
A teacher wants to find if hours studied affect exam scores of 5 students.
Student X = Study Hours Y = Exam Score
● If regression analysis is appropriate
📍 When to Use Pearson’s r
● Both variables are quantitative (numerical) ● Relationship is linear ● No extreme outliers are present
📍 When NOT to Use It
● One or both variables are categorical (use Chi-square instead) ● Relationship is non-linear
🧾 13. Formula Summary
Concept Formula
Pearson’s r ( r = \frac{nΣxy - (Σx)(Σy)}{\sqrt{[nΣx^2 - (Σx)^2][nΣy^2 - (Σy)^2]}} )
Regression Equation ( Y = a + bX )
Slope (b) ( b = \frac{nΣxy - (Σx)(Σy)}{nΣx^2 - (Σx)^2} )
Intercept (a) ( a = \frac{Σy - bΣx}{n} )
🧭 14. Real-World Applications
● Predicting academic performance from study habits ● Measuring link between income and spending ● Studying relation between advertising and sales ● Health studies: correlation between age and blood pressure
Key Point Essence
Pearson’s r Measures strength & direction of linear relationship
Range -1 ≤ r ≤ +
Significance High
Interpretation Positive = direct relation; Negative = inverse
Purpose Foundation for regression, data analysis, and prediction
Regression Analysis – a statistical method used to study the relationship between dependent (Y) and independent (X) variables. It shows how Y changes when X changes.
Purpose:
🔹 Example
Predict exam score if a student studies 7 hours.
X (Hours) Y (Score)
2 65
3 70
5 75
6 85
8 90
a = 56.5789,\ b = 4. Ŷ = 56.5789 + 4.2544X
For X = 7 Ŷ = 86.
✅ Interpretation: A student studying 7 hours is predicted to score 86.36.
📉 PEARSON CORRELATION (r)
Formula:
Interpretation:
Value of r Interpretation
+0.90 to +1.00 Very strong positive
+0.70 to +0.89 Strong positive
+0.50 to +0.69 Moderate positive
+0.30 to +0.49 Weak positive
0 No correlation
-0.30 to -0.49 Weak negative
-0.70 to -0.89 Strong negative
-0.90 to -1.00 Very strong negative
Purpose:
● Tests if there is a linear relationship between X and Y ● Checks if the regression model is valid ●
1 ⃣ Understanding Regression
● Explains relationship between dependent and independent variables ● Used for prediction and inference
2 ⃣ Types
Type When to Use
Simple Linear One predictor
Multiple Linear Two or more predictors
Logistic Binary/categorical dependent variable
3 ⃣ Computation Methods
● Manual – least squares formulas ● Calculator – direct computation ● Excel – fast analysis with graphs
4 ⃣ Applications
● Predicting sales or grades ● Estimating costs or productivity ● Forecasting trends or risks