Download Correlation and Regression Analysis in Public Health: A Comprehensive Guide and more Lecture notes Biostatistics in PDF only on Docsity!
Correlation and Linear Regression
Regression and Correlation
- Many medical investigations are concerned with:
- Establishment of relationship between two variables.
- The strength of a relationship.
- Predicting one variable on the basis of another.
- – Controlling the effect of unwanted variables.Controlling the effect of unwanted variables.
- Such intentions can be addressed either by using correlation
or regression analysis.
Correlation Analysis
- Does not imply cause and effect relationship.
- The value of r ranges from -1 to +1.
- If the correlation coefficient is greater than 0, the variables are said to be positively correlated (i.e. as X increases, Y tends to increase).increase).
- If the correlation coefficient is less than 0, the variables are said to be negatively correlated (i.e. as X increases, Y tends to decrease).
- If the correlation coefficient is 0 then the variables are said to be uncorrelated.
Correlation Analysis Cont…
- The formula for computing sample correlation coefficient (r) for two variables X and Y is given as:
- Or
[ ( ) ][ ( ) ]
x x^2 y y^2
x x y y
r
- Or
- Before computing r, scattered plot between the two variables should be drawn. Why?
[ ( x x ) ][ ( y y ) ]
[ n( x^2 ) ( x)^2 ][n( y^2 ) ( y)^2 ]
n xy x y
r
Correlation Analysis Cont…
y y
Strong relationships Weak relationships
(continued)
y
x
x
y
x
x
Correlation Analysis Cont…
y
No relationship
(continued)
y
x
x
Correlation Analysis Cont…
Example 8.1:
- The data of a random sample of 20 countries are shown in the following table. X represents the percentage of children immunized by age one year and Y represents the under five year mortality rate.year mortality rate.
- Determine the strength of association between the two variables.
Correlation Analysis Cont…
- Country % Immunized (X) CMR/1000LB(Y) XY Y^2 X
- Bolivia
- Brazil
- Cambodia
- Canada
- China
- Czech
- Egypt
- Ethiopia
- Ethiopia
- Finland
- France
- Greece
- India
- Italy
- Japan
- Mexico
- Poland
- Russia
- Senegal
- Turkey
- UK
- Total
Correlation Analysis Cont…
- Interpretation option:
- Rule of thumb: Size of Coefficient General Interpretation
0.8-1.0 Very strong relationship
0.6-0.8 Strong relationship
0.4-0.6 Moderate relationship
0.2-0.4 Weak relationship
0.0-0.2 Very weak or no relationship
Correlation Analysis Cont…
- Hypothesis Testing for a Correlation Coefficient
- As that of mean and percentage, it is also possible to test significance about population correlation.
- For two tailed test
- H 0 : is 0
- H : is different from 0
- H 1 : (^) is different from 0
- The t test statistic is given as (with n-2 df):
2 1
2
r
n t r
Correlation Analysis Cont..
- The critical t value for 0.05 level of significance at 18 (n-2) degree of freedom is - 1.734. Then we calculate the test statistics.
) 5.
18 ) 0. 79 ( 1 ( 0. 79 )
20 2
- 79 ( 1
2 (^2) 2
r
n t r
- Hence we accept the H 1 that r indicates significant negative relationship between immunization coverage and child mortality.
1 r 1 ( 0. 79 )^ 0.
Correlation Analysis Cont..
Limitations:
- Applied only to a linear relationship.
- One must not extrapolate an observed correlation beyond observed ranges of the x and y value.
- Does not differentiate dependent and independent variables.
- • Confounding by a third variable.Confounding by a third variable.
Correlation Analysis Cont..
- The formula for the Spearman Correlation Coefficient is (given that there is no tied rank):
- Where;
( 1 )
6 ( ) (^1 )
2
n n
D r (^) s
- Where;
- 6 is a constant,
- D is the difference between a subjects ranks on the two variables,
- n is the number of subjects.
- Consider the following example.
Correlation Analysis Cont..
Countries (Per100,00MMR 0LB)
MMRRank
Delivery CoverageService (%)
Rank D D^2
1 315 4 55 6 -2 4 2 450 6 40 5 1 1 3 200 1 70 8 -7 49 4 250 3 79 10 -7 49
The following table presents the MMR level and delivery service coverage in 10 developing countries.
6 ( D 2 )^4 250 3 79 10 -7^49
5 243 2 75 9 -7 49 6 830 9 25 3 6 36 7 850 10 21 2 8 64 8 656 7 20 1 6 36 9 701 8 30 4 4 16 10 410 5 60 7 -2 4 308
= 1- [(6x308)/10(100-1)] = 1-[1848/990] = 1-1. = -0.
2
n n
D
rs