Download Testing for Normality and more Lecture notes Statistics in PDF only on Docsity!
Testing for Normality
For each mean and standard deviation combination a theoretical
normal distribution can be determined. This distribution is based
on the proportions shown below.
There are several methods of assessing whether data are
normally distributed or not. They fall into two broad categories:
graphical and statistical. The some common techniques are:
Graphical
- Q-Q probability plots
- Cumulative frequency (P-P) plots
Statistical
- W/S test
- Jarque-Bera test
- Shapiro-Wilks test
- Kolmogorov-Smirnov test
- D’Agostino test
Q-Q plots display the observed values against normally distributed
data (represented by the line).
Normally distributed data fall along the line.
Tests of Normality
Age .110 1048 .000 .931 1048.
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnova^ Shapiro-Wilk
a. Lilliefors Significance Correction
Tests of Normality
TOTAL_VALU .283 149 .000 .463 149.
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnova^ Shapiro-Wilk
a. Lilliefors Significance Correction
Tests of Normality
Z100 .071 100 .200* .985 100.
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnova^ Shapiro-Wilk
*. This is a lower bound of the true s ignificance. a. Lilliefors Significance Correction
Statistical tests for normality are more precise since actual
probabilities are calculated.
Tests for normality calculate the probability that the sample was
drawn from a normal population.
The hypotheses used are:
Ho: The sample data are not significantly different than a normal
population.
Ha: The sample data are significantly different than a normal
population.
Non-Normally Distributed Data
Average PM10 .142 72 .001 .841 72.
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov a^ Shapiro-Wilk
a. Lilliefors Significance Correction
Normally Distributed Data
As thma Cases .069 72 .200* .988 72.
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnova^ Shapiro-Wilk
*. This is a lower bound of the true s ignificance. a. Lilliefors Significance Correction
In SPSS output above the probabilities are greater than 0.05 (the typical alpha
level), so we accept Ho … these data are not different from normal.
In the SPSS output above the probabilities are less than 0.05 (the typical alpha
level), so we reject Ho … these data are significantly different from normal.
Simple Tests for Normality
Range constant,
SD changes
Range changes,
SD constant
The W/S test uses a critical range. If the calculated
value falls within the range, then accept Ho. If the
calculated value falls outside the range then reject Ho.
Since 3.34 < q=4.05 < 4.71, we accept Ho.
Nurio 6.
Since we have a critical range, it is difficult to determine a
probability range for our results. Therefore we simply state our
alpha level.
The sample data set is not significantly different than normal (q 4.05,
p > 0.05).
D’Agostino Test
- A very powerful test for departures from normality.
- Based on the D statistic, which gives an upper and lower critical
value.
where D is the test statistic, SS is the sum of squares of the data
and n is the sample size, and i is the order or rank of observation
x. The df for this test is n (sample size).
- First the data are ordered from smallest to largest or largest to
smallest.
Use the next lower
n on the table if the
sample size is NOT
listed.
This is the ‘middle’ of the data set.
This is the observation’s distance from the middle.
This is the observation, and is used to ‘weight’ the result
based on the size of the observation and its distance.
Breaking down the equations:
This represents which tail is more pronounced (- for left, + for right).
This adjusts for sample size like this:
This is the dataset’s total squared variation.
This transforms the squared values from SS.