Normality Test Using Various Approach, Thesis of Highway Engineering

This write up centers on the normality test approach to be used to check if a sample data is normally distributed and indicate whether to use a parametric or non-parametric test.

Typology: Thesis

2016/2017

Uploaded on 05/01/2017

nossyboy
nossyboy 🇳🇬

1 document

1 / 30

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Testing for Normality
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e

Partial preview of the text

Download Normality Test Using Various Approach and more Thesis Highway Engineering in PDF only on Docsity!

Testing for Normality

For each mean and standard deviation combination a theoretical

normal distribution can be determined. This distribution is based

on the proportions shown below.

There are several methods of assessing whether data are

normally distributed or not. They fall into two broad categories:

graphical and statistical. The some common techniques are:

Graphical

  • Q-Q probability plots
  • Cumulative frequency (P-P) plots

Statistical

  • W/S test
  • Jarque-Bera test
  • Shapiro-Wilks test
  • Kolmogorov-Smirnov test
  • D’Agostino test

Q-Q plots display the observed values against normally

distributed data (represented by the line).

Normally distributed data fall along the line.

Tests of Normality

Age .110 1048 .000 .931 1048.

Statistic df Sig. Statistic df Sig.

Kolmogorov-Smirnova^ Shapiro-Wilk

a. Lilliefors Significance Correction

Tests of Normality

TOTAL_VALU .283 149 .000 .463 149.

Statistic df Sig. Statistic df Sig.

Kolmogorov-Smirnova^ Shapiro-Wilk

a. Lilliefors Significance Correction

Tests of Normality

Z100 .071 100 .200* .985 100.

Statistic df Sig. Statistic df Sig.

Kolmogorov-Smirnova^ Shapiro-Wilk

*. This is a lower bound of the true s ignificance. a. Lilliefors Significance Correction

Statistical tests for normality are more precise since actual

probabilities are calculated.

Tests for normality calculate the probability that the sample was

drawn from a normal population.

The hypotheses used are:

H o: The sample data are not significantly different than a

normal population.

H a : The sample data are significantly different than a normal

population.

Non-Normally Distributed Data

Average PM10 .142 72 .001 .841 72.

Statistic df Sig. Statistic df Sig.

Kolmogorov-Smirnov

a Shapiro-Wilk

a. Lilliefors Significance Correction

Remember that LARGE probabilities denote normally distributed

data. Below are examples taken from SPSS.

Normally Distributed Data
As thma Cases .069 72 .200* .988 72.
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov

a

Shapiro-Wilk
*. This is a lower bound of the true s ignificance.
a. Lilliefors Significance Correction

In SPSS output above the probabilities are greater than 0.05 (the

typical alpha level), so we accept H o … these data are not different

from normal.

Normally Distributed Data
As thma Cases .069 72 .200* .988 72.
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov

a

Shapiro-Wilk
*. This is a lower bound of the true s ignificance.
a. Lilliefors Significance Correction

Three Simple Tests for Normality

W/S Test for Normality

  • A fairly simple test that requires only the sample standard

deviation and the data range.

  • Based on the q statistic, which is the ‘studentized’ (meaning t

distribution) range, or the range expressed in standard

deviation units. Tests kurtosis.

  • Should not be confused with the Shapiro-Wilks test.

where q is the test statistic, w is the range of the data and s is

the standard deviation.

s

w q =

Village

Population Density Aranza 4. Corupo 4. San Lorenzo 4. Cheranatzicurin 4. Nahuatzen 4. Pomacuaran 4. Sevina 4. Arantepacua 5. Cocucho 5. Charapan 5. Comachuen 5. Pichataro 5. Quinceo 5. Nurio 6. Turicuaro 6. Urapicho 6. Capacuaro 7.

Standard deviation (s) = 0. Range (w) = 3. n = 17

  1. 06 4. 31

  2. 16

  3. 866

  4. 6

q to

q

s

w q

Critical Range =

= =

=

The W/S test uses a critical range. IF the calculated value falls WITHIN the range,

then accept Ho. IF the calculated value falls outside the range then reject Ho.

Since 3.06 < q=4.16 < 4.31, then we accept Ho.

3

1

3

3

ns

x x

k

n

i

∑ i

=

4

1

4

=

ns

x x

k

n

i

i

2 4

2

k 3 k

JB n

Where x is each observation, n is the sample size, s is the

standard deviation, k 3 is skewness, and k4 is kurtosis.

Jarque–Bera Test

A goodness-of-fit test of whether sample data have the skewness

and kurtosis matching a normal distribution.

Village

Population Density

Mean Deviates

Mean Deviates 3

Mean Deviates 4 Aranza 4.13 -1.21 -1.771561 2. Corupo 4.53 -0.81 -0.531441 0. San Lorenzo 4.69 -0.65 -0.274625 0. Cheranatzicurin 4.76 -0.58 -0.195112 0. Nahuatzen 4.77 -0.57 -0.185193 0. Pomacuaran 4.96 -0.38 -0.054872 0. Sevina 4.97 -0.37 -0.050653 0. Arantepacua 5.00 -0.34 -0.039304 0. Cocucho 5.04 -0.30 -0.027000 0. Charapan 5.10 -0.24 -0.013824 0. Comachuen 5.25 -0.09 -0.000729 0. Pichataro 5.36 0.02 0.000008 0. Quinceo 5.94 0.60 0.216000 0. Nurio 6.06 0.72 0.373248 0. Turicuaro 6.19 0.85 0.614125 0. Urapicho 6.30 0.96 0.884736 0. Capacuaro 7.73 2.39 13.651919 32. 12.595722 37.

s

x