PPLOT: Probability Plots for Statistical Analysis, Study notes of Mathematical Statistics

Pplot is a statistical procedure used to create probability plots for investigating the distribution of one or more sequence or time series variables. It can be used to determine if data follows a specific distribution such as normal, lognormal, logistic, exponential, weibull, gamma, beta, uniform, pareto, laplace, half normal, chi-square, or student’s t. The concept of pplot, the notation used, and the methods for calculating fractional ranks and scores for various distributions.

Typology: Study notes

2011/2012

Uploaded on 10/31/2012

sangawar
sangawar 🇮🇳

4.5

(4)

118 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
PPLOT1
PPLOT produces probability plots of one or more sequence or time series
variables. The variables can be standardized, differenced, and/or transformed
before plotting. Expected normal values or deviations from expected normal values
can be plotted. PPLOT can be used to investigate whether the data are from a
specified distribution: normal, lognormal, logistic, exponential, Weibull, gamma,
beta, uniform, Pareto, Laplace, half normal, chi-square and Student’s t.
Notation
The following notation is used throughout this chapter unless otherwise stated:
X
Sample mean
S Sample standard deviation
LX Sample mean for ln xi
16
LS Sample standard deviation for ln xi
16
xi Value of the ith observation
xi()
The ith smallest observation
Ri Corresponding rank for xi
n Sample size
f
rx
dist i
() Fractional rank of xi for the specified distribution function
ax
dist i
() Score for the specified distribution function
α
Location parameter
β
Scale parameter
γ
Shape parameter
ν
Degrees of freedom
1 This procedure was introduced in SPSS 7.0 and replaces the NPPLOT procedure
of earlier releases.
pf3
pf4
pf5

Partial preview of the text

Download PPLOT: Probability Plots for Statistical Analysis and more Study notes Mathematical Statistics in PDF only on Docsity!

1

PPLOT

PPLOT produces probability plots of one or more sequence or time series variables. The variables can be standardized, differenced, and/or transformed before plotting. Expected normal values or deviations from expected normal values can be plotted. PPLOT can be used to investigate whether the data are from a specified distribution: normal, lognormal, logistic, exponential, Weibull, gamma, beta, uniform, Pareto, Laplace, half normal, chi-square and Student’s t.

Notation

The following notation is used throughout this chapter unless otherwise stated:

X Sample mean S Sample standard deviation

LX Sample mean for^ ln^ 1 6 x^ i

LS Sample standard deviation for ln 1 6 x i

x (^) i Value of the i th observation x (^) ( ) i The i th smallest observation Ri Corresponding rank for x (^) i n Sample size frdist ( x (^) i ) (^) Fractional rank of x (^) i for the specified distribution function a (^) dist ( xi ) (^) Score for the specified distribution function

α Location parameter

β Scale parameter γ (^) Shape parameter ν (^) Degrees of freedom

1 This procedure was introduced in SPSS 7.0 and replaces the NPPLOT procedure of earlier releases.

Fractional Ranks

Based on the rank Ri for the observation x (^) i , the fractional rank fr (^) dist ( xi ) is computed and used to estimate the expected cumulative distribution function of X. One of four methods can be selected to calculate the fractional rank frdist ( x (^) i ) :

fr x

R n

R n

R n R n

dist i

i

i

i i

%

&

K K K

'

K K K

Blom

Rankit

Tukey Van der Waerden

Scores

The score of the specified distribution for case i is defined as

a (^) dist ( x (^) i ) = Fdist −^1 ( frdist ( x (^) i )) i = 1 , K, n

where Fdist −^1 is the inverse cumulative specified distribution function.

P-P Plot

For a P-P plot, the fractional rank and the cumulative specified distribution function Fdist are plotted:

2^ frdist^ 1 6 x^ i ,^ Fdist^ 1 6 x^ i 7 i^ =^1 ,^ K, n

Student’s t ( ν ) v 0 > 05 is the degrees of freedom specified by the user.

Uniform( a,b ) a is a lower bound and b is an upper bound.

Weibull( β , γ ) β 0 > 05 is a scale parameter and γ 0 > 05 is a shape parameter.

Estimates of the Parameters

The estimates for parameters for each distribution are defined below.

Beta( β 1 , β 2 ) $^

β (^1 )

% & '

( )

X

X X

S

scale parameter

β^ $^ ( ) (^2 1 )

% &

K

'K

( )

K

*K

X

X X

S

3 8 scale parameter

Chi-square( ν ) ν is the degrees of freedom specified by the user.

Exponential( β ) β =$

X

scale parameter

Gamma( γ , β ) γ =$

X

S

2 2 shape parameter

β =^ $ X S^2

scale parameter

Half Normal( β ) β =$^ x (^) 12 +... + x (^) n^2 scale parameter

Laplace( α , β ) α$ = X location parameter

β =^ $ S

2

2

scale parameter

Logistic( α , β ) α$ = X location parameter

β^ $ π

=  

 

S

, π = 31415927. scale parameter

Lognormal β

∧ − = exp( L X ) scale parameter

γ

∧ = LS shape parameter

Normal( α , β ) α$ = X location parameter

β =^ $ S scale parameter

Pareto( β ,b ); β =$^ min (^) ; x 1 (^) , K, x (^) n @ scale parameter

$ ln $

b LX

4 9^ β

index of inequality

Student’s t ( ν ) v is the degrees of freedom specified by the user.

Uniform( a,b ) a $^ = min ; x 1 (^) , K, x (^) n @ lower bound

b^ $^ x , , x = max (^) ; 1 K n @ upper bound

Weibull( β , γ ) $ ( )

β =

− −

= −

=

U Y nU Y

U U

i i i

n

i i

n

1 2 1

scale parameter

γ^ $^ = ( −(( −β ) / β))

− ∧ − ∧ exp Y U shape parameter

where Yi = ln 4 − ln 21 − frdist 1 6 x (^) i 79 and U (^) i =ln1 6 xi

References

Kotz, S., and Johnson, N. L., eds. 1988. Encyclopedia of statistical sciences. John Wiley & Sons, Inc.: New York.