Download Basic Concepts - GIS and Mapping - Lecture Slides and more Slides Geology in PDF only on Docsity!
Introduction
- Statistical options tend to be limited in most GIS
applications.
- This is likely to be redressed in the future.
- We will look at spatial statistics in general terms, and
conclude with a review of the software available.
Basic Concepts
- Spatial statistics differ from ‘ordinary’ statistics by the
inclusion of locational properties.
- This makes spatial statistics more complex.
- The book by Bailey and Gatrell (1995) provides an
accessible introduction. They identify four categories:
- Point pattern data;
- Spatially continuous data;
- Areal data; and
- Interaction data.
- Obvious correspondence with conceptual models.
Random Variables
- Statistical models deals with phenomena that are
stochastic (i.e. are subject to uncertainty).
- A random variable Y has values that are subject to
uncertainty (but may not necessarily be random).
- The distribution of possible values is referred to as the
probability distribution.
- Represented by a function fY(y)
- Random variables may be discrete or continuous.
Probabilities
- Probability that y is between a and b is given by:
if Y is discrete
if Y is continuous (probability density)
- Cumulative probability ( or distribution function ) FY is
given by:
if Y is discrete if Y is continuous
∑^ ( )
=
b y a
fY y
∫^ (^ )
b a fY^ y dy
( ) (^) ∑ ( ) =−∞
=
y Y (^) u Y F y f u
FY^ (^ y )^ =∫− y ∞ fY ( ) udu
Joint Probability
- Can generalise to situations where there is more than one
random variable.
- Joint probability distribution (or density): fXY(x,y)
- Covariance : COV(X,Y) = Σ ((X - E(X)).(Y - E(Y)))
- Correlation : ρ X,Y = COV(X,Y) / σ X. σ y
- Independence : Neither variable affects the other. Joint
probability is product of individual probabilities:
fXY(x,y)=f X (x).f Y (y)
Statistical Models
- A statistical model specifies the probability distribution for the phenomenon being modelled.
- If modelling ozone levels in a region R we would have a probability distribution for each location s (where s is a 2x1 vector of x,y coordinate pairs). Individual points can be referred to as s 1 , s 2 etc.
- The complete set of random variables may be referred to as a spatial stochastic process.
- The probability distribution for near points will probably be more similar than for distant points, so our random variables will probably not be independent.
Specifying Models(2)
- Assumptions may be expressed in general terms (e.g. a
Normal distribution, a regression model) with unspecified parameters.
- The model can be fitted using observed data to estimate
the parameters.
- After evaluating the model we may decide to change its
general form.
A Regression Model
- To illustrate, to model our ozone data we might make the following assumptions: - The random variables {Y( s ), s ∈ R} are independent; - They have the same distribution, but different means; - Their means are a simple linear function of location,
say E(Y( s )) = β 0 + β 1 s 1 + β 2 s 2 ;
- Each Y( s ) has a normal distribution about this mean
with the same variance σ^2.
- These assumptions would enable us to estimate the
parameters from the available data.