















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This document has following main points TS model, Notation, Models, Simple Exponential Smoothing, ARIMA and Transfer Function Models, Outlier Detection in Time Series Analysis, Definitions of outliers
Typology: Study notes
1 / 23
This page cannot be seen from the preview
Don't miss anything!
















1
The TSMODEL procedure builds univariate exponential smoothing, ARIMA
(Autoregressive Integrated Moving Average), and transfer function (TF) models for
time series, and produces forecasts. The procedure includes an Expert Modeler that
identifies and estimates an appropriate model for each dependent variable series.
Alternatively, you can specify a custom model.
This algorithm is designed with help from professor Ruey Tsay at The University
of Chicago.
The following notation is used throughout this chapter unless otherwise stated:
Y (^) t ( t =1, 2, ..., n ) Univariate time series under investigation
n Total number of observations
Y kt
Model-estimated k -step ahead forecast at time t for series Y.
s The seasonal length.
TSMODEL estimates exponential smoothing models and ARIMA/TF models.
Exponential smoothing models
The following notation is specific to exponential smoothing models:
Simple Exponential Smoothing
Simple exponential smoothing has a single level parameter and can be described by
the following equations:
Y ˆ t^ ( k )= L ( t )
It is functionally equivalent to an ARIMA(0,1,1) process.
Brown’s Exponential Smoothing
Brown’s exponential smoothing has level and trend parameters and can be
described by the following equations:
T ( t )= α ( L ( t )− L ( t − 1 )) +( 1 − α) T ( t − 1 )
ˆ (^) ( ) () (( 1 ) ) () 1 Yt k Lt k Tt
−
It is functionally equivalent to an ARIMA(0,2,2) with restriction among MA
parameters.
Holt’s Exponential Smoothing
Holt’s exponential smoothing has level and trend parameters and can be described
by the following equations:
L ( t )= α Y ( t )+( 1 − α) ( L ( t − 1 )+ T ( t − 1 )),
T ( t )= γ ( L ( t )− L ( t − 1 )) +( 1 − γ) T ( t − 1 )
Yt k = Lt + kTt
It is functionally equivalent to an ARIMA(0,2,2).
Y ˆ t^ ( k )= L ( t )+ kT ( t )+ S ( t + k − s )
It is functionally equivalent to an ARIMA(0,1,s+1)(0,1,0) with restrictions among
MA parameters
Winters’s Multiplicative Exponential Smoothing
Winter’s multiplicative exponential smoothing has level, trend and season
parameters and can be described by the following equations:
L ( t )= α ( Y ( t )/ S ( t − s )) +( 1 − α)( L ( t − 1 )+ T ( t − 1 )),
T ( t )= γ ( L ( t )− L ( t − 1 )) +( 1 − γ) T ( t − 1 )
S ( t )=δ ( Y ( t )/ L ( t )) +( 1 −δ) S ( t − s )
Y ˆ^ ( k )^ (^ L ( t ) kT ( t ))^ S ( t k s ) t = + + −
There is no equivalent ARIMA model.
Estimation and Forecasting of Exponential Smoothing
The sum of squares of the one-step ahead prediction error, (^) ∑ (^ − (^) − )
2
1 (^1 )
Yt Yt , is
minimized to optimize the smoothing weights.
Initialization of Exponential Smoothing
Let L denote the level, T the trend and, S , a vector of length s , denote the
seasonal states. The initial smoothing states are made by back-casting from t=n to
t=0. Initialization for back-casting is described here.
For all the models L = yn.
For all non-seasonal models with trend, T is the slope of the line (with intercept)
fitted to the data with time as a regressor.
For the simple seasonal model, the elements of S are seasonal averages minus the
sample mean; for example, for monthly data the element corresponding to January
will be average of all January values in the sample minus the sample mean.
=
s
i
y t i Ii t
1
α * β * ()to the data where
t is time and I (^) i ( t )are seasonal dummies. Note that the model does not have an
intercept. Then T = α, and S = β − mean ( β).
For the multiplicative Winters model, fit a separate line (with intercept) for each
season with time as a regressor. Suppose μ is the vector of intercepts and β is the
vector of slopes (these vectors will be of length s ). Then T = mean ( β)and
ARIMA and Transfer Function Models
The following notation is specific to ARIMA/TF models:
a (^) t ( t = 1, 2, ... , n ) White noise series normally distributed with mean zero and
variance
2 σ
p Order of the non-seasonal autoregressive part of the model
q Order of the non-seasonal moving average part of the model
d Order of the non-seasonal differencing
P Order of the seasonal autoregressive part of the model
Q Order of the seasonal moving-average part of the model
D Order of the seasonal differencing
s Seasonality or period of the model
φ (^) p ( B ) AR polynomial of^ B^ of order^ p ,
φ p ϕ ϕ ϕ p
p ( B ) = 1 − 1 B − 2 B −... − B
2
θ (^) q ( B ) MA polynomial of^ B^ of order^ q ,
θ q ϑ ϑ ϑ q
q ( B ) = 1 − 1 B − 2 B −... − B
2
s Φ (^) P B
Seasonal AR polynomial of B
S of order P , 2 ( ) 1 1 2 ...
s s s sP Φ (^) P B = − Φ B − Φ B − − Φ PB
s Θ Q (^) B
Seasonal MA polynomial of B
S of order Q , 2 ( ) 1 1 2 ...
s s s sQ Θ Q (^) B = − Θ B − Θ B − − Θ QB
s Θ Q (^) B and the auto-
regressive lag polynomial AR = φ (^) p ( B ) ( )
s Φ P B
polynomials are of the form:
Num i = (^) ( 0 1 )
u ω i (^) − ω i B − " −ω iuB ( 1 1 )
s vs − Ω i (^) B − " − Ω ivB
b B and Deni =
( 1 1 )(^1 1 )
r s − δ i B − " − δ ir B − Δ i B −"
=
k
i
i it i
i t t X Den
Num N Z
1
is assumed to be a mean zero, stationary ARMA process.
Estimation and Forecasting of ARIMA/TF
There are two forecasting algorithms available: Conditional Least Squares (CLS)
and Exact Least Squares (ELS) or Unconditional Least Squares forecasting (ULS).
These two algorithms differ in only one aspect: they forecast the noise process
differently. The general steps in the forecasting computations are as follows:
step ahead forecasting during the historical period and multi-step ahead
forecasting after that. The differences in CLS and ELS forecasting
methodologies surface in this step. The prediction variances of noise
forecasts are also computed in this step.
contributions of the constant term and the transfer function inputs and then
integrating and back-transforming the result. The prediction variances of
noise forecasts also may have to be processed to obtain the final
prediction variances.
Let N ˆ^ (^) t ( k )and ( )
2
Conditional least squares (CLS) method
N ˆ^ (^) t ( k )= E ( Nt + k | Nt , Nt − 1 ," ), assuming N (^) t = 0 for t<0.
1 2 2 2
0
( ) *
k
t j
j
σ k σ ψ
−
=
= ∑
Minimize = (^) ∑ ( − )
2 S N ( t ) N ˆ( t ) , where N ˆ^ ( t )is one-step ahead forecast.
Missing values are imputed with forecast values of N (^) t.
Maximum likelihood (ML) method (Brockwell and Davis, 1991)
N ˆ^ (^) t ( k )= E ( Nt + k | Nt , Nt − 1 ,", N 1 )
Maximize likelihood of {^ }
n N t N ( t ) t 1
( )− (^) =; that is,
=
n
j
L S n n j
1
where S = (^) ∑ ( N^ t − N ( t )) / η t
2
2 2 = is the one-step ahead
forecast variance.
When missing values are present, a Kalman filter is used to calculate ( )
N (^) t k
Error Variance
2 σ = S n − k
in both methods. Here n is the number of non-zero residuals and k is the number
of parameters (excluding error variance).
Initialization of ARIMA/TF
Notations
The following notation is specific to outlier detection:
U(t) or Ut The uncontaminated series, outlier free. It is assumed to be a univariate ARIMA or transfer function model.
Definitions of outliers
Types of outliers are defined separately here. In practice any combination of these
types can occur in the series under study.
AO (Additive Outliers)
Assuming that an AO outlier occurs at time t=T, the observed series can be
represented as
Y t ( ) = U t ( ) + wIT ( ) t
where
⎩
t T
t T I (^) T t 1
( ) is a pulse function and w is the deviation from the true
U(T) caused by the outlier.
IO (Innovational Outliers)
Assuming that an IO outlier occurs at time t=T, then
( () ()) ( )
( ) () at wI t B
Y t t + T Δ
ϕ
θ μ.
LS (Level Shift)
Assuming that a LS outlier occurs at time t=T, then
Y t ( ) = U t ( ) + wST ( ) t
where
⎩
⎨
⎧
≥
−
= t T
t T I t B
S (^) T t T 1
0 () 1
1 ( ) is a step function.
TC (Temporary/Transient Change)
Assuming that a TC outlier occurs at time t=T, then
Y t ( ) = U t ( ) + wDT ( ) t
where () 1
( ) I t B
D (^) T t T
= , 0 <δ < 1 is a damping function.
SA (Seasonal Additive)
Assuming that a SA outlier occurs at time t=T, then
Y t ( ) = U t ( ) + wSST ( ) t
where
⎩
ow
t T ksk I t B
SS t T T (^) s is a step seasonal
pulse function.
LT (Local Trend)
Assuming that a LT outlier occurs at time t=T, then
Y t ( ) = U t ( ) + wTT ( ) t
where
( ) (^) ⎩
2 ow
t T t T I t
B
TT t T is a local trend function.
AO patch
, 1
k D k ( )
M
k O T
k
Y t t w L B I t a t B
θ μ ϕ =
where M is the number of outliers.
Estimating the effects of an outlier
Suppose that the model and the model parameters are known. Also suppose that the
type and location of an outlier are known. Estimation of the magnitude of the
outlier and test statistics are as follows.
The results in this section are only used in the intermediate steps of outlier
detection procedure. The final estimates of outliers are from the model
incorporating all the outliers in which all parameters are jointly estimated.
Non-AO patch deterministic outliers
For a deterministic outlier of any type at time T (except AO patch), let e t ( ) be the
residual and x t ( ) = π( B L B ) ( ) Δ I (^) T ( ) t , so:
e ( t )= wx ( t )+ a ( t ).
From residuals e(t ), the parameters for outliers at time T are estimated by simple
linear regression of e(t) on x ( t ).
For j = 1 (AO), 2 (IO), 3 (LS), 4 (TC), 5 (SA), 6 (LT), define test statistics:
Var( ( ))
w T
w T
j
j λ (^) j =.
the model and model parameters are known.
AO patch outliers
For an AO patch of length k starting at time T, let xi ( ; t T ) = π ( B ) Δ I (^) T (^) + − i 1 ( ) t for i =
1 to k, then
1
et w T x tT a t
k
i
=
.
Multiple linear regression is used to fit this model. Test statistics are defined as:
2 2
( )( ) ( ) ( )
T XT X (^) T T χ T σ
′ ′ =
w w .
Assuming the model and model parameters are known, ( )
2 χ T has a Chi-square
distribution with k degrees of freedom under the null hypothesis
w 1 (^) ( T )= " = wk ( T )= 0.
Detection of outliers
The following flow chart demonstrates how automatic outlier detection works. Let
M be the total number of outliers and Nadj be the number of times the series is
adjusted for outliers. At the beginning of the procedure, M = 0 and Nadj = 0.
Goodness-of-fit statistics are based on the original series Y(t). Let k= number of
parameters in the model, n = number of non-missing residuals.
Mean Squared Error
( )
n k
Yt Y t MSE −
2 () ˆ()
Mean Absolute Percent Error
= (^) ∑ ( ()−ˆ()) ()
Yt Yt Yt n
Maximum Absolute Percent Error
MaxAPE = 100 max( ( Y ( t )− Y ˆ( t )) Y ( t ))
Mean Absolute Error
Y t Yt n
Maximum Absolute Error
MaxAE =max ( Y ( t )− Y ˆ( t ))
Normalized Bayesian Information Criterion
Nomalized ( ) n
n BIC MSE k
ln( ) =ln +
R-Squared
( )
2
2
2
Yt Y
Yt Y t R
Stationary R-Squared
A similar statistic was used by Harvey (1989).
( )
( )
2
2
2
t S
t
Z t Z t
Z t Z
Where
The sum is over the terms in which both Z ( ) t − Z t ˆ( ) and Δ Z ( ) t − Δ Z are not
missing.
Δ Z is the simple mean model for the differenced transformed series, which is
equivalent to the univariate baseline model ARIMA(0,d,0)(0,D,0).
For the seven exponential smoothing models currently under consideration (simple,
double or Brown, Holt, damped trend, simple seasonal, additive Winters,
multiplicative Winters), use the differencing orders (corresponding to their
equivalent ARIMA models if there is one).
1 other
2 Brown,Holt d ,
⎩
s
s D.
Note: Both the stationary and usual R-squared can be negative with range (−∞, 1 ].
A negative R-squared value means that the model under consideration is worse than
the baseline model. Zero R-squared means that the model under consideration is as
good or bad as the baseline model. Positive R-squared means that the model under
consideration is better than the baseline model.
ARIMA Expert Model
Note: for short series, do the following:
In the multivariate situation, users can let the Expert Modeler select a model for
them from:
Note: If the multivariate expert ARIMA model drops all the predictors and
ends up with a univariate expert ARIMA model, this univariate expert ARIMA
model will be compared with expert exponential smoothing models as before
and the Expert Modeler will decide which is the best overall model.