

















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
this data science notes of time series forcasting
Typology: Study notes
1 / 25
This page cannot be seen from the preview
Don't miss anything!


















Disclaimer: This material is protected under copyright act AnalytixLabs ©, 2011-2016. Unauthorized use and/ or duplication of this material or any part of this material including data, in any form without explicit and written permission from AnalytixLabs is strictly prohibited. Any violation of this copyright will attract legal actions
Introduction to Time Series Forecasting
The success of any analysis ultimately depends on the availability of the appropriate data. It is therefore essential that we spend some time discussing the types of data that one may encounter in empirical analysis. Three types of data may be available for empirical analysis
Cross-sectional data Time series data Classification of the widely used forecasting techniques Casual models Time series models Smoothing techniques Regression analysis Box-Jenkins processes Exponential and its extensions
Time Series Analysis – Time Series components Components of a Time Series Secular Trend: This refers to a gradual, long term movement in the data Seasonal Trend : This is the short term fluctuations in the data where the period of fluctuation is less than one year. Cyclical Movements : These are oscillatory movement in the data where the period of oscillation is typically more than a year. So, these are medium term movements. Irregular Components: These are disturbances or residual variation that remain after all the other behaviors have been accounted for. Components of a Time series are:
Forecasting Techniques Forecasting Techniques Executive Committee Delphi Technique Survey of Sales Force Survey of Customers Game Theory Judgmental Bootstrapping
Smoothing ARCH/GARCH Models Moving Averages Simple Linear Regression Multiple Linear Regression Neural Networks Judgmental (Qualitative) Statistical (Quantitative) Time Series Casual Time Series Techniques (1/2)
▪ Trend (T) ▪ Seasonal index (SI) ▪ Combined trend and seasonal index (Comb) ▪ Simple Averages ▪ Moving Averages(MA) ▪ Exponential smoothing (ES) ▪ Naive
Time Series Techniques (2/2)
▪ Simultaneous equation model or structural equation model ▪ VAR or VECM ▪ STATESPACE ▪ PANEL DATA MODEL,etc
▪ Univariate time series model ▪ ARIMA(p,d,q), ARCH, GARCH, etc ▪ Single equation multivariate model ▪ ARIMAX(p,d,q) with ARCH, GARCH, etc Types of Time Series Techniques Pattern-less Techniques
Exponential Smoothening
Pattern Based Technique - Decomposition Method
Seasonality ▪ Fig 1 (Passengers) displays a multiplicative seasonality with a exponentially rising trend ▪ Fig 2 (Retail Sales) displays an additive seasonality with constant mean (no linear trend) Seasonality: Many time series data follow recurring seasonal patterns. For example sales may peak around Christmas year after year. Movie ticket sales may increase noticeably on weekends. Thus, it may be useful to smooth the seasonal component independently with an extra parameter Seasonality can be Additive or Multiplicative in nature Detection: Detection of seasonality can involve plotting and visually inspecting the series, by method of indexing and also by analyzing the autocorrelogram Decomposition Models ▪ There are two types of Decomposition Models following the classical decomposition of a Time series into trend, seasonal, cyclical and irregular components
Classical Multiplicative Decomposition(3/3) Models for Time series Analysis
Models for Time Series Analysis
Stationary
Stationarity How to make a series stationary -- ▪ Differencing - Involves taking difference between successive values ▪ Log Transformations - makes nonconstant variance constant & removes exponential trends Why do we need to take care of Stationarity? ▪ The reason I took up this section first was that until unless your time series is stationary, you cannot build a time series model. ▪ In cases where the stationary criterion are violated, the first requisite becomes to stationarize the time series and then try stochastic models to predict this time series. ▪ There are multiple ways of bringing this stationarity. Some of them are Detrending, Differencing etc.
ACF While examining correlograms one should keep in mind that autocorrelations for consecutive lags are formally dependent. Consider the following example. If the first element is closely related to the second, and the second to the third, then the first element must also be somewhat related to the third one, etc. This implies that the pattern of serial dependencies can change considerably after removing the first order auto correlation (i.e., after differencing the series with a lag of 1).
PACF Another useful method to examine serial dependencies is to examine the partial autocorrelation function (PACF) - an extension of autocorrelation, where the dependence on the intermediate elements (those within the lag) is removed. In other words the partial autocorrelation is similar to autocorrelation, except that when calculating it, the (auto) correlations with all the elements within the lag are partialled out. If a lag of 1 is specified (i.e., there are no intermediate elements within the lag), then the partial autocorrelation is equivalent to auto correlation. In a sense, the partial autocorrelation provides a "cleaner" picture of serial dependencies for individual lags (not confounded by other serial dependencies).
Augment Dickey Puller Test(Unit-Root Test)
ARIMA - Three Stages EstimationEstimationEstimation Forecasting Identification Identification Here we need to check – (a) Stationarity/ Non-Stationarity (b) Seasonality (c) Order of AR and MA processes (d) White Noise Estimation To specify an ARIMA model to fit to the variable specified in the previous IDENTIFY statement and to estimate the parameters of the specified model Forecasting Forecast the future values using the model Box Jenkins Procedure The steps are
How does one know whether it follows a purely AR process or a purely MA process or an ARMA process or an ARIMA process, in which case we must know p, d, and q. The Box Jenkins (BJ) methodology comes in handy in answering the preceding question. The method consists of the following steps: Now the million $ question is
If the series has positive autocorrelations out to a high number of lags, then it probably needs a higher order of differencing. If the lag-1 autocorrelation is zero or negative, or the autocorrelations are all small and pattern less, then the series does not need a higher order of differencing. If the lag-1 autocorrelation is - 0.5 or more negative, the series may be over differenced. The optimal order of differencing is often the order of differencing at which the standard deviation is lowest. A model with no orders of differencing assumes that the original series is stationary (among other things, mean-reverting). A model with one order of differencing assumes that the original series has a constant average trend. A model with two orders of total differencing assumes that the original series has a time- varying trend. A model with no orders of differencing normally includes a constant term (which represents the mean of the series). A model with two orders of total differencing normally does not include a constant term. In a model with one order of total differencing, a constant term should be included if the series has a non - zero average trend.
If the partial autocorrelation function (PACF) of the differenced series displays a sharp cutoff and/or the lag- 1 autocorrelation is positive--i.e., if the series appears slightly "under differenced"--then consider adding one or more AR terms to the model. The lag beyond which the PACF cuts off is the indicated number of AR terms. If the autocorrelation function (ACF) of the differenced series displays a sharp cutoff and/or the lag- 1 autocorrelation is negative--i.e., if the series appears slightly "over differenced"--then consider adding an MA term to the model. The lag beyond which the ACF cuts off is the indicated number of MA terms. It is possible for an AR term and an MA term to cancel each other's effects, so if a mixed AR-MA model seems to fit the data, also try a model with one fewer AR term and one fewer MA term--particularly if the parameter estimates in the original model require more than 10 iterations to converge. If there is a unit root in the AR part of the model--i.e., if the sum of the AR coefficients is almost exactly 1-- you should reduce the number of AR terms by one and increase the order of differencing by one. If there is a unit root in the MA part of the model--i.e., if the sum of the MA coefficients is almost exactly 1-- you should reduce the number of MA terms by one and reduce the order of differencing by one. If the long-term forecasts appear erratic or unstable, there may be a unit root in the AR or MA coefficients.