






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
RM - StatisRM - Statistics...................................
Typology: Summaries
1 / 12
This page cannot be seen from the preview
Don't miss anything!







Financial econometrics Financial econometric is the application of statistical and mathematical techniques to solve problems in finance.
Time series data Time series data are data that have been collected over a period of time on one or more variables.
Cross-sectional data Cross sectional data are data on one or more variables collected at a single point in time
Panel data Panel data has the dimensions of both time series and cross sectional data, e.g. daily prices of a number of blue chip stocks over two years.
Continuous data Continuous data can take on any value and are not confined to take on any specific numbers
Discrete data Discrete data can only take on certain values, which are usually integers
Cardinal Cardinal numbers are those where the actual numerical values that a particular variable takes have meaning, and where there is equal distance between the numerical values. Examples: price of a building, number of houses in a street
Ordinal Ordinal numbers can only be interpreted as providing a position or ordering. Example: position of a runner in a race
Nominal numbers Nominal numbers occur when there is no natural ordering of the values at all. Examples: telephone numbers
Simple returns R (^) t = (pt /pt-1 )- 1))*
Log returns R (^) t = ln(pt/pt-1 )*
Why log returns?
Negative consequences of log returns
What are the key features of financial data?
Steps in formulating an econometric model
Kurtosis Kurtosis measures the fatness of the tails of the distribution and how peaked at the mean the series is. Kurtosis of a normal distribution is 3, so it is possible to calculate excess kurtosis.
Population and sample Population is the total collection of all subjects to be studied. Sample is just a selection of just some items from the population of interest. Example we are interested in the relation between risk and return for UK stocks. The population is all time series observations on all stocks traded on London Stock Exchange. The sample could be data from 2005-
Maximum Likelihood estimation technique An approach that can be used for parameter estimation based on the construction and maximisation of a likelihood function, which is particularly useful for non-linear models.
Error term An error term is a part of a regression model that sweeps up any influences on the dependent variable that are not captured by the independent variable.
Population regression function The population regression function (PRF) is thought to be a description of the model that is taught to be generating the actual data, and the true relationship between the variables (i.e. the true values of a and b). It is also known as the data generating process (DGP).
Sample regression function
The sample regression function (SRF) is the relationship that has been estimated using sample observations
but since we also know that u (^) t==y (^) t –y bar, we can write
We use the SRF to infer likely values of the PRF. We want to know how good our estimates of alpha and beta are.
Estimator An equation that is employed together with the data in order to calculate the parameters that describe the regression relationship
Estimate Estimates are the actual numerical values for the coefficients.
Standard error Standard error measures the precision or reliability of the regression estimates.
Statistical inference. Statistical inference is the process of drawing conclusions about the likely characteristics of the population from the sample estimates. We can do this by using hypothesis tests.
Hypothesis tests Hypothesis test is a framework for considering plausible values of the true population parameters given the sample estimates. We always have a null hypothesis and an alternative hypothesis. We calculate a test statistic, which we compare to a critical value. We then reject or do not reject the null hypothesis depending on the values of the test-statistic and our critical value.
T-distribution The t distribution is symmetric, centred on zero and bell shaped like the normal distribution but has fatter tails meaning that it is more prone to producing values that
If the variance of the error terms is constant and finite over time, then this is known as homoscedasticity.
Heteroscedasticity If the errors do not have a constant variance, they are said to be heteroscedastic.
Autocorrelation This is where there is a relationship between the i th and j th residuals. Recall that one of the assumptions of the CLRM was that such a relationship did not exist. We want our residuals to be random, and if there is evidence of autocorrelation in the residuals, then it implies that we could predict the sign of the next residual and get the right answer more than half the time on average!
Positively autocorrelated series of residuals will not cross the time-axis very frequently
Lagged value The value that the variable took during a previous year. For example value of y t lagged one period, written y (^) t-1.
The first difference of y , known as the change in y. ∆ y t = y t - y (^) t-
When one-period lags or first difference of a variable are constructed, the first observation is lost.
Why do we sometimes need to lag in a regression?
Dynamic model A model where we allow for lagged values of both the dependent and independent variable. A change in a variable at time t, may not necessarily cause a change in another variable at time t , but it may take longer time. Example t+1.
Equilibrium solution The relevant definition of ’equilibrium’ in this context is that a system has reached equilibrium if the variables have attained some steady state values and are no longer changing.
Consistent standard errors/ Robust standard errors Heteroscedasticity-consistent standard errors are used to allow the fitting of a model that does contain heteroscedastic residuals. For example, a possible solution if we find heteroscedasticity is to use White’s heteroscedasticity consistent standard error estimates. The effect of using White’s correction is that in general the standard errors for the slope coefficients are increased relative to the usual OLS standard errors. This makes us more “conservative” in hypothesis testing, so that we would need more evidence against the null hypothesis before we would reject it.
Outliers Data points that do not fit in with the pattern of the other observations and that are a long way from the fitted model
Multicollinearity A phenomenon where two or more of the explanatory variables used in a regression model are highly correlated with each other
Omitted variable bias A relevant variable for explaining the dependent variable has been left out of the estimated regression equation, leading to biased inferences on the remaining parameters
Irrelevant variable parameter stability Variables that are included in a regression equation but in fact have no impact on the
dependent variable. Consequence: Coefficient estimates will still be consistent and unbiased, but the estimators will be inefficient.
Total sum of squares (TSS) = ESS + RSS The total variation across all observations of the dependent variable about its mean value is known as the total sum of squares. The TSS can be split into two parts: the part that has been explained by the model ( the ESS ) and the part that the model was not able to explain ( the RSS).
RSS is also =