RM - Statistics, Summaries of Finance

RM - StatisRM - Statistics...................................

Typology: Summaries

2017/2018

Uploaded on 06/02/2018

Mina_Silsepour
Mina_Silsepour 🇳🇴

1 document

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CHAPTER 1
Financial econometrics
Financial econometric is the application of statistical and mathematical techniques to solve
problems in finance.
Time series data
Time series data are data that have been collected over a period of time on one or more
variables.
Cross-sectional data
Cross sectional data are data on one or more variables collected at a single point in time
Panel data
Panel data has the dimensions of both time series and cross sectional data, e.g. daily prices of
a number of blue chip stocks over two years.
Continuous data
Continuous data can take on any value and are not confined to take on any specific numbers
Discrete data
Discrete data can only take on certain values, which are usually integers
Cardinal
Cardinal numbers are those where the actual numerical values that a particular variable takes
have meaning, and where there is equal distance between the numerical values. Examples:
price of a building, number of houses in a street
Ordinal
Ordinal numbers can only be interpreted as providing a position or ordering. Example:
position of a runner in a race
Nominal numbers
Nominal numbers occur when there is no natural ordering of the values at all. Examples:
telephone numbers
By Al. Brit – Tekbir Finance
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download RM - Statistics and more Summaries Finance in PDF only on Docsity!

CHAPTER 1

Financial econometrics Financial econometric is the application of statistical and mathematical techniques to solve problems in finance.

Time series data Time series data are data that have been collected over a period of time on one or more variables.

Cross-sectional data Cross sectional data are data on one or more variables collected at a single point in time

Panel data Panel data has the dimensions of both time series and cross sectional data, e.g. daily prices of a number of blue chip stocks over two years.

Continuous data Continuous data can take on any value and are not confined to take on any specific numbers

Discrete data Discrete data can only take on certain values, which are usually integers

Cardinal Cardinal numbers are those where the actual numerical values that a particular variable takes have meaning, and where there is equal distance between the numerical values. Examples: price of a building, number of houses in a street

Ordinal Ordinal numbers can only be interpreted as providing a position or ordering. Example: position of a runner in a race

Nominal numbers Nominal numbers occur when there is no natural ordering of the values at all. Examples: telephone numbers

Simple returns R (^) t = (pt /pt-1 )- 1))*

Log returns R (^) t = ln(pt/pt-1 )*

Why log returns?

  1. Log returns have a nice property that they can be interpreted as continuously compounded returns
  2. Can add them up. Example if we want a weekly return and we have calculated daily returns we can find the weekly return by: ln(p5/p0)

Negative consequences of log returns

  1. (^) The simple return of a portfolio of assets is a weighted average of the simple returns on the individual assets. This does not work for log returns

What are the key features of financial data?

  • There is a lot of data available
  • They are noisy and volatile
  • They are leptokurtic and have fatter tails than a normal distribution with the same mean and variance
  • Most such series are negatively skewed, so that large negative returns are more likely than positive returns of the same magnitude
  • They exhibit volatility clustering, high volatility in some periods and low volatility for other periods
  • They can often be characterized as a random walk with drift process

Steps in formulating an econometric model

Kurtosis Kurtosis measures the fatness of the tails of the distribution and how peaked at the mean the series is. Kurtosis of a normal distribution is 3, so it is possible to calculate excess kurtosis.

  • Positive excess kurtosis (leptokurtic): more peaked at the mean and fatter tails than a normal distribution with the same mean and variance
  • Negative excess kurtosis (platykurtic): less peaked at the mean, thinner tails and more mass on the shoulders than a normal distribution with the same mean and variance

Population and sample Population is the total collection of all subjects to be studied. Sample is just a selection of just some items from the population of interest. Example we are interested in the relation between risk and return for UK stocks. The population is all time series observations on all stocks traded on London Stock Exchange. The sample could be data from 2005-

Maximum Likelihood estimation technique An approach that can be used for parameter estimation based on the construction and maximisation of a likelihood function, which is particularly useful for non-linear models.

Error term An error term is a part of a regression model that sweeps up any influences on the dependent variable that are not captured by the independent variable.

Population regression function The population regression function (PRF) is thought to be a description of the model that is taught to be generating the actual data, and the true relationship between the variables (i.e. the true values of a and b). It is also known as the data generating process (DGP).

  • yt = + xt + u (^) t

Sample regression function

The sample regression function (SRF) is the relationship that has been estimated using sample observations

  • yt ^=^+ ^xt

but since we also know that u (^) t==y (^) t –y bar, we can write

  • y (^) t =^ + ^xt + ut ^

We use the SRF to infer likely values of the PRF. We want to know how good our estimates of alpha and beta are.

Estimator An equation that is employed together with the data in order to calculate the parameters that describe the regression relationship

Estimate Estimates are the actual numerical values for the coefficients.

Standard error Standard error measures the precision or reliability of the regression estimates.

Statistical inference. Statistical inference is the process of drawing conclusions about the likely characteristics of the population from the sample estimates. We can do this by using hypothesis tests.

Hypothesis tests Hypothesis test is a framework for considering plausible values of the true population parameters given the sample estimates. We always have a null hypothesis and an alternative hypothesis. We calculate a test statistic, which we compare to a critical value. We then reject or do not reject the null hypothesis depending on the values of the test-statistic and our critical value.

T-distribution The t distribution is symmetric, centred on zero and bell shaped like the normal distribution but has fatter tails meaning that it is more prone to producing values that

If the variance of the error terms is constant and finite over time, then this is known as homoscedasticity.

Heteroscedasticity If the errors do not have a constant variance, they are said to be heteroscedastic.

Autocorrelation This is where there is a relationship between the i th and j th residuals. Recall that one of the assumptions of the CLRM was that such a relationship did not exist. We want our residuals to be random, and if there is evidence of autocorrelation in the residuals, then it implies that we could predict the sign of the next residual and get the right answer more than half the time on average!

Positively autocorrelated series of residuals will not cross the time-axis very frequently

Lagged value The value that the variable took during a previous year. For example value of y t lagged one period, written y (^) t-1.

The first difference of y , known as the change in y. ∆ y t = y t - y (^) t-

When one-period lags or first difference of a variable are constructed, the first observation is lost.

Why do we sometimes need to lag in a regression?

Dynamic model A model where we allow for lagged values of both the dependent and independent variable. A change in a variable at time t, may not necessarily cause a change in another variable at time t , but it may take longer time. Example t+1.

Equilibrium solution The relevant definition of ’equilibrium’ in this context is that a system has reached equilibrium if the variables have attained some steady state values and are no longer changing.

Consistent standard errors/ Robust standard errors Heteroscedasticity-consistent standard errors are used to allow the fitting of a model that does contain heteroscedastic residuals. For example, a possible solution if we find heteroscedasticity is to use White’s heteroscedasticity consistent standard error estimates. The effect of using White’s correction is that in general the standard errors for the slope coefficients are increased relative to the usual OLS standard errors. This makes us more “conservative” in hypothesis testing, so that we would need more evidence against the null hypothesis before we would reject it.

Outliers Data points that do not fit in with the pattern of the other observations and that are a long way from the fitted model

Multicollinearity A phenomenon where two or more of the explanatory variables used in a regression model are highly correlated with each other

Omitted variable bias A relevant variable for explaining the dependent variable has been left out of the estimated regression equation, leading to biased inferences on the remaining parameters

Irrelevant variable parameter stability Variables that are included in a regression equation but in fact have no impact on the

dependent variable. Consequence: Coefficient estimates will still be consistent and unbiased, but the estimators will be inefficient.

CHAPTER 3

Total sum of squares (TSS) = ESS + RSS The total variation across all observations of the dependent variable about its mean value is known as the total sum of squares. The TSS can be split into two parts: the part that has been explained by the model ( the ESS ) and the part that the model was not able to explain ( the RSS).

RSS is also =