Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Variance, Covariance, Correlation, and Efficient Frontier in Risk Management, Summaries of Design history

Risk ManagementFinancial MathematicsEconometricsFinancial MarketsInvestment Analysis

An introduction to financial markets and risk management concepts, including variance, standard deviation, covariance, correlation, distribution functions, and the efficient frontier. It covers the weak form of the Efficient Market Hypothesis, the concept of mean and variance of a portfolio, and the efficient frontier definition. The document also discusses the limitations of the Value-at-Risk and introduces the Autoregressive Model.

What you will learn

  • What is the efficient frontier in finance?
  • What is the correlation between two random variables?
  • What is the difference between variance and standard deviation?
  • What is the variance of a random variable?
  • What is the covariance between two random variables?

Typology: Summaries

2021/2022

Uploaded on 08/01/2022

fabh_99
fabh_99 🇧🇭

4.4

(52)

543 documents

1 / 46

Toggle sidebar

Related documents


Partial preview of the text

Download Understanding Variance, Covariance, Correlation, and Efficient Frontier in Risk Management and more Summaries Design history in PDF only on Docsity!

Lecture Notes MTH6113: Mathematical Tools for Asset

Management

 - March 12, Dr Kathrin Glau; Dr Linus Wunderlich 
  • 0 Preliminaries Contents
  • 1 Efficient Market Hypothesis (EMH)
    • 1.1 The Weak Form of EMH
    • 1.2 The Semi-strong Form of EMH
    • 1.3 The Strong Form of EMH
    • 1.4 Criticism and Use of the EMH
    • 1.5 Summary
  • 2 Stochastic models of long-term behaviour of security prices
  • 3 Risk and return
    • 3.1 Shortfall probability
    • 3.2 Value at Risk and α -quantiles
    • 3.3 Stress test
  • 4 Mean-variance portfolio theory
    • 4.1 Introduction to portfolios
    • 4.2 Mean & variance of the portfolio
    • 4.3 Attainable sets of portfolios
    • 4.4 Minimal Variance Portfolio (MVP)
    • 4.5 Short selling
    • 4.6 Efficient frontier
    • 4.7 Adding a risk-free security
  • 5 Factor models of asset returns
    • 5.1 Single factor models
  • 6 Pricing
    • 6.0 Mean-variance portfolio theory for several assets
    • 6.1 The Captial Asset Pricing Method (CAPM)
      • 6.1.1 CAPM formula:
      • 6.1.2 The security market line (SML)
      • 6.1.3 Efficient portfolios
      • 6.1.4 How to use CAPM?
      • 6.1.5 Discussion of the validity
    • 6.2 The arbitrage pricing theory (APT)
  • 7 Utility Theory
    • 7.1 Reminder: convex and concave functions
    • 7.2 Expected utility
    • 7.3 Pricing lotteries based on utility theory
  • 8 Behavioural finance

0 Preliminaries

Week 1, Preliminary remark: This module differs from most other mathematical modules in that we are Lecture 1 exploring mathematics as a tool for financial purposes in this lecture rather than an aim. This means that an understanding of the material is not possible by understanding the mathematical bits alone. We have to understand the financial context. Ultimately, an equation or inequality that we encounter in this lecture is not interesting as such, but we have to understand its financial meaning. Chapter 1 gives us a first flavour, as we here introduce some basic lines of thoughts in financial terms. For organizational issues and preliminaries such as repetition from probability theory and basic information on the financial market see the slides. Here, we give a basic list of notions from probability.

Revision of probability theory Given random variables X, Y , we consider

  • the expected value E( X ): -

i xiP^ ( X^ =^ xi ) for discrete variables with possible values^ xi ;

-

R xfX^ ( x ) d x^ for the probability density function (pdf)^ fX^.

  • the variance Var( X ) = E
(
( X − E( X ))^2
)
;
  • the standard deviation σX =

Var( X );

  • the covariance Cov( X, Y ) = E (( X − E( X ))( Y − E( Y )));
  • the correlation corr( X, Y ) = Cov( X, Y ) / ( σX σY );
  • the distribution function FX ( x ) = P ( Xx ); - for continuous variables, this is the integral of the density function: FX ( x ) =

∫ (^) x −∞ fX^ ( ξ ) d ξ.

With random variables X, Y, Z and a deterministic scalar a , we frequently use:

  • linearity of the expected value: E( aX + Y ) = a E( X ) + E( Y );
  • variance as the covariance with itself: Var( X ) = Cov( X, X );
  • symmetry and scaling of the covariance Cov( X, Y ) = Cov( Y, X ) and Cov( aX, Y ) = a Cov( X, Y );
  • as a result also Var( aX ) = a^2 Var( X ) and σaX = aσX ;
  • bilinearity of the covariance: Cov( aX + Y, Z ) = a Cov( X, Z ) + Cov( Y, Z ),
  • which yields Var( X + Y ) = Var( X ) + 2 Cov( X, Y ) + Var( Y ),
  • and if X and Y are independent also Var( X + Y ) = Var( X ) + Var( Y ).

1 Efficient Market Hypothesis (EMH)

Week 1, Efficient markets restrict possibility to strategically make profit that is larger than the market’s Lecture 2 average. The general line of thoughts is this one: If whenever you spot a possibility ”to beat the market”, e.g. by being able to predict the price, you believe that_..._

1.... there would have been many others to know the price in advance, 2.... these others would have bought the stock, 3.... the resulting bids would yield to a rise of today’s stock price, 4.... this would happen very fast and until the advantage vanishes,

then you believe in the efficiency of the market. In more detail, we discuss three main formulations of market efficiency:

  • Weak form of the EMH: The current stock price reflects all the historical stock prices;
  • Semi-strong form of the EMH: The current stock price reflects all public information;
  • Strong form of the EMH: The current stock price reflects all public and private information.

The weaker formulations are contained in the stronger formulations, i.e.:

strong form holds ⇒ semi-strong form holds ⇒ weak form holds.

1.1 The Weak Form of EMH

Under the weak form of EMH investments based on past stock prices do not yield superior returns. ”Technical Analysis”, i.e. predicting price movements based on past prices is not possible under this hypothesis. A simple example of strategy that would classify as technical analysis follows the idea ”the trend is your friend“: Here you invest in stocks when there is an upwards trend. Advice: be careful with such ideas, scientific studies show that this strategy is less profitable than a ”buy and hold” strategy, when trading costs are taken into account. See Part 2 of? for more examples. Roughly, the weak form of EMH means that an investor cannot ”beat the market” based on the knowledge of past stock prices. ”Beating the market” means to consistently outperform the market. A clear way to ”beat the market” is by arbitrage, i.e. by making a profit without risking a loss. There are investment strategies trying to ”beat the market”, which are consistent with the weak EMH. For instance:

A) Fundamental analysis: Model the intrinsic value of a company and invest in underrated stocks; then wait for the price to approach the intrinsic value.

B) Quickly react to news with your investment strategy, e.g.:

- announcements about the company / the market, e.g. higher profit then expected, new CEO,... - rumours, e.g. expected merger, expected contracts - political events, e.g. tax and tariffs; strike action, changed regulations.

The general line of thoughts behind the weak EMH is this one: If it is possible to ”beat the market” based on knowledge of past prices, then algorithms are produced to do so. Large companies will use these algorithms and trade accordingly. This will rapidly lead to a rise of demands of specific products. This rise in demand in turn will be visible to those ones who sell these products and therefore lead to a rise of the price of these products. This process will continue until the price is finally so high that the strategy is no longer superior. Now, this process is assumed to be very fast and one may assume that it has already been in place once we see the prices.

1.2 The Semi-strong Form of EMH

Under the semi-strong form of the EMH investments based on any publicly available information do not yield superior returns. The hypothesis assumes that the price adjusts immediately to new information, e.g. the an- nouncement of quarterly earnings, dividends, new stocks. Public information means is anything that is publicly available and relatively easy to acquire (e.g. press releases, newspapers, financial magazines). Non-public Information: is information that is not publicly available, for instance insider information. Notice that insider trading is usually illegal. However, many cases of insider trading are indeed documented.

1.3 The Strong Form of EMH

Nobody can consequently outperform the market with their investment. The line of thoughts is very similar to the other two cases.

1.4 Criticism and Use of the EMH

In general, is difficult to test the hypotheses, as the primary information is not available. In regards to the semi-strong form of efficient market hypothesis, one can study the influence of information releases on prices of financial instruments. There is, stronger criticism against the validity of the strong hypothesis: if we believe that insider trading is not profitable, that has strong consequences. Many cases of insider trading are documented, so one cannot argue that they do not exist. In order to have sufficient insider trading so that prices reflect all insider information, a significant number of insiders need to trade, such that the price can reflect their private information.

-10 -5 0 5 10

Log-return day t, %

-
-
0

Log-return day

t-1, %

Figure 1: Scatter plot of subsequent plots for GE’s returns within over 55 years. The empirical correlation 1 , 46% is not statistically significant.

There are also arguments based on data that support the EMH: the scatter plot 1 shows that returns of subsequent days are uncorrelated for a specific time series of prices. This means that the autocorrelation of the stock returns corr( Rt +1 , Rt ) ≈ 0. This empirical observation has been repeatedly made for other asset price time series as well thus underpinning that the future price cannot be predicted based on the past and today’s price. At least this is evidence against a very basic form of predicting the price based on the price history thus supporting the weak EMH. Some studies have investigated the possibility of outperforming the market by comparing the long-term performance of mutual funds with the one of the market, the latter here is represented by the

Wilshire 5000 Total Market Index, see Figure 1.4. In some years mutual funds outperformed the market, but no fund does so consistently, underpinning the strong EMH. The different forms of EMH follow intuitive rationales. This is highly beneficial for getting a first understanding of trading strategies and modelling purposes. We discussed rationales of trading strategies: Those based on the believe that one can consistently outperform the market using historical prices, those based on the believe that one can consistently outperform the market using publicly/ private information. The position of facing a strongly efficient market is that one of a market participant who does not believe to be able to use information to ”beat the market”. The benefit of the EMH for modelling purposes is the following: the financial market is utterly complex and simplifications need to be made before one is able to formulate a mathematical model. The different forms of the EMH give a reasonable rational to formulate such simplifications. In this sense, we are with this chapter at a stage where we set the ground for mathematical tools, for being able to formulate and justify mathematical models for the behaviour of financial quantities.

Figure 2: Example of a binomial model over three time periods.

The observation of uncorrelated subsequent returns andthe reasoning underpinning the EMH supports the random walk theory of stock prices. Here, we model stock prices randomly, und in a way that the daily increments are independent of the history of prices. An simple example of a model respecting these features is the binomial model, compare Figure ??.

1.5 Summary

The implication on potential investments is of large interest for us:

  1. Assuming none of the hypothesis holds , you can find investments, which are based on
    • patterns found in historical stock prices, or
    • any information concerning the company/the market

and can consistently expect profits that are larger than the market average.

  1. Assuming only the weak form is valid, you cannot find investments, which consistently yield superior profit and are based on - patterns found in historical stock prices,

however, it can be based on

  • any further information concerning the company/the market.
  1. Assuming the semi-strong form (hence also the weak form) is valid, you cannot find invest- ments, which consistently yield superior profit and are based on
  • any public information,

investment based on investor may believe in investor does not believe in

historical stock prices no EMH weak form

public information weak form semi-strong form

private information semi-strong form strong form

investor needs to in- crease risk, to increase expected payoff

strong form -

Table 1: Overview of the EMH

however, it can be based on

  • any private information concerning the company/the market.
  1. Assuming the strong form (hence also the semi-strong and weak forms) is valid, you cannot find
  • any investment that consistently yield superior profit.

The only way to increase the expected return is to

  • increasing the risk.

An overview on what can/cannot be used to design superior investment strategies is Table 1. Empirical evidence of the weak formulation of efficient markets can be found, but testing the hypotheses is difficult. The validity of the hypotheses is therefore also criticised. The EMH is useful to get an orientation towards the investment strategies and to simplify the complexity of real markets to set the ground for mathematical modelling.

2 Stochastic models of long-term behaviour of security prices

Week 2 A consequence of the efficient market hypothesis is the random walk theorem , stating that the returns on subsequent days are independent of each other. An important model is the lognormal model.

The Log-normal Model With ( St ) t ∈N the daily stock price, we consider ( Xt ) t ∈N the daily log-returns Xt = log( St +1 /St ). Log-returns for several days are obtained by summing up the daily log-returns:

log( St ) − log( Ss ) =

t ∑− 1

i = s

Xi,

i.e. St = Ss exp(

t − 1 i = s Xi ) for^ s < t. The key-assumption for the lognormal model is that

  • the daily log-returns Xt are iid (i.e. independent and identically distributed), and that
  • this distribution is a normal distribution N ( μ, σ^2 ).

Parameter Estimation in the Log-normal Model Given N iid random variables Xi with an assumed distribution, e.g. N ( μ, σ^2 ), we need to estimate the model parameters, here μ and σ. Here, Xi for each i represents the daily log-return of a stock, and the model parameters are the mean μ and the volatility σ of the log-returns, and σ^2 is its variance. Parameter estimation of a time series of data is a large and deep area of statistics. The estimation will only approximately represent the real time series, and in view of the limited number of observations, the error needs to be well

2000 2004 2008 2012 2016

300

400

500

600

700

800

900

1000

2000 2004 2008 2012 2016

500

1000

1500

2000

Figure 3: Stock prices. Left: Empirical data; Right: Lognormal model

understood. On the one hand one may choose or build an estimator that fulfils many desirable statistical features, which allow to better judge the quality of the estimation, on the other hand one would like to keep the estimation process as simple as possible. Since mathematical simplification in terms of model assumptions meet reality here, there is a large number of sources of additional errors. Here, we make a very simple and convenient choice, we consider the empirical mean and variance to estimate μ and σ^2 via

μX = ( X 1 +... + XN /N ) , (empirical mean)

σ^2 ≈

N − 1
∑^ N

i =

( XiX )^2 (empirical variance).

In Excel, for instance, these formulas can be conveniently implemented through the commands AVERAGE (computing the mean of a cell range) and STDEV_._ S (computing the standard deviation of a cell range). This simple choice of estimators comes with some crucial statistical properties. In the Home- work, Coursework 1 you will show the following: If E( X ) = μ and Var( X ) = σ^2 /N , then X ) converges to the mean μ with probability 1. This is called consistency. Essentially, this means that if we pick more and more observations, the empirical mean converges to the true mean. Another basic property of estimators is unbiasedness. A parameter estimate θ ˆ N of the true parameter θ is called unbiased , iff E( θ ˆ N ) = θ. In other words, if the estimated parameter is in mean equal to the true parameter. The Mean Square Error (MSE) can be represented in terms of the bias and the variance of the estimator,

MSE = E
(

( θ ˆN − θ )^2

)

= bias( θ ˆN)^2 + var( θ ˆN) ,

if the estimator is unbiased, the bias vanishes. One can show that both the empirical mean and the empirical variance are unbiased estimators. In fact, the first guess for a good estimator of the variance might be (^) N^1

∑ N

i =1( Xi^ −^ X ) (^2). This estimator is consistent, however is biased.

In contrast, the estimator (^) N^1 − 1

∑ N

i =1( Xi^ −^ X ) (^2) is unbiased and consistent, and therefore is the

standard estimator of the variance, which is called the empirical variance.

Comparison of the Log-normal Model to Market Data An example of the stock price is given in Figure 3. When comparing the model with empirical data, we see the limitations, in particular:

  • Volatility clustering is observed (large squared daily returns are likely to follow each other), but not present in the lognormal model, see Figure 4
  • Large losses are underestimated with the lognormal model, see Figures 4 and 5. While the higher lieklyhood of large gains and losses is visible in the tails ofthe histogram, it is present in form of spikes in the time series of log-returns.

2000 2002 2004 2006 2008 2010 2012 2014 2016 2018

-0.

-0.

0

2000 2002 2004 2006 2008 2010 2012 2014 2016 2018

-0.

-0.

0

Figure 4: Log-returns Top: Empirical data; Bottom: Lognormal model

Figure 5: Histogram for log-returns and normal pdf

The consequences of underestimating large losses can be severe! ” [... ] Large fluctuations in the stock market are far more common than Brownian motion predicts. The reason is unrealistic as- sumptions – ignoring potential black swans. [... ]“ see https://www.theguardian.com/science/ 2012/feb/12/black-scholes-equation-credit-crunch If the Black-Scholes model is systemically used to estimate the risk and underestimates large losses, this leads to a systemic underestimation of large losses. This can have severe consequences as financial institutions thus may face a lack of risk capital in times of crises. This in turn can lead to a further destabilisation of the system and can advance a crisis. This does not mean that the Black-Scholes model is not a good model. It is a very good model in the sense that it displays some features of the stock market in a very simple way. However, it cannot serve all purposes. It has its clear shortcomings and when it is used systemically in the wrong way this can lead to damages on a global scale. It is therefore highly important that you understand the benefits and shortcomings of the Black-Scholes model. Moreover, it is important to realise that whatever model you use, it has its specific scope and you need to understand its benefits and shortcomings very well. This is of a urgent economic meaning, globally.

(In)dependence and (no) autocorrelation and volatility clusters In Figure 4 we see clus- ters of high changes in the subsequent returns. This is known as volatility clusters. There presence indicate a dependence of subsequence returns, contradicting one of the basic assumptions of the log-normal model. Next, let us graphically study the autocorrelation of the returns, i.e. the correlation between subsequent returns. To do so, we build pairs ( Rt, Rt − 1 ) of all subsequent returns observed. We plot the value of Rt on the x -axis and the value of Rt − 1 on the y -axis, thus obtaining the scatter plot Figure 6. The points are centred around zero, radially symmetric. This indicates that there is no linear dependence between Rt and Rt − 1. Computing the empirical autocorrelation yields − 0_._ 016 confirming that this is very low, thus no indication of a linear dependence. Notice that this is a rudimentary approach, only to get a rough idea. To make this mathematically conclusive, one would need to employ statistical techniques, which goes beyond the scope of this lecture. This observation has been made consistently,

Figure 6: Scatter plot of subsequent returns of the HSBC stock prices.

To summarize our findings, returns of stock prices (and also log-returns as they are very similar) exhibit

  • No autocorrelation: corr( Rt, Rt +1) ≈ 0 (this is in line with the weak form of EMH)
  • Volatility clustering: corr( R^2 t , R^2 t +1) > 0. We can observe periods of large volatility and of small volatility;
  • Heavy tails / spikes: High losses and gains much more likely than for normally distributed random variables.

The presence of volatility clusters indicates a dependence of subsequence returns. However, we also observed no autocorrelation. Correlation and dependence of random variables is closely linked. If two random variables are independent they are uncorrelated. The contrary, however is

not always true. For instance consider X standard normally distributed. Clearly X and X^2 are dependent. What is their correlation? For returns, we look for such random variables, which have no autocorrelation, but which are dependent.

Stylized Facts To model the stock price evolution in an appropriate way means to balance model complexity against realistic features. Researchers have established a list of stylized features, that is features that stock prices typically exhibit. This step is helpful in modelling as it establishes the features that a model should reproduce. In practice, the actual goal of the model determines which features are most important and which ones may be ignored. Building a good model is a highly nontrivial task, each model will clearly not be perfect: Each model is flawed. But which model is good enough for the actual tasks at hand? This type of work, modelling, is done in financial institutions when internal models are build and validated, it is also a vivid research area. Here, we have listed three of the most important stylized features of the daily returns that we have observed. Deeper discussion and more stylised facts: R. Cont, Empirical properties of asset returns: stylized facts and statistical issues, Quantitative Finance, Volume 1, 2001 https: //www.lpsm.paris/pageperso/ramacont/papers/empirical.pdf

Autoregressive Model A better fit of the data is available with more complex models, e.g. the autoregressive AR(1) process. There the volatility (i.e. standard deviation of the log-returns) is a stationary autoregressive stochastic process:

Xt = μt + σtZt, Zt ∼ N (0 , 1) iid , σt = α + βσt − 1 + vt, t ∼ N (0 , 1) iid , | β | < 1 ,

with Zt and t being independent of each other and of σt − 1 , Xt − 1. The autoregressive model introduces a positive correlation of the volatility and hence the magnitude of returns. This way volatility clusters are introduced. Challenges are the fitting the parameters and a more complex evaluation compared to the lognormal model.

Comparison AR(1) Model to Market Data In order to obtain a first impression on the behaviour of the AR (1) model compared to market data, we simulate log-returns in the model, for an arbitrary choice of the parameters. Note that we did not fit the parameters, so the comparison is in a preliminary stage and we can only have a glimpse on the behaviour in respect to stylized facts. We display the time series of the related stock prices, the time series of the log-returns, in comparison to one empirically observed time series of market data in Figure 7. From the time

Figure 7: Log-returns Top: AR(1) model; Bottom: Empirical data

series of stock prices itself it is hard to extract stylized facts, similarities or differences. Turning to the time series of log-returns, however, we observe that the AR(1) model reproduces clusters, i.e. periods of a large number of high returns in absolute values and periods of lower numbers of high returns. We also observe some positive and negative spikes. Both features are more extreme in the

empirical time series, but note that we did not fit the parameters and we have only one example here, so we should not draw any conclusion from this single observation. Next, we display the histogram of log-returns from the empirically observed prices, in the AR(1) model, in a log-normal model in Figure 8. We observe that the shape of the empirical distribution of the log-returns is better reproduced, it is steeper around the mean and higher returns are more likely than in the log-normal model. Both the steeper form in the middle and the slower decay of the tails are visually more similar to the market data then the histogram of the log-normal returns.

Figure 8: Histogram of Log-returns

Estimation of Parameters in the Autoregressive Model The general approach to derive the model parameters α, β, v in σt = α + βσt − 1 + vt is the following two-stage procedure. First, estimate the expectation value E( σt ), the variance Var( σt ) and autocorrelation corr( σt, σt − 1 ). Second, derive the parameters

β = corr( σt, σt − 1 ) , α = (1 − β ) E( σt ) , v^2 = (1 − β^2 ) Var( σt ).

The challenging step is to estimate the empirical volatility. Remember how we estimate the em- pirical variance in the log-normal model, σ^2 ≈ (^) N^1 − 1

∑ N

i =1( Xi^ −^ X ¯)^2. The subtle point is that this is a good estimator if the sequence Xi is iid. However, the AR(1) model is build in such a way to create dependence of the log-returns. The main difficulty thus is that

  • Xi are not independent in the AR(1) model, and
  • σt is different for each Xt , but it is impossible to estimate the variance with a single data point only.

Figure 9: Time series of locally estimated volatilities.

As a compromise we use here a naive parameter fitting approach, a local estimate. We estimate the local variance using 5 neighbouring values of the log-return:

σ^2 t ≈ 1 / 4

∑^ t +

i = t − 2

(

XiX ¯

) 2
.

The resulting time series of the volatility is shown in Figure 10. Then we use this time-series to estimate E( σt ), Var( σt ) and corr( σt, σt − 1 ). We apply this approach to the time series of HSBC stock prices, and obtain E( σt ) ≈ 0_._ 0138, Var( σt ) ≈ 1_._ 05 · 10 −^4 , corr( σt, σt − 1 ) ≈ 0_._ 9013. A graphical comparison of the empirical log-returns and log-returns simulated from the AR(1) model is shown in Figure

Figure 10: Time series and histogram of log-returns. Left: simulated from the AR(1) model; right: empirical.

Further Alternative Models There is a large and ever growing family of stock price models, each model comes with advantages and disadvantages. Here we list a few approaches and concrete models. The AR(1) model is discrete in time. Many models are continuous in time, which makes the analysis often much more elegant and therefore easier for complex tasks. Time-continuous extensions of the AR(1) model for the volatility have been developed, the most famous ones are

  • Ornstein-Uhlenbeck (OU) process,
  • Cox-Ingersoll-Ross (CIR) model.

The latter is used as volatility process in the famous Heston model for option pricing.

3 Risk and return

Week 3 Assessment of risk is one of the most important parts of mathematical finance. Quantified risk can be used to evaluate investments, as well as optimize a portfolio of assets. We first discuss the dominance of assets based on mean and variance and then discuss various measures of risk. First, we consider a basic risk quantity. Comparing a savings account with fixed interest rates with a stock, one basic difference is that we know in advance how the investment of the savings account will change over time, while we do not know how the stock price will change over time, the broader the expected deviation, the more risky the asset feels. This brings us to the first basic notion of risk in finance, the volatility, or its square, the variance. More precisely, we consider the variance of returns Var( R ) = E

(
( R − E( R ))^2
)

as a measure of risk. It

  • measures uncertainty in terms of scatter around the expectation,
  • measures distance between realised and expected return R − E( R ),
  • by the square the sign vanishes and larger deviations are weighted higher than smaller ones,
  • by taking the outer expectation, the deviations are weighted according to their likelihoods.
  • The variance is 0 if there is no risk!

Consider now two opportunities to invest. The first investment is one with a return of high variance, say 20% and low expectation, for instance 0_._ 01. The second possible investment is one without risk, i.e. the variance is zero, and the double return, 0_._ 02. Which investment is more risky? To make it even more extreme, let the expected return of the risky asset be negative. These examples show that for investment decisions, the variance alone cannot measure the risk. A sensible approach is to consider both mean and variance of the returns. For the investment evaluation based on mean and variance, we consider each investment as a pair ( μ, σ ) of the returns mean μ = E( R ) and standard deviation σ =

Var( R ).

Definition 1. An investments ( μ 1 , σ 1 ) dominates another investment ( μ 2 , σ 2 ) , iff

μ 1 ≥ μ 2 , σ 1 ≤ σ 2 ,

and one of the inequalities is strict (i.e. not equal). We write ( μ 1 , σ 1 )  ( μ 2 , σ 2 ).

An investment is dominated, when another investment has a higher expected payoff with less risk. Note that not all pairs can be ordered. Investments that are not dominated form the efficient subset

Definition 2. Given a set of investments A = {( μi, σi ) , i ∈ I}. An investment ( μ, ̂ ̂ σ ) ∈ A is an element of the efficient subset A eff , iff it is not dominated, i.e. there is no i ∈ I , such that ( μi, σi )  (̂ μ, ̂ σ ).

We can use the efficient subset to determine reasonable investments. If we have a given set of investments and we want to invest according to the mean-variance analysis, only elements of the efficient subset are reasonable. We can evaluate the efficient subset by testing pairwise dominance and neglecting all elements that are dominated. Graphically dominance means that the dominating asset lies towards the top left in the σ - μ plane, see Figure 11. To summarize, the pair of expectation and variance of returns,

  • represents both the level of return that we can expect and the risk we take.
  • together can be investigated by comparing different investments on the ( σ, μ )-plane.
  • cancelling out the pairs for which we find a better alternative leaves us with the efficient subset.

This is the basis of investment theory! We also observe some drawbacks of the variance as risk measure, namely,

  • unexpected large profit contributes same as a loss Remember.
  • we cannot distinguish between frequent small losses and rare huge losses.
  • Variance follows historical prices, and does not allow us a tool to include the impact of events (such as the outbreak of a global pandemic or the Brexit or the storming of the US Capitol) which are not reflected in historical price series.

These are severe shortcomings, and therefore further risk measures have been developed. Building good risk measures is, as building models, a highly complex task. Each attempt to pin down the risk in a single number will ultimately fail to assess the true risk completely. The nature of financial risks is too complex. However, quantifying essential aspects of the risk in a single number is utterly important in order to deal with the risk in a responsible manner. When dealing with

0.014 0.016 0.018 0.02 0.022 0.024 0.026 0.028 0.

0

2

4

6

8

10 10 -

BMW (^) VW

Dailmer

Deutsche Bank Commerzbank

Lufthansa

Adidas

Siemens

Figure 11: Several stocks in the σμ -plane and their efficient subset.

risk measures it is crucial to understand the scope of the measure, what does it reflect and what does it not reflect? Remember, there is always something that the risk measure does not capture, so always be sure to understand very well what it does and what it does not reflect. Ultimately, the development and the understanding of measuring financial risks is highly rel- evant for financial institutions and sound risk measures are required to control the risk of invest- ments. On a systemic level, controlling the risk of the investments of all institutions is required to guarantee the stability of the financial system and of the economy as a whole. Next, we consider the semi-variance as a slight adaptation of the variance as a risk measure and then we turn to the most commonly used risk measures in practice, the shortfall probability and the value-at-risk. Finally, we briefly discuss the concept of stress testing, which is one of the pillars of financial risk assessment.

Semi-variance The first drawback of the variance as risk measure listed above is that losses and gains equally contribute to the variance, while investors will welcome gains and suffer from losses. In order to adapt the semi-variance is defined by E(min{ 0 , Xμ }^2 ). It measures the downside-risk. As a major drawback, we observe that it is still highly dependent on the mean μ , also the other two criticism listed above are still valid.

3.1 Shortfall probability

The variance is a very simple measure of investment risk. While it enables us to easily compare stocks, for a more detailed investigation more advanced risk measures need to be considered. Shortcomings include:

  • due to it’s dependency on the expected value, assets with a larger expected value may seem riskier although they are not;
  • unexpected large gains are valued the same as unexpected large losses;
  • the variance does not give any information about the size of the risk or their probability. A likely small loss can have the same variance as a less likely huge loss.

Figure 12: Illustration of the empirical shortfall probability.

To solve the problems, the shortfall probability and the Value at Risk can be considered. They answer the questions

  • how likely are large losses (shortfall probability);
  • how large are likely losses (Value at Risk).

Both are based on the realised loss L = − R (note, that we can use either the return R or the log-return X in the definition of the loss, depending on the situation; the results will differ only slightly). The shortfall probability can best be evaluated using the distribution function of the return R : FR ( x ) = P ( Rx ):

SF( b, R ) = P ( Lb ) = FR (− b ) ,

see Figure 13 for an illustration. Roughly, the shortfall probability measures how likely large losses are. More precisely, it measures how likely losses larger than a pre-specified threshold are. How to compute the shortfall probability? If we have a model at hand, we can do that with the help of the density, or the distribution function directly. If we have an observation of a time series of daily returns Xt for days t = 1 ,... , N instead, we need to evaluate the empirical shortfall probability instead. This is given by

SFe( b ) =

|{ t : 1 ≤ tN, s.t.Xt > b }| N

.

Figure 12 illustrates how to obtain the empirical shortfall probability for 20 samples and the threshold b = 0_._ 1. We count 4 samples below the threshold, which is 20%, therefore SFe(0_._ 1) = 20%.

3.2 Value at Risk and α -quantiles

Week 4 The shortfall probability quantifies how likely losses beyond a given threshold are. Asking differ- ently, we may want to know with which level of loss do we have to probably deal? For instance, we would like to be prepared to compensate all likely losses with cash, while we leave it open how we move on when a larger loss happens, because the scenario is unlikely. The notion of value-at-risk makes this mathematically precise. First, we have to specify what we mean with likely losses. This is done by specifying a confidence level, for instance 95%. The value-at-risk is the maximum amount to be lost with a specified likelyhood, i.e. at a pre- defined confidence level. For example, if the 95% VaR is 1 million, there is 95% confidence that the portfolio will not lose more than 1 million. The Value at Risk is defined as

VaR α = inf{ b : P ( L > b ) < 1 − α }.

If the distribution function of the return FR is continuous and strictly increasing, we can use the inverse function to evaluate the value at risk:

VaR α = − F (^) R −^1 (1 − α ).

Note: we usually evaluate VaR α for α > 0_._ 5, e.g. 95% or 99%, which yields 1 − α < 0_._ 5. See Figure 14 for the illustration of the evaluation using the density function.

-b=-1.

P(-X>b)

Figure 13: Evaluating the shortfall probability SF( X, 1_._ 5) using the distribution function (here X log-return)

Reminder: The distribution function FX ( x ) := P ( Xx ) is left-continuous

Definition 3. For α ∈ (0 , 1) the number

( X ) = inf{ x : α < FX ( x )}

is called upper α-quantile of X.

( X ) = inf{ x : αFX ( x )}

is called the lower α-quantile of X. Any q ∈ [ ( X ) , qα ( X )] is called α-quantile of X.

  • If F X is continuous and strictly increasing,

( X ) = ( X ) = F (^) X −^1 ( α ).

  • VaR α = − q^1 − α ( X ).

Note: different notations are used in practice, e.g. VaR95% is sometimes denoted VaR5%. Examples:

  1. uniform distribution → Tutorials
  2. normal distribution → Homework

-VaR (X) 0 Values of X

1- =0.

0

1

Figure 14: Evaluating the Value at Risk VaR95% using the inverse of the distribution function (here X log-return)

  1. discrete random variable: Note: This example is not part of the lecture. It is only included for your own interest.

Why do we consider discrete random variables? Two examples are

a) Binary options, e.g.

Payoff =

{
£ 100 , ST < £ 1 , 200
£ 0 , ST ≤ 1 , 200.

b) Corporate bond with given probability p for a default. E.g.

Return =

{

1 , with probability 1 − p − 1 , with probability p (default).

Let’s work on an example return of

RT =



− 0_._ 9 p = 0_._ 1 , − 0_._ 1 p = 0_._ 4 , 1 p = 0_._ 5_._

The distribution function is given as

FRT ( x ) = P ( RTx ) =






0 , x < − 0_._ 9 , 0_._ 1 , − 0_._ 9 ≤ x < − 0_._ 1 , 0_._ 5 , − 0_._ 1 ≤ x < 1 , 1 , 1 ≤ x.

With the help of the draft we see

VaR80% = − q^1 −^0_.^8 ( RT ) = − q^0.^2 ( RT ) = − inf{ x : 0._ 2 < FRT ( x )} = − inf{ x ≥ − 0_._ 1 } = 0_._ 1 ,

and also q 0_._ 2 ( RT ) = − 0_._ 1 for the lower quantile. For a threshold at 90% we have

VaR90%( RT ) = − inf{ x : 0_._ 1 < FRT ( x )} = 0_._ 1 ,

while the lower quantile yields

q 0_._ 1 ( RT ) = inf{ x : 0_._ 1 ≤ FRT ( x )} = inf{ x : x ≥ − 0_._ 9 } = − 0_._ 9_._

This means that any q ∈ [− 0_._ 9 , − 0_._ 1] is a 10%-quantile of RT.

Empirical Value-at-risk To compute the empirical value-at-risk, proceed along the following two steps: Let α be the confidence level and Rt for t = 1 ,... , N samples the observed daily returns.

  1. Sort the values of Rt by magnitude.

  2. Consider the smallest (1 − α ) N samples elements and choose the value of the largest one.

This process is illustrated in Figure 15. To deepen the understanding of the empirical value-at-risk,

Figure 15: Illustration of the empirical value-at-risk.

remember that sorting the observations by magnitude yields the empirical distribution function. The value-at risk is q quantile of the empirical distribution, compare Figure 16.

Shortcomings of the Value-at-risk Some shortcoming of the Value at Risk is that it does not give us any information about the distribution of the loss in the unlikely case of (1 − α ). Furthermore it does not enable us to study the influence of possible events without precedent.

Figure 16: Illustration of the empirical value-at-risk as quantile of the empirical distribution.

3.3 Stress test

The idea behind stress testing is to model important possible scenarios and the compute the related risk. A possible implementation is done along the following steps:

  1. Build a factor model for the ingredients of the portfolio.
  2. Specify a set of stress scenarios S ⊂ Ω.

(for instance high/moderate/low interest rates and high/moderate/low inflation rates)

  1. For all ωS compute the future portfolio gain G ( ω ).
  2. For losses L = − X compute worst case loss

% ( L ) = sup{ L ( ω )| ωS }

when we restrict our attention to those element of the space of possible events that belong to S , our selected scenarios.

Example: Consider one stock S and a risk-free asset with rate r. We assume the stock-price is given by the random variable

S 1 =
{

S 0 (1 + μ + σ ) , p = 1 / 2 , S 0 (1 + μ + σ ) , p = 1 / 2_._

where the current mean and variance are μ = 0_._ 05 and σ = 0_._ 1. The risk-free rate is r = 4% and we have invested £ 1 , 000 each in the stock and the risk-free security. The stress test defines certain scenarios and returns the worst-case lost. Then one needs to check, whether the result is acceptable (passing the stress test) or not (failing it). In our case these scenarios could be

Ω = {“ μ = − 0_._ 5 , σ = 0_._ 05 , r = 0_._ 03” ,μ = 0 , σ = 0_._ 2 , r = 0_._ 01” ,... }

In each case the maximal loss is computed, e.g.

R (“ μ = − 0_._ 5 , σ = 0_._ 05 , r = 0_._ 03”) = (1000(1 − 0_._ 5 − 0_._ 05) + 1000(1 + 0_._ 03)) / 2000 − 1 = 0_._ 26 R (“ μ = 0 , σ = 0_._ 2 , r = 0_._ 01”) = (1000(1 + 0_._ 0 − 0_._ 2) + 1000(1 + 0_._ 01)) / 2000 − 1 = 0_._ 095_._

In this case the largest loss would be L=-9.5%.

4 Mean-variance portfolio theory

Week 5

  • What is a portfolio? - A collection of investments hold (here stocks/risk-free securities)
  • Why is portfolio theory so interesting? - A lot more can happen than with single assets.

Illustrative example: Ice-cream sellers & umbrella sellers. We model that the following summer will be either rainy or sunny, both equally likely. If it is rainy the umbrella sellers make a larger profit, while ice-cream sellers make a loss. In a sunny summer the situation is the reverse. rainy summer ( p = 50%) sunny summer ( p = 50%) Return ice-cream sellers ( R^1 ) -5% +10% Return umbrella corporation ( R^2 ) +10% -5%

Both investments have an expectation of 2_._ 5% and a standard deviation of 7_._ 5%. If we buy equal parts of the ice-cream seller and the umbrella corp. we have a return of

1 2

R^1 +
R^2 =
{

2_._ 5% , p = 50% , 2_._ 5% , p = 50%.

,

i.e. a safe return of 2_._ 5% (standard deviation zero). Why does this happen → both investments are negatively correlated:

corr( R^1 , R^2 ) =

Cov( R^1 , R^2 ) σ 1 σ 2

=
E
(

( R^1 − μ 1 )( R^2 − μ 2 )

)

σ 1 σ 2

= − 1.

Note that a correlation of −1 is a very extreme case unlikely to happen in practice. Let us have a look at an example with no correlation:

R^1 =
{

10% , p = 1 / 2 −5% , p = 1 / 2

, R^2 =
{

10% , p = 1 / 2 −5% , p = 1 / 2

,

which are independent of each other. Due to their independence, the joint distribution now has four cases:

probability: 25% 25% 25% 25% R^1 10% 10% –5% –5% R^2 10% –5% 10% –5% 1 2 ( R

1 + R 2 ) 10% 2.5% 2.5% –5%

For the portfolio this yields E( 12 ( R^1 + R^2 )) = 2_._ 5% and

Var( 12 ( R^1 + R^2 )) ≈ 5_._ 3% < 7_._ 5%. In general we see that a portfolio can have a smaller expected value than the individual assets, while having an expectation value as large as both assets. This is called diversification.

4.1 Introduction to portfolios

In the following, we try to optimise our portfolio. We will figure out how to distribute the money and what other choices remain. Let us consider

  • 2 Stocks S^1 , S^2 with expected return μ 1 , μ 2 , variances σ 12 , σ 22 and correlation ρ ;
  • 2 dates t ∈ { 0 , 1 }: - t = 0 is today , i.e. values S^1 (0) , S^2 (0) are deterministic; - t = 1 is some point in the future , i.e. values S^1 (1) , S^2 (1) are random variables.

A portfolio consists of buying/owning x 1 stocks of asset 1 and x 2 of asset 2. The current value is known: P ( x 1 ,x 2 )(0) = x 1 S^1 (0) + x 2 S^2 (0) ,

and the future value is a random variable:

P ( x 1 ,x 2 )(1) = x 1 S^1 (1) + x 2 S^2 (1).

Note that the amount of shares x 1 , x 2 can be quite disproportional to their value, e.g. with x 1 = 1 and S^1 (0) = £ 150 the value is higher than for x 1 = 10 and S^1 (0) = £ 10. We therefore introduce weights w 1 , w 2 , which represent proportion of our wealth P ( x 1 ,x 2 )(0) invested in the two assets:

w 1 =

x 1 S^1 (0) P ( x 1 ,x 2 )(0)

, w 1 =

x 2 S^2 (0) P ( x 1 ,x 2 )(0)

,

with w 1 + w 2 = 1. The weights allow us to conveniently express the return RP ( w 1 ,w 2 ) of our portfolio in terms of

the individual returns R^1 = S^1 (1) /S^1 (0) − 1 and R^2 = S^2 (1) /S^2 (0) − 1:

R P( w 1 ,w 2 ) =

P ( x 1 ,x 2 )(1) P ( x 1 ,x 2 )(0)

− 1 =

x 1 S^1 (1) + x 2 S^2 (1) P ( x 1 ,x 2 )(0)

− 1
=

x 1 S^1 (1) P ( x 1 ,x 2 )(0)

+

x 2 S^2 (1) P ( x 1 ,x 2 )(0)

− 1

= x 1

S^1 (0)

P ( x 1 ,x 2 )(0)

S^1 (1)
S^1 (0)
  • x 2
S^2 (0)

P ( x 1 ,x 2 )(0)

S^2 (1)
S^2 (0)
− 1

= w 1

S^1 (1)
S^1 (0)
  • w 2
S^2 (1)
S^2 (0)

w 1 − w 2 = w 1 R^1 + w 2 R^2 ,

i.e. R P^ = w 1 R^1 + w 2 R^2. If we know the proportions we wish to invest in each asset, i.e., we have the weights w 1 , w 2 given, we can compute the amount of shares as

x 1 = w 1 P (0) S^1 (0)

, x 2 = w 2 P (0) S^2 (0)

.

We typically ignore the fact that we cannot buy fractions of a share.

Example 1. We wish to invest £ 1 000 in equal parts in two assets with S^1 (0) = 10 , S^2 (0) = 100_. With the initial wealth P_ (0) = 1 000 and the weights w 1 = w 2 = 1 / 2 , we can compute the amount of stocks:

  • Asset 1: x 1 = w S^11 P (0)^ (0) = 50010 = 50 , i.e. we buy 50 shares of company 1;
  • Asset 2: x 2 = w S^22 P (0)^ (0) = 500100 = 5 , i.e. we buy 5 shares of company 2.

We can compute the return either based on new prices or based on the given return:

1. Given new prices S^1 (1) = 12 , S^2 (1) = 100 : P (1) = x 1 S^1 (1) + x 2 S^2 (1) = 600 + 550 = 1 150 , i.e. R P^ = 15%_.

  1. Given returns: R_^1 = 20% and R^2 = 10% : R P^ = w 1 R^1 + w 2 R^2 = 15%.

4.2 Mean & variance of the portfolio

Given the formula for the return of the portfolio R P^ = w 1 R^1 + w 2 R^2 , we can compute its expected value μ P = E( R P) and the variance σ P^2 = Var( R P):

μ P = E( w 1 R^1 + w 2 R^2 ) = w 1 E( R^1 ) + w 2 E( R^2 ) = w 1 μ 1 + w 2 μ 2 ;

as well as

σ^2 P = Var( R P) = E

((
R P^ − E( R P)
) 2 )
= E
((

w 1 R^1 + w 2 R^2 − E( w 1 R^1 + w 2 R^2 )

) 2 )
= E
((

w 1 R^1 − E( w 1 R^1 ) + w 2 R^2 − E( w 2 R^2 )

) 2 )
= E
((

w 1 R^1 − E( w 1 R^1 )

) 2 )
+ E
((

w 2 R^2 − E( w 2 R^2 )

) 2 )
+ 2E
((

w 1 R^1 − E( w 1 R^1 )

)(

w 2 R^2 − E( w 2 R^2 )

))

= w^21 Var( R^1 ) + w^22 Var( R^2 ) + 2 w 1 w 2 Cov( R^1 , R^2 ) = w^21 σ 12 + w^22 σ^22 + 2 w 1 w 2 σ 1 σ 2 ρ.

Theorem 1. For a portfolio of two assets ( μ 1 , σ 1 ) and ( μ 2 , σ 2 ) with correlation ρ and the portions w 1 in asset 1, the portfolio’s return’s expectation μ P and variance σ P^2 satisfy

μ P = w 1 μ 1 + (1 − w 1 ) μ 2 σ^2 P = w 12 σ^21 + 2 w 1 (1 − w 1 ) σ 1 σ 2 ρ + (1 − w 1 )^2 σ 22_._

4.3 Attainable sets of portfolios

Which points of the ( σ, μ )-plane can we attain:

{( σ P( w 1 ) , μ P( w 1 )) : w 1 ∈ [0 , 1]} Let’s start with some special cases:

  • ρ = 1: extreme positive correlation (artificial setting) μ P = w 1 μ 1 + (1 − w 1 ) μ 2 σ P^2 = w^21 σ^21 + 2 w 1 (1 − w 1 ) σ 1 σ 2 + (1 − w 1 )^2 σ^22 = ( w 1 σ 1 + (1 − w 1 ) σ 2 )^2 ⇒ σ P = w 1 σ 1 + (1 − w 1 ) σ 2_._ Here the attainable set is a straight line connecting both assets:
  • ρ = −1: extreme negative correlation (also artificial)

μ P = w 1 μ 1 + (1 − w 1 ) μ 2 σ P^2 = w^21 σ^21 − 2 w 1 (1 − w 1 ) σ 1 σ 2 + (1 − w 1 )^2 σ^22 = ( w 1 σ 1 − (1 − w 1 ) σ 2 )^2 ⇒ σ P = | w 1 σ 1 − (1 − w 1 ) σ 2 |.

Again the attainable set is a straight line, but with one kink:

See the lecture slides for the non-trivial examples. Now let’s summarise the general case:

Theorem 2. The attainable set for μ 1 6 = μ 2 , ρ ∈ (− 1 , 1) is a hyperbola with its centre on the vertical axis.

Proof e.g. Theorem 2.7 of Capinski, Kopp.

4.4 Minimal Variance Portfolio (MVP)

Week 6 The attainable portfolio with the minimal risk (i.e. smallest σ P) is called minimal variance portfolio (MPV).

Example 2. For ρ = 0 we have

σ^2 P = w^21 σ^21 + (1 − w 1 ) σ^22 , μ P = w 1 μ 1 + (1 − w 1 ) μ 2_._

To find the MVP, we need to find w 1 ∈ [0 , 1] , such that σ P is minimal (and equivalently σ^2 P ). I.e. we solve the optimisation problem

min w 1 ∈[0 , 1] w 12 σ^21 + (1 − w 1 )^2 σ^22

Since the function is convex in w 1 , we can find the maximum as the root of the first derivative:

d d w 1

( w^21 σ 12 + (1 − w 1 )^2 σ^22 ) = 2( σ 12 + σ^22 ) w 1 − 2 σ 22 = 0 ,

i.e.

w 1 =

σ^22 σ 12 + σ^22

, w 2 =

σ^21 σ 12 + σ^22

.

This yields

σ MVP =

σ^21 σ 22 σ^21 + σ 22

, μ MVP =

μ 1 σ 22 + μ 2 σ^21 σ 12 + σ 22

.

Theorem 3. Let σ 1 > σ 2 be the standard deviations for both assets and ρ ∈ (− 1 , 1) their correla- tion. Then the MVP is given for the weights

w 1 = max

(

σ^22 − ρσ 1 σ 2 σ^21 + σ^22 − 2 ρσ 1 σ 2

, 0
)
.

and w 2 = 1 − w 1_._

Proof. See Tutorial 5.

4.5 Short selling

Assuming no restrictions on short-selling, negative weights are positive

w 1 ∈ R , w 2 = 1 − w 1 ∈ R_._

Negative weights include a leverage (borrow the less profitable asset & sell it to buy the more profitable one.)

Theorem 4 (MVP, general case). With no restrictions on short-selling and ρ ∈ (− 1 , 1) , the MVP is given for the weights

w 1 =

σ 22 − ρσ 1 σ 2 σ 12 + σ 22 − 2 ρσ 1 σ 2

,

and w 2 = 1 − w 1_._

4.6 Efficient frontier

Definition 4. The efficient subset of the attainable set is called efficient frontier_._

Observe: The efficient frontier is the part of the attainable set, which connects the MVP with the asset of the highest expectations (and continuing beyond if short-selling is possible). When no short-selling is possible, it is a closed set (including the end points), otherwise it is half-bounded with the MVP as its end-point.