Monte Carlo Methods-Modeling and Simulation-Handouts, Lecture notes of Mathematical Modeling and Simulation

Dr. Gauri Raghu gave these handouts for course of Modeling and Simulation at Birla Institute of Technology and Science to cover these points: Stochastic, Simulations, Monte, Carlo, Methods, Area, Curve, Random, Number, Generation

Typology: Lecture notes

2011/2012

Uploaded on 07/03/2012

raajesh
raajesh 🇮🇳

4

(1)

16 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
S
St
to
oc
ch
ha
as
st
ti
ic
c
S
Si
im
mu
ul
la
at
ti
io
on
ns
s
5.1 Monte Carlo Methods
Stochastic simulations are based on experimenting with models having stochastic
variants from a given probability distribution. It is a statistical sampling experiment
with models. This sampling involves all the problems of statistical design analysis.
Monte Carlo methods and Genetic Algorithms form two major branches of stochastic
simulation. Monte Carlo method is used to find consequences of a series of events each
having its own probability of occurrence. Genetic algorithms utilize the random
probability based solutions that are iterated using fitness functions and operations of
superimposition and mutation as it happen in biological systems to evolve towards
best solutions. We will discuss in detail genetic algorithms in chapter ten, here we
focus on understanding use of Monte Carlo methods as stochastic technique.
Monte Carlo methods have been applied to two types of modeling schemes.
First one is a class of purely stochastic problems where it is necessary to use random
numbers to imitate the natural randomness of the system. The second kind is where
a deterministic model with no random behavior is replaced by a set of stochastic
processes. We then repeat this over several trials and averages give a solution that is
close to suitable answer for the model. Repeated trials help in producing better
averages; reduce the standard deviation in results and it is always instructive to keep
track of errors in stochastic simulations as we repeat the study for number of trails.
The Monte Carlo approach takes advantage of high speed digital computers
to solve complex problems. Historically, the Monte Carlo method was considered a
technique, using random or pseudo-random numbers, to find solution. Random
numbers are essentially independent random variables that are uniformly
distributed over the unit interval [0, 1]. Actually, what are available at computer
systems are arithmetic codes for generating sequences of pseudo-random digits, where
each digit (0 through 9) occurs with almost equal probability. These programs are
called random number generators capable of producing pseudo-random numbers with
any required number of elements.
The Monte Carlo method is a procedure that takes the advantage of high
speed of computer in solving complex problems in science and engineering where
the analytical solution is not available and experimental procedures are not feasible.
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Monte Carlo Methods-Modeling and Simulation-Handouts and more Lecture notes Mathematical Modeling and Simulation in PDF only on Docsity!

SSttoocchhaassttiicc SSiimmuullaattiioonnss

5.1 Monte Carlo Methods

Stochastic simulations are based on experimenting with models having stochastic variants from a given probability distribution. It is a statistical sampling experiment with models. This sampling involves all the problems of statistical design analysis. Monte Carlo methods and Genetic Algorithms form two major branches of stochastic simulation. Monte Carlo method is used to find consequences of a series of events each having its own probability of occurrence. Genetic algorithms utilize the random probability based solutions that are iterated using fitness functions and operations of superimposition and mutation as it happen in biological systems to evolve towards best solutions. We will discuss in detail genetic algorithms in chapter ten, here we focus on understanding use of Monte Carlo methods as stochastic technique.

Monte Carlo methods have been applied to two types of modeling schemes. First one is a class of purely stochastic problems where it is necessary to use random numbers to imitate the natural randomness of the system. The second kind is where a deterministic model with no random behavior is replaced by a set of stochastic processes. We then repeat this over several trials and averages give a solution that is close to suitable answer for the model. Repeated trials help in producing better averages; reduce the standard deviation in results and it is always instructive to keep track of errors in stochastic simulations as we repeat the study for number of trails.

The Monte Carlo approach takes advantage of high speed digital computers to solve complex problems. Historically, the Monte Carlo method was considered a technique, using random or pseudo-random numbers, to find solution. Random numbers are essentially independent random variables that are uniformly distributed over the unit interval [0, 1]. Actually, what are available at computer systems are arithmetic codes for generating sequences of pseudo-random digits, where each digit (0 through 9) occurs with almost equal probability. These programs are called random number generators capable of producing pseudo-random numbers with any required number of elements.

The Monte Carlo method is a procedure that takes the advantage of high speed of computer in solving complex problems in science and engineering where the analytical solution is not available and experimental procedures are not feasible.

The Monte Carlo procedure is mainly used to predict the result of a series of events each having its own probability. One of the earliest problems connected with Monte Carlo method is the famous Buffon's needle problem. In this problem a needle of length l units is thrown randomly onto a floor composed of parallel planks of equal width d units, where d > 1_._ What is the probability that the needle, once it comes to rest, will cross (or touch) a crack separating the planks on the floor? It can be shown

that the probability of the needle hitting a crack is 2l/  d, which can be estimated as

the ratio of the number of throws hitting the crack to the total number of throws.

The term Monte Carlo was introduced by Neumann and Ulam during Second World War, as a code word for the secret work at Los Alamos. It was considered a game similar to the gambling at casinos in the city of Monte Carlo in Monaco. The Monte Carlo method was then applied to problems related to the neutron and radiation transport studies there. Major work involved tracking each neutron history as a process of random diffusion in a fissionable material. Shortly after that Monte Carlo methods were employed to evaluate complex multi-dimensional integrals and to solve many integral equations occurring in physics. Its broad use in statistical physics & chemistry, development of polymers, study of percolation, noise generation & filters and solving heat equation became very popular.

The Monte Carlo method is now the most powerful and commonly used technique for analyzing complex problems. Applications can be found in many fields from radiation transport to river basin modeling. Recently, the range of applications has been broadening, and the complexity and computational effort required has been increased. However, we must remember that stochastic simulation of a system and Monte Carlo procedure are two different approaches.

To point out some of the differences in deterministic simulations and Monte Carlo procedure, in the Monte Carlo method time does not play as substantial a role as it does in stochastic simulation. Observations in the Monte Carlo method are independent. However, when we experiment with a deterministic model over time, the observations are serially correlated. Also, in the Monte Carlo simulations, it is possible to express the response as a rather simple function of the stochastic input variants. Whereas in deterministic simulations, the response is usually a very complicated function and is expressed explicitly only by the computer program itself. Let us start introduction to Monte Carlo method by its use for an estimate of area under a curve.

5.2 Area Under the Curve Using Monte Carlo Method

We consider the simple example of obtaining the area under the

curve x^2  y^2  1 using Monte Carlo simulation procedure. The shaded area is

quarter of the circle shown in Figure 5.1.

A script file named monte_area_1.m is given below for this algorithm to be implemented in MATLAB. Here we save the results of a 100 darts throw in the form of a graph with hit as small circle and a miss as a star. Then it is stored as a bitmap frame and graph is allowed to hold on with already drawn data. Then in next iteration, 100 more dots are thrown and plotted as hits or miss on the same graph. This is saved as a second frame. The process is allowed to continue for 20 frames and it is played back as a movie. You may increase the number of frames and decrease the size of points on graph for further enhancement. Running this program in MATLAB will produce a true visualization of emerging area of circle by Monte Carlo selection. The results, in the form of a snapshot of a frame of the movie are given as Fig. 5.2.

In this program, we have to repeat many times the step 1 where we need two random numbers. To accomplish this we require a large supply of random numbers and it is in itself a complex procedure. MATLAB uses a function subroutine rand to generate random numbers uniformly in [0, 1] interval. These can be transformed to generate numbers in some other interval also. For instance, if a subprogram is generating number rand between 0 and 1, then we could simply multiply by a factor d to shift random numbers in interval [0, d ]. The randn function generates normal distribution based random numbers with average zero and variance of one.

% Program name: monte_area_1.m % Program to find area under curve by Monte Carlo method % Consider a quarter of a cirular disk in range of % 0 < x < 1 and 0 < y < 1. We are interested % to find its area using Monte Carlod method. % This generates a movie of 20 frames with % increasing number of dots thrown at the region. N = 100; hit = 0; mis = 0; xs = 1; ys = 0; rand('state', 0) % this sets the generator to an initial state nframes = 10; % number of frames for movie delta = 1.0/N for i = 1: nframes for j = 1: N xs = xs - delta; ys = abs(sqrt(1- (xsxs))); plot(xs, ys, '--') axis([0 1 0 1]) hold on end for k = 1: N x = rand; y = rand; if((xx + yy)<= 1 ) hit = hit + 1; plot(x, y, 'r:o') hold on else mis = mis + 1; plot(x, y, '') end end m(k)= getframe; end movie(M, 30) % play movie 20 times. probability = hit/N area = probability**

5.3 Random Number Generation

Random numbers form an essential part of the Monte Carlo method. The term random number is generally used for uniform random number. Many techniques for generating random numbers have been suggested, tested, and used in recent years. Some of these are based on random phenomena, others on deterministic recurrence procedures.

Initially, manual methods were employed, including coin flipping, dice rolling, card shuffling, and roulette wheels. It was believed that only mechanical (or electronic) devices could yield "truly" random numbers. These methods were too slow for general use and sequences generated by them could not be reproduced. Shortly following the development of fast digital computers it became possible to obtain random numbers with its aid. One method of generating random numbers on a digital computer consists of preparing a table and storing it in the memory of the computer. In 1955 the RAND Corporation published a well known table of a million random digits that may be used in forming such a table. The advantage of this method is reproducibility; its disadvantage is its lack of speed and the risk of exhausting the table.

In view of these difficulties, John von Neumann suggested the mid-square method, using the arithmetic operations of a computer. His idea was to take the square of the preceding random number and extract the middle digits. For instance, if we are generating four-digit numbers and arrive at 5232, we square it, obtain 27,373,824; the next number consists of the middle four digits (namely, 3738 ) and the procedure is repeated. This raises a logical question: how can such sequences, defined in a completely deterministic way, be random? The answer is that they are

Figure 5.3 Effect of number of histories on an estimate of area within a circle using Monte Carlo method.

not really random, but only seem so, and are in fact referred to as pseudo-random or quasi-random; still we call them random, with the appropriate reservation. Von Neumann's method was also slow and awkward for statistical analysis; in addition the sequences tend to cyclic, and once a zero was encountered the sequence stopped.

The random numbers generated by this or any other method are "good" ones if they are uniformly distributed, statistically independent, and reproducible. A good method is, moreover, necessarily fast and requires minimum memory capacity. Since all these properties are not generally found, some compromise is always needed when developing random number generators.

Before we present various methods to generate random numbers, a brief introduction to some notations and concepts in number theory is important. When the difference of two numbers S and T is evenly divisible by an integer M , then s is termed as “ congruent to T modulo M ”. It is written as

STmodM (5.2)

For instance, consider 27 = 7 mod 5. Here, S is 27 ; T is 7 with a difference of 20. It is evenly divisible by 5. We may also say that the quotient (S – T )/M is an integer i.e. (27 – 7)/5 = 4. If S, T and M satisfy Eq. (5.2), then two further relations exist:

  1. S, T and M have same remainder after division by M. For example, 27 = 7 mod 5,

 and 5

  1. Also, the following relation exists: (S – T) = I ×M. (5.3)

Where, I is an integer. For instance, 27 = 7 mod 5 means that 27 – 7 = 4 ×5. When we look carefully into the second relation, we see that the S may have many possible values when T and M are given. These are known as collection. For instance, S = 7 mod 5, which implies S = T + I ×M

Table 5.1 A Collection of S (S = 7 mod 5) I - 1 0 1 2 3 4

S 2 7 12 17 22 27

The smallest positive number in the collection of S is known as the least positive residue. In above table, the least positive residue is equal to 2.

Example 5.

Find the least positive residue if T and M are given.

Where, the multiplier a, the increment c, and the modulus m are nonnegative integers. The modulo notation ( mod m) means that

Xi (^)  1 ( aXicmki ), (5.7)

where, ki =[( a Xi + c ) / m ] denotes the largest positive integer in (aXi+ c)/m. Given an initial starting value Xo (also called the seed), Equation 5.6 yields a congruence relationship (modulo m) for any value i of the sequence { Xi }. Generators that produce random numbers according to above equation are called mixed congruent generators. The random numbers on the unit interval (0, 1) can be obtained by

m

X

U (^) ii (5.8)

It is clear that such a sequence will repeat itself in at most m steps, and will therefore be periodic. For instance, let a = c = Xo = 3 and m = 5 ; then the sequence obtained from the recursive formula Xi+1 = 3XI + 3 ( mod5 ) is Xi = 3, 2, 4 , 0, 3.

It follows from the above equation that Xi < m for all i value. This inequality means that the period of the generator cannot exceed m. Therefore, the sequence Xi can contain at the most m distinct numbers.

Because of the deterministic character of the sequence, the entire sequence recurs as soon as any number is repeated. We say that the sequence gets into a loop, that is, there is a cycle of numbers that is repeated endlessly. It is shown that all sequences having the form Xi = f(Xi). We want, of course, to choose m as large as possible to ensure a sufficiently large sequence of distinct numbers in a cycle.

Let p be the period of the sequence. When p equals its maximum, that is, when p = m , we say that the random number generator has a full period. It can be shown that the generator defined in Eq. (5.6) has a full period, m , if and only if:

c is relatively prime to m, that is, c and m have no common divisor.  A= 1(mod g) for every prime factor g of m.  A=1(mod 4) if m is a multiple of 4. The first condition means that the greatest common divisor of c and m is unity. Condition two means that a = g[a/g] + 1. Let g be a prime factor of m ; then denoting K= [a/g] , we may write a = 1 + gk. Greenberger have already shown that the correlation coefficient between Xi and Xi+1, lies between the values:

(^161) , m

a m

c am

c a

and that its upper bound is achieved when a = ml/2^ irrespective of the value of c. Since most computers utilize either a binary or a decimal digit system, we select m = 2b, or m = 10b^ respectively where, b denotes the word-length of the particular computer. We discuss both cases separately.

The first condition for a binary computer means that m is 2 b. It guarantees a full period. For m = 2b , the parameter c must be odd. It can be achieved by setting a

equal to 2 r^ + 1 , where, r is greater or equal to two. It is noted in the literature that good statistical results can be achieved while choosing m = 2^35 , a = 2^7 + 1, and c = 1.

For a binary computer we again choose m = 2b^ and it is shown that the

maximal period is achieved when a = 8r  3. Here r is any positive integer. The

procedure for generating pseudo-random numbers on a binary computer can be written as:

1. Choose any odd number as a starting value Xo.

2. Choose an integer a = 8r  3, where r is any positive integer.

_3. Choose value of a close to 2b/2 (if b = 35, a = 8 x 217 + 3 is a good selection).

  1. Compute X 1 using fixed-point integer arithmetic. This product will consist of 2b bits_ _from which the high-order b bits are discarded, and the low-order b bits represent X 1.
  2. Calculate U 1 = X 1 /2b to obtain a uniformly distributed variable.
  3. Each successive random number Xi+1 is obtained from the low-order bits of the product_ aXi.

In a decimal computer the value of m is 10 b. In order to generate a sequence with a full period, c must be a positive number not divisible by g = 2 or g = 5. Satisfactory statistical results have been achieved by choosing a = 101, c = 1, r > 4. In this case Xo had little or no effect on the statistical properties of the generated sequences. The second widely used generator is the multiplicative generator of the following form:

X (^) i  1 ( aXi )(mod m ), (5.9)

It is a particular case of the mixed generator with c = 0. It can be shown that, generally, a full period cannot be achieved here, but a maximal period can, provided that Xo is relatively prime to m and a meets certain congruence conditions.

For a decimal computer m = 10b. The maximal period is achieved when a is

200r  p. The r is any positive integer and p is any of the following 16 numbers: (3,

11, 13, 19, 21, 27, 29, 37, 53, 59, 61, 67, 69, 77, 83, 91). The procedure for generating ran- dom numbers on a decimal computer can be based on following procedure:

Choose any odd integer not divisible by 5 as a starting value X 0.Then choose an integer a = 200r + p for a constant multiplier, where r is any integer and p is any of the values 3, 11, 13, 19, 21, 27, 29, 37, 53, 59, 61, 67, 69, 77, 83, 91.

 Choose a close to 10b/2. (If b = 10, a = 100,000  3 is a good selection.)

Compute aXo using fixed point integer arithmetic. This product will consist of 2b digits, from which the high-order b digits are discarded, and the low-order digits are the value of X 1. Integer multiplication instructions automatically discard the high-order b digits.The decimal point must be shifted b digits to the left to convert the random number (which is an integer) into a uniformly distributed variable defined over the unit interval U 1 = X 1 / 10b. Each successive random number Xi+1, is obtained from the low-order digits of the product aXi.

It remains fixed over the range [0, 1] and goes to zero after the range. It is also termed a pulse of fixed height and duration. The v is a standard exponential deviate with probability density function of the following form:

otherwise ,

e , v g(v)

v (^) 0

0

It is a rapidly decreasing function as the value of v is allowed to increase. The function is zero for negative values of v. Similarly, z is a standard normal or a Gaussian deviate with a distribution shown in Fig. 5.4. It has a symmetric behavior around the zero value and is a bell-shaped function. Its probability density function is given by

h( z) e z^ / ,  z .

 ^2

2 2

In such cases we say that z is a normal variable with mean 0 and variance 1. This figure shows the standardized normal curve. Total area under the curve is one. The areas between z = -1 and z = +1, z = -2 and z = 2 and z = -3 and z =+3 are equal to 68.27%, 95.45% and 99.73% of the total area respectively. The area under the curve bounded by the ordinates at z=0 and any value of z is the probability that z lies between 0 and z.

5.4.1 Inverse Transform Method

We present the procedure to a continuous problems, where a random variable x can take any value between a given interval (say a < x < b ). Let the p(x)dx is the probability that x falls in dx interval about x. The function p(x) is called the probability distribution and satisfies the following:

Figure 5.4 The Gaussian probability density distribution function.

-3 -2 -1 0 1 2 3

0.

0.

0.

0.

0.

0.

0.

Deviation

h(z)^ Mean Value

The value of z





p(x)dx  1. (5.15)

Then we define F(x) as a cumulative probability distribution as following:

 

x F(x) P(w) dw. (5.16)

The value of x is determined with given F(x) and a random number  such that

 = F(x) (5.17)

The procedure is illustrated in Fig. 5.5. Here, we are given with a random number and a curve F(x) as a function of x in the domain, then the value of x is decided by

equation for  and F(x). The method has been applied to several frequency

distributions in following examples.

Figure 5.5 The decision of x by a random numberin the range [0, 1] and accumulative probability distribution.

0.0 0.5 1.0 1.5 2.0 2.5 3.

0.

0.

0.

0.

0.

1.

x

F(x)

Example 5.

For a typical exponential distribution, what is procedure to find the random variate?

Solution: An exponential variate X has the following probability distribution function:

  otherwise ,

e , x , f(x)

x / 0 0

0

By inverse transform method, we have the F(X) as

UF(X)  1  ex/

so that X   ln( 1  U).

Since 1 – U is distributed in the same way as U , we have the X as following:

X  ln(U).

For sampling purposes we may assume,  = 1 : if V is sampled from the standard exponential distribution exp(1) then X =V is from exp(). We can write the following algorithm for the process:

Generate U from u(0, 1).Then compute : X   ln(U).Finally, output the value of X.

Although this technique seems very simple, the computation of the natural logarithm on a digital computer includes a power series expansion (or some equivalent approximation technique) for each uniform variate generated. Common applications of Monte Carlo methods in simulation of real systems will be done in detail using above procedures in next sections.