Calculating Leap Years and Actuarial Procedures in Algol, Lecture notes of Printing

Algol code examples for calculating leap years and actuarial procedures such as annuity calculations and difference triangles. It also discusses the importance of numerical analysis and computational methods in actuarial programming.

Typology: Lecture notes

2021/2022

Uploaded on 09/12/2022

linden
linden 🇬🇧

4.4

(8)

217 documents

1 / 82

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ACTUARIAL PROGRAMMING METHODS
by
E. S. ROBERTSON, F.F.A.
and
A. D. WILKIE, M.A., F.F.A., F.I.A.
[Submitted to the Faculty on 31st March 1969. A synopsis of the
paper will be found on page 180.]
1. INTRODUCTION
1.1. For some years now actuaries have been using computers for
data-processing work. This mainly involved problems of the mani-
pulation of files of policies, the selection of the right action to take
for a particular policy, and the printing of appropriate notices or
summaries. Only incidentally did any actuarial calculation enter
into the programs, and what did—valuation of policies or groups,
calculations for surplus, calculations of valuation factors or recosting
rates—was tackled either by conventional methods, or by the first
convenient new method that came to hand. Also, however, com-
puters began to be used for specifically actuarial problems—the
calculation of tables of commutation columns and policy values, tables
of premium rates, specific rates according to new methods. Again,
either a conventional method was used, or whichever new method
the actuary or programmer could conveniently devise. Each actuary
or programmer had to tackle each problem afresh, since as yet no
set of standard methods, especially suited for computer work, had
been published.
1.2. Early computers required their programs to be written in a
machine language, which was specific to the particular design of
computer used, was not readily intelligible, and required a large
number of individual simple instructions. Refinements of machine
languages produced assembler languages directly related to machine
language, though with mnemonic order codes and named locations;
the current examples of these are ICL 1900 PLAN, and IBM/360
Assembler language. If facilities are available for macro-instructions
(as these two languages have) they can be made considerably more
powerful. We shall mention later (9.3.6) how we see one possible line
K
TFA 31 (1968-1969) 121-202
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52

Partial preview of the text

Download Calculating Leap Years and Actuarial Procedures in Algol and more Lecture notes Printing in PDF only on Docsity!

by E. S. ROBERTSON, F.F.A.

and

A. D. WILKIE, M.A., F.F.A., F.I.A.

[Submitted to the Faculty on 31st March 1969. A synopsis^ of^ the paper will be found on page 180.]

  1. INTRODUCTION

1.1. For some years now actuaries have been using computers for data-processing work. This mainly involved problems of the mani- pulation of files of policies, the selection of the right action to take for a particular policy, and the printing of appropriate notices or summaries. Only incidentally did any actuarial calculation enter into the programs, and what did—valuation of policies or groups, calculations for surplus, calculations of valuation factors or recosting rates—was tackled either by conventional methods, or by the first convenient new method that came to hand. Also, however, com- puters began to be used for specifically actuarial problems—the calculation of tables of commutation columns and policy values, tables of premium rates, specific rates according to new methods. Again, either a conventional method was used, or whichever new method the actuary or programmer could conveniently devise. Each actuary or programmer had to tackle each problem afresh, since as yet no set of standard methods, especially suited for computer work, had been published.

1.2. Early computers required their programs to be written in a machine language, which was specific to the particular design of computer used, was not readily intelligible, and required a large number of individual simple instructions. Refinements of machine languages produced assembler languages directly related to machine language, though with mnemonic order codes and named locations; the current examples of these are ICL 1900 PLAN, and IBM/ Assembler language. If facilities are available for macro-instructions (as these two languages have) they can be made considerably more powerful. We shall mention later (9.3.6) how we see one possible line K

of development through the use of macros. Nevertheless, these assembler languages remain specific to one design of computer, still require a large number of simple instructions, are largely incompre- hensible to someone who is not a trained programmer, and are not readily understood even by him.

1.3. The next development was the construction of “higher-level” languages. These permitted the programmer to use algebraic expressions to indicate how the value of a variable was to be cal- culated, and also included certain elementary English language instructions with words like “if”, “go to”, etc. A program written in one of these higher-level languages requires to be interpreted (“compiled” or “translated”) by another program (a “compiler” or “language translator”) and this program constructs a program suitable to be run on the particular computer used. But compilers exist so that all languages can be run on a variety of different machines, with less or more modification. However the essential part of a program in one of these languages remains constant, what- ever the computer used. The higher-level language therefore is not tied to one design of computer, is reasonably intelligible to someone who has taken a small amount of trouble to learn the language, and is in a much more concise form than an assembler-type language.

1.4. The various higher-level languages that are widely used have specific advantages for certain types of problem. COBOL is suitable for the manipulation of files of data, for business processing, character- handling, and printing. It is poor at pure mathematical work; it is also excessively wordy in its style. FORTRAN is very widely used mathematically; it is easily learnt, and has the necessary features for manipulating mathematical data; but it has poor facilities for character-handling and many of its features are matters of arbitrary convention rather than logical necessity. Algol is also a mathe- matically-oriented language, and has very poor character-handling and file-processing facilities; but it is more general and more powerful than FORTRAN in the mathematical area, and is also a very elegant language for expressing the essentials of a calculation if the calculation is not actually to be run on a computer. Other higher- level languages include PL/I, a new language that contains many of the good points of COBOL and FORTRAN^ and some of Algol, but^ in so doing is very elaborate and possibly not so readily learnt as a less powerful language; CPL, an Algol-like^ mathematical^ language;^ and Autocode and its derivatives, Extended Mercury Autocode and Atlas

and quoting possible algorithms for their solution. We cannot claim that our solutions are in any way the best; nor do they attempt to be comprehensive. But we hope that they will pave the way towards the regular publication of actuarial algorithms in the Transactions, so that those actuaries who have solved explicit problems can make their solutions available to those that follow. In due course a sufficient corpus of work may have been accumulated for someone to select the recognized standard methods and produce a text-book of actuarial programming, or incorporate the relevant parts into a revision of the official text-books.

1.8. We have tried to keep the student in mind when giving examples of actuarial methods. We are not proposing any immediate revision of the recently revised examination syllabus. But we use occasional examples from past examination papers in order to show how certain questions could be rephrased to elicit a program solution from the student. We do not think that it is yet possible to integrate programming with the examination syllabus, but we hope these examples give some indications of how this might be done in the future.

1.9. It will become evident on studying our program examples that the conventional actuarial notation does not readily fit into computer notation. For instance has to be altered into something like a(i, m, n). This would be the case whatever programming language was used. Further,^ there^ is so much^ freedom^ in naming^ variables that some sort of new conventions could prove very useful. Pro- posals have already been made on a revision of the actuarial notation, and we shall also discuss this point.

1.10. To those who have read this far, and may be put off by the unfamiliar terms and symbols that follow, we can only say “It is not nearly as hard as it looks, no harder than elementary algebra; do not despair yet”.

  1. ALGOL

2.1. Since Algol will be used in this paper to illustrate some of the methods of programming it is necessary to describe some of the features of Algol. No attempt is made to teach the reader how to write programs accurately in Algol. There are many other publica- tions (1, 2)* that will do that. Only enough^ is explained^ to enable^ the

  • The references are listed at the end of the Paper.

actuarial reader to follow the program examples and to appreciate some of the features that are present in any higher-level language.

2.2. Numeric Variables and Constants Variables in Algol are comparable with algebraic variables; they are given names that may consist of several letters or of letters and digits, but beginning always with a letter. Constants^ are ordinary^ numbers with decimal points, and may be expressed with a scaling factor of some power of ten. Thus^ 250000 may^ be written^ as 2·5105, meaning 2·5 × 105. The subscript 10 is a special Algol symbol and in other languages this might be written 2·5E5, with no spaces between.

2.3. Real and integer forms Numeric variables are of two types: real and integer. An integer is a positive or negative integer with a size limited by the particular computer. Some languages (but not Algol) handle “fixed-point” numbers, where the program assumes a fixed decimal point some- where in the integer, in much the same way as a pointer is used in a hand calculating machine. Thus a variable may have to be described as having a fixed point with say five decimal places after it; the value ·12345 would be expressed as the integer + 12345, and the value ·00004 would be expressed as the integer +4. A real number is a positive or negative rational number, and is usually expressed in “floating-point” form inside the computer. In this form the location used to store the number is partitioned into an argument and an expo- nent. The argument is normally arranged so that | argument | lies between ½ and 1 (or 1/10 and1) and is a fraction given to a fixed number of binary (or decimal) places. The exponent may lie between, say,

  • 128 and +127, and gives the power of 2 (or 10) by which the fraction must be multiplied to give the true value. Thus, if an internal decimal scale is used, 250000 is stored as + ·25106, and
  • ·000195 is stored as – ·19510 – 3. Most^ computers^ can^ perform arithmetic on “floating-point” numbers, either by built-in instruc- tions, or by programmed software. (^) The advantage of floating-point working is that the programmer does not need to concern himself about the scaling of his results. The^ disadvantage^ is that^ spurious precision may seem to be achieved, for example if differences are taken. Thus, if a table of Sx is stored to six significant figures, the second differences, Dx, will be correct only to about four signi- ficant figures although they will still appear to the computer to have six.

and expect to get the answer 1·0 or thereabouts. A useful test of the

accuracy of the compiler that interprets Algol is to see how correctly

it calculates such known results.

2.6. Boolean variables

Another type of variable in Algol is the Boolean variable, which can

have one of the two values true or false. These are used in conditional

statements “if ... then”. Conditional expressions can be formed

from Boolean variables or from logical relations, which are made up of

variables and the operators < > = such as:

x < 4

b × b – 4 × a × c 0

There are also logical operators that can combine conditional relations

or Boolean variables:

¬ not: if A is true ¬ A is false;

∧ and: A ∧ B is true only if both A and B are true;

∨ or: A ∨ B is true only if either A or B or both are true;

⊃ implies: A ⊃ B is false only if A is true and B is false;

≡ equivalent: A ≡ B is true only if A and B are either both

true or both false.

So if we are finding the roots of a quadratic, given a, b and c we may

use at some stage:

if then realroots: = true;

and use the Boolean variable realroots at a later stage. Conditional

statements are considered further in 2.10 below.

2.7. Block statements

Variables must be declared before they are used, and the declaration

must show whether they are real, integer or Boolean. A declaration

applies only within one block, which is a set of statements enclosed

by begin and end. A block may contain subordinate blocks, and

declarations made in a superior block are valid in a subordinate one,

though not vice versa. So if we write:

begin real a, b;

begin real x, y;

end;

end;

the variables a and b can be recognized throughout, but x and y are valid only from their declaration up to the first end.

2.8. Input and output Input and output instructions are not specified in the Algol report and vary according to the computer installation. For the purpose of this paper we shall define that the statements a:=read; b;=read; c:=read; cause three numbers to be read from some unspecified input device and assigned to a, b and c in that order. Also that the statement print (a, b, c); causes the values of a, b and c to be put out on some unspecified device to some specific number of significant figures, and a statement such as writetext (‘answer = ’);

causes the actual characters enclosed in quotation marks to be printed.

2.9. Comments

A very useful facility for annotating programs is the ability to write comment followed by any commentary you like which terminates at the first semicolon. The comment is ignored by the compiler and has no effect on the program.

2.10. Conditional statements These are introduced by if... then... and may be followed by else... Following the if is a Boolean variable or a conditional expression, and following then and else are statements, i.e. further instructions. Thus if we wish to calculate whether any year is a leap year (in the Gregorian calendar) we can write;

begin integer year; Boolean leapyear; year: = read; if year–(year ÷ 4) × 4 0 then leapyear: = false else if year– (year ÷ 100) × 100 0 then leapyear: = true

and print (1 + i)n for a series of values of n starting at 1 and increasing by 1 until n = 100. (^) We could then write:

begin real r, i; integer n; i: = read; for n: = 1 step 1 until 100 do begin r: = (1 + i) ↑ n; print (n, r) end end; We might want to repeat this for a series of values of i, starting at ·001 and increasing by ·001 to ·1 (i.e. from ·1 per cent to 10 per cent).

It would also be neater to avoid the exponentiation which is done every time in the program above. So we could now have:

begin real r, i; integer n; for i: = ·001 step ·001 until ·10001 do begin writetext (‘ i = ’); print (i); r: = 1; for n: = 1 step 1 until 100 do begin r: = r × (1 + i); print (n, r) end of n loop; end of i loop; end; Note how r is set = 1 at the beginning of each loop and is used to cal- culate the next value of r. See also how two for-loops are nested one inside the other, so that for each rate of interest 100 values of (1 + i)n are produced. Finally, note that the upper limit for i is taken as ·10001, since a real variable may not be accurately expressed in decimals, and the margin makes sure that the program stops after ·100 and not just before. A second form of for... do... statement includes a while condition e.g.

for r: = r + 1 while a > tol do...

This will step up the variable r by 1 each time, and come out of the loop when a tol.

2.13. Arrays In actuarial work, as in much mathematical work, it is common to use columns of values, rectangular tables, or arrays in three or more dimensions. The variables in the array could be referred to as aij or as where i, j, x and n are subscripts. In Algol an array can also be defined, and the items in it can be identified by subscripts thus:

a[i, j] or assurance [x, n] Before an array is used its type must be declared and the ranges of the subscripts must be given. Thus: real array A[1 : 100], assurance [20 : 80, 5: 70]; defines a column with 100 elements numbered from 1 to 100, and a rectangular table where the first subscript runs from 20 to 80 and the second from 5 to 70, a total of 4026 elements. In practice actuarial arrays are often

in array m, and if we follow this by: sumup (m, r, 1, 100) we get the result of the same operation on m left in array r.

2.15. Conclusion It can be seen from this summary of the main features of Algol that it is an elegant, easily written, and easily understood programming language for mathematical work. (^) On the other hand it is not designed for use in commercial data-processing. It is not possible to have a variable consisting of letters, so names, addresses and alphabetic codes cannot be handled. (^) We cannot put: if month:=january or if name>smith, with any hope of being understood. (^) COBOL (or PL/I) is required for this sort of work. But our purpose in this paper is to illustrate some of the actuarial and mathematical uses of programm- ing, and for this Algol is very well suited.

3. NUMERICAL ANALYSIS

3.1. Since computers, by their very name, deal mainly with the cal- culation of numerical answers it is natural that attention should be paid to numerical analysis and computational methods. A great deal of computer literature deals with the algorithms, or procedures, that enable one to perform some calculation or operation accurately and conveniently. (^) Algorithms do not all deal with purely numerical work. They may be concerned with sorting information into alpha- betical order, analysing sentences grammatically, or using a series of conditional statements to attach the right name and address to a premium notice; but the main interest of this paper is in numerical algorithms.

3.2. Functions In section 2.5 we referred to the built-in functions of Algol, such as sin (x), exp (x), sqrt (x), etc. (^) The user of Algol may not have to worry about how the computer performs the calculations for these functions, but he may have to concern himself with the calculation of other functions not built-in; or he may be using a language (such as COBOL) with no useful built-in functions; or he may be the writer of an Algol compiler in the first place. (^) He therefore may have occasion to write a function subroutine himself. (^) Fortunately many suitable methods have already been published (3, 4), but even if he can borrow a convenient method, he has to consider the accuracy, range of argu- ment, and special conditions he may meet, such as what he does if

a negative x is used as the argument of a square root subroutine. If we ignore these problems for the moment, we can readily see that the usual expansions in powers of x give us one way to calculate, for instance exp (x)

real procedure exp (x); real x; begin real u; integer r; u:=1; r:=1; exp:=0; loop: exp: = exp + u; u: = (u × x)/r; r: = r + 1; if u > exp × 0·0000001 then go to loop; end;

This calculates exp (x) correct to six significant figures, which may be satisfactory only for selected purposes. The if statement could be replaced by: if u > ·0000001 then go to loop; thus giving exp (x) correct to six decimal places, which may suit better a certain range of values of x. For many functions a polynomial function y = P(x) or a rational function y = P(x)/Q(x) gives an answer to a satisfactory degree of accuracy over a suitable range of values of x. Thus an alternative method for calculating exp (x) is to use the formula:

which for 0 x 0·1 gives an accuracy of more than 10 significant figures (see reference 5, formula EXPEC 1800). Other methods require the use of continued fractions, e.g.

for

or infinite products, e.g.

or other mathematical devices that it is beyond the ability of the authors to explain. A very^ full^ description^ of many^ such^ methods, with further references, is found in Computer Approximations (5).

procedure first difference (n, f, d); integer n; real array d, f; begin integer i; for i: = 1 step 1 until (n – 1)do d[i]: = f [i + 1] – f[i] end;

3.4.3. Of more use, perhaps, is to calculate all available differences up to ∆ n–1f[1], and to place them in a square array d[i, j] of size (n – 1) by (n – 1), so that d[i, j] = ∆ if[j]:

procedure difference square (n, f, d); integer n; real array d, f; begin integer i, j; for j: = 1 step 1 until (n – 1) do d[1,j]:=f[j+1]–f[j]; for i: = 2 step 1 until (n – 1) do begin for j:= 1 step 1 until (n – i) do d[i, j]: = d[i – 1, j+ 1] – d[i – 1, j] end end;

3.4.4. However it is clear that the square array of d[i, j] has nearly half its values unfilled, so in order to conserve space in the computer it may be convenient to rearrange the differences in a single array d[k] of length ½n (n – 1), representing in effect the triangular pattern of differences. This array contains in sequence:

and any value ∆ if[j] can be found by d[k] where k = ½(i – 1) × (2n – i) + j. The table of d[k] is given by the procedure: procedure difference triangle (n, f, d); integer n; real array d, f; begin integer i, j, k, l; for k: = 1 step 1 until (n–1) do d[k]:=f[k+1]–f[k]; k:=n; l:=0; for i:=2 step 1 until (n–1) do begin l:=l+1; for j:= 1 step 1 until (n–i) do

begin d[k]: = d[l + 1] – d[l]; l:=l+1; k:=k+ end end end; It may be worth pointing out that the difficulty in constructing or understanding this algorithm is in getting the subscripts k and l stepping up correctly through the table; the arithmetic is naturally trivial, but the point of constructing the algorithm for general use is to save oneself or others the trouble of working out the correct pattern of subscripts.

3.4.5. Let us now assume that we have calculated array d by means of the last procedure. We now wish to enter the table with argument x and interpolate to give y = f(x). Note that we assume that

f[1] = f(0). If we use Newton’s advancing difference formula we get: procedure Newton interpolation (n, f, d, x, y); integer n; real x, y; real array d, f; begin integer i, k; real a; a:=x; y:=f[1]; k:=1; for i: = 1 step 1 until (n – 1) do begin y:=y+a × d[k]; k:=k+n–i; a:=a×(x–i)/(i+1) end end; When we have unravelled this we see that we are starting with y = f(0), and adding to it a further (n – 1) terms as we go round the

do-loop, the terms being successively etc.

up to The multiplier a is calculated

by a recurrence relation and^ the subscript^ k steps^ up

through the triangular-straight-line array by successively n–1, n–2, n–3 etc.

3.4.6. This procedure is suitable for finding an individual value when the difference table is already constructed; but if the array d is not yet L

definite integral of a function over a given range by suitable approxi- mate methods. Indeed^ this^ may^ be the^ only practical^ way in which to obtain the value of the integral for various mathematical functions which cannot be integrated by calculus. Different^ methods^ are used for each of these cases. The problem of finding the differentials at one point on f(x) or at a series of points is not essentially different from interpolation, with a different function of the values off(z) or the differences being used.

3.5.2. A function may be tabulated at equal or at unequal intervals. For integration at equal intervals either Simpson’s rule repeated or the three-eighths rule repeated or a combination of both will give reasonable results. A general procedure could provide an array g

which gave values of A simpler method,

which can also be used for unequal intervals, is the trapezoidal rule. Yet other methods are available, both for equal and unequal intervals, if the first or second derivatives of the function are also available. A series of algorithms for these are available in the IBM System/ Scientific Subroutine Package (6), which gives references to the further mathematical background. For normal actuarial purposes Simpson’s rule repeated is probably quite satisfactory.

3.5.3. Of greater mathematical importance is possibly the case where the function can be calculated for any value of the argument, but no analytical method is available to find the integral. Amongst^ other functions the familiar normal distribution probability integral is one:

For convenience of calculation it is preferable to find the integral from 0 to x for positive values of x, and a straightforward method is to divide the interval from 0 to x into two parts, giving t = 0, x/2, x, with a first approximation of:

A second approximation is given by dividing the interval into four and using Simpson’s rule repeated twice, i.e.

The process can be repeated in the usual way until two successive values are sufficiently close. The procedure could in general be speci- fied as procedure Simpson (a, b, f, y, tol) where a and b are the lower and upper limits of integration, f is the name of a function subroutine, and y is the result i.e.

to a specified tolerance tol. Since there are two auxiliary functions that help in the calculations it may be helpful to show a flowchart for the procedure (Figure 1) and the procedure itself.

procedure Simpson (a, b, f, y, tol); real a, b, y, tol: real procedure f; begin integer n, i; real g, h, k, oldy;^ Boolean firsttime; n: = 1; firsttime:= true; h:= f(a) + f(b); loop: n: = 2 × n; k: = (b–a)/n; g:=0; oldy:=y; for i: = 1 step 2 until (n – 1) do g:=g+f(a+i × k); h:=h+2×g; y:=(h+2×g)×k/3; if firsttime then begin firsttime: = false; go to loop end

end;

else if abs (oldy – y) > tol then go to loop

3.5.4. We can perhaps use this example to show how to calculate the probability integral for values of x from 0 to 5·0 at intervals of 0·1,

giving the ordinates and the integrals

both to five decimal places.

begin real x, pi, sqtpi, p, bigf, y, tol; real procedure f(t); real t ; f: = exp (– t × t/2); tol:=0·000001; pi:=3·141592; sqtpi:=1/(sprt (2 × pi)); writetext (x P(x) F(x));