Data Processing - Research and Teaching Methods - Lab, Study notes of Research Methodology

The goal of this course is to pass on to new graduate students fundamentals of graduate research with an emphasis on biological systems engineering, and, college instruction. Keywords in this lab manual are: Data Processing, Subprograms and Character Type, Declaration Statements, Internal Subprograms, Program Units, Scientific Programming Applications, Cartesian Coordinate System, Format of Internal Subprograms, Local and Global Variables, Functions and Subroutines

Typology: Study notes

2012/2013

Uploaded on 10/03/2013

abani
abani 🇮🇳

4.4

(34)

81 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Data Processing with Fortran
Subprograms and Character Type
(Compiled from Hahn [1994], Nyhoff and Leestma [1996])
1. Subprograms
1.1. Main Program
Every complete program must have one and only one main program, which has the form
PROGRAM name
[declaration statements]
[executable statements]
[CONTAINS
internal subprograms]
END
If the last statement ahead of the CONTAINS statement does not result in a branch (and it should not),
control passes over the internal subprograms to the END statement, and the program stops. In other words,
the internal subprograms are only executed if properly invoked by a CALL statement in the case of a
subroutine, or by referencing its name, in the case of a function.
Branching in a code alters the normal top-to-bottom sequence of program execution. A branch transfers
control from one statement to a labeled target statement in the same program unit. Statement labels can only
refer to branch target statements, FORMAT statements, and DO terminations.
1.2. Program Units
In Fortran, a non-trivial problem could be broken down into separate subprograms (procedures), each
carrying out a particular, well-defined task. It often happens that such subprograms can be used by many
different “main” programs, and in fact by different users of the same computer system. Fortran 90 enables
you to implement these subprograms as functions and subroutines which are independent of the main
program. Examples are procedures to perform statistical operations, or to sort items, or to find the best
straight line through a set of points, or to solve a system of differential equations.
Subprograms may be internal or external. Useful procedures may be collected together as libraries. Such
collections are called modules. Main programs, external subprograms, and modules are referred to as
program units. An internal subprogram is contained within another program unit, and therefore compiled with
it, whereas an external subprogram is not and is compiled separately. An important implication is that an
internal subprogram may use names of entities declared by the program unit that contains it, whereas an
external subprogram would in general not.
1.3. Internal Subprograms
There are two types of subprograms: functions and subroutines. We will first look at functions. We have
already seen how to use some of the intrinsic functions supplied by Fortran 90, such as SQRT. Functions are
particularly useful when arithmetic expressions, which can become long and cumbersome, need to be
evaluated repeatedly. You can write your own functions, to be used in the same way in a program.
Subroutines are similar to functions in many aspects, except that they generally offer greater flexibility and
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Data Processing - Research and Teaching Methods - Lab and more Study notes Research Methodology in PDF only on Docsity!

Data Processing with Fortran Subprograms and Character Type (Compiled from Hahn [1994], Nyhoff and Leestma [1996])

1. Subprograms

1.1. Main Program

Every complete program must have one and only one main program, which has the form

PROGRAM name [declaration statements] [executable statements] [CONTAINS internal subprograms] END

If the last statement ahead of the CONTAINS statement does not result in a branch (and it should not), control passes over the internal subprograms to the END statement, and the program stops. In other words, the internal subprograms are only executed if properly invoked by a CALL statement in the case of a subroutine, or by referencing its name, in the case of a function.

Branching in a code alters the normal top-to-bottom sequence of program execution. A branch transfers control from one statement to a labeled target statement in the same program unit. Statement labels can only refer to branch target statements, FORMAT statements, and DO terminations.

1.2. Program Units

In Fortran, a non-trivial problem could be broken down into separate subprograms (procedures), each carrying out a particular, well-defined task. It often happens that such subprograms can be used by many different “main” programs, and in fact by different users of the same computer system. Fortran 90 enables you to implement these subprograms as functions and subroutines which are independent of the main program. Examples are procedures to perform statistical operations, or to sort items, or to find the best straight line through a set of points, or to solve a system of differential equations.

Subprograms may be internal or external. Useful procedures may be collected together as libraries. Such collections are called modules. Main programs, external subprograms, and modules are referred to as program units. An internal subprogram is contained within another program unit, and therefore compiled with it, whereas an external subprogram is not and is compiled separately. An important implication is that an internal subprogram may use names of entities declared by the program unit that contains it, whereas an external subprogram would in general not.

1.3. Internal Subprograms

There are two types of subprograms: functions and subroutines. We will first look at functions. We have already seen how to use some of the intrinsic functions supplied by Fortran 90, such as SQRT. Functions are particularly useful when arithmetic expressions, which can become long and cumbersome, need to be evaluated repeatedly. You can write your own functions, to be used in the same way in a program. Subroutines are similar to functions in many aspects, except that they generally offer greater flexibility and

are therefore used more frequently in scientific programming applications.

1.3.1. An Example

To illustrate the use of internal subprograms, consider the rotation of a Cartesian coordinate system. If such a system is rotated counterclockwise through an angle of a radians, the new coordinates (xN, yN) of a point referred to the rotated axes are given by

xN = x cos a + y sin a yN = –x sin a + y cos a

where (x, y) are its coordinates before rotation of the axes. The following functions could be used to define the new co-ordinates:

FUNCTION Xnew ( X, Y, A ) REAL XNew, X, Y, A Xnew = X * COS(A) + Y * SIN(A) END FUNCTION Xnew

FUNCTION YNew( X, Y, A ) REAL YNew, X, Y, A YNew = –X * SIN( A ) + Y * COS( A ) END FUNCTION Ynew

1.3.2. The Format of Internal Subprograms

Since functions and subroutines are very similar, common features are described by referring to them collectively as subprograms. Most of the following rules apply also to external subprograms, except where otherwise stated.

All internal subprograms are placed between a CONTAINS statement and the END statement of the main program. Subprograms look almost like a main program, except for their headers and END statements. Internal subprograms may not contain other subprograms, and so may not themselves have a CONTAINS statement.

The general syntax of an internal function is

FUNCTION Name ([argument list]) [declaration statements] [executable statements] END FUNCTION [Name]

The statement FUNCTION Name (argument list) is called the function statement, or header, or declaration. Note that if the main program has an IMPLICIT NONE statement (and sound programming style insists that it should have one) the function name and arguments must be declared with a type. Although this may be done in the main program, you may also declare the function name and arguments in the function body itself, as recommended by some experts.

INTEGER I

DO I = 1, 10

PRINT*, I, Fact (I) END DO

CONTAINS FUNCTION Fact (N) INTEGER Fact, N, Temp Temp = 1 DO I = 2, N Temp = I * Temp END DO Fact = Temp END FUNCTION

END

The problem is that I is a global variable, i.e., the name I represents the same variable inside and outside the function. Fact is first called when I = 1, which is the first value written. This value is passed to the function’s dummy argument N. The same I is now given the initial value 2 by the DO loop inside Fact, but since it is greater than N, the DO loop is not executed, so I still has the value 2 when Fact returns to be printed in the main program. However, I is now incremented to 3 in the DO loop in the main program, which is the value it has when the second call to Fact takes place. In this way, Fact is never computed for an even value of I. All this is a consequence of the variable I being global.

The problem is solved by re-declaring I in the function to make it local. You should make it a rule to declare all variables used in subprograms. That way you can never inadvertently make use of a global variable in the wrong context. If you need to get information into a subprogram from its host, the safest way to do it is through the dummy arguments. When there is a large amount of such information to be shared by many subprograms, the best solution is to declare global variables in a module, and for subprograms needing access to those variables to use the module.

The use of IMPLICIT NONE in a main program is particularly important when there are internal subprograms. It forces you to declare all local variables and dummy arguments, which makes for good programming style.

1.3.5. The Return Statement

Normal exit from a subprogram occurs at its END statement. However, it is sometimes convenient to exit from other points. This may be done with the statement RETURN. Excessive use of RETURN should be avoided since it very easily leads to spaghetti (unstructured code).

1.3.6. Functions vs. Subroutines

Subroutines are very similar to functions. The differences are:

  • No value is associated with the name of a subroutine, hence it must not be declared.
  • A subroutine is invoked with a CALL statement.
  • The keyword SUBROUTINE is used in the definition and the END statement.
  • A subroutine need not have any arguments, in which case the name is written without parentheses, e.g.

CALL PLONK

A function without arguments must have empty parentheses, e.g., Fung ().

The general syntax of an internal subroutine is

SUBROUTINE Name [(argument list)] [declaration statements] [executable statements] END SUBROUTINE [Name]

The following program prints a line with the subroutine PrettyLine, which has two dummy arguments. Num is the number of characters to be printed; Symbol is the ASCII code of the characters to be printed.

IMPLICIT NONE

CALL PrettyLine (5, 2)

CONTAINS SUBROUTINE PrettyLine (Num, Symbol) INTEGER I, Num, Symbol CHARACTER80 Line DO I = 1, Num Line(I:I) = ACHAR(Symbol) END DO PRINT, Line END SUBROUTINE END

Character substrings, such as Line (I:I), are discussed briefly in section 2.

The following example shows that dummy arguments may be used to take information back to the calling program—in this case their values are exchanged:

IMPLICIT NONE REAL A, B

READ, A, B CALL SWAP (A, B) PRINT, A, B

CONTAINS SUBROUTINE SWAP(X, Y) REAL Temp, X, Y Temp = X X = Y Y = Temp

of a special program called a linker, finally resulting in a .EXE version of the main program. Your compiler manual will have the details of how to do this. Once it is finally debugged, an external subprogram need never be recompiled, only linked. This prevents you from wasting time in compiling it over and over, which would be the case if it was an internal subprogram.

1.4.2. An Example

As an example, let’s rewrite the internal subroutine SWAP of section 1.3.6. as an external subroutine. The main program (in one file) becomes

IMPLICIT NONE EXTERNAL SWOP REAL A, B

READ, A, B CALL SWAP (A, B) PRINT, A, B

END

and the external subroutine (in a separate file) is then

SUBROUTINE SWAP (X, Y) REAL Temp, X, Y Temp = X X = Y Y = Temp END SUBROUTINE

Try compiling, linking and running this example. If you want more than one external subprogram in the same file, you should use a module.

1.4.3. The EXTERNAL Statement

If you accidentally used the name of an intrinsic subprogram for an external subprogram, the compiler would by default assume you were referring to the intrinsic subprogram, so your external subprogram would be inaccessible. You might think you know the names of all intrinsic subprograms, and that this problem will not present itself. However, you could have a problem when transporting your code to another installation, because the standard allows compilers to provide additional intrinsic subprograms.

To avoid this problem, the names of all external subprograms should be specified in an EXTERNAL statement, which should come after any USE (a statement allows access to entities in a module) or IMPLICIT statements. Naming an external subprogram like this ensures that it is linked as an external subprogram, and makes any intrinsic subprogram with the same name unavailable. This practice is strongly recommended.

2. Characters

2.1. Character Constants

So far we have dealt mainly with two of Fortran 90’s intrinsic types: integer and real. We now come to the intrinsic type character. The basic character literal constant is a string of characters enclosed in a pair of either apostrophes (‘) or quotes (“). Most characters supported by your computer are permitted, with the exception of the “control characters” (e.g., escape). The apostrophes and quotes serve as delimiters and are generally not part of the constant.

A blank in a character constant is significant, so that

“B Shakespeare”

is not the same as

“BShakespeare”

Fortran 90 is “case sensitive” only in the case of character constants, so

Charlie Brown

is not the same as

CHARLIE BROWN

There are two ways of representing the delimiter characters themselves in a character constant. Either sort of delimiter may be embedded in a string delimited by the other sort, as in

‘Jesus said, “Follow me”’

A character string may be empty, i.e. ‘’ or “ ”. The number of characters in a string is called its length. An empty string has a length of zero.

2.2. Character Variables

The statement CHARACTER LETTER declares LETTER to be a character variable of length 1, i.e., it can hold a single character. Longer characters may be declared as in the following example:

CHARACTER (Len = 15) Name

This means that the character variable Name can hold a string of up to 15 characters. An alternative form of the declaration is

CHARACTER Name*

Character constants may be assigned to variables in the obvious way:

Name = ‘J. Soap’

On an input record, the quote or apostrophe delimiters are not needed for a character constant if the constant does not contain a blank, comma or slash. Since the names in the example above contain commas and blanks, delimiters are needed in the input file.

followed by T or F (upper- or lowercase), optionally followed by additional letters. This curious arrangement is simply to allow the strings .TRUE. or .FALSE. to be input.

Character values are controlled by the A edit descriptor in one of two forms, A or Aw. In the form A, the width of the I/O fields is determined by the length of the character variable or expression in the I/O list. So, if NAME is declared

CHARACTER NAME*

then 7 characters are output, and 7 characters are input.

In the second form (Aw), the w left-most characters are printed, on output. If necessary, the output field is blank-filled from the left. The rules for input under the second form are a little strange. Suppose len is the length of the variable being read. If w is less than len, the left-most w characters are read, padded with blanks on the right. For instance, under A5, the input string NAPOLEON is read into NAME (as declared above) as NAPOLbbb. However, and this is the strange part, if w is greater than len, the right-most len characters are read. So under A7, for example, the string NAPOLEON is read into NAME as APOLEON. One would have expected the left-most characters to be read.

Finally, there are the general Gw.d and Gw.dEe edit descriptors, which may be used for any of the intrinsic data types. These descriptors are useful for printing values whose magnitudes are not well-known in advance. Where possible values are output under the equivalent F descriptor; otherwise the equivalent form with E is used.

3.2. Formatted Read

A more general form of READ allows input from files, and which can intercept errors and end-of-file conditions gracefully, without causing the program to crash. It is

READ ([UNIT=]u, [FMT=]fmt [,IOSTAT=ios] [,ERR=errorlabel] [,END=endlabel]) [list]

The only obligatory items are the format specifier fmt, as described above, and the unit specifier u. A unit is an I/O device, such as a printer, terminal, or disk drive, for example, which may be connected by the compiler to your program. Such a unit may have a unit number attached to it, which is usually in the range 1–99, for the duration of a program.

We have seen the only two situations where a unit number is not required. The PRINT normally expects to output to the terminal, and the first form of READ above normally expects to read from the terminal. In such cases, the terminal is called the standard I/O unit. Your system may allow you to change the standard unit.

The unit specifier u, when it is required, may be of three forms: an integer expression, an asterisk (which implies the standard input unit), or a character variable in the case of an internal file. The remaining specifiers are optional, and may be in any order. If IOSTAT is specified, ios must be an integer variable. After execution of READ ios has different (system-dependent) negative values depending on whether an end-of-record or end-of-file condition occurred, a positive value if an error was detected, or the value of zero otherwise. The presence of IOSTAT prevents a crash if an exception occurs.

3.3. Formatted WRITE

The general form of the WRITE statement for formatted output is

WRITE ([UNIT=]u, [FMT=]fmt [,IOSTAT=ios] [,ERR=errorlabel]) [list]

The specifiers have the same meanings as in the READ statement. The output device may be selected during program execution. You may be developing a large program which will eventually spew vast amounts of data out on the printer. To save time (and paper) while writing the program, you may want to be able to specify while the program is running where the output should go. The following code should help (PRN and CON are the names of the PC printer and terminal respectively):

CHARACTER OutputDevice* PRINT, ‘Where do you want the output (“prn” or “con”)?’ READ, OutPutDevice OPEN (1, FILE = OutputDevice) WRITE( 1, * ) ‘Output on designated device’

4. Fortran Internal Functions

4.1. Elemental Numerical Functions

Note that the arguments may be real or complex scalars or arrays, unless otherwise stated.

ABS(A): absolute value of integer, real or complex A ACOS(X): inverse cosine (arc cosine) AIMAG(Z): imaginary part AINT(A [,KIND]): largest whole real number not exceeding its argument, e.g., AINT (3.9) returns 3. ANINT(A [,KIND]): nearest whole real number, e.g., ANINT (3.0) returns 4. ASIN(X): inverse sine (arc cosine) ATAN(X): inverse tangent (arc tangent), in the range !B/2 to B/ ATAN2(Y, X): inverse tangent (arc tangent), as principal value of the argument of the complex number (X, Y), in the range !B to B CEILING(A): smallest integer not less than A CMPLX(X [,Y] [,KIND]): converts X or (X, Y) to complex type CONJG(Z): conjugate of complex Z COS(X): cosine COSH(X): hyperbolic cosine DIM(X, Y): max(X!Y, 0). EXP(X): exponential function FLOOR(A): largest integer not exceeding its argument, e.g., FLOOR(!3.9) returns! 4 INT(A [,KIND]): converts to integer type, truncating towards zero LOG(X): LOG natural logarithm; for complex X result is the principal value LOG10(X): common (base 10) logarithm MAX(A1, A2 [,A3,...]: maximum of arguments MIN(A1, A2 [,A3,...]: minimum of arguments MOD(A, P): remainder of A modulo P, i.e. A!INT(A/P)P, e.g., MOD (2.2, 2.0) returns 0. MODULO(A, P): A modulo P for A and P both real or both integer, i.e., A!FLOOR(A/P)P in the real case, and A!FLOOR(A÷P)*P in the integer case, where ÷ represents mathematical division, e.g., MODULO(!10, 3) returns 2, MODULO(!2.2, 2.0) returns 1. NINT(A [,KIND]): integer nearest to A