Student's Guide to Shared-Memory Parallel Programming | CSC 4310, Papers of Computer Science

Material Type: Paper; Class: PARALLEL & DIST COMPUTING; Subject: COMPUTER SCIENCE; University: Georgia State University; Term: Unknown 1989;

Typology: Papers

Pre 2010

Uploaded on 08/31/2009

koofers-user-xah
koofers-user-xah 🇺🇸

4

(1)

10 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSC 4310/6310 Student’s Guide to Shared-Memory Parallel
Programming
Hrishikesh K. Joshi
, Nikhil A. Junankar
Abstract
This document’s primary objective is to serve as a guide for students taking the Parallel Computing course
offered at Georgia State University. The first section provides a basic introduction to UNIX commands. The
second section provides details of the parallel machine to be used and the utilities available on it to write,
compile and execute a shared-memory parallel program. Suggestions for debugging and performance tuning of
parallel programs are also included.
1Past Graduate Research Assistants, Department of Computer Science, Georgia State University
1
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Student's Guide to Shared-Memory Parallel Programming | CSC 4310 and more Papers Computer Science in PDF only on Docsity!

CSC 4310/6310 Student’s Guide to Shared-Memory Parallel

Programming

Hrishikesh K. Joshi∗ , Nikhil A. Junankar∗

Abstract This document’s primary objective is to serve as a guide for students taking the Parallel Computing course offered at Georgia State University. The first section provides a basic introduction to UNIX commands. The second section provides details of the parallel machine to be used and the utilities available on it to write, compile and execute a shared-memory parallel program. Suggestions for debugging and performance tuning of parallel programs are also included.

(^1) Past Graduate Research Assistants, Department of Computer Science, Georgia State University

Contents

  • 1 Introduction to UNIX
    • 1.1 Logging on to hydra
    • 1.2 Basic IRIX commands
      • 1.2.1 Commands to manipulate a directory
      • 1.2.2 Commands to manipulate a file
      • 1.2.3 Commands to set the access permissions for files and directories
  • 2 Notes on Shared-Memory Model
    • 2.1 Machine Description
    • 2.2 Compiling and Running Parallel Programs
    • 2.3 Debugging tools
    • 2.4 Other Software Tools on SGI (and other UNIX machines)
      • 2.4.1 The make utility:
      • 2.4.2 Improving the efficiency of your programs - pixie and prof
    • 2.5 Examples
      • 2.5.1 Print Hello.
      • 2.5.2 Additing n Numbers in Parallel
      • 2.5.3 Timing the Barrier Calls
  1. Changing directory The command cd directory names changes the present working directory to the new directory, if it exists. The command, cd takes us to our home directory.
  2. Creating a directory A new directory can be created by mkdir command. For example the command mkdir testDir will create a new directory testDir in the present sub-directory, we are in.
  3. Moving a directory mv command allows the user to change the location of a directory and/or its name.
  4. Deleting a directory A directory can be deleted by the following command: rmdir directory name The directory is deleted only if it is empty. If we want to delete a directory alongwith all its subdirectories then the command rm -i -r directory name. This command recursively deletes all the subdirectories under the given directory and finally deletes the directory itself.
  5. Listing the files in a directory The command ls lists the files and subdirectories in the present directory. The command ls directory name lists the files and subdirectories in the specified directory.

1.2.2 Commands to manipulate a file

  1. Creating a file We are mainly interested in creating C program files also called as C source files. These files contain the C source code in the form of ASCII text and have a .c extension. We create these files using the editor program. The simplest editors are ’pico’ and ’vi’. The command pico myfile.c will open a file that may or may not exist. If it already exists then we can make changes to it and save it either under the same name or under a different name. By using a C compiler ’cc’, we can compile our source file ’myfile.c’ and generate. This file is created by the compiler program, by taking out C source file as input. The procedure to compile parallel programs is described in a later section. Unlike in DOS executable file in IRIX need not have the extension .exe. We can execute a file by simply giving its name on the command prompt. For example, suppose we have an executable named myfile, then we can execute it by simply giving myfile.
  2. Moving a file The command is mv. It is the same command that was used in moving directories.
  3. Removing a file The command is rm -i file name.
  4. Copying a file The command is cp file name1 file name2.

1.2.3 Commands to set the access permissions for files and directories

IRIX provides us a method to define the access privileges for users other than the owner of the file or directory. Normally, owner of the file or directory is the user who actually creates a file or directory by logging into his own account. Let us assume that the user account matnajx has created a C source file example.c, on hydra. Following operations can be performed on this file:

  1. A file can be modified by changing its contents. This operation is known as write operation. (Deletion is a special write operation in which all the contents of the file are destroyed.)
  2. A file can simply be read without changing any contents. This is known as the read operation.
  3. A file can be executed (if it is indeed an executable file).

The access previleges for a file can be set by the command chmod access bits file name Where access bits can be the octal digits LMN (L, M, N can be from 0 to 7)

L: specifies Access privileges for the owner

M: specifies Access previleges for the users of the same group as the owner

N: specifies Access previleges for all the other users

Every digit L, M or N can specify following types of permissions:

0 : Neither read nor write nor execute

1 : Only execute

2 : Only write

3 : Only write and execute

4 : Only read

5 : Only read and execute

6 : Only read and write

7 : All the three privileges

For example, the command chmod 775 example.c sets the following privileges for example.c.

  • Read,write and execute by the owner.
  • Read,write and execute by the users of the same group.
  • Read,write by all the other users.

Note that everything in the IRIX is a process. It is really a process which is showing you the hydra prompt and waiting for next input from the user. Also some processes are under user control others are not. This is to prevent malicious use of processes.

  1. ps command If you type ps at the hydra command prompt and type enter you get a list of processes which belong to you and are currently alive. If you type ps -ef at the prompt you get a complete listing of all the processes currently alive in the system which will not all belong to you. Some are system processes while others are those initiated by other users. To isolate processes while using ps -ef which belong to you use the following format ps -ef grep your userid e.g ps -u user id This will sometimes have the same effect as the usual ps command. In the output of the above commands we specifically get information about the process id which is of importance to us. Try to identify the pids of the processes by refering the corresponding command cols. Refer to the man pages for an explanation of other details shown.
  2. kill command The kill command can be used to terminate processes you identified using ps, because of a variety of reasons, primarily because they are erroneous, requiring debugging. To use kill first obtain the process ids using ps by matching the PID column with the COMMAND column. Assuming that you have a PID of a process which you want to kill, type in the following command and press enter kill -9 PID The -9 option stands for sure kill. Others are discussed in the man pages. To kill processes which have spawned by a program perform the following steps a) Run the main program in background process mode. For example prog & b) Use ps to identify the process PIDs. c) Use the kill command to kill the processes identified.
  3. top command The top command displays information about the top cpu processes alive at anytime. To use top, type top at the hydra prompt and press enter. Type q to quit. Top displays the top 30 processes and ranks them according to raw cpu percentage usage. Read the man pages to get more details on ps, kill and top.

2.3 Debugging tools

Using debugger to debug C programs The IRIX system provides a source level debugger called dbx. It can be invoked from the command line as dbx exec program The ’exec program’ is the name of the executable program to be debugged. This program must have been compiled by the ’-g’ option when compiling the C source code. After we run dbx we get a prompt ¿ on which we issue the various commands to monitor the execution of the program.

Starting execution The program is executed by the command run [argument list]. Here argument list is the list of arguments that is normally passed when running the program on the command line. The program execution continues till a break point. If at any point of time we want to reexecute the program from start, we can use the command rerun.

Setting a breakpoint We can set the break point at any line in the soure code by giving the a line number (line numbers are shown in dbx). The command is, stop at NN , where NN is the line number at which you want to set the break point. We can set a break point just after entering a function. The command is, stop in func name, where func name is the name of the function at which the break point is to be set.

Single stepping After the program encounters a break point, the execution is controlled by us, normally single stepping. The command is next or n. This command will execute the statements sequentially. So we have to just keep on doing next everytime.

Stepping inside a function The command step allows to single step inside a function call.

Continuing execution When we are single stepping, we can set new break points or continue execution by the command cont. This command continues execution till the next break point.

Printing the value of a variable The command print variable name prints the present value of vari- able name. We do not have to specify the type of the variable so it is properly taken care of by the dbx itself.

Determining the type of the variable The command whatis variable name gives us the type of the variable variable name.

Selecting a source file The command e source file edits a particular source file.

Listing a source file The command list lists the next ten lines of the source file currently selected. If we say list line number, the next ten lines starting with the specified line number is shown.

Editing a source file The command edit source file allows the editing of source file. It invokes the vi editor for editing the source file.

Debugging core dumps Core dump is the abnormal termination of the program due to malicious memory handling, due to a bug in the program. When a core dump occurs during the execution of a program, a file named core is created containing some information about the abnormalities. dbx uses this file to extract the information about the location of the core dump. We do not have to control anything because generation of the core file, is automatic. To get the name of the function which caused the problem, we issue the command trace or simply t. This gives us the list of functions in their calling sequence with the most recently called function listed first, this function itself is the cause of the core dump. The next logical step is to actually go inside the function by setting the break point and single step to determine the actual location of the problem. To facilitate the debugging the normal functions should not be more than 30 lines.

Following are possible causes of a core dump

  1. An array is accessed beyond its maximum size.
  2. A pointer is accessed without allocating it by a call to malloc or calloc.
  3. A pointer is freed twice.
  4. A pointer being freed is null.
  5. Trying to access the memory through a pointer that is null.
  6. Trying to access the memory through a pointer that is uninitialized.

At this time if you do a ls you will notice main.o in the directory along with the other files. Simillarly obtain average.o. Now you need to link them to obtain the final executable because in .o files external linkages are not resolved. cc -o average main.o average.o Now if there is a change in main.c or object.c you will again have to go through the procedure of obtaining the object file and then linking it with the others. Imagine doing this for 5 files out of the 10 you worked on. The make utility helps in situations like above. First a special file is created called the Makefile (note the capital M) and kept in the same directory as all other files, .c and .o. Once we create this file by a procedure described next, we type make at the command line. The make program will read the compilation and linking instructions in the Makefile and create the executable thus saving time for the programmer.

Preparing the Makefile: A make file consists of a series of entries of the following form

[.....commands........]

The first line of each entry is a list of targetfiles separated by spaces, then a colon and then a list of files called the dependencies. A dependency is a file which if changed will alter the functionality of the target file. The colon is followed by a tab before any text is inserted. If the names of the dependencies overflow in the next line then that list is followed by a semi-colon before starting the commands. A typical Makefile for our running example of main.c and average.c is as given below

average: main.o average.o cc -o average main.o average.o main.o: main.c cc -c main.c average.o: average.c cc -c average.c

The make program interprets the first entry in the following manner: average is the target file and it depends on main.o and average.o and it is obtained by executing the command on the next line. The other two entries are interpreted in the same manner. An example of a Makefile is given along with the example programs. Hence the make utility proposes the following methodology.... ...edit...make...run and repeat...

2.4.2 Improving the efficiency of your programs - pixie and prof

  • pixie utility The pixie program adds profiling code to a program. It reads an executable program, partitions it into basic blocks, and writes an equivalent program containing additional code that counts the execution of each basic block. ( A basic block is a region of the program that can be entered only at the beginning and exited only at the end). Pixie is invoked on an executable file as pixie prog where prog is a executable file from our running example. After the above command is executed you should find an executable file in your directory called prog.pixie and a file called prog.Addrs. When you run this new executable file prog.pixie like any other file, it will generate a file containing basic block counts. This file has the name of prog.Counts. This file will go as input to prof command which is explained next.
  • prof command Having obtained the prog.Addrs and prog.Counts files which contain information on addresses of blocks and counts of the blocks respectively, by using pixie, we then use the prof utilty to analyze this profiling data and produce a listing. This listing typically gives information on how many times a given procedure is invoked, what is it’s execution time, lines or procedures which did not execute and othe information used for optimization. The use of prof command is explained below using an example. For more details do a man prof. Example:

cc -o myprog myprog.c /* generates executable called prog / pixie -o myprog.pixie myprog / generates myprog.Addrs / myprog.pixie / generates myprog.Counts */ prof -pixie myprog myprog.Addrs myprog.Counts > myprog.prof

2.5 Examples

There are examples of C programs implemented on the hydra in the directory ??. Use the UNIX commands in earlier sections to make copies of these programs for yourself. A readme file is provided to help you. The examples and their algorithms are described next.

2.5.1 Print Hello.

Description/ Algorithm : This program spawns processes and makes each one of them print a hello on the screen.

File names: hello.c /*source code / hello / executable / helloop / output example file */

2.5.2 Additing n Numbers in Parallel

This program divides an array of integer amongst available processes and each one of the processes performs addition of the elements of the data set in it’s domain. The final result is obtained by each process adding it’s result to a global final result. Note that in the sequential version we would obtain each element of the array one at a time and add to the global result. Three versions of the same idea are provided.

Parallel add version 1 : This program spawns multiple processes which work on an array of size 8388608 and give the time taken and the speed-up obtained with number of processes incremented from 1 to 3.

File names: plladd1.c /*source code / plladd1 / executable / plladd1op / output example file */

Parallel add version 2: This program spawns multiple processses which work on an array of variable size and number of processes varying from 1 to 3. A table provides information on summation results and speed up obtained. To calculate the speedup we need the time taken for a pure sequential run of the slave process. This program achieves the addition of the array using the sequential add without using parallel computing primitives as we will see in a later example (Timing the barrier calls).

File names: plladd2.c /*source code / plladd2 / executable / plladd2op / output example file */