







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This is project report related to Computer science degree. Project was supervised by Dr. Niharika Raj at Acharya Nagarjuna University. Its main points are: Table, Pack, Scalpack, Linear, Matrix, Libarary, Algebra, Factorization, Eigen, Equation, Comples, Real, Value, Problem
Typology: Study Guides, Projects, Research
1 / 13
This page cannot be seen from the preview
Don't miss anything!








Certificate of Approval .................................................. Error! Bookmark not defined. Table of Contents .......................................................................................................... i
ii
Figure 1-Cluster Computing .......................................................................................... 6 Figure 2 Speedup vs Number of Processors .................................................................. 8
he needs to choose between different available parallel libraries then he might be able to choose the library that gives best performance for his problem.
There are two underlying architectures that are to be used for this project. First will be a cluster i.e. processors running on separate systems. In this architecture, there will be a issue of networking delay.
The second underlying architecture will be the super-computer having multiple processors. In this case there will be no issue of LAN capability or network congestions.
Parallel computers can run some types of programs far faster than traditional single processor computers. That is why there is an emerging trend in the use of parallel computing.
Most of the people do not want to run all the available programs related to their problem in order to find the best performing program. This project will help them to see some performance comparison of different parallel programs so that they might be able to start their initial task easily.
When writing programs for parallel computing, programmers explicitly specify how to decompose the computing work between all available nodes. After this project, performance of certain libraries will be available and by analyzing the decomposition of tasks of these well performing libraries, programmers can have an idea about the decomposition rules of the program that can give better performance.
Parallelism finds applications in very diverse application domains for different motivating reasons. The use of multi-processor machines or parallel computing has made
a great impact in a variety of areas, especially in computational simulations for scientific and engineering applications. These range from improved application performance to cost considerations. Some of the particular fields are Earth Quakes Economics Fission Fusion Material Science Medical Imaging
Some parallel libraries are selected for this project. These libraries are discussed under this heading. In this project some of the selected program of these libraries will be considered for performance evaluation.
PBLAS (Parallel Basic Linear Algebra Subprograms) libraries contain serial and parallel versions of basic linear algebra procedures. It contains routines that fall into the following three levels [7]:
Vector-Vector Operations (swap, copy, addition, dot product) Matrix-Vector Operations (multiply, rank-updates, outer-product) Matrix-Matrix Operations (multiply, transpose, rank-updates)
All these routines are designed to work with a variety of matrix types such as general, symmetric, complex, Hermetian and triangular matrices.
scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.
PETSc includes an expanding suite of parallel linear, nonlinear equation solvers and time integrators that may be used in application codes written in Fortran, C, and C++. PETSc provides many of the mechanisms needed within parallel application codes, such as parallel matrix and vector assembly routines. The library is organized hierarchically, enabling users to employ the level of abstraction that is most appropriate for a particular problem. By using techniques of object-oriented programming, PETSc provides enormous flexibility for users. Some of the PETSc modules deal with [5]
Vectors Matrices (generally sparse) Distributed arrays (useful for parallelizing regular grid-based problems) Krylov subspace methods Pre-conditioners, including multi-grid and sparse direct solvers Nonlinear solvers Time-steppers for solving time-dependent (nonlinear) PDEs
The Matlab Distributed Computing Toolbox is a powerful tool that enables PC clients to run parallel Matlab computations. This Toolbox enables the user to execute MATLAB algorithms in a cluster of computers. The user can prototype and develop applications in the MATLAB environment and then use the Distributed Computing Toolbox to divide them into independent tasks. The MATLAB Distributed Computing Engine evaluates these tasks on remote MATLAB sessions. The Distributed Computing Toolbox substantially reduces overall execution time for many types of applications, including those in which algorithms or models are executed repeatedly with different input data[3]. The Distributed Computing Toolbox also offers support for third-party schedulers, and
new inter-process communication capabilities for distributing and executing parallel algorithms in a cluster of computers using Matlab.
In this project, the underlying architecture is cluster and SGI(a supercomputer).
Cluster is a widely used term meaning independent computers combined into a unified system through software and networking. At the most fundamental level, when two or more computers are used together to solve a problem, it is considered a cluster.
Figure 1-Cluster Computing
In order to analyze the performance of a system, we need some performance evaluation metrics. By using these metrics, we can perform comparison of various types of systems. In parallel computing, the criteria for evaluating performance can include speedup, efficiency and scalability.
Speedup tells us that how much a parallel algorithm is faster than a corresponding sequential algorithm. It is defined as
Where Sp is the speed up, T 1 is the execution time for sequential algorithm, Tp is the execution time for parallel algorithm and p is the number of processors. There are three cases for speedup linear, sub-linear and super-linear [1] as shown in figure 2. Linear: Speedup is equal to the number of processors. Sub-Linear: Increasing the number of processors decreases the speedup. Super-Linear: Increasing the number of processors increases the speedup exponentially.
Figure 2 Speedup vs Number of Processors
The efficiency of a parallel system is defined as the achieved fraction of total potential parallel processing gain. It estimates how well the processors are used in solving the problems [1]. Mathematically it is expressed as,
Where, EP is the efficiency. When P is fixed, Speedup and Efficiency are equivalent measures, differing only by the constant factor P.
Scalability is the ability of a system to handle growing amount of work. In parallel computing scalability measures the capability to effectively utilize an increasing number of processors [1].
In most of the cases, scalability decreases as more number of processors are involved in computation. However, if we increase the problem size accordingly then the efficiency of the system can be maintained.
The entire project work is distributed semester-wised. The work breakdown structure of project is given below:
SEMESTER 6 & SUMMER: Literature Survey Development of SRS and Project Plan Understanding Parallel Computing Concepts
SEMESTER 7: Study of Different Parallel Numerical Libraries
[1] Alexey Borisenko, “Performance Evaluation in Parallel Systems”, School of information Technology, University of Ottawa, Canada, 2010 [2] URL: http://www.codeproject.com/KB/IP/ClusterComputing.aspx [3] URL: http://www.mathworks.com/products/parallel-computing [4] URL: http://www.buyya.com/cluster/v2chap1.pdf [5] URL: http://www.mcs.anl.gov/petsc/petsc-2/documentation [6] URL: http://www.netlib.org/scalapack/scalapack_home.html [7] URL:http://sirius2.umdnj.edu/~rossiar/hpcworkshop/hpc_files/Introduction_to_ MPI/resources/mpi/parallel_libs/1229.html