Solving Linear Systems using Parallel Gaussian Elimination - Prof. Elise H. Dedoncker, Study notes of Computer Science

The solution of linear systems using parallel gaussian elimination. It includes the concept of gaussian elimination, the use of partial pivoting, and the application of the method to a specific system. The document also covers the analysis of the time complexity and speedup of the parallel algorithm.

Typology: Study notes

Pre 2010

Uploaded on 08/18/2009

koofers-user-g8z
koofers-user-g8z 🇺🇸

10 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
3/9/2009
1
Thap Panitanarak
CS6260 Spring 2009
Solving Linear System using
Parallel Gaussion Elimination
a0,0x0+ a0,1x1+ a0,2x2... + a0,n-1xn-1 = b0
a1,0x0+ a1,1x1+ a1,2x2... + a1,n-1xn-1 = b1
... ... ... ... ... ...
an-1,0x0+ an-1,1x1+ an-1,2x2... + an-1,n-1xn-1 = bn-1
Linear System
Linear System or System of Linear Equations
n equations of n variables
ai,j is a coefficient of xjin equation i
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Solving Linear Systems using Parallel Gaussian Elimination - Prof. Elise H. Dedoncker and more Study notes Computer Science in PDF only on Docsity!

Thap Panitanarak

CS6260 Spring 2009

Solving Linear System using

Parallel Gaussion Elimination

a0,0x 0 + a0,1x 1 + a0,2x 2 ... + a0,n-1xn-1 = b 0

a1,0x 0 + a1,1x 1 + a1,2x 2 ... + a1,n-1xn-1 = b 1

an-1,0x 0 + an-1,1x 1 + an-1,2x 2 ... + an-1,n-1xn-1 = bn-

Linear System

 Linear System or System of Linear Equations

 n equations of n variables

 ai,j is a coefficient of xj in equation i

Linear System

 Matrix representation

a0,0 a0,1 a0,2 ... a0,n-1 x 0 b 0

a1,0 a1,1 a1,2 ... a1,n-1 x 1 = b 1

an-1,0 an-1,1 an-1,2 ... an-1,n-1 xn-1 bn-

A x^ b

Linear System

 Matrix representation – Augmented matrix

a0,0 a0,1 a0,2 ... a0,n-1 b 0

a1,0 a1,1 a1,2 ... a1,n-1 b 1

an-1,0 an-1,1 an-1,2 ... an-1,n-1 bn-

[A:b]

Back Substitution

  • 1x 0 + 1x 1 - 1x 2 = - - 2x 1 - 3x 2 = - 2x 2 = - 2x 3 =
  • 1x 0 + 1x 1 - 1x 2 + 4x 3 = - - 2x 1 - 3x 2 + 1x 3 = - 2x 2 - 3x 3 = - 2x 3 =
  •  Substitute x 3 =  Solve a linear system Tx = c, where T is upper triangular
  •  Next x 2 and x - 1x 0 = - - 2x 1 = - 2x 2 = - 2x 3 = - 1x 0 + 1x 1 = - - 2x 1 = - 2x 2 = - 2x 3 =
  •  Substitute x 2 = Back Substitution
  •  Substitute X 1 = -
  •  We get x 0 = 3, x 1 = -6, x 2 = 3 and x 3 =

Gaussian Elimination

 General Concept

 “Well-Known” algorithm for solving the linear system

Ax = b

 Reduce Ax = b to Tx = c where T is an upper

triangular matrix

 Back substitution Tx = c to solve for x

 Row Operations used

 Multiply any row with a nonzero constant

 Row swap

 Add/Subtract one row with another

4x 0 + 6x 1 + 2x 2 - 2x 3 = 8

  • 3x 1 + 4x 2 - 1x 3 = 0
  • 3x 1 - 3x 2 + 2x 3 = 9
  • 6x 1 - 6x 2 + 7x 3 = 24

4x 0 + 6x 1 + 2x 2 - 2x 3 = 8 2x 0 + 5x 2 - 2x 3 = 4

  • 4x 0 - 3x 1 - 5x 2 + 4x 3 = 1 8x 0 + 18x 1 - 2x 2 + 3x 3 = 40

Gaussian Elimination

 Consider the linear system

 m1,0 = a1,0/a0,0 = 0.5, m2,0 = a2,0/a0,0 = -1 and m3,0 = a3,0/a0,0 = 2

 Lower elements of column 0 are eliminated !!!

  • 6x 1 + 2x 2 - 2x 3 = 8 2x 0 + 5x 2 - 2x 3 = 4
  • 4x 0 - 3x 1 - 5x 2 + 4x 3 = 1 8x 0 + 18x 1 - 2x 2 + 3x 3 = 40

Gaussian Elimination

 Partial Pivoting

 Use the row that has the biggest absolute value of a pivot

column as a pivot row (row swap needed)

 Also make the computation more accuracy

8x 0 + 18x 1 - 2x 2 + 3x 3 = 40 2x 0 + 5x 2 - 2x 3 = 4

  • 4x 0 - 3x 1 - 5x 2 + 4x 3 = 1
    • 6x 1 + 2x 2 - 2x 3 = 8

Gaussian Elimination

Sequential Algorithm

for i from 0 to n-1 do //Find Pivot Row pmax = 0 for j from i to n-1 do if pmax < |a[j,i]| then pmax = |a[j,i]| prow = j end if end do rswap(i,prow) //Gaussian Elimination for j from i+1 to n-1 do m = a[j,i]/a[i,i] for k from i to n do a[j,k] = a[j,k]-a[i.k]*m end do end do end do

//Back Substitution for i from n-1 to 0 x[i] = a[i,n]/a[i,i] for j from 0 to i-1 do a[i,n] = a[j,n]-x[i]*a[j,i] end do end do

Note: To make row swap moreefficiently, we can also use indirect index loc[i] = j where “physical row”j is indexed by “virtual row” i

  • 6x 1 + 2x 2 - 2x 3 = 8 2x 0 + 5x 2 - 2x 3 = 4
  • 4x 0 - 3x 1 - 5x 2 + 4x 3 = 1 8x 0 + 18x 1 - 2x 2 + 3x 3 = 40

Gaussian Elimination

 Indirect index for row swap

  • 6x 1 + 2x 2 - 2x 3 = 8 2x 0 + 5x 2 - 2x 3 = 4
  • 4x 0 - 3x 1 - 5x 2 + 4x 3 = 1 8x 0 + 18x 1 - 2x 2 + 3x 3 = 40

Gaussian Elimination

Sequential Algorithm - Analysis

 Gaussian elimination with partial pivoting

 In iteration k (column k),  Finding pivot row step uses (n-k)  Elimination step uses (n-k-1)(n-k)  n iterations (0 to n-1)  n(n+1)/2 + n(n-1)(n+1)/  O(n^3 )

 Back substitution

 In iteration k, it takes (n-k-1)  n iterations  n(n-1)/  O(n^2 )

 All in O(n^3 )

Gaussian Elimination

Parallel Design

 Back Substitution – Algorithm & Analysis

for i from n-1 to 0 do

P(i) computes x[i]

P(i) boardcasts x[i] to others P’s

for j from 0 to i-1 do in parallel

P(j) updates b[j]

end do

end do

P(0) computes x[0]

 With n processors, we can achieve O(n*log 2 n)  Speedup S = n^2 /nlog 2 n = n/log 2 n  Example: n = 16  S = 16/4 = 4

4x 0 + 6x 1 + 2x 2 - 2x 3 = 8 2x 0 + 5x 2 - 2x 3 = 4

  • 4x 0 - 3x 1 - 5x 2 + 4x 3 = 1 8x 0 + 18x 1 - 2x 2 + 3x 3 = 40

Gaussian Elimination

Parallel Design

 Gaussian elimination with partial pivoting  Processors’ asignment (row-wise)

 Communication needed to find a pivot row in each iteration  All-reduce (pmax, prow)  Communication needed to zero off (elimination) – all processors need to know all elements in pivot row  Parallel elimination can be done after all processors have pivot row

P 1

P 2

P 3

P 0

All-Reduce Communication

 Hypercube (Find MAX)

Gaussian Elimination

Parallel Design

 Gaussian elimination with partial pivoting – Algorithm

for i from 0 to n-1 do

//find pivot row

find max(pmax,prow) using all-reduce

P(i) swaps its row (or index) with P(prow)

//elimination

P(i) broadcasts its row to others

for j from i+1 to n-1 do in parallel

P(j) computes m[j]

for k from i to n do

P(j) computes a[j,k]

end do

end do

end do

Question

 What is the purpose of all-reduce

communication? Give an example.

 At the end of communication, all processors have

the same reduced value. (Min, Max, etc)

References

 [1] Seyed H. Roosta, Parallel Processing and

Parallel Algorithms: Theory and Computation,

Springer-Verlag 2000.

 [2] Micheal J. Quinn, Parallel Programming in C

with MPI and OpenMP, Tata McGraw-Hill 2003.