Multistage Interconnection Connection-Parallel Processing-Assignments, Exercises of Parallel Computing and Programming

This assignment was assigned by Prof. Rasul Rangarajan at Deenbandhu Chhotu Ram University of Science and Technology for Parallel Processing course. It includes: NUMA, Parallel, Computer, Processor, Address, Destination, Source, Response, Data, Destination

Typology: Exercises

2011/2012

Uploaded on 07/23/2012

parama
parama ๐Ÿ‡ฎ๐Ÿ‡ณ

4.1

(12)

56 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Question # 1 (Marks 30)
A 128-node NUMA parallel computer has 40 bit RISC processor with clock
rate of 200 MHz and local memory of 16 MB. Memory access time for local
load/store is 4 clock cycles. While overhead of 12 clock cycles is required to
initiate transmission of a request to remote node. The bandwidth of the
interconnection network is 75MB/sec. In a set of program 10 % instruction
are load and 8 % are stores. If 200,000 instructions are executed compute:
a) Load/store time if all accesses are to local nodes.
b) Repeat part (a) if 15 % of accesses are to the remote node.
Request and response packet lengths are 5 and 7 bytes respectively.
Format for request and response packets is given below.
Request block
Source๎˜ƒaddress๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ
(8๎˜ƒbits)๎˜ƒ
Destination๎˜ƒ
address๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ(8๎˜ƒ
bits)๎˜ƒ
Address๎˜ƒto๎˜ƒmemory๎˜ƒ(3bytes)๎˜ƒ
Response block
Source๎˜ƒaddress๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ(8๎˜ƒ
bits)๎˜ƒ
Destination๎˜ƒaddress๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ๎˜ƒ
(8๎˜ƒbits)๎˜ƒ
Data๎˜ƒfrom๎˜ƒmemory๎˜ƒ๎˜ƒ(5
bytes)๎˜ƒ
Hint:
docsity.com
pf3

Partial preview of the text

Download Multistage Interconnection Connection-Parallel Processing-Assignments and more Exercises Parallel Computing and Programming in PDF only on Docsity!

Question # 1 (Marks 30) A 128-node NUMA parallel computer has 40 bit RISC processor with clock rate of 200 MHz and local memory of 16 MB. Memory access time for local load/store is 4 clock cycles. While overhead of 12 clock cycles is required to initiate transmission of a request to remote node. The bandwidth of the interconnection network is 75MB/sec. In a set of program 10 % instruction are load and 8 % are stores. If 200,000 instructions are executed compute: a) Load/store time if all accesses are to local nodes.

b) Repeat part (a) if 15 % of accesses are to the remote node.

Request and response packet lengths are 5 and 7 bytes respectively. Format for request and response packets is given below. Request block Source address

(8 bits)

Destination address ( bits)

Address to memory (3bytes)

Response block Source address (

bits)

Destination address (8 bits)

Data from memory ( bytes)

Hint:

For accesses to a remote node a fixed overhead for initiating the request and time to transmit/receive packet is required in each direction.

Question # 2 (Marks 30)

a) Draw the block diagram (switches & links) of 16 x 16 multistage interconnection network OMEGA having shuffle + Exchange at each stage.

b) Give distributed self routing algorithm for n x n OMEGA network and use it for finding a path from source node # 14 (start node numbering from 0) to node # 3.

c) Find another path which will have blocking problem with the path in part (b).

d) Using Shuffle / Exchange function to find a path from node # 13 to node # 7.

e) Compare the switch & link cost and delay of N x N MIN & cross bar network. Compute these values for 1024 x 1024 network of the two type.

Question # 3 (Marks 30)