
Fall 2006 ECE 4100/6100
Assignment 4
Due Date: 9 pm, Friday December 8st, 2006
For this ass ignment you must pick only one of two optional components. Option 1
below is comprised of three parts. Option 2 is comprised of a programming assignment of
IBM’s Cell processor using a cycle accurate simulator. Note the requirements for Option
2.
Option 1:
Answer all of the following questions.
1. Compute the bisection bandwidth of an N = 2n binary hypercube with single bit
channels. Assume full duplex links each of width 1 bit.
a. Assuming full duplex channels of width W bits in each direction, find the
channel width of a k-ary n-cube that will saturate this bisection width.
b. Assuming L bits/message and switching, routing and wire latency of 1 cycle,
and L a perfect multiple o f W, construct an analytic model of latency, and then
use this expression to find the number of dimensions that will minimize
latency for a fixed bisection bandwidth when using wormho le switching.
Make any assumpt ions you feel you have to.
2. Read the Cell paper “Cell Multiprocessor Network: Built For Speed” (posted on the
class website). From this paper, provide detailed description (not a reproduction of
paragraphs from the paper) of the transfer of blocks of data, in both directions,
between the PowerPC and the SPE co-processor. Your description should be
supported by a figure that has labeled the sequence of steps involved in each transfer.
3. The URL http://www.pcisig.com/home provides information on the PCI Express
Communication protocol. Browse the site, find the relevant documents and compare
and contrast this protocol with the Hypertransport protocol (www.hypertransport.org)
according to the following features
a. The goals of the protocol: What are the anticipated application domains and
what technical needs is the protocol intended to fill.
b. Packet structure: Describe and differentiate the purpose of the various packet
fields. Describe the need for the fields in the context of the preceding bullet.
c. Addressing: Range and type of devices that can be addressed.
d. Physical link and physical link protocol operation.
e. If I had to construct a shared memory, cache coherent multiprocessor,
packaged over many boards in a rack and had to layer a coherency protocol
over a physical layer, which of the above two communication standards
(Hypertransport and PCI Express) would you pick and why. Structure your
answer as a sequence of bullets rather than essay form – be concise.
4. Submit the assignment electronically to the TA.