ADVANCED COMPUTER ARCHITECTURE Notes - Interconnection Networks and Clusters - 3, Study notes of Advanced Computer Architecture

Notes on:Interconnection Networks Media, twisted pairs, copper wires, Multimode fiber, full duplex, Commercial Interconnection Networks, Connecting the Network to the Computer, Cross-Company Interoperability, Message Failure Tolerance, Node Failure Tolerance, Clusters, Designing a Cluster, SDRAM, RAID,

Typology: Study notes

2010/2011

Uploaded on 09/01/2011

punjabforever
punjabforever 🇮🇳

4.3

(42)

16 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
8. Interconnection Networks and Clusters
1. Interconnection Networks Media
There is a hierarchy of media to interconnect computers that varies in cost,
performance, and reliability. Network media have another figure of merit, the maximum
distance between nodes. This section covers three popular examples, and Figure 8.11
illustrates them.
Category 5 Unsheilded Twisted pair ("Cat5"):
The frst medium is twisted pairs of copper wires. These are two insulated wires, each about
1 mm thick. They are twisted together to reduce electrical interference, since two parallel lines
form an antenna but a twisted pair does not. As they can transfer a few megabits per second
over several kilometers without amplification, twisted pair were the mainstay of the telephone
system. Telephone companies bundled together (and sheathed) many pairs coming into a
building. Twisted pairs can also offer tens of megabits per second of bandwidth over shorter
distances, making them plausible for LANs.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download ADVANCED COMPUTER ARCHITECTURE Notes - Interconnection Networks and Clusters - 3 and more Study notes Advanced Computer Architecture in PDF only on Docsity!

8. Interconnection Networks and Clusters

  1. Interconnection Networks Media There is a hierarchy of media to interconnect computers that varies in cost, performance, and reliability. Network media have another figure of merit, the maximum distance between nodes. This section covers three popular examples, and Figure 8. illustrates them. Category 5 Unsheilded Twisted pair ("Cat5"): The frst medium is twisted pairs of copper wires. These are two insulated wires, each about 1 mm thick. They are twisted together to reduce electrical interference, since two parallel lines form an antenna but a twisted pair does not. As they can transfer a few megabits per second over several kilometers without amplification, twisted pair were the mainstay of the telephone system. Telephone companies bundled together (and sheathed) many pairs coming into a building. Twisted pairs can also offer tens of megabits per second of bandwidth over shorter distances, making them plausible for LANs.

The original telephone-line quality was called Level 1. Level 3 was good enough for 10 Mbits/second Ethernet. The desire for even greater bandwidth lead to the Lev-el 5 or Category 5, which is sufficient for 100 Mbits/second Ethernet. By limiting the length to 100 meters, “Cat5” wiring can be used for 1000 Mbits/second Ethernet links today. It uses the RJ- 45 connector, which is similar to the connector found on telephone lines. Coaxial cable was deployed by cable television companies to deliver a higher rate over a few kilometers. To offer high bandwidth and good noise immunity, insulating material surrounds a single stiff copper wire, and then cylindrical conductor surrounds the insulator, often woven as a braided mesh. A 50-ohm baseband coaxial cable delivers 10 megabits per second over a kilometer. The third transmission media is Fiber optics which transmits digital data as pulses of light. A fiber optic network has three components: 1 the transmission medium, a fiber optic cable; 2 the light source, an LED or laser diode; 3 the light detector, a photodiode. First, cladding surrounds the glass fiber core to confine the light. A buffer then surrounds the cladding to protect the core and cladding. Note that unlike twisted pairs or coax, fibers are one- way, or simplex, media. A two-way, or full duplex, connection between two nodes requires two fibers. Since light bends or refracts at interfaces, it can slowly spread as it travels down the cable unless the diameter of the cable is limited to one wavelength of light; then it transfers in a straight line. Thus, fiber optic cables are of two forms:

  1. Multimode fiber —It uses inexpensive LEDs as a light source. It is typically much larger than the wavelength of light: typically 62.5 microns in diameter vs. the 1. 3 - micron wavelength of infrared light. Since it is wider it has more dispersion problems, where some wave frequencies have different propaga-tion velocities. The LEDs and dispersion limit it to up to a few hundred meters at 1000 Mbits/second or a few kilometers at 100 Mbits /second. It is older and less expensive than single mode fiber.

goals, all LANs and WANs plug into the I/O bus. The location of the network connection significantly affects the software interface to the network as well as the hardware. A memory bus is more likely to be cache-coherent than an I/O bus and therefore more likely to avoid these extra cache flishes. DMA is the best way to send large messages. Whether to use DMA to send small messages depends on the efficiency of the interface to the DMA. The DMA interface is usually memory-mapped, and so each interaction is typically at the speed of main memory rather than of a cache access. Standardization: Cross-Company Interoperability LANs and WANs use standards and interoperate effectively. WANs involve many types of companies and must connect to many brands of computers, so it is difficult to imagine a proprietary WAN ever being successful. The ubiquitous nature of the Ethernet shows the popularity of standards for LANs as well as WANs, and it seems unlikely that many customers would tie the viability of their LAN to the stability of a single company. Message Failure Tolerance The communication system must have mechanisms for retransmission of a message in case of failure. Often it is handled in higher layers of the software protocol at the end points, requiring retransmission at the source. Given the long time of flight for WANs, often they can retransmit from hop to hop rather relying only on retransmission from the source. Node Failure Tolerance The second practical issue refers to whether or not the interconnection relies on all the nodes being operational in order for the interconnection to work properly. Since software failures are generally much more frequent than hardware failures, the question is whether a software crash on a single node can prevent the rest of the nodes from communicating. Clearly, WANs would be useless if they demanded that thousands of computers spread

across a continent be continuously available, and so they all tolerate the failures of individual nodes. LANs connect dozens to hundreds of computers together, and again it would be impractical to require that no computer ever fail. All successful LANs normally survive node failures.

Clusters

There are many mainframe applications––such as databases, file servers, Web servers, simulations, and multiprogramming/batch processing––amenable to running on more loosely coupled machines than the cache-coherent NUMA machines. These applications often need to be highly available, requiring some form of fault tolerance and repairability. Such applications––plus the similarity of the multiprocessor nodes to desktop computers and the emergence of high-bandwidth, switch-based local area networks—lead to clusters of off-the- shelf, whole computers for large-scale processing. Performance Challenges of Clusters One drawback is that clusters are usually connected using the I/O bus of the computer, whereas multiprocessors are usually connected on the memory bus of the computer. The memory bus has higher bandwidth and much lower latency, allowing multiprocessors to drive the network link at higher speed and to have fewer conflicts with I/O traffic on I/O-intensive applications. This connection point also means that clusters generally use software-based communication while multiprocessors use hardware for communication. However, it makes connections non-standard and hence more expensive. A second weakness is the division of memory: a cluster of N machines has N independent memories and N copies of the operating system, but a shared address multiprocessor allows a single program to use almost all the memory in the computer. Thus, a sequential program in a cluster has 1/ N th the memory available compared to a sequential program in a shared memory multiprocessor. Interestingly, the drop in DRAM prices has made memory costs so low that this multi-processor advantage is much less important in 2001 than it was in 1995. The primary issue in 2001 is whether the maximum memory per cluster node is

Inktomi, WebTV, and Yahoo rely on clusters of PCs or workstations to provide services used by millions of people every day. Clusters are growing in popularity in the scientific computing market as well. Figure 8.30 shows the mix of architecture styles between 1993 and 2000 for the top 500 fastest scientific computers. One attraction is that individual scientists can afford to construct clusters themselves, allowing them to dedicate their cluster to their problem. Shared supercomputers are placed on monthly allocation of CPU time, so its plausible for a scientist to get more work done from a private cluster than from a shared supercomputer. It is also relatively easy for the scientist to scale his computing over time as he gets more money for computing. Designing a Cluster: Consider a system with about 32 processors, 32 GB of DRAM, and 32 or 64 disks. Figure 8.33 lists the components we use to construct the cluster, including their prices. IBM model name xSeries 300 xSeries 330 xSeries 370 Maximum number processors per box 1 2 8 Pentium III Processor Clock Rate (MHz) 1000 1000 700 L2 Cache (KB) 256 256 1, Price of base computer with 1 Processor $1,759 $1,939 $14,

Price per extra Processor n.a. $799 $1, Price per 256 MB SDRAM DIMM $159 $269 $ Price per 512 MB SDRAM DIMM $549 $749 $1, Price per 1024 MB SDRAM DIMM n.a. $1,689 $2, IBM 36.4 GB 10K RPM Ultra160 SCSI $579 $639 $ IBM 73.4 GB 10K RPM Ultra160 SCSI n.a. $1,299 $1, PCI slots: 32bit,33 MHz / 64bit,33 MHz / 64bit,66 MHz

Rack space (VME Rack Units) 1 1 8 Power Supply 200 W 200 W 3 x 750 W Emulex cLAN-1000 Host Adapter ( Gbit)

Emulex cLAN5000 8-port switch $6,280 $6,280 $6, Emulex cLAN5000 Rack space (R.U.) 1 1 1 Emulex cLAN5300 30-port switch $15,995 $15,995 $15, Emulex cLAN5300 Rack space (R.U.) 2 2 2 Emulex cLAN-1000 10-meter cable $135 $135 $ Extra PCI Ultra160 SCSI Adapter $299 $299 $ EXP300 Storage Enclosure (up to 14 disks)

EXP300 Rack space (VME Rack Units) 3 3 3 Ultra2 SCSI 4-meter cable $105 $105 $ Standard 19-in Rack (44 VME Rack Units)

Figure 8.33 confirms some of the philosophical points of the prior section. Note that difference in cost and speed processor is in the smaller systems versus the larger multiprocessor. In addition, the price per DRAM DIMM goes up with the size of the computers. The higher price of the DRAM is harder too explain based on cost. For example, all include ECC. The uni-processor uses 133 MHz SDRAM and the 2-way and 8-way both use registered

Hence, the system is down on a disk failures until the operator arrives, and there is no separate visibility or access to storage. This second example centralizes the disks behind a RAID controller in each case using FC-AL as the Storage Area Network. Third Example: Accounting for Other Costs The first and second examples only calculated the cost of the hardware. There are two other obvious costs not included: software and the cost of a maintenance agreement for the hardware. Fourth Example: Cost and Performance of a Cluster for Transaction Processing: This cluster also has 32 processors, uses the same IBM computers as building blocks, and it uses the same switch to connect computers together. Disk size : since TPC-C cares more about I/Os per second (IOPS) than disk capacity, this clusters uses many small fast disks. The use of small disks gives many more IOPS for the same capacity. These disks also rotate at 15000 RPM vs. 10000 RPM, delivering more IOPS per disk. RAID : Since the TPC-C benchmark does not factor in human costs for running a system, there is little incentive to use a SAN. TPC-C does require a RAID protection of disks, however. IBM used a RAID product that plugs into a PCI card and provides four SCSI strings. To get higher availability and performance, each enclosure attaches to two SCSI buses. Memory : Conventional wisdom for TPC-C is to pack as much DRAM as possible into the servers. Hence, each of the four 8-way SMPs is stuffed with the maximum of 32 GB, yielding a total of 128 GB. Processor :This benchmark uses 900 MHz Pentium III with a 2MB L2 cache. The price is $6599 as compared to prior 8-way clusters for $1799 for the 700 MHz Pentium III with a 1 MB L2 cache PCI slots : This cluster uses 7 of the 12 available PCI bus slots for the RAID controllers compared to 1 PCI bus slot for an external SCSI or FC-AL controller in the prior 8-way

clusters. This greater utilization follows the guideline of trying to use all resources of a large SMP Tape Reader, Monitor, Uninterruptable Power Supply : To make the system easier to come up and to keep running for the benchmark, IBM includes one DLT tape reader, four monitors, and four UPSs Maintenance and spares : TPC-C allows use of spares to reduce maintenance costs, which is a minimum of two spares or 10% of the items. Hence, there are two spare Ethernet switches, host adapters, and cables for TPC-C