

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
HPC Notes. Introduces HPC basics
Typology: Study notes
1 / 3
This page cannot be seen from the preview
Don't miss anything!


Most high performance systems are based on Reduced Instruction Set Computer (RISC) processors. High performance RISC processors are designed to be easily inserted into a multiple-processor system with 2 to 64 CPUs accessing a single memory using the symmetric multi processing (SMP). Each processor is very powerful and a small number of processors can be put into a single enclosure. Often applications need to span multiple enclosures. In such cases, enclosures are linked with a high-speed network to function as a network of workstations(NOW). A NOW can be used individually through a batch queueing system or can be used as a large multicomputer using a message passing tool such as parallel virtual machine (PVM) or message-passing interface (MPI). The scalable parallel processing systems with hundreds or thousands of processors come in two flavours: one programmed using message passing. These processors are connected using a proprietory, scalable, high-bandwidth, low-latency interconnect. Because of the high performace interconnect these systems can scale to the thousand of processors while keeping the time spent performing the overhead communications to a minimum. The second type of the large parallel processing system is the scalable non-uniform memory access(NUMA) systems. These systems also use high performance interconnect to implement a distributed shared memory that can be accessed from any processor using a load/store paradigm. This is similar to programming SMP systems except that some areas of memory have slower access than others. High Performance Microprocessors: A complex instruction set computer (CISC) instruction set is made of powerful primitives, close in functionality to the primitives of high-level languages like C or FORTRAN. It captures the sense of "don't do in software what you can do in hardware". RISC emphasizes low-level primitives, far below the complexity of a high-level language. RISC takes more machine instructions than CISC to compute anything. Why CISC? In the past, the design variables favoured CISC. 50 years ago, high-level language compilers didn't generate the fastest code and they weren't thrifty with memory. Hence, programming is done in assembly language. A good instruction set is both easy to use and powerful. "Powerful" instructions accomplished a lot and saved the programmer from specifying many little steps- which made them easy to use. A instruction that could roll all the steps of a complex operation, such as a do-loop, into a single opcode was a plus, because it saved time and memory and memory was precious. Complex instructions saved time too. When a single instruction can perform several operations, the overall number of instructions retrieved from the memory can be reduced. Minimizing the instructions was important because, with few exceptions, the machines of late 1950s were very sequential; not until the current instruction was completed did the computer initiate the process of going out to memory to get the next instruction. Modern machines form a bucket brigade- passing instructions in from memory and figuring out what they do on the way – so that there are fewer gaps in processing. The assembly language programmers used the complicated machine instructions, but compilers generally did not. Fundamentals of RISC: The following factors contributed to the growth of RISC: (i) The number of transistors that could fit on a single chip were increasing. Eventually, one would be able to fit all the components for a processor board onto a single chip. (ii) Techniques like pipelining were being explored to improve performance. Variable-length instructions and variable-length instruction execution times made implementing pipelines more difficult. (iii) As compilers improved, they found that well-optimized sequences of stream-lined
instructions often outperformed the equivalent complicated multi-cycle instructions. The RISC designers sought to create a high performance single-chip processor with a fast clock rate, which made it necessary to discard the existing CISC instruction sets and develop a new minimal instruction set that could fit on a single chip. For the first generation of RISC chips, the restrictions on the number of components that could be manufactured on a single chip were severe, forcing the designers to leave out the hardware support for some instructions like no floating-point support and integer multiply. These instructions could be implemented using software routines that combined other instructions. Earliest RISC processors were not successful for following reasons: