



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A concise overview of multiprocessor systems and parallel processing concepts. It covers key definitions and distinctions, including task-level parallelism, multicore microprocessors, shared memory multiprocessors, and various multithreading techniques. It also addresses memory access models, message passing, and network topologies relevant to multiprocessor architectures. Structured as a series of questions and answers, making it a useful study aid for understanding computer architecture principles and their practical implications in modern computing systems. It also touches on topics like amdahl's law, data-level parallelism, and the role of gpus.
Typology: Exams
1 / 6
This page cannot be seen from the preview
Don't miss anything!




Multiprocessor - correct answer ✔✔A computer system with at least two processors. This computer is in contrast to a uniprocessor, which has one, and is increasingly hard to find today. Task-level parallelism or process-level parallelism: - correct answer ✔✔Utilizing multiple processors by running independent programs simultaneously. Parallel processing program - correct answer ✔✔A single program that runs on multiple processors simultaneously. Cluster - correct answer ✔✔A set of computers connected over a local area network that function as a single large multiprocessor Multicore microprocessor - correct answer ✔✔A microprocessor containing multiple processors ("cores") in a single integrated circuit. Virtually all microprocessors today in desktops and servers are multicore. Shared memory multiprocessor (SMP - correct answer ✔✔A parallel processor with a single physical address space. To benefit from a multiprocessor, an application must be concurrent. - correct answer ✔✔False: Task-level parallelism can help sequential applications and sequential applications can be made to run on parallel hardware, although it is more challenging. Strong scaling - correct answer ✔✔Speed-up achieved on a multiprocessor without increasing the size of the problem. Weak scaling - correct answer ✔✔Speed-up achieved on a multiprocessor while increasing the size of the problem proportionally to the increase in the number of processors.
Strong scaling is not bound by Amdahl's Law. - correct answer ✔✔Weak scaling can compensate for a serial portion of the program that would otherwise limit scalability, but not so for strong scaling. SISD or single instruction stream, single data stream - correct answer ✔✔A uniprocessor. MIMD or multiple instruction streams, multiple data streams - correct answer ✔✔A multiprocessor. SPMD or single program, multiple data streams - correct answer ✔✔The conventional MIMD programming model, where a single program runs across all processors. Data-level parallelism - correct answer ✔✔Parallelism achieved by performing the same operation on independent data. Vector lane - correct answer ✔✔One or more vector functional units and a portion of the vector register file. Inspired by lanes on highways that increase traffic speed, multiple lanes execute vector operations simultaneously. As exemplified in the x86, multimedia extensions can be thought of as a vector architecture with short vectors that supports only contiguous vector data transfers. - correct answer ✔✔True, but they are missing useful vector features like gather- scatter and vector length registers that improve the efficiency of vector architectures. (As an elaboration in this section mentions, the AVX2 SIMD extensions offers indexed loads via a gather operation but not scatter for indexed stores. The Haswell generation x86 microprocessor is the first to support AVX2.) Hardware multithreading - correct answer ✔✔Increasing utilization of a processor by switching to another thread when one thread is stalled. Thread - correct answer ✔✔A thread includes the program counter, the register state, and the stack. It is a lightweight process; whereas threads commonly share a single address space, processes don't.
Shared memory multiprocessors cannot take advantage of task-level parallelism. - correct answer ✔✔False: Since the shared address is a physical address, multiple tasks each in their own virtual address spaces can run well on a shared memory multiprocessor. GPUs rely on graphics DRAM chips to reduce memory latency and thereby increase performance on graphics applications. - correct answer ✔✔False Graphics DRAM chips are prized for their higher bandwidth. Message passing - correct answer ✔✔Communicating between multiple processors by explicitly sending and receiving information. Send message routine - correct answer ✔✔A routine used by a processor in machines with private memories to pass a message to another processor. Receive message routine - correct answer ✔✔A routine used by a processor in machines with private memories to accept a message from another processor. Clusters - correct answer ✔✔Collections of computers connected via I/O over standard network switches to form a message-passing multiprocessor. Software as a service (SaaS) - correct answer ✔✔Rather than selling software that is installed and run on customers' own computers, software is run at a remote site and made available over the Internet typically via a Web interface to customers. SaaS customers are charged based on use versus on ownership. Like SMPs, message-passing computers rely on locks for synchronization. - correct answer ✔✔False: Sending and receiving a message is an implicit synchronization, as well as a way to share data Clusters have separate memories and thus need many copies of the operating system. - correct answer ✔✔True:
Network bandwidth - correct answer ✔✔Informally, the peak transfer rate of a network; can refer to the speed of a single link or the collective transfer rate of all links in the network. Bisection bandwidth - correct answer ✔✔The bandwidth between two equal parts of a multiprocessor. This measure is for a worst case split of the multiprocessor. Fully connected network - correct answer ✔✔A network that connects processor- memory nodes by supplying a dedicated communication link between every node. Multistage network - correct answer ✔✔A network that supplies a small switch at each node. Crossbar network - correct answer ✔✔A network that allows any node to communicate with any other node in one pass through the network. For a ring with P nodes, the ratio of the total network bandwidth to the bisection bandwidth is P/2. - correct answer ✔✔True Memory-mapped I/O - correct answer ✔✔An I/O scheme in which portions of the address space are assigned to I/O devices, and reads and writes to those addresses are interpreted as commands to the I/O device. Direct memory access (DMA) - correct answer ✔✔A mechanism that provides a device controller with the ability to transfer data directly to or from the memory without involving the processor. Interrupt-driven I/O - correct answer ✔✔An I/O scheme that employs interrupts to indicate to the processor that an I/O device needs attention Device Driver - correct answer ✔✔A program that controls an I/O device that is attached to the computer.