








































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
vtu 7 semester notes wrt cloud computing
Typology: Study notes
1 / 48
This page cannot be seen from the preview
Don't miss anything!









































Over the past 60 years, computing technology has undergone a series of platform and environment changes. There is evolutionary changes in machine architecture, operating system platform, network connectivity, and application workload. Instead of using a centralized computer to solve computational problems, a parallel and distributed computing system uses multiple computers to solve large-scale problems over the Internet. Thus, distributed computing becomes data-intensive and network-centric. 1.1.1 The Age of Internet Computing Billions of people use the Internet every day. As a result, supercomputer sites and large data centers must provide high- performance computing services to huge numbers of Internet users concurrently. Because of his high demand, the Linpack Benchmark for high- performance computing (HPC) applications is no longer optimal for measuring system performance. The emergence of computing clouds instead demands high- throughput computing (HTC) systems built with parallel and distributed computing technologies. Data centers have to be upgraded using fast servers, storage systems, and high-bandwidth networks. The purpose is to advance network-based computing and web services with the emerging new technologies 1.1.1.1 The Platform Evolution Computer technology has gone through five generations of development, with each generation lasting from 10 to 20 years From 1950 to 1970, a handful of mainframes, including the IBM 360 and CDC 6400, were built to satisfy the demands of large businesses and government organizations.
From 1960 to 1980, lower-cost minicomputers such as the DEC PDP 11 and VAX Series became popular among small businesses and on college campuses. From 1970 to 1990, we saw widespread use of personal computers built with VLSI microprocessors. From 1980 to 2000, massive numbers of portable computers and pervasive devices appeared in both wired and wireless applications. Since 1990, the use of both HPC and HTC systems hidden in clusters, grids, or Internet clouds has proliferated Figure 1.1 illustrates the evolution of HPC and H TC systems. On the HPC side, supercomputers (massively parallel processors or MPPs) are gradually replaced by clusters of cooperative computers out of a desire to share computing resources. The cluster is often a collection of homogeneous compute nodes that
Advances in virtualization make it possible to see the growth of Internet clouds as a new computing paradigm. The maturity of radio-frequency identification (RFID), Global Positioning System (GPS), and sensor technologies has triggered the development of the Internet of Things (IoT). 1.1.1.5 Computing Paradigm Distinctions Distributed computing is the opposite of centralized computing. The field of parallel computing overlaps with distributed computing to a great extent, and cloud computing overlaps with distributed, centralized, and parallel computing. Centralized computing This is a computing paradigm by which all computer resources are centralized in one physical system. All resources (processors, memory, and storage) are fully shared and tightly coupled within one integrated OS. Many data centers and supercomputers are centralized systems, but they are used in parallel, distributed, and cloud computing applications. Parallel computing In parallel computing, all processors are either tightly coupled with centralized shared memory or loosely coupled with distributed memory. Interprocessor communication is accomplished through shared memory or via message passing. A computer system capable of parallelcomputing is commonly known as a parallel computer. Programs running in a parallel computer are called parallel programs. The process of writing parallel programs is often referred to as parallel programming. Distributed computing This is a field of computer science/engineering that studies distributed systems. A distributed system consists of multiple autonomous computers, each having its own private memory, communicating through a computer network. Information exchange in a distributed system is accomplished through message passing. A computer program that runs in a distributed system is known as a
distributed program. The process of writing distributed programs is referred to as distributed programming. Cloud computing An Internet cloud of resources can be either a centralized or a distributed computing system. The cloud applies parallel or distributed computing, or both. Clouds can be built with physical or virtualized resources over large data centers that are centralized or distributed. Some authors consider cloud computing to be a form of utility computing or service computing. Ubiquitous computing refers to computing with pervasive devices at any place and time using wired or wireless communication. The Internet of Things (IoT) is a networked connection of everyday objects including computers, sensors, humans, etc. The IoT is supported by Internet clouds to achieve ubiquitous computing with any object at any place and time. Finally, the term Internet computing is even broader and covers all computing paradigms over the Internet. 1.1.1.6Distributed System Families Technologies for building P2P networks and networks of clusters have been consolidated into many national projects designed to establish wide area computing infrastructures, known as computational grids or data grids. Internet clouds are the result of moving desktop computing to service- oriented computing using server clusters and huge databases as data centres. Design objectives of Distributed System Families: Efficiency measures the utilization rate of resources in an execution model by exploitingmassive parallelism in HPC. For HTC, efficiency is more closely related to job throughput, dataaccess, storage, and power efficiency. Dependability measures the reliability and self-management from the chip to the system andapplication levels. The purpose is to provide high-throughput service with Quality of Service(QoS) assurance, even under failure conditions. Adaptation in the programming model measures the ability to support billions of job requestsover massive data sets and virtualized cloud resources under various workload and servicemodels.
Table 1.1 highlights afew key applications that have driven the development of parallel and distributed systems over the years. These applications spread across many important domains in science, engineering, business, education, health care, traffic control, Internet and web services, military, and government applications. The Trend toward Utility Computing Figure 1.2 identifies major computing paradigms to facilitate the study of distributed systems and their applications. These paradigms share some common characteristics. First, they are all ubiquitous in daily life. Reliability and scalability are two major design objectives in these computing models. Second, they are aimed at autonomic operations that can be self-organized to support dynamic discovery. Finally, these paradigms are composable with QoS and SLAs (service-level agreements). These paradigms and their attributes realize the computer utility vision. Utility computing focuses on a business model in which customers receive computing resources from a paid service provider. All grid/cloud platforms are regarded as utility service providers.
However, cloud computing offers a broader concept than utility computing. Distributed cloud applications run on any available servers in some edge networks. Major technological challenges include all aspects of computer science and engineering. For example, users demand new networkefficient processors, scalable memory and storage schemes, distributed OSes, middleware formachine virtualization, new programming models, effective resource management, and application program development. These hardware and software supports are necessary to build distributed systems that explore massive parallelism at all processing levels. The Hype Cycle of New Technologies Any new and emerging computing and information technology may go through a hype cycle, as illustrated in Figure 1.3. This cycle shows the expectations for the technology at five different stages. The expectations rise sharply from the trigger period to a high peak of inflated expectations. Through a short period of disillusionment, the expectation may drop to a valley and then increase steadily over a long enlightenment period to a plateau of productivity. The number of years for an emerging technology to reach a certain stage is marked by special symbols. The hollow circles indicate technologies that will reach mainstream adoption in two years. The gray circles represent technologies that will reach mainstream adoption in two to five years. The solid circles represent those that require five to 10 years to reach mainstream adoption, and the triangles denote those that require more than 10 years. The crossed circles represent technologies that will become obsolete before they reach the plateau.
The IoT refers to the networked interconnection of everyday objects, tools, devices, or computers. One can view the IoT as a wireless network of sensors that interconnect all things in our daily life. These things can be large or small and they vary with respect to time and place. The idea is to tag every object using RFID or a related sensor or electronic technology such as GPS. With the introduction of the IPv6 protocol, 2128 IP addresses are available to distinguish all the objects on Earth, including all computers and pervasive devices. The IoT researchers have estimated that every human being will be surrounded by 1,000 to 5,000 objects. The IoT needs to be designed to track 100 trillion static or moving objects simultaneously. The IoT demands universal addressability of all of the objects or things. To reduce the complexity of identification, search, and storage,one can set the threshold to filter out fine-grain objects. T he IoT obviously extends the Internet andis more heavily developed in Asia and European countries. In the IoT era, all objects and devices are instrumented, interconnected, and interacted with eachother intelligently. This communication can be made between people and things or among the thingsthemselves. Three communication patterns co-exist: namely H2H (human-to- human), H2T (human-tothing),and T2T (thing-to-thing). Here things include machines such as PCs and mobile phones. The ideahere is to connect things (including human and machine objects) at any time and any place intelligentlywith low cost. Any place connections include at the PC, indoor (away from PC), outdoors, and on themove. Any time connections include daytime, night, outdoors and indoors, and on the move as well.The dynamic connections will grow exponentially into a new dynamic network of networks,called the Internet of Things (IoT). The IoT is still in its infancy stage of development. Many prototypeIoTs with restricted areas of coverage are under experimentation at the time of this writing. Cloud computing researchers expect to use the cloud and future Internet technologies to support fast, efficient, and intelligent interactions among humans, machines, and any objects on Earth. A smart Earth should have intelligent cities, clean water, efficient power, convenient transportation, good food supplies, responsible banks, fast telecommunications, green IT, better schools, good health care, abundant resources, and so on. This
dream living environment may take some time to reach fruition at different parts of the world. Cyber-Physical Systems A cyber-physical system (CPS) is the result of interaction between computational processes and thephysical world. A CPS integrates “cyber” (heterogeneous, asynchronous) with “physical” (concurrentand information-dense) objects. A CPS merges the “3C” technologies of computation, communication,and controlinto an intelligent closed feedback system between the physical world and theinformation world. The IoT emphasizesvarious networking connections among physical objects, while the CPS emphasizes exploration ofvirtual reality (VR) applications in the physical world. We may transform how we interact with thephysical world just like the Internet transformed how we interact with the virtual world. 1.2 TECHNOLOGIES FOR NETWORK-BASED SYSTEMS 1.2.1 Multicore CPUs and Multithreading Technologies Consider the growth of component and network technologies over the past 30 years. They are crucial to the development of HPC and HTC systems. Processor speed is measured in millions of instructions per second (MIPS) and network bandwidth is measured in megabits per second (Mbps) or gigabits per second (Gbps). Advances in CPU Processors Today, advanced CPUs or microprocessor chips assume a multicore architecture with dual, quad, six, or more processing cores. These processors exploit parallelism at ILP and TLP levels.
Multicore CPUs may increase from the tens of cores to hundreds or more in the future. But the CPU has reached its limit in terms of exploiting massive DLP due to the memory wall problem. This has triggered the development of many-core GPUs with hundreds or more thin cores. Both IA-32 and IA- instruction set architectures are built into commercial CPUs. Now, x-86 processors have been extended to serve HPC and HTC systems in some high-end server processors. Many RISC processors have been replaced with multicore x- processors and many-core GPUs in the Top 500 systems. This trend indicates that x-86 upgrades will dominate in data centers and supercomputers. The GPU also has been applied in large clusters to build supercomputers in MPPs.In the future, the processor industry is also keen to develop asymmetric or heterogeneous chip multiprocessors that can house both fat CPU cores and thin GPU cores on the same chip. Multithreading Technology Consider in Figure 1.6 the dispatch of five independent threads of instructions to four pipelined data paths (functional units) in each of the following five processor categories, from left to right: a four-issue superscalar processor, a fine-grain multithreaded processor, a coarse-grain multithreaded processor, a two-core CMP,
a simultaneous multithreaded (SMT) processor. The superscalar processor is single-threaded with four functional units. Each of the three multithreaded processors is four-way multithreaded over four functional data paths. In the dual-core processor, assume two processing cores, each a single-threaded two-way superscalar processor. Instructions from different threads are distinguished by specific shading patterns for instructions from five independent threads. Typical instruction scheduling patterns are shown here. Only instructions from the same thread are execute in a superscalar processor. Fine-grain multithreading switches the execution of instructions from different threads per cycle. Course-grain multithreading executes many instructions from the same thread for quite a few cycles before switching to another thread. The multicore CMP executes instructions from different threads completely. The SMT allows simultaneous scheduling of instructions from different threads in thesame cycle. These execution patterns closely mimic an ordinary program. The blank squares correspond to no available instructions for an instruction data path at a particular processor cycle. More blank cells imply lower scheduling efficiency. The maximum ILP or maximum TLP is difficult to achieve at each processor cycle. The point here is to demonstrate your understanding of typical instruction scheduling patterns in these five different micro- architectures in modern processors. 1.2.2 GPU Computing to Exascale and Beyond A GPU is a graphics coprocessor or accelerator mounted on a computer’s graphics card or video card. A GPU offloads the CPU from tedious graphics tasks in video editing applications. The world’s first GPU, the GeForce 256, was marketed by NVIDIA in 1999. These GPU chips can process a minimum of 10 million polygons per second, and are used in nearly every computer on the market today. Some GPU features were also integrated into certain CPUs. Traditional CPUs are structured with only a few cores. For example, the Xeon X5670 CPU has six cores. However, a modern GPU chip can be built with hundreds of processing cores. Unlike CPUs, GPUs have a throughput architecture that exploits massive parallelism by executing many concurrent threads slowly, instead of executing a single long thread in a conventional microprocessor very quickly.
Figure 1.7 shows the interaction between a CPU and GPU in performing parallel execution of floating-point operations concurrently. The CPU is the conventional multicore processor with limited parallelism to exploit. The GPU has a many-core architecture that has hundreds of simple processing cores organized as multiprocessors. Each core can have one or more threads. Essentially, the CPU’s floating-point kernel computation role is largely offloaded to the many-core GPU. The CPU instructs the GPU to perform massive data processing. The bandwidth must be matched between the on-board main memory and the on-chip GPU memory. This process is carried out in NVIDIA’s CUDA programming using the GeForce 8800 or Tesla and Fermi GPUs 1.2.3 Memory, Storage, and Wide-Area Networking Memory Technology The growth of DRAM chip capacity from 16 KB in 1976 has been increased to 64 GB in 2011.This shows that memory chips have experienced a 4x increase in capacity every three years. Memory access time did not improve much in the past. In fact, the memory wall problem is getting worse as the processor gets faster. For hard drives, capacity increased from 260 MB in 1981 to 250 GB in 2004. The Seagate Barracuda XT hard drive reached 3 TB in 2011. This represents an approximately 10x increase in capacity every eight years. The capacity increase of disk arrays will be even greater in the years to come.
Faster processor speed and larger memory capacity result in a wider gap between processors and memory. The memory wall may become even worse a problem limiting the CPU performance in the future. Disks and Storage Technology Beyond 2011, disks or disk arrays have exceeded 3 TB in capacity. The rapid growth of flash memory and solid-state drives (SSDs) also impacts the future of HPC and HTC systems. The mortality rate of SSD is not bad at all. A typical SSD can handle 300,000 to 1 million write cycles per block. So the SSD can last for several years, even under conditions of heavy write usage. Flash and SSD will demonstrate impressive speedups in many applications. Eventually, power consumption, cooling, and packaging will limit large system development. Power increases linearly with respect to clock frequency and quadratic ally with respect to voltageapplied on chips. Clock rate cannot be increased indefinitely. Lowered voltage supplies are verymuch in demand.
The nodes in small clusters are mostly interconnected by an Ethernet switch or a local area network(LAN). As Figure 1.11 shows, a LAN typically is used to connect client hosts to big servers. A storage area network (SAN) connects servers to network storage such as disk arrays. Networkattached storage (NAS) connects client hosts directly to the disk arrays. All three types of networksoften appear in a large cluster built with commercial network components. If no large distributedstorage is shared, a small cluster could be built with a multiport Gigabit Ethernet switch plus coppercables to link the end machines. All three types of networks are commercially available.
Virtual Machines In Figure 1.12, the host machine is equipped with the physical hardware, as shown at the bottom of the figure. Physical Machine An example is an x-86 architecture desktop running its installed Windows OS, as shown in part (a) of the figure. The VM can be provisioned for any hardware system. The VM is built with virtual resources managed by a guest OS to run a specific application. Between the VMs and the host platform, one needs to deploy a middleware layer called a virtual machine monitor (VMM). Native VM : Figure 1.12(b) shows a native VM installed with the use of a VMM called a hypervisor in privileged mode. For example, the hardware has x-86 architecture running the Windows system.The guest OS could be a Linux system and the hypervisor is the XEN system developed
atCambridge University. This hypervisor approach is also called bare- metal VM, because the hypervisorhandles the bare hardware (CPU, memory, and I/O) directly. Hosted VM Here the VMM runs in nonprivileged mode. The host OS need not be modified. Dual mode VM The VM can also be implemented with a dual mode, as shown in Figure 1.12(d).Part of the VMM runs at the user level and another part runs at the supervisor level. In this case, the host OS may have to be modified to some extent. Multiple VMs can be ported to a given hardware system to support the virtualization process. The VM approach offers hardware independence of the OS and applications. The user application running on its dedicated OS could be bundled together as a virtual appliance that can be ported to any hardware platform. The VM could run on an OS different from that of the host computer. VM Primitive Operations The VMM provides the VM abstraction to the guest OS. With full virtualization, the VMM exports a VM abstraction identical to the physical machine so that a standard OS such as Windows 2000 or Linux can run just as it would on the physical hardware. Low-level VMM operations are indicated by Mendel Rosenblum and illustrated in Figure 1.13. First, the VMs can be multiplexed between hardware machines, as shown in Figure 1.13(a).