




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This assignment was assigned by Prof. Rasul Rangarajan at Deenbandhu Chhotu Ram University of Science and Technology for Parallel Processing course. It includes: Processingย Speed, Number, Cores, Family, Operating, System, Interconnect, Application, Area, Vendor, UMA, NuMA, Cluster
Typology: Exercises
1 / 8
This page cannot be seen from the preview
Don't miss anything!





Question #01 [36 marks] Consult the latest 10 lists of Top500 fastest computer and tabulate the names and count of best choice under each of the following category: 1 โ Processing Speed 2 โ Number of Cores 3 โ Processor Family 4 โ Operating System Family 5 โ Interconnect Family 6 โ Application Area 7 โ Vendor 8 โ Country
Computer List P.speed # of cores P.family OS family IC family App.area Vendor Country Kโcomputer 11/2011 2.0 GHz 705024 Fujitsu cluster Linux Tofuโ custom Research Fujitsu Japan Kโcomputer 06/2011 2.0 GHz 705024 Fujitsu cluster Linux Tofuโ custom Research Fujitsu Japan Tianheโ1A 11/2010 Xeon X5670 6C 2.93 GHz
Linux Propritry Research NUDT China Jaguar 06/2010 Opteron 6 โCore 2. GHz 224162 Cray XT5โHE Linux Proprietry Research Cray Inc. United States Jaguar 11/2009 Opteron 6 โCore 2. GHz 224162 Cray XT5โHE Linux Proprietry Research Cray Inc. United States
Roadrunner 06/2009 powerXCell 8i 3.2 GHz/ Opteron DC 1.8 GHz 129600 BladeCenter QS22/LS Cluster Linux Infiniband Research IBM United States Roadrunner 11/2008 powerXCell 8i 3.2 GHz/ Opteron DC 1.8 GHz 129600 BladeCenter QS22/LS Cluster Linux Infiniband Research IBM United States Roadrunner 06/2008 powerXCell 8i 3.2 GHz/ Opteron DC 1.8 GHz 129600 BladeCenter QS22/LS Cluster Linux Infiniband Research IBM United States BlueGene/L 11/2007 700MHz 212992 Power CNK/SLES 9 Proprietry Research IBM Unites States BlueGene/L 06/2007 700MHz 212992 Power CNK/SLES 9 Proprietry Research IBM Unites States Analysis: We can analyze from above table that before 2010, it was USA who has best technology but after that they were beaten by China and Japan when China introduced Tianheโ1A by NUDT. We can also note that Fujitsu is dominating vendor now who built Kโcomputer which is four time powerful than its nearest competitor and also is one of most energy efficient systems.
The block diagram shows a 15โs pipeline. The pipeline starts with 5 โstage fetch phase where instructions are fetched from L1 and then move to decode phase where they are decoded into micro ops. After decode phase register are assigned and then instructions are dispatched There is a loop cache in decode phase to store instructions that makeup a loop kernel in decided.
The block diagram below show the architecture of TMS320C6474 which is based on 65nm process technology the most powerful version of SoCs..The SoC in below diagram is capable of delivering 3.6GHz of total raw DSP processing power with performance of up to 28,800 million instructions per second.
Question #03 [40 marks] Choose one of the most popular/powerful parallel processor build by industrial/research groups in each of the following category and give the architectural block diagram and main features for each 1 โ UMA 2 โ NUMA 3 โ MPP 4 โ Cluster System
Xeon processors are UMA (uniform memory access) architected. Processors are connected to external memory controller through a bus. It is UMA in sense that each socket has uniform way to access any memory location in terms of latency.In latest chipsets dedicated connections are provided to each socket.
Opteron has concept of integrated memory controller so that each socket connect to other socket by means of direct hypertranspprt. This refer to NUMA. Other key features: ๏ท Increased HT3 bandwidth ๏ท 20%โ50% higher performance than QuadโCore AMD Opteron processor
In diagram below shown a cluster system of Tile64. Tile64 card is composed of 64 core processors with each core running its own Linux OS and communicating using Tilera API. Tile64 is a system on a chip that can be plugged into PCI slot and be used independently from CPU. Each tile is complete full featured processor including integrated L1 and L2 cache and non blocking switch that connects the tile into mesh.
Application of this card are advanced networking and digital video etc.
๏ท MPP consist of Array control unit ,stage memory ,host processor and array unit as shown in below diagram ๏ท Stage memory is of 32 mb(multiple of bytes) ๏ท Array unit has 16385 processing elements. Array control unit is used to send commands to PEs in Array unit.