High Performance Machine Learning (HPML)

Course Description

During the past decades, the ﬁeld of High-Performance Computing (HPC) has been about

building supercomputers to solve some of the biggest challenges in science. HPC is where

cutting edge technology (GPUs, low latency interconnects, etc.) is applied to the solution of

scientiﬁc and data-driven problems.

One of the key ingredients to the current success of ML is the ability to perform computations on

very large amounts of training data. Today, the application of HPC techniques to ML algorithms

is a fundamental driver for the progress of Artiﬁcial Intelligence.

In this course, you will learn HPC techniques that are typically applied to supercomputing

software, and how they are applied to obtain the maximum performance out of ML algorithms.

You will also learn about techniques for building eﬃcient ML systems. The course is based on

PyTorch, CUDA programming, MPI.

Objectives

At the end of the course, you will be able to:

●Use HPC techniques to ﬁnd and solve performance bottlenecks

●Do performance measurements and proﬁling of ML software

●Evaluate the performance of diﬀerent ML software stacks and hardware systems

●Develop high performance distributed ML algorithms

●Use fast math libraries, CUDA and C++ to accelerate High-Performance ML algorithms

●Model compression

Prerequisites

●Knowledge of computer architecture and operating system

●C/C++: intermediate programming skills

●Python: intermediate programming skills.

●Understanding of Machine Learning concepts and Neural Networks algorithms:

The course is focused on the system performance rather than the algorithms, and a basic

explanation of the algorithms will be part of the course. However, it is strongly recommended to

start the course with a good understanding of the following algorithms: logistic regression, feed

forward (basic) neural networks, convolutional neural networks, recurrent neural networks.

Course materials

The course does not follow a speciﬁc textbook, however some parts of the following

books can be used as a learning support. Pointers to speciﬁc literature/web links will be

provided in class.

Introduction to High Performance Computing for Scientists and Engineers

Authors: Georg Hager, Gerhard Wellein Editor: CRC Press

ISBN: 9781439811924

Introduction to High Performance Scientiﬁc Computing (ONLINE)

Authors: Victor Eijkhout with Edmond Chow, Robert van de Geijn

Partial preview of the text

Download High Performance Machine Learning (HPML) and more Lecture notes Artificial Intelligence in PDF only on Docsity!

Course Description

During the past decades, the field of High-Performance Computing (HPC) has been about building supercomputers to solve some of the biggest challenges in science. HPC is where cutting edge technology (GPUs, low latency interconnects, etc.) is applied to the solution of scientific and data-driven problems. One of the key ingredients to the current success of ML is the ability to perform computations on very large amounts of training data. Today, the application of HPC techniques to ML algorithms is a fundamental driver for the progress of Artificial Intelligence. In this course, you will learn HPC techniques that are typically applied to supercomputing software, and how they are applied to obtain the maximum performance out of ML algorithms. You will also learn about techniques for building efficient ML systems. The course is based on PyTorch, CUDA programming, MPI.

Objectives

At the end of the course, you will be able to: ● Use HPC techniques to find and solve performance bottlenecks ● Do performance measurements and profiling of ML software ● Evaluate the performance of different ML software stacks and hardware systems ● Develop high performance distributed ML algorithms ● Use fast math libraries, CUDA and C++ to accelerate High-Performance ML algorithms ● Model compression

Prerequisites

● Knowledge of computer architecture and operating system ● C/C++: intermediate programming skills ● Python: intermediate programming skills. ● Understanding of Machine Learning concepts and Neural Networks algorithms: The course is focused on the system performance rather than the algorithms, and a basic explanation of the algorithms will be part of the course. However, it is strongly recommended to start the course with a good understanding of the following algorithms: logistic regression, feed

forward (basic) neural networks, convolutional neural networks, recurrent neural networks.

Course materials

The course does not follow a specific textbook, however some parts of the following

books can be used as a learning support. Pointers to specific literature/web links will be

provided in class.

Introduction to High Performance Computing for Scientists and Engineers Authors: Georg Hager, Gerhard Wellein Editor: CRC Press ISBN: 9781439811924 Introduction to High Performance Scientific Computing (ONLINE) Authors: Victor Eijkhout with Edmond Chow, Robert van de Geijn

Computer Architecture 5th Edition - A Quantitative Approach Authors: John Hennessy, David Patterson Editor: Morgan Kaufmann ISBN: 9780123838728 Efficient Processing of Deep Neural Networks Authors: Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, Joel Emer Morgan & Claypool Publishers ISBN-13: 978- Topics covered ML/DL and PyTorch basics PyTorch performance Performance optimization in Pytorch Parallel performance modeling Intro to CUDA Math libraries for ML (cuDNN) DNNs architectures (CNN, RNN, LSTM, Attention, Transformers) in Pytorch Intro to MPI Distributed ML Distributed PyTorch algorithms, parallel data loading, and ring reduction Hardware acceleration for ML and AI Quantization and model compression Course Information ● Instructors : Dr. Parijat Dube and Dr. Kaoutar El Maghraoui ● Grading: Homework (50%) + Final Project (20%) + Final Exam (20%) + Quizzes (10%) ● Homework : There will be five homework assignments mostly involving programming and experiments involving GPUs. Assignments will be based on C/C++, Python, and PyTorch ● Course project ○ Project proposals are due by midterm ○ Final presentations of all projects towards the end of the course. Weekly Lesson Plan ● Week-1: Introduction to HPC and ML Course introduction and organization; HPC and ML technology; ML/DL success drivers; HPC for ML; hardware overview: CPUs, accelerators, high speed networks; software overview: algorithms, math libraries, frameworks ● Week-2: ML performance optimization Factors affecting ML performance; software performance optimization for ML; Performance optimization methodology: measurement, analysis, optimization; Measurement: metrics, benchmarking workloads, time/resources, throughput, time to

Determining bit-width; Mixed and varying precision; Quantization: post-training quantization, static vs dynamic quantization, quantization aware training, graph mode quantization; hardware aware quantization ● Week-13: Sparsity and Model Compression Activation sparsity, weight sparsity ; Compression; Sparse Dataflow; Low-rank approximation; Knowledge distillation; Distilled architectures in convolutional and recurrent networks ● Week 14 : Designing Efficient DNNs Improving efficiency in manual network design; Neural architecture search (NAS), hardware-aware NAS; Near memory and In-memory processing; Analog AI

High Performance Machine Learning (HPML), Lecture notes of Artificial Intelligence

Related documents

Partial preview of the text

Download High Performance Machine Learning (HPML) and more Lecture notes Artificial Intelligence in PDF only on Docsity!