Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Elemental Technologies: Harnessing GPU Power for Video Processing - Prof. Jingke Li, Study notes of Computer Science

Portland State University (PSU)Computer Science

Prof. Jingke Li

An insight into elemental technologies, a company specializing in video processing solutions. The company's background, story, and products are discussed, with a focus on their transition from building asics to using cuda for software-based video processing. The benefits of this approach, such as cost reduction and high performance, are highlighted.

Typology: Study notes

Pre 2010

Uploaded on 08/18/2009

koofers-user-5nc 🇺🇸

10 documents

1 / 16

This page cannot be seen from the preview

Don't miss anything!

Harnessing Stream Processors:

massively parallel processing

Jesse Rosenzweig, CTO, [email protected]

April 21st, 2009 Elemental Technologies Incorporated Confidential

Agenda

• Company Background

• Story of a Startup

• The Elemental Video Engine

• Elemental Product Line

• CUDA introduction

• Conclusion

2Elemental Technologies Incorporated Confidential

Discover Study notes of Computer Science Portland State University (PSU)

Partial preview of the text

Download Elemental Technologies: Harnessing GPU Power for Video Processing - Prof. Jingke Li and more Study notes Computer Science in PDF only on Docsity!

Harnessing Stream Processors:

massively parallel processing

Jesse Rosenzweig, CTO, [email protected]

April 21st, 2009 (^) Elemental Technologies Incorporated Confidential

Agenda

• Company Background

• Story of a Startup

• The Elemental Video Engine

• Elemental Product Line

• CUDA introduction

• Conclusion

(^2) Elemental Technologies Incorporated Confidential

Company Background

Our Mission:
- To create the fastest, highest quality video solutions by harnessing massively parallel, off-the-shelf hardware.harnessing massively parallel, off the shelf hardware.
Founded in 2006
Team led display revolution at
Headquartered in beautiful Portland, Oregon
Profitable in first quarter of revenue (Q4 ‘08)

Raised $7.1M Series A in June 2008

Elemental Technologies Incorporated Confidential

Story of a Startup

Founded August 2006
- Focus was to build ASIC St d l t d / d

VCU3D Comb IR

Standalone transcoder / encoder
Estimated cost $20M to revenue
Funding sources limited
Elemental 2.0: April 2007

TS Demux

PODController

Decryption EncryptionTS Remux

p

NVIDIA G80 had been released
CUDA had been launched
Powerful parallel engine available
Switched to software model! Elemental Technologies Incorporated Confidential

Disruptive Innovation

Elemental’s video harnesses key GPU trends
1. GPUs have become immensely powerful 2 2. GPUGPUs have become extremely programmable h b t l bl
2. PCI-e bus allows fast CPU / GPU communication

(^7) Elemental Technologies Incorporated Confidential

Video Engine Pipeline

Harnesses both the CPU and GPU strengths
Achieves up to 10x performance of CPU-only
Efficient use of system resources is key

(^8) Elemental Technologies Incorporated Confidential

Elemental Video Engine

Currently used by a variety of applications:
- Virtualization / Remote Video Distribution
- U it d St tUnited States Intelligence Community I t lli C it
- Professional Video Editing

(^9) Elemental Technologies Incorporated Confidential

Product Target Features

Elemental’s Product Line

All powered by Elemental core technology

Elemental Video Engine™ SDK Developer^ • Flexible and extensible

Supports a variety of codecs Badaboom™ Media Converter Consumer^ • Video on mobile devices
1 million+ downloads Elemental Accelerator for CS4 Professional^ • Premiere Pro plug-in
Bundled w/ NVIDIA Quadro CX

Q3 ‘08 Q4 ‘08 Q1 ‘09 Q2 ‘09 Q3 ‘

Badaboom™ Media Converter available RapiHD™ Accelerator for Adobe Premiere Pro CS4 available

RapiHD™ SDK available

Elemental Technologies Incorporated Confidential

CUDA Introduction

Elemental Technologies Incorporated Confidential

CUDA Introduction

• What is CUDA?

Compute Unified Device Architecture
PP arallel processing at a very low levelll l i t l l l
Extensions to C

(^14) Elemental Technologies Incorporated Confidential

GPU Hardware Introduction

Arrays of

multiprocesors

Each multiprocessor has sets of processors
Each processor executes the same instruction on different data

Each processor has access to shared memory

Elemental Technologies Incorporated Confidential

CUDA Introduction

Memory types
- Global/Device  GPU’s DRAMGPU s DRAM. Slowest of all memory Slowest of all memory
- Constant  Cached global memory for constant read-only data
- Texture  2D cache and hardware interpolation for global memory
- Shared  Fast memory (as fast as registers) available to a CUDA block

y ( g )

Elemental Technologies Incorporated Confidential

CUDA Introduction

Typical data flow
- CPU produces/captures data
- CC opy data to GPU DRAMd t t GPU DRAM
- Kernel loads data from DRAM into shared memory
- Threads execute, in parallel, on data in shared memory
- Once threads are done (syncthreads), move data back into GPU DRAM
- Move results back to CPU

Move results back to CPU

Elemental Technologies Incorporated Confidential

CUDA Introduction

Occupancy
- The ratio of the number of active warps peractive warps per multiprocessor to the maximum number of active warps
- Current NVIDIA GPU capability has a max of 32 active warps

active warps

Higher occupancy is not necessarily faster for any given algorithm, but is a measure of how much work can be done per clock.Elemental Technologies Incorporated Confidential

CUDA Introduction

Optimize kernels by
- minimizing registers => simple algorithms
- Mi iMinimizing shared memory usage => resourceful mem i i h d f l management
- Maximizing warps per block => give the device enough work.
- Good memory access  Coalesced global reads and writes R d b k fli t h d

 Reduce bank conflicts on shared memory.

Elemental Technologies Incorporated Confidential

CUDA Introduction

Example –

Matrix Multiply

Each thread block is responsible for computing one square sub-matrix Csub of C;
Each thread within the block is responsible

for computing one

element of Csub.

Elemental Technologies Incorporated Confidential

CUDA Introduction

GPU Side (part 2)
- Load shared memory with datawith data
- Do matrix multiply in parallel
- Write result to global memory

(^25) Elemental Technologies Incorporated Confidential

CUDA Introduction

Performance for A[48,80] * B[128, 48] =

C[128,80]

GPU 10ms (5.4x faster)
CPU 54ms
491k multiplies and 491k adds.

(^26) Elemental Technologies Incorporated Confidential

CUDA Introduction

Performance for A[48,8000] * B[12800, 48] =

C[12800,8000]

GPU 663ms ( 14.2x faster )
CPU 9,483ms
~5 billion multiplies and adds.

(^27) Elemental Technologies Incorporated Confidential

Compute competition

CUDA only for NVIDIA, but Mac, Linux and

Windows supported

OpenCL (Apple) and DX11 (Microsoft) for all

GPU and CPU platforms.

(^28) Elemental Technologies Incorporated Confidential

More information

CUDA: www.nvidia.com/CUDA
OpenCL: www.khronos.org/opencl/
DX11 Compute: DirectX March 2009 release
www.elementaltechnologies.com!

(^31) Elemental Technologies Incorporated Confidential

Elemental Technologies: Harnessing GPU Power for Video Processing - Prof. Jingke Li, Study notes of Computer Science

Related documents

Partial preview of the text

Download Elemental Technologies: Harnessing GPU Power for Video Processing - Prof. Jingke Li and more Study notes Computer Science in PDF only on Docsity!

Harnessing Stream Processors:

Agenda

• Company Background

• Story of a Startup

• The Elemental Video Engine

• Elemental Product Line

• CUDA introduction

• Conclusion

Company Background

Story of a Startup

p

Disruptive Innovation

Video Engine Pipeline

Elemental Video Engine

Elemental’s Product Line

CUDA Introduction

CUDA Introduction

• What is CUDA?

GPU Hardware Introduction

multiprocesors

CUDA Introduction

CUDA Introduction

CUDA Introduction

CUDA Introduction

CUDA Introduction

Matrix Multiply

element of Csub.

CUDA Introduction

CUDA Introduction

C[128,80]

CUDA Introduction

C[12800,8000]

Compute competition

Windows supported

GPU and CPU platforms.

More information