Docsity
Docsity

Prepare-se para as provas
Prepare-se para as provas

Estude fácil! Tem muito documento disponível na Docsity


Ganhe pontos para baixar
Ganhe pontos para baixar

Ganhe pontos ajudando outros esrudantes ou compre um plano Premium


Guias e Dicas
Guias e Dicas


GPU Computing Gems Emerald Edition, Manuais, Projetos, Pesquisas de Cultura

GPU Computing Gems Emerald Edition Outro Livro de programação da GPU.

Tipologia: Manuais, Projetos, Pesquisas

2011

Compartilhado em 22/09/2011

angelo-polotto-2
angelo-polotto-2 🇧🇷

2 documentos

1 / 889

Toggle sidebar

Esta página não é visível na pré-visualização

Não perca as partes importantes!

bg1
ELS Dunn 03-loc-xi-xviii-9780123849885 2011/1/8 11:12 page xviii #8
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Pré-visualização parcial do texto

Baixe GPU Computing Gems Emerald Edition e outras Manuais, Projetos, Pesquisas em PDF para Cultura, somente na Docsity!

GPU Computing Gems

Emerald Edition

GPU Computing Gems

Emerald Edition

Wen-mei W. Hwu

AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO •^ SINGAPORE •^ SYDNEY •^ TOKYO Morgan Kaufmann Publishers is an imprint of Elsevier

Acquiring Editor: Todd Green Assistant Editor: Robyn Day Project Manager: Paul Gottehrer Designer: Dennis Schaefer Morgan Kaufmann is an imprint of Elsevier 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA ©c 2011 NVIDIA Corporation and Wen-mei W. Hwu. Published by Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods or professional practices, may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information or methods described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data GPU computing gems / editor, Wen-mei W. Hwu. p. cm. Includes bibliographical references. ISBN 978-0-12-384988-

  1. Graphics processing units–Programming. 2. Imaging systems. 3. Computer graphics. 4. Image processing–Digital techniques. I. Hwu, Wen-mei. T385.G6875 2011 006.6–dc 2010047487 British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

For information on all MK publications visit our website at www.mkp.com

Printed in the United States of America 11 12 13 14 15 11 10 9 8 7 6 5 4 3 2 1

vi Contents

CHAPTER 10 Wavelet-Based Density Functional Theory Calculation on Massively

Parallel Hybrid Architectures................................................... 133 Luigi Genovese, Matthieu Ospici, Brice Videau, Thierry Deutsch, Jean-Franc¸ ois M ´ehaut

SECTION 2 LIFE SCIENCES Bertil Schmidt

CHAPTER 11 Accurate Scanning of Sequence Databases with the Smith-Waterman

Algorithm........................................................................ 155 Łukasz Ligowski, Witold R. Rudnicki, Yongchao Liu, Bertil Schmidt

CHAPTER 12 Massive Parallel Computing to Accelerate Genome-Matching.................. 173

Ben Weiss, Mike Bailey

CHAPTER 13 GPU-Supercomputer Acceleration of Pattern Matching......................... 185

Ali Khajeh-Saeed, J. Blair Perot

CHAPTER 14 GPU Accelerated RNA Folding Algorithm........................................ 199

Guillaume Rizk, Dominique Lavenier, Sanjay Rajopadhye

CHAPTER 15 Temporal Data Mining for Neuroscience........................................ 211

Wu-chun Feng, Yong Cao, Debprakash Patnaik, Naren Ramakrishnan

SECTION 3 STATISTICAL MODELING Mike Giles

CHAPTER 16 Parallelization Techniques for Random Number Generators.................... 231

Thomas Bradley, Jacques du Toit, Robert Tong, Mike Giles, Paul Woodhams

CHAPTER 17 Monte Carlo Photon Transport on the GPU...................................... 247

L ´aszl ´o Szirmay-Kalos, Bal ´azs T ´oth, Mil ´an Magdics

CHAPTER 18 High-Performance Iterated Function Systems................................... 263

Christoph Schied, Johannes Hanika, Holger Dammertz, Hendrik P. A. Lensch

SECTION 4 EMERGING DATA-INTENSIVE APPLICATIONS Volodymyr Kindratenko

CHAPTER 19 Large-Scale Machine Learning.................................................. 277

Jerod J. Weinman, Augustus Lidaka, Shitanshu Aggarwal

Contents vii

CHAPTER 20 Multiclass Support Vector Machine............................................. 293

Sergio Herrero-Lopez

CHAPTER 21 Template-Driven Agent-Based Modeling and Simulation with CUDA............ 313

Paul Richmond, Daniela Romano

CHAPTER 22 GPU-Accelerated Ant Colony Optimization...................................... 325

Robin M. Weiss

SECTION 5 ELECTRONIC DESIGN AUTOMATION

Sunil P. Khatri

CHAPTER 23 High-Performance Gate-Level Simulation with GP-GPUs........................ 343

Debapriya Chatterjee, Andrew DeOrio, Valeria Bertacco

CHAPTER 24 GPU-Based Parallel Computing for Fast Circuit Optimization................... 365

Yifang Liu, Jiang Hu

SECTION 6 RAY TRACING AND RENDERING

Austin Robison

CHAPTER 25 Lattice Boltzmann Lighting Models.............................................. 381

Robert Geist, James Westall

CHAPTER 26 Path Regeneration for Random Walks........................................... 401

Jan Nov ´ak, Vlastimil Havran, Carsten Dachsbacher

CHAPTER 27 From Sparse Mocap to Highly Detailed Facial Animation....................... 413

Bernd Bickel, Manuel Lang

CHAPTER 28 A Programmable Graphics Pipeline in CUDA for Order-Independent

Transparency.................................................................... 427 Mengcheng Huang, Fang Liu, Xuehui Liu, Enhua Wu

SECTION 7 COMPUTER VISION

James Fung

CHAPTER 29 Fast Graph Cuts for Computer Vision............................................ 439

P.J. Narayanan, Vibhav Vineet, Timo Stich

CHAPTER 30 Visual Saliency Model on Multi-GPU............................................ 451

Anis Rahman, Dominique Houzet, Denis Pellerin

Contents ix

CHAPTER 41 Parallelization of Katsevich CT Image Reconstruction Algorithm

on Generic Multi-Core Processors and GPGPU.................................. 659 Abderrahim Benquassmi, Eric Fontaine, Hsien-Hsin S. Lee

CHAPTER 42 3-D Tomographic Image Reconstruction from Randomly Ordered Lines

with CUDA....................................................................... 679 Guillem Pratx, Jing-Yu Cui, Sven Prevrhal, Craig S. Levin

CHAPTER 43 Using GPUs to Learn Effective Parameter Settings for GPU-Accelerated

Iterative CT Reconstruction Algorithms......................................... 693 Wei Xu, Klaus Mueller

CHAPTER 44 Using GPUs to Accelerate Advanced MRI Reconstruction with Field

Inhomogeneity Compensation................................................... 709 Yue Zhuo, Xiao-Long Wu, Justin P. Haldar, Thibault Marin, Wen-mei W. Hwu, Zhi-Pei Liang, Bradley P. Sutton

CHAPTER 45 1 Minimization in 1-SPIRiT Compressed Sensing MRI Reconstruction....... 723

Mark Murphy, Miki Lustig

CHAPTER 46 Medical Image Processing Using GPU-Accelerated ITK Image Filters.......... 737

Won-Ki Jeong, Hanspeter Pfister, Massimiliano Fatica

CHAPTER 47 Deformable Volumetric Registration Using B-Splines........................... 751

James Shackelford, Nagarajan Kandasamy, Gregory Sharp

CHAPTER 48 Multiscale Unbiased Diffeomorphic Atlas Construction on Multi-GPUs......... 771

Linh Ha, Jens Kr ¨uger, Sarang Joshi, Cl ´audio T. Silva

CHAPTER 49 GPU-Accelerated Brain Connectivity Reconstruction and

Visualization in Large-Scale Electron Micrographs............................. 793 Won-Ki Jeong, Hanspeter Pfister, Johanna Beyer, Markus Hadwiger

CHAPTER 50 Fast Simulation of Radiographic Images Using a Monte Carlo X-Ray

Transport Algorithm Implemented in CUDA...................................... 813 Andreu Badal, Aldo Badano

Index.............................................................................................. 831

This page intentionally left blank

xii Editors, Reviewers, and Authors

Reza Farivar, University of Illinois at Urbana-Champaign Vladimir Frolov, NVIDIA Vladimir Glavtchev, BMW Technology Office Kanupriya Gulati, Intel Corporation Trym Vegard Haavardsholm, Norwegian Defense Research Establishment Ken Hawick, University of Auckland, New Zealand Jared Hoberock, NVIDIA Tim Kaldewey, Oracle Vinay Karkala, Advanced Micro Devices Christian Linz, Technical University, Braunschweig Christian Lipski, Technical University, Braunschweig Weiguo Liu, Nanyang Technological University Dave Luebke, NVIDIA W. James MacLean, Google Corey Manders, A*STAR Institute for Infocomm Research Morgan McGuire, Williams College, Massachusetts Derek Nowrouzezahrai, Disney Research Zurich Ming Ouyang, University of Louisville, Kentucky Steven Parker, NVIDIA Kalyan Perumalla, Oak Ridge National Laboratory Nicolas Pinto, Massachusetts Institute of Technology Tobias Preis, Johannes Gutenberg University Ramtin Shams, Australian National University Craig Steffen, University of Illinois at Urbana-Champaign Andrei Tatarinov, NVIDIA Cristina Nader Vasconcelos, Institulo de Computac¸ ˜ao, Universidade Federal Fluminense, Brazil Ben Weiss, Shell and Slate Software Ruediger Westermann, Technical University, Munich Jan Woetzel, MeVis Medical Solutions, AG Kesheng Wu, Berkeley Lab, University of California Ren Wu, HP Labs Weihang Zhu, Lamar University, Texas

Editors, Reviewers, and Authors xiii

Authors

Shitanshu Aggarwal, Grinnell College, Iowa (Chapter 19)

Mike Bailey, Oregon State University (Chapter 12)

Andreu Badal, US Food and Drug Administration (CDRH/OSEL/DIAM) (Chapter 50)

Aldo Badano, US Food and Drug Administration (CDRH/OSEL/DIAM) (Chapter 50)

Lorena A. Barba, Boston University (Chapter 9)

Bedˇrich Beneˇs, Purdue University, Indiana (Chapter 35)

Abderrahim Benquassmi, Georgia Institute of Technology (Chapter 41)

Valeria Bertacco, University of Michigan (Chapter 23)

Johanna Beyer, King Abdullah University of Science and Technology (KAUST) (Chapter 49)

Bernd Bickel, Disney Research, Zurich (Chapter 27)

Thomas Bradley, NVIDIA (Chapter 16)

Benjamin Brown, Northeastern University (Chapter 40)

Martin Burtscher, Texas State University, San Marcos (Chapter 6)

Yong Cao, Virginia Tech (Chapter 15)

Debapriya Chatterjee, University of Michigan (Chapter 23)

Yifeng Chen, Peking University (Chapter 39)

Jike Chong, University of California, Berkeley (Chapter 37)

Jing-Yu Cui, Stanford University (Chapter 42)

Xiang Cui, Peking University (Chapter 39)

Carsten Dachsbacher, Karlsruhe Institute of Technology (Chapter 26)

Holger Dammertz, Ulm University (Chapter 18)

Andrew DeOrio, University of Michigan (Chapter 23)

Thierry Deutsch, Laboratoire de Simulation Atomistique (Chapter 10)

Rodrigo Dominguez, Northeastern University (Chapter 40)

Jacques Du Toit, Numerical Algorithms Group (Chapter 16)

Gabriel Falcao, University of Coimbra (Chapter 38)

Massimiliano Fatica, NVIDIA (Chapter 46)

Wu-chu Feng, Virginia Tech and Wake Forest University (Chapter 15)

Eric Fontaine, Georgia Institute of Technology (Chapter 41)

James Fung, NVIDIA (Chapter 36)

Robert Geist, Clemson University (Chapter 25)

Editors, Reviewers, and Authors xv

Dominique Lavenier, Ecole Normale Sup ´´ erieure de Cachan (Chapter 14)

Hsien-Hsin S. Lee, Georgia Institute of Technology (Chapter 41)

Hendrik Lensch, Ulm University (Chapter 18)

Craig S. Levin, Stanford University (Chapter 42)

Zhi-Pei Liang, University of Illinois at Urbana-Champaign (Chapter 44)

Augustus Lidaka, Grinnell College (Chapter 19)

Łukasz Ligowski, University of Warsaw (Chapter 11)

Fang Liu, Chinese Academy of Sciences (Chapter 28)

Xuehui Liu, Chinese Academy of Sciences (Chapter 28)

Yifang Liu, Texas A&M University (Chapter 24)

Yongchao Liu, Nanyang Technological University (Chapter 11)

Berker Logoglu, Middle East Technical University (Chapter 34)

Nathan Luehr, Stanford University and SLAC National Accelerator Laboratory (Chapter 3)

Miki Lustig, University of California, Berkeley (Chapter 45)

Mil ´an Magdics, Budapest University of Technology and Economics (Chapter 17)

Thibault Marin, Illinois Institute of Technology (Chapter 44)

Todd Martinez, Stanford University and SLAC National Accelerator Laboratory (Chapter 3)

Jean-Franc¸ ois M ´ehaut, Universite Joseph Fourier (Chapter 10)

Hong Mei, Peking University (Chapter 39)

Perhaad Mistry, Northeastern University (Chapter 40)

Richard Moore, Massachusetts General Hospital (Chapter 40)

Keiji Morokuma, Kyoto University (Chapter 5)

Klaus Mueller, State University of New York, Stony Brook (Chapter 43)

Mark Murphy, University of California, Berkeley (Chapter 45)

Pinar Muyan- ¨Ozc¸ elik, University of California, Davis (Chapter 32)

P. J. Narayanan, International Institute of Information Technology Hyderabad (Chapter 29)

Jan Nov ´ak, Karlsruhe Institute of Technology (Chapter 26)

Anton Obukhov, NVIDIA (Chapter 33)

Fatih Omruuzun, Middle East Technical University (Chapter 34)

Matthieu Ospici, Laboratoire d’Informatique de Grenoble (Chapter 10)

Jeffery M. Ota, BMW Group Technology Office (Chapter 32)

John D. Owens, University of California, Davis (Chapter 32)

xvi Editors, Reviewers, and Authors

Vijay S. Pande, Stanford University (Chapter 2) Debprakash Patnaik, Virginia Tech (Chapter 15) Denis Pellerin, GIPSA-lab (Chapter 30) J. Blair Perot, University of Massachusetts, Amherst (Chapter 13) Hanspeter Pfister, Harvard University (Chapters 46 and 49) Keshay Pingali, Texas State University, San Marcos (Chapter 6) Guillem Pratx, Stanford University (Chapter 42) Sven Prevrhal, Philips Healthcare (Chapter 42) Anis Rahman, GIPSA-lab (Chapter 30) Sanjay Rajopadhye, Colorado State University (Chapter 14) Naren Ramakrishnan, Virginia Tech (Chapter 15) Paul Richmond, University of Sheffield (Chapter 21) Guillaume Rizk, Institut de Recherche en Informatique et Syst `emes Al ´eatoires, Universit ´e de Rennes (Chapter 14) Christopher Rodrigues, University of Illinois at Urbana-Champaign (Chapter 4) Daniela Romano, University of Sheffield (Chapter 21) Witold R. Rudnicki, University of Warsaw (Chapter 11) Jan Saam, University of Illinois at Urbana-Champaign (Chapter 1) Karthikeyan Sankaralingam, University of Wisconsin-Madison (Chapter 7) Dana Schaa, Northeastern University (Chapter 40) Christoph Schied, Ulm University (Chapter 18) Bertil Schmidt, Nanyang Technological University (Chapter 11) Klaus Schulten, University of Illinois at Urbana-Champaign (Chapters 1 and 4) James Shackleford, Drexel University (Chapter 47) Gregory Sharp, Massachusetts General Hospital (Chapter 47) John Silberholz, University of Maryland (Chapter 8) Claudio Silva, University of Utah (Chapter 48) Vitor Silva, University of Coimbra (Chapter 38) Matthew D. Sinclair, University of Wisconsin-Madison (Chapter 7) Leonel Sousa, Technical University of Lisbon (Chapter 38) Joe Stam, NVIDIA (Chapter 36) Ondˇrej ˇStava, Purdue University (Chapter 35)

This page intentionally left blank

Introduction

Wen-mei W. Hwu

STATE OF GPU COMPUTING

We are entering the golden age of GPU computing. Since the introduction of CUDA in 2007, more than 100 million computers with CUDA-capable GPUs have been shipped to end users. Unlike the previous GPGPU shader programming models, CUDA supports parallel programming in C. From my own experience in teaching CUDA programming, C programmers can begin to write basic CUDA programs after only attending one lecture and reading one textbook chapter. With such a low barrier of entry, researchers all over the world have been engaged in developing new algorithms and applications to take advantage of the extreme floating point execution throughout these GPUs. Today, there is a large community of GPU computing practitioners. Many of them have reported a 10 to 100 times speedup of their applications with GPU computing. To put this into perspective, with the historical 2X performance growth every 2 years, these researchers are experiencing the equivalent of time travel of 8 to 12 years. That is, they are getting the performance today that they would have to wait for 8 to 12 years if they went for the “free-ride” advancement of performance in microprocessors. Interestingly, such “free ride” advancement is no longer available. Furthermore, once they develop their application in CUDA, they will likely see continued performance growth of 2X for every two years from this day forward. After discussing with numerous researchers, I have reached the conclusion that many of them are solving similar algorithm problems in their programming efforts. Although they are working on diverse applications, they often end up developing similar algorithmic strategies. The idea of GPU Comput- ing Gems is to provide a convenient means for application developers in diverse application areas to benefit from each other’s experience. In this volume, we have collected 50 gem articles written by researchers in 10 diverse areas. Each gems article reports a successful application experience in GPU computing. These articles describe the techniques or “secret sauce” that contributed to the success. The authors highlight the potential applicability of their techniques to other application areas. In our editorial process, we have emphasized the accessibility of these gems to researchers in other areas. When we issued the call for proposals for the first GPU Computing Gems , we received more than 280 submissions, an overwhelming response. After careful review, we accepted 110 proposals that have a high likelihood of making valuable contributions to other application developers. Many high- quality proposals were not accepted because of concerns that they may not be accessible to a large audience. With so many accepted proposals, we were forced to divide these gems into two volumes. This volume covers 50 gems in the application areas of scientific simulation, life sciences, statistical modeling, emerging data-intensive applications, electronic design automation, ray tracing and render- ing, computer vision, video and image processing, signal and audio processing, and medical imaging.

xix