0% found this document useful (0 votes)

549 views

Program Guide: Sept 20-23, 2010 San Jose Convention Center

Uploaded by

KRiSnoMaN RocK

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

549 views

Program Guide: Sept 20-23, 2010 San Jose Convention Center

Uploaded by

KRiSnoMaN RocK

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 160

Sept 20-23, 2010

San Jose Convention Center

PROGRAM
GUIDE

PRESENTED BY SPONSORED BY
Microsoft and Parallel Nsight – a powerful combination. Use Parallel Nsight to integrate into Microsoft
Visual Studio, the world’s most popular development environment. Run GPGPU accelerated applications
on your desktop, on Windows Server or use Windows HPC Server 2008 for cluster applications.
Learn more at table 87 or the Parallel Nsight Lounge by Microsoft in the concourse area.
The world,
these days,
isn’t flat.

INTRODUCTION
It’s parallel.
The parallel-processing power unleashed by the year, NVIDIA has shipped more than 70 million CUDA-
graphics processing unit, or GPU, is changing the enabled GPUs; 10 textbooks about CUDA have been
face of computing. And, as it changes, our ability to published in Chinese, English, Japanese and Russian;
address some of the world’s most vexing challenges and two more universities each week, on average, are
is improving. adopting CUDA into their curriculums.
Performing safer heart surgery. Making cars safer And in the graphics space, NVIDIA’s new Quadro
to drive. Drilling oil wells more accurately. Solutions professional graphics are ushering in a new era of
to these and other complex computational problems computational visualization, bringing significant
have rapidly moved within reach, yielding results that change to broadcast and film production, medical
have the potential to change lives and, ultimately, imaging and seismology, among other fields.
society as a whole. Perhaps the most immediately obvious sign of the
Facilitating this work is the GPU, one of the most importance of parallel computing, though, is the
sophisticated processors ever manufactured. With GPU Technology Conference itself. Last year’s event
up to three billion transistors in an area the size of far outstripped expectations in terms of attendance.
a postage stamp, it can accelerate applications by And interest this year suggests that a revolution is at
several hundred times, shortening time to discovery hand. Consider the following:
from days to minutes. > The response to the call for talks at this year’s
At the same time, GPUs are far more power efficient conference rose more than fourfold from last
than clusters designed exclusively with CPUs. And year, while the number of sessions has more than
they are significantly less costly. In June, China’s doubled to some 300 hours.
National Supercomputing Center, in Shenzhen, > Representatives from more than 100 universities
unveiled the world’s second-most powerful super- are registered.
computer, powered by Tesla GPUs, which was
developed in a matter of months. Other enormously > Attendees have arrived from some 50 countries.
powerful GPU-powered supercomputers are on For all these metrics, the ultimate success of
the way. the GPU Technology Conference, though, will be
Another indicator of the triumph of parallel measured by the level of engagement it inspires,
computing is the growing relevance of CUDA, the by the side conversations that it generates and the
architecture developed by NVIDIA that enables collaboration that it leads to.
GPUs to understand industry standard computing Brace yourself for immersion in this brave
languages, as well as graphics APIs. Over the past new world!
The wait is over.
Experience breakthrough performance with the all-new
Mercury Playback Engine in Adobe® Premiere® Pro CS5.

www.adobe.com/go/productionpremium
IMPORTANT INFORMATION
If there is anything else we can do to make your conference experience better,
please stop by the info desk and let us know!

REGISTRATION / INFORMATION DESK HOURS

Monday, September 20 8:00 AM to 6:00 PM
Tuesday, September 21 7:00 AM to 7:00 PM
Wednesday, September 22 8:00 AM to 6:00 PM
Thursday, September 23 8:00 AM to 6:00 PM

EXHIBIT HALL AND MEAL HOURS
Tuesday, September 21 12:00 PM to 2:00 PM Lunch / Exhibits Open
6:00 PM to 8:00 PM Reception / Exhibits Open
Wednesday, September 22 12:00 PM to 2:00 PM Lunch / Exhibits Open
6:00 PM to 8:00 PM Reception / Exhibits Open
Thursday, September 23 12:00 PM to 2:00 PM Lunch / Exhibits Open

ENROLL INTO YOUR SESSIONS Go to www.Nvidia.Com/gtc, click on “view schedule,” and log in to start adding
sessions into your personal schedule. Priority access into each session will be
given to those who enroll. Enrolling into sessions also help us place the most
popular sessions into the largest rooms.

WIRELESS INTERNET ACCESS Free wireless internet access is available in most session rooms, the keynote
hall, and also in the concourse outside of Ballroom A and the Exhibit Hall, under
“GTC2010.”

FIND OUT THE LATEST Log on to www.nvidia.com/gtc to get the latest coverage on the event, along
HAPPENINGS WITH THE with any updates on room changes, access to session feedback survey, etc.
CONFERENCE

BUSINESS CENTER / SHIPPING The Marriott Hotel and the Hilton Hotel both have business centers located
on the first floor, near their front lobby. You can work out shipments with their
respective front desks or bell desks. Alternatively, there is a FedEx Office Print
& Ship Center at 93 E. San Carlos St, near 3rd St (3 blocks from the Convention
Center, call 408-295-4336 for hours).

GO GREEN! Take part in the shared goal of minimizing our collective impact on the
environment. Please take only the conference materials you need and recycle
your badges at the conclusion of the event. Also, we have provided GTC coffee
mugs for those who opted in, so please use those to fill your hot and cold
beverages to avoid contributing to more waste to the environment.

BAG AND COAT CHECK Bag check is available at the bell desk of the Marriott and Hilton hotels,
connected to the Convention Center. It is also available for a small fee on the
ground floor of the San Jose Convention Center.

LOST AND FOUND Please check the info desk should you lose or find an article.

FIRST AID / EMERGENCY Should there be a medical emergency, please dial 911 and alert the nearest
conference personnel.
Unparalleled PowerEdge
flexibility with the M610x
The Dell™ PowerEdge™ M610x blade server allows you to
creatively incorporate a vast array of expansion solutions,
including the NVIDIA® Tesla™ GPGPU card.

Now, a single M610x, equipped with a NVIDIA Tesla GPGPU card and
installed in a PowerEdge M1000e blade enclosure, can perform over 400
Gigaflops of double-precision computations for demanding, floating-point-
intensive workloads. The x16 Gen2 PCIe slots in the PowerEdge M610x
bring a new dimension of flexibility and performance to your data center.

PowerEdge M1000e

PowerEdge M610x

NVIDIA Tesla

Discover more at Dell.com/PowerEdge

©2010 Dell Inc. All rights reserved. Dell, the DELL logo, the DELL badge, and PowerEdge are trademarks of Dell Inc. NVIDIA is a registered
trademark and Tesla is a trademark of NVIDIA Corporation in the U.S. and other countries. The content provided is as is and without express or
implied warranties of any kind. ad# G10000156
TABLE OF CONTENTS

3 Important Information
7-9 Conference Highlights – Don’t Miss These Events!
12-13 Recommended for Academics
14-31 Emerging Companies
33-34 Sessions Listing – Monday
36-46 Sessions Listing – Tuesday
48-64 Sessions Listing – Wednesday
66-78 Sessions Listing – Thursday
80-93 Research Posters Listing
95-136 Speaker and Panel Listing
138-139 Exhibit Hall Map & Directory
140-155 Sponsors & Exhibitors
Quadro Professional Graphics
NVIDIA® Quadro® by PNY®
Professional Solutions. Professional Support.

Visit the PNY booth and see how NVIDIA Quadro by PNY and Quadro Plex multi-GPU
solutions are used today to enhance real-world professional graphics and HPC applications.
Meet with our experienced product managers and partners to discuss your development and
application needs. Learn how NVIDIA Quadro by PNY professional solutions enable new
technical and business possibilities.

PNY Professional Graphics Advantage: Partner optimized applications

• 3 year warranty and certified leading ISV’s for:
• Pre-sales support and system configuration assistance • Large Scale Visualization
• Dedicated Quadro Field Application engineers • HPC Environments
• Direct tech support hot lines • CAD/CAE
• Certified software support and bug reporting • Digital Content Creation
• Published product support and training materials • HD Broadcast and Film
• Support for all workstation brands and complex installations • Video Editing and Effects
• Long product life cycles and availability

Find out more at: www.pny.com/quadro

Features and specifications subject to change without notice. The PNY logo are registered trademarks of
PNY Technologies, Inc. All other trademarks are the property of their respective owners. © 2009 PNY Technologies, Inc. All rights reserved.
CONFERENCE
HIGHLIGHTS
DON’T MISS THESE
EVENTS!
ALL WEEK LONG
Parallel Nsight Lounge, by Microsoft
While attending GTC, come learn from the experts at the Parallel Nsight™ Lounge
by Microsoft, a casual environment for hands-on learning and instruction on
Parallel Nsight, the industry’s first development environment for GPU-accelerated
applications. Experts from NVDIA and Microsoft will be available from 10am to
8pm each day to answer questions and provide instruction on Parallel Nsight,
Visual Studio 2010, Windows HPC Server 2008 and CUDA C/C++ development.

CUDA Certification for GPU Computing Developers

The CUDA Certification Program is a response to the growing demand for
qualified parallel programmers. To become CUDA certified, candidates must
demonstrate an excellent working knowledge of the CUDA architecture and
programming model, the ability to apply CUDA constructs to common algorithmic
frameworks, and a strong understanding of optimization techniques related to
CUDA C-based code.
> During GTC 2010, visit the “Ask the CUDA Experts” table to learn more!
> Additional info available at http://www.nvidia.com/certification

DigitalGuru: Where Smart People Get Smarter

DigitalGuru Technical Bookshop of Cupertino, California is pleased to participate
in GTC 2010. Please visit our table during the conference for a wide and relevant
selection of books on parallel programming, computer science, application tools
and more. Books sold at GTC are available at 20% off list price. For more info, see
www.digitalguru.com
Conference highlights continued
ALL WEEK LONG
Birds of a Feather Gathering Places
Have a break in your schedule and want to network with some of the other
brilliant minds here this week? Flock to one of the many “Birds of a Feather
Gathering” tables out on the concourse. Each table is labeled with an interest
group ranging from Computational Finance to Computer Vision. Table is open all
day and evening.

TUESDAY
09:00 - 10:30 Opening Keynote with Jen-Hsun Huang, NVIDIA CEO and Co-Founder
>Keynote Hall
12:00 - 14:00 Exhibits Open / Networking Lunch >Exhibit Hall

18:00 - 20:00 Posters Showcase / Exhibits Open / Networking Reception

>Exhibit Hall and Concourse

WEDNESDAY
09:00 - 09:50 Day 2 Keynote with Dr. Klaus Schulten, University of Illinois at Urbana-
Champaign >Keynote Hall
10:00 - 10:50 Emerging Companies Summit Opening Address and Highlights >Keynote Hall
12:00 - 14:00 Exhibits Open / Networking Lunch >Exhibit Hall

18:00 - 20:00 Exhibits Open / Networking Reception >Exhibit Hall

THURSDAY
09:00 - 09:50 Emerging Companies Summit “Fireside Chat” featuring Quentin Hardy (Forbes
Magazine) and Jen-Hsun Huang (NVIDIA) >Keynote Hall
12:00 - 14:00 Exhibits Open / Networking Lunch >Exhibit Hall
17:00 - 18:30 Closing Keynote with Sebastian Thrun, Professor / Distinguished Engineer,
Stanford University and Google >Keynote Hall

Closing Night Party for Charity

What happens when you throw together live music, raffle prizes, your GTC colleagues
and a charity? A great party for a great cause. You can feel even better about letting
loose as every dollar raised will be matched by the NVIDIA Foundation. For $10, you get
a free drink ticket plus an entry into a raffle to win some stellar prizes. Learn more and
buy your tickets at the NVIDIA Foundation table, the Gear Store booth, or buy at the door.
Time: 20:00 - ?
Location: Voodoo Lounge, 14 S. Second Street, near Santa Clara Street. See your badge
insert for a map to this location.
An initiative sponsored by the NVIDIA Foundation

There’s a role for everyone in the search for a cure.

Stop by the NVIDIA Foundation booth in the exhibit hall
during GTC to learn about our efforts to help scientists
speed the cure for cancer and to see how you can help.
CLOSING NIGHT PARTY FOR CHARITY E.
Sa
nta
Cla
ra VOODOO
LOUNGE
rn
an
do
St

Fe
San
E.

What happens when you throw together live music, raffle prizes, your GTC

S.
4 th
t

St
S
colleagues and a charity? A great party for a great cause.

S.
st
Po

3rd
St
S.
t
oS

2n
d San Jose

You can feel even better about letting loose as every dollar raised will be
an

dS
rn Museum of Art
Fe

t
n
Sa
Fairmont

S.
W. Hotel

matched by the NVIDIA Foundation.

1s
tS
Ma
s

t
rlo

rke
n Ca
Sa

tS
For $10, you get a free drink ticket plus an entry into a raffle to win some stellar E.

tre
Hotel

et
Montgomery

prizes. Learn more and buy your tickets at the NVIDIA Foundation table on the rk
Av
e
TECH MUSEUM
Pa
concourse, the Gear Store booth, or buy at the door.
OF INNOVATION

r
do
lva

Alm

Ma
n Sa

rk
ad
CIVIC MARRIOTT Sa
E.

et
WHEN: Thursday, September 23 at 8:00 PM

en
AUDITORIUM HOTEL
re

St
e

B
t

lvd
WHERE: Voodoo Lounge, 14 S. Second Street, near Santa Clara Street HILTON
SAN JOSE McENERY
CONVENTION CENTER
HOTEL

e
YOU ARE HERE Av
a
ol
Vi

About the Cause About the NVIDIA Foundation

This year’s recipient is Hope Lab, an organization The NVIDIA Foundation is one of the only employee-
that aims to enhance the physical health and led foundations in Silicon Valley. Each year, an
psychological well-being of young people with employee board of directors receives feedback from
cancer. Your donations will support the global our global employee population regarding the issues
distribution of Hope Lab’s popular kids video game, they care most about addressing, such as education
Re-Mission, the first video game shown to induce and cancer research, and puts together an exciting
positive behaviors that enhance the effectiveness set of programs that leverages NVIDIA’s unique
of medical treatment. In Re-Mission, players pilot strengths as a company: having a transformative
a nanobot named Roxxi as she travels through the impact, engaging our employees, leveraging our
bodies of fictional cancer patients destroying cancer ecosystem of customers, suppliers and vendors.
cells, battling bacterial infections, and managing The Foundation is developing a program called
side effects associated with cancer and cancer Compute the Cure, to drive forward a cure for cancer
treatment. Research shows that patients who played by supporting researchers working with gene
Re-Mission stuck to their prescribed treatments sequencing technologies. Stop by the Foundation’s
more consistently, a key component of successful booth at GTC, near the NVIDIA Store and Press
cancer treatment, and showed increases in cancer Lounge to learn more.
knowledge and self-efficacy. Get your own copy at:
www.re-mission.net.
Recommended for Academics

Tuesday / September 21
TIME ID / SESSION TITLE
09:00 - 10:30 1001 – Opening Keynote with Jen-Hsun Huang
11:00 - 12:00 2223 – Academic Welcome Social and Poster Review
11:00 - 11:50 2112 – The Heisenberg Spin Glass Model on GPU: Myth versus Fact
14:00 - 14:50 2262 – CUDA Centers of Excellence Super-Session I
15:00 - 15:50 2263 – CUDA Centers of Excellence Super-Session II
16:00 - 16:50 2264 – CUDA Centers of Excellence Super-Session III
17:00 - 17:50 2265 – CUDA Centers of Excellence Super-Session IV
18:00 - 18:50 1005 – Research Poster Showcase / Exhibits Open / Networking Reception

Wednesday / September 22
TIME ID / SESSION TITLE
09:00 - 09:50 1002 – Keynote with Dr. Klaus Schulten,
University of Illinois at Urbana-Champaign
10:00 - 10:50 2280 – TSUBAME2.0 Experience
10:00 - 10:50 2082 – CU-LSP: GPU-based Spectral Analysis of Unevenly Sampled Data
10:00 - 10:20 2163 – Leveraging GPUs for Evolutionary Game Theory
10:00 - 10:50 2249 – New Programming Tools GPU Computing
10:00 - 10:50 2166 – The Triad of Extreme Computing-Fast Algorithms, Open Software and
Heterogeneous Systems
10:00 - 10:50 2058 – A Practical Introduction to Computational Fluid Dynamics on GPUs
11:00 - 11:50 2078 – Shockingly fast and accurate CFD simulations
11:00 - 11:50 2286 – Towards Peta-Scale Green Computation- Applications of the GPU
Supercomputers in the Chinese Academy of Sciences (CAS)
11:00 - 11:50 2177 – Simplifying Parallel Programming with Domain Specific Languages
14:00 - 14:50 2248 – Parallel Processing on GPUs at the University of Utah
14:00 - 14:50 2137 – CUDA for Real-Time Multigrid Finite Element Simulation of Soft Tissue
Deformations
14:00 - 14:50 2000 – Gravitational N-body Simulations: How Massive Black Holes Interact with
Stellar Systems
14:00 - 14:50 2164 – Analytical Performance Models to Improve the Efficiency of
GPU Computing
14:00 - 14:50 2068 – Parallelizing FPGA Technology Mapping using GPUs
14:00 - 14:50 2204 – Bridging GPU Computing and Neuroscience to Build Large-Scale Face
Recognition on Facebook.
15:00 - 15:50 2050 – Copperhead: Data-Parallel Python for the GPU
15:00 - 15:50 2029 – Computer Vision Algorithms for Automating HD Post-Production
15:00 - 15:50 2044 – GRASSY: Leveraging GPU Texture Units for Asteroseismic Data Analysis
15:00 - 15:50 2122 – Using GPUs for Real-Time Brain-Computer Interfaces
15:00 - 15:50 2281 – Domain-Specific Languages
16:00 - 16:50 2108 – Binary Black Holes Simulations using CUDA
16:00 - 16:50 2118 – Large-scale Gas Turbine Simulations on GPU Clusters
16:00 - 16:50 2135 – Processing Petabytes per Second with the ATLAS experiment at the Large Hadron
Collider at CERN
16:00 - 16:50 2226 – Reverse Time Migration with GMAC
17:00 - 17:50 2005 – Porting Large-Scale Legacy Fortran Codes
17:00 - 17:50 2242 – Swarming Bacteria and Diffusing Particles: High-Throughput Analysis of
Microscopic 3D Motion
17:00 - 17:50 2167 – Designing a Geoscience Accelerator Library Accessible from High Level Languages

Thursday / September 23
TIME ID / SESSION TITLE
09:00 - 09:50 2030 – High-Throughput Cell Signaling Network Learning with GPUs
09:00 - 09:50 2236 – A Work-Efficient GPU Algorithm for Level Set Segmentation
10:00 - 10:50 2001 – Acceleration of the Freesurfer Suite for Neuroimaging Analysis
10:00 - 10:50 2269 – Bringing GPUs to Mainstream Molecular Dynamics Packages
10:00 - 10:50 2176 – Easy GPU Meta-programming: A Case Study in Biologically-Inspired
Computer Vision
10:30 - 10:50 2292 – Implementation of High-Order Adaptive CFD Methods on GPUs
11:00 - 11:50 2007 – Folding@home: Petaflops on the Cheap Today; Exaflops Soon?
14:00 - 14:50 2054 – NAMD, CUDA, and Clusters: Taking GPU Molecular Dynamics Beyond
the Desktop
14:00 - 14:50 2210 – GPU-Ocelot: An Open Source Debugging and Compilation Framework
for CUDA
15:00 - 15:50 2062 – HOOMD-blue: Fast and Flexible Many-Particle Dynamics
17:00 - 18:30 1003 – Closing Keynote with Dr. Sebastian Thrun, Stanford University
20:00 - ??? 1007 – Closing Night Party for Charity
emerging companies
summit

Welcome back to NVIDIA’s annual Emerging the massive power of GPUs to drive amazing
Companies summit. I’m delighted to report that 2010 performance for their applications.
marks the third consecutive and successful year for In the spirit of innovation, this year we decided to
ECS, and our momentum continues to build! introduce a new and exciting format for the startup
The Emerging Companies Summit (an integral presentations at ECS. While you will still be able
part of the GPU Technology Conference) is now the to find booths in the exhibit hall for almost all
premier event for startups to share new applications, 60 of the emerging companies, with the help of
based on GPUs (graphics processing units) that are an advisory committee we have chosen a select
revolutionizing the computing industry. At the same group of 24 to participate in action-packed “CEO on
time, these startups will have an opportunity to meet Stage” sessions. The CEOs from these 24 emerging
with hundreds of technologists, investors, analysts companies will have the special opportunity to both
and executives who add additional “fuel” to the GPU present and discuss their business strategies with
computing ecosystem. panels comprised of some of the world’s leading
Familiar sectors such as media and entertainment and most impressive venture capitalists, technology
have already been fundamentally altered by the executives and industry analysts. I believe this will
GPU. Movie director James Cameron often says be one of the major highlights of this year’s GPU
that Avatar could not have been created 10 years Technology Conference, and I urge you to participate
ago – it required powerful graphics processors to in as many of these sessions as possible.
bring his vision to life. In addition to entertainment, In closing, I am extremely excited and honored to
new industries and applications are also being once again be part of the Emerging Companies
unleashed and significantly enhanced by GPU-based Summit. The GPU computing ecosystem has now
technologies. gathered a full head of steam, which will be clearly
At this year’s ECS, I hope you will take full advantage evident over the next few days. I would also like to
of the opportunity to see and hear from 60 of the add a special note of thanks to our sponsors which
most promising companies in these fields. The include Cooley Godward Kronish, Citi, Sutter Hill
companies, representing several countries from Ventures, Silicon Valley Bank, Deloitte, Mandel
around the world, will showcase new technology Communications, Churchill Club, and VentureBeat.
in the fields of computer vision, robotics, video Thank you for attending, and welcome to the GPU
processing, cloud computing and mobile computing. computing revolution!
What they all share in common is that they harness

Jeff Herbst
Vice President of Business Development
NVIDIA
Experienced Guides

Cooley is a proud Platinum Sponsor of

the 2010 NVIDIA GTC Conference Emerging

Company Summit.

Cooley attorneys have served as counselors, strategists

and advocates to technology entrepreneurs and

investment funds since 1959.

Cooley, a national law firm for the converging worlds of high

technology, high finance and high-stakes litigation.
For more information, visit us at www.cooley.com

© 2010 Cooley LLP, 101 California Street, 5th Floor, San Francisco, CA 94111. 415/693-2000.
RECOMMENDED SESSIONS FEATURING EMERGING COMPANIES
Emerging Companies Summit Agenda

Tuesday / September 21
TIME ID / SESSION TITLE
9:00 – 10:30 1001 – Opening Keynote with Jen-Hsun Huang
12:00 – 14:00 1004 – Exhibits Open / Networking Lunch
18:00 – 20:00 1005 – Exhibits Open / Networking Happy Hour/ Research Posters Showcase

Wednesday / September 22
TIME ID / SESSION TITLE
9:00 – 9:50 1002 – Day 2 Keynote with Dr. Klaus Schulten, University of Illinois at
Urbana-Champaign
10:00 – 10:50 4000 – Emerging Companies Summit Opening Address featuring
Jeff Herbst (NVIDIA)
11:00 – 11:50 4001 – Emerging Companies Summit “CEO on Stage” featuring Sam Blackman
(Elemental Technologies, Inc.), Sam Cox (Milabra), Chris Doran
(Geomerics) and panelists Drew Lanza (Partner, Morgenthaler), Dan’l
Lewin (Corporate VP and Strategic & Emerging Business Development,
Microsoft), Jon Peddie (President, JPR), Jeff Herbst (Vice President of
Business Development, NVIDIA)
12:00 – 14:00 1004 – Exhibits Open / Networking Lunch
14:00 – 14:50 4002 – Emerging Companies Summit “CEO on Stage” featuring Christopher
Blewitt (miGenius), Sebastien Deguy (Allegorithmic), Philip Lunn
(Bunkspeed) and panelists Drew Lanza (Partner, Morgenthaler), Dan’l
Lewin (Corporate VP and Strategic & Emerging Business Development,
Microsoft), Jon Peddie (President, JPR), Jeff Herbst (Vice President of
Business Development, NVIDIA)
15:00 – 15:50 4003 – Emerging Companies Summit “GPUs for Computer Vision” moderated by
Jon Peddie (Jon Peddie Research), featuring panelists Sam Cox (CEO,
Milabra), Tom Dean (Research Scientist, Google) Janko Mrsic-Flogel (CTO,
MirriAd), Joe Stam (Sr. Applications Engineer, NVIDIA), Yoram Yaacovi
(CTO & General Manager, Technologies at Microsoft)
16:00 – 16:50 4004 - Emerging Companies Summit “CEO on Stage” featuring Michael
Hummel (empulse GmbH), Natan Peterfreund (Playcast Media Systems),
Austin Shoemaker (Cooliris) and panelists Nathan Brookwood (Research
Fellow, Insight64), Charles Carmel (VP of Corporate Business
Development, Cisco), Flip Gianos (General Partner, InterwestInterWest
Partners), Jeff Herbst (Vice President of Business Development, NVIDIA)
17:00 – 17:50 4005 - Emerging Companies Summit “CEO on Stage” featuring Michel Tombroff
(Softkinetic), Uri Tal (Rocketick), Kristian Raue (Jedox Business
Intelligence) and panelists Nathan Brookwood (Research Fellow,
Insight64), Charles Carmel (VP of Corporate Business Development,
Cisco), Flip Gianos (General Partner, InterwestInterWest Partners),
Jeff Herbst (Vice President of Business Development, NVIDIA)
18:00 – 20:00 1005 - Exhibits Open / Networking Happy Hour
emerging companies
THURsday / September 23

summit
TIME ID / SESSION TITLE
9:00 – 9:50 4006 – Emerging Companies Summit “Fireside Chat” featuring Quentin Hardy
(Forbes Magazine) and Jen-Hsun Huang (Co-founder & CEO, NVIDIA)
10:00 – 10:50 4007 – Emerging Companies Summit “CEO on Stage” featuring Andrew Jamison
(Scalable Display Technologies), Jeroen Snepvangers (RTT), Michael
Zeitlin (Aqumin) and panelists Rob Enderle (Analyst, Enderle Group),
Jeff Herbst (Vice President of Business Development, NVIDIA), Savitha
Srinivasan (Corporate Venture Partner, IBM), Norman Winarsky (VP of
Ventures, Licensing and Strategic Programs, SRI)
11:00 – 11:50 4008 – Emerging Companies Summit “CEO on Stage” featuring David Peters
(Universal Robotics), David Hayes (ICD) and panelists Rob Enderle
(Analyst, Enderle Group), Jeff Herbst (Vice President of Business
Development, NVIDIA), Savitha Srinivasan (Corporate Venture Partner,
IBM), Norman Winarsky (VP of Ventures, Licensing and Strategic
Programs, SRI)
12:00 – 14:00 1004 – Exhibits Open / Networking Lunch
14:00 – 14:50 4009 – Emerging Companies Summit “The `New Normal’ For Building Emerging
Companies Based On Disruptive Technologies” moderated by Jeff Herbst
(NVIDIA), featuring panelists Gerald Brady (Silicon Valley Bank), Bill
Frauenhofer (Managing Director, Citigroup Global Markets), Garrett
Herbert (Partner, M&A Transaction Services, Deloitte & Touche LLP), Eric
Jensen (Partner, Business Department Chair, Cooley LLP), Andrew T.
Sheehan (Managing Director, Sutter Hill Ventures)
15:00 – 15:50 4010 - Emerging Companies Summit “CEO on Stage” featuring Yoram Burg
(OptiTex), Sylvain Ordureau (Useful Progress), Torsten Reil (NaturalMotion)
and panelists Tim Bajarin (Creative Strategies), Bill Tai (Charles River
Ventures), Paul Weiskopf (Adobe)
16:00 – 16:50 4011 – Emerging Companies Summit “CEO on Stage” featuring Jeff Han
(Perceptive Pixel), Lance Maurer (Cinnafilm, Inc.), Bruno Uzzan (Total
Immersion) and panelists Tim Bajarin (President, Creative Strategies),
Jeff Herbst (Vice President of Business Development, NVIDIA), Bill
Tai (General Partner, CRV), Paul Weiskopf (Sr. VP of Corporate
Development, Adobe)
17:00 – 18:30 1003 – Closing Ceremony / “Ones to Watch” Award Presentation and Closing
Keynote with Dr. Sebastian Thrun, Stanford University
20:00 – ? 1007 – Closing Party for Charity

Emerging Companies Bunkspeed Jedox Business Intelligence Reservoir Labs

Cinnafilm Mersive Technologies RTT
in the Exhibit Hall CodeSourcery miGenius Scalable Display Technologies
For full profile, please see Cooliris Milabra ScaleForm Corporation
“Sponsors and Exhibitors” section. Cyberlink Corporation Milngleverse SEACO2
Discretix Technologies NaturalMotion Ltd Softkinetic
3DreamTeam embodee OptiTex Stonetrip
3DTV Solutions EM Photonics Perceptive Pixel Tide Powered Ltd.
AccelerEyes Empulse GmbH PhaseSpace, Inc. Trinity Racing Concepts, LLC
Acceleware Filter Foundry Phototour Universal Robotics
Allegorithmic Geomerics Playcast Media Systems Useful Progress
Binatix, Inc. HPC Projects / Wild Systems Prometech Software, Inc. VisiSonics Corporation
Biodigital Systems Israel Economic Mission (IEM) RealityFrontier
CEO ON STAGE LISTING

Allegorithmic

Allegorithmic is the company behind Substance, the first professional

middleware for the authoring and on-the-fly rendering of smart textures.
Substance allows content developers to produce textures twice as fast as usual,
while Substance description files are typically 500-1000 times smaller than
regular bitmaps: Creating and distributing textures for online and mobile 3D
games has never been that efficient.

Speaker Sébastien Deguy, Founder & CEO

Speaker Session
4002 - Emerging Companies: CEO on Stage featuring Allegorithmic, Bunkspeed,
and miGenius (Wednesday, Sept 22, 14:00)

CEO Sébastien Deguy

Investors Undisclosed
Capital Raised Undisclosed

Aqumin

Aqumin LLC invented AlphaVision™ to give financial professionals the ability

to see relationships between data across entire markets at once. More than
just another heat map or static table, AlphaVision™ converts financial data
into interactive three-dimensional financial landscapes that enable real-time,
multi-variate comparative analysis. Coupled with data services from premier
providers such as Bloomberg, ThompsonReuters, ActivFinancial and others,
Aqumin is developing a rich view library that simplifies the process of gathering,
organizing, and presenting relevant information. At the click of a button,
AlphaVision™ facilitates the seamless transition from current table-based views
of global securities markets to an interactive, multi-dimensional workspace.

Speaker Michael Zeitlin, CEO

Speaker Session
4007 - Emerging Companies: CEO on Stage featuring Aqumin, RTT, and Scalable
Display Technologies (Thursday, Sept 23, 10:00)

CEO Michael Zeitlin

Investors Private Individuals
Capital Raised $2 Million
emerging companies
Bunkspeed

Bunkspeed is a 3D rendering and animation software developer based in

summit
Carlsbad, California. Our philosophy has been to create easy to use use software
for creative people with no prior 3D modeling or rendering experience, thus
expanding the marketing beyond traditional bounderies established by complex
rendering software. Founded in 2003, Bunkspeed software has become the
standard in the industrial design community and is spreading rapidly to the
engineering design and marketing communities. Recently Bunkspeed has
introduced it’s new generation of 3D rendering software based on mental
images iray accelerated by the NVIDIA CUDA GPU’s.

Speaker Philip Lunn, CEO

Speaker Session
Speaking Session 4002 - Emerging Companies: CEO on Stage featuring
Allegorithmic, Bunkspeed, and miGenius (Wednesday, Sept 22, 14:00)

CEO Philip Lunn

Investors Undisclosed
Capital Raised Undisclosed

Cinnafilm

Cinnafilm™ was founded to address the absence of quality, software-based tools

to visually optimize, convert and repurpose video images in the post production
market. Recognizing the power of modern Graphics Processing Units (GPUs)
and the accelerating migration to file-based workflows, Cinnafilm disregarded
prevailing industry methods in 2005 and started from scratch to create what has
become the world’s fastest, most accurate GPU-based image processing engine,
Pixel Strings™. Cinnafilm is a growing company that has successfully partnered
with some of the strongest names in the motion picture and television industry:
ARRI, Quantel, Rhozet, and NVIDIA. Our partners have recognized the power of
Pixel Strings and the superlative image quality which can result when this power
is properly harnessed. Cinnafilm is a privately held company, headquartered in
Albuquerque, NM, amongst powerful resources such as the nation’s defense
laboratories and New Mexico’s highly competitive film tax incentives.

Speaker Lance Maurer, CEO

Speaker Session
4011 - Emerging Companies: CEO on Stage featuring Cinnafilm, Perceptive
Pixel and Total Immersion (Thursday, Sept 23, 16:00)

CEO Lance Maurer

FULL CONFERENCE GUIDE 2010

Investors Undisclosed
Capital Raised Undisclosed
19
Financial capital. Intellectual capital.

In more than 100 countries around the world, Citi is helping companies,
governments and institutions overcome business challenges, raise capital,
mitigate risk and extend their reach.

When you partner with Citi, you gain access to an unparalleled global
platform, capital markets, insightful advice and award-winning solutions —
so you can realize your goals today and in the future.

Success requires both ﬁnancial and intellectual capital, and that’s why
Citi never sleeps.

Please visit us at www.icg.citi.com.

© 2009 Citigroup Inc. All rights reserved. Citi and Arc Design is a trademark and service mark of Citigroup Inc., used
and registered throughout the world. Citi Never Sleeps is a service mark of Citigroup Inc.
emerging companies
Cooliris

Cooliris was founded with a simple mantra: “Think beyond the browser”. The

summit
company creates products that make discovering and enjoying the Web more
exciting, efficient,and personal. Core products include Cooliris (formerly
PicLens),which transforms your browser into an interactive, full-screen
“cinematic” experience for web media, and CoolPreviews, which lets you preview
links instantly. Cooliris has reached over 12 million installs of the product, with
thousands more downloads everyday

Speaker Austin Shoemaker, CTO

Speaker Session
4004 - Emerging Companies: CEO on Stage featuring Cooliris, empulse GmbH,
and Playcast Media Systems (Wednesday, Sept 22, 16:00)

CEO Soujanya Bhumkar

Investors Kleiner Perkins Caufield & Byers, DAG Ventures, The Westly Group,
and T-Venture

Capital Raised $20+ Million

Elemental Technologies

Elemental Technologies is the leading provider of massively parallel video

processing solutions for broadcast and online video customers. Elemental’s
products use off-the-shelf, programmable graphics processing units (GPUs)
for compute-intensive video processing tasks. The product line is ideal for
digital media workflows that require video encoding for Internet and mobile
distribution.

Speaker Sam Blackman, CEO & Co-Founder

Speaker Session
4001 - Emerging Companies: CEO on Stage featuring Elemental Technologies,
Geomerics, and Milabra (Wednesday, Sept 22, 11:00)

CEO Sam Blackman

Investors Elemental’s venture capital investors are Steamboat Ventures,
headquartered in Burbank, California; General Catalyst, headquartered in
Cambridge, Massachusetts; and Voyager Capital, headquartered in Seattle,
Washington. Each venture capital investor has a board seat, alongside with other
industry experts on the Elemental board who hail from Adobe and Pixelworks.

Capital Raised $14 Million

FULL CONFERENCE GUIDE 2010
21
empulse GmbH

empulse was founded in 2007 by two ex-Accenture managers to help clients

realize complex interactive web projects. Today, empulse employs 20 IT-
professionals and realizes projects for leading international corporations.
To deliver outstanding search and analysis performance empulse developed
ParStream - an innovative database for handling billions of structured data sets.

Speaker Michael Hummel, Managing Director

Speaker Session
4004 - Emerging Companies: CEO on Stage featuring Cooliris, empulse GmbH,
and Playcast Media Systems (Wednesday, Sept 22, 16:00)

CEO Joerg Bienert

Investors Privately held by three founders
Capital Raised Undisclosed

Geomerics

Geomerics is an innovation-led company built on advanced in-house IP, a

world-class research team, and strong management experience. Geomerics’
first product is Enlighten. Enlighten redefines the way lighting is handled in
computer games. Instead of pre-baking the effects of global illumination into
the scene they are computed at run time, allowing for fully dynamic lighting that
dramatically enhances quality. Enlighten gives artists total control over lighting,
driving a new generation of games that rival film for their manipulation of mood
and atmosphere. Licensees of Enlighten include AAA titles in production at EA
DICE, CCP and FunCom.

Speaker Chris Doran, Founder & Chief Operating Officer

Speaker Session
4001 - Emerging Companies: CEO on Stage featuring Elemental Technologies,
Geomerics, and Milabra (Wednesday, Sept 22, 11:00)

CEO Chris Doran

Investors Undisclosed
Capital Raised Undisclosed
emerging companies
ICD

ICD designs technology. We create purposeful products driven by user centric

summit
design principles using the latest hardware and software technologies available.
Our products are optimized for usability, aesthetics, technology and purpose. We
specialize in NVIDIA chipset technologies running Google and Windows OS. We
have basic goals: Simplify. Innovate. Impress.

Speaker David Hayes, CEO

Speaker Session
4008 - Emerging Companies: CEO on Stage featuring ICD and Universal Robotics
(Thursday, Sept 23, 11:00)

CEO David Hayes

Investors Undisclosed
Capital Raised Undisclosed

Jedox Business Intelligence

Jedox developed a centralized, in-memory, GPU-based calculation engine that

controls and stores the spreadsheet-based business Intelligence data contained
in every Excel, Open Office and Google spreadsheet in an organization. This
technology stops “Spreadsheet Spreadmart Chaos” (hundreds of spreadsheets
with uncoupled, non-verifiable data “running amok” in an organization).

Speaker Kristian Raue, CEO & Founder

Speaker Session
4005 - Emerging Companies: CEO on Stage featuring Jedox Business
Intelligence, Rocketick, and Softkinetic (Wednesday, Sept 22, 17:00)

CEO Kristian Raue

Investors Klaus Wecken, eCapital, KfW
Capital Raised 7 Million Euros
FULL CONFERENCE GUIDE 2010
23
From to rising to game
rule breaker star changer.

We’re with you.

Silicon Valley Bank. For nearly three decades, we’ve led the way in recognizing the

vast potential in technology and life science companies. From jump starting start ups to global cash
management to providing debt and asset management for industry leaders, SVB understands what you
do and has the resources to provide what you need. Beyond commercial banking, we offer venture
capital services and funds, valuations and analytics, private banking and more. Silicon Valley Bank.

Named one of Forbes’ Top 5 Best Banks.

We’ll help you find a way, all the way. svb.com

Gerald Brady, Managing Director

2400 Hanover Street Palo Alto, California 94304
Phone 650.855.3072 E-mail [email protected]

©2010 SVB Financial Group.SM All rights reserved. Member Federal Reserve. Silicon Valley Bank.® All rights reserved. Member of FDIC and Federal Reserve. Rev. 08-24-10.
emerging companies
miGenius

Using a powerful combination of mental images Reality Server®, iray®

summit
renderer and NVIDIA’s CUDA based GPU hardware systems; both businesses
and consumers alike can now rapidly and simply upload any 3D content to
individually customised websites that can be immediately shared and explored
with friends and colleagues, in both accurate photorealistic detail and real-
time. miGenius is developing a toolset to enable predominantly ‘non-technical’
users to easily create customised User Interfaces with a wide range of viewing
and management controls and upload their detailed 3D scenes onto either
a dedicated GPU server or onto the rapidly emerging ‘GPU cloud computing’
networks.

Speaker Christopher Blewitt, CEO

Speaker Session
4002 - Emerging Companies: CEO on Stage featuring Allegorithmic, Bunkspeed,
and miGenius (Wednesday, Sept 22, 14:00)

CEO Chris Blewitt

Investors No current investor, but we are looking for prospect investors
Capital Raised Undisclosed

Milabra

Milabra uses proprietary, patent pending machine vision software to create

visual data about the web. We bridge the visual relationship between the Ad,
the Page, and the Audience for increased advertiser performance. We analyze
the visual attributes and content of webpages in order to target online display
advertising, optimizing creative choice based on the visual environment within
which it will appear. We analyze page elements as well as the photo and video
content, providing real time data for targeting and optimization as well as
defending advertisers against negative content associations.

Speaker Sam Cox, CEO

Speaker Session
4001 - Emerging Companies: CEO on Stage featuring Elemental Technologies,
Geomerics, and Milabra (Wednesday, Sept 22, 11:00)

CEO Sam Cox

Investors Undisclosed
Capital Raised Undisclosed
FULL CONFERENCE GUIDE 2010
25
NaturalMotion Ltd

NaturalMotion Ltd is a leading entertainment software company with offices in

Oxford (England), San Francisco (California) and Seoul (Korea). The company
produces the widely-adopted animation technologies euphoria,morpheme and
endorphin, used across the game and movie industries by companies such
as Rockstar Games, Ubisoft, LucasArts, Disney and Bioware. NaturalMotion
recently established its own games division, NaturalMotion Games. Its first title,
Backbreaker, has since become one of the most successful sports games on
mobile devices, with more than 2.2 million downloads.

Speaker Torsten Reil, CEO

Speaker Session
4010 - Emerging Companies: CEO on Stage featuring NaturalMotion, OptiTex,
and Useful Progress (Thursday, Sept 23, 15:00)

CEO Torsten Reil

Investors Undisclosed
Capital Raised Undisclosed

OptiTex

OptiTex is the premiere 2D and 3D CAD software for virtually all sewn-products
industries. OptiTex technologies allows designers to create, correct and
adjust compelling designs before the first piece of fabric is cut, giving a new
dimension to the motto, “Virtual is Real”. OptiTex system consists of three main
components: cloth content creation system with our PDS software, 3D Runway
Designer, a virtual try-onsystem, which includes both cloth simulation and
accurate 3D parametric mannequins; motion animation engine which enables
the generation of motion sequences with interactive cloth. OptiTex brings a
wealth of virtual textile experience to the gaming, feature animation and digital
effects industries. OptiTex’s products are second only to real life in depicting
fabric movement and dynamics.

Speaker Yoram Burg, President

Speaker Session
4010 - Emerging Companies: CEO on Stage featuring NaturalMotion, OptiTex,
and Useful Progress (Thursday, Sept 23, 15:00)

CEO Yoram Burg

Investors Undisclosed
Capital Raised Undisclosed
emerging companies
Perceptive Pixel

Perceptive Pixel is dedicated to the research, development and deployment of

summit
multi-touch interfaces for the knowledge worker. The company’s hardware and
software solutions enable both novice and expert users to manipulate complex
datasets through a new class of intuitive yet powerful and visually rich interface
techniques.

Speaker Jeff Han, Founder, Chief Scientist

Speaker Session
4011 - Emerging Companies: CEO on Stage featuring Cinnafilm, Perceptive Pixel
and Total Immersion (Thursday, Sept 23, 16:00)

CEO Jeff Han

Investors Undisclosed
Capital Raised Series B

Playcast Media Systems

Playcast Media Systems brings video games to the world’s largest media
distribution platform – Pay TV networks. The company’s solution includes
a head-end based system, which streams a game’s audiovisual content as
a standard MPEG stream, as well as the provisioning of the content and
programming itself. Playcast’s media streaming systems, located in operators’
headends, host the games and stream them over the existing video network
to an already distributed base of set-top boxes. Playcast is a privately owned,
venture capital backed company, based in Israel and the UK.

Speaker Natan Peterfreund, CTO

Speaker Session
4004 - Emerging Companies: CEO on Stage featuring Cooliris, empulse GmbH,
and Playcast Media Systems(Wednesday, Sept 22, 16:00)

CEO Guy de Beer

Investors Xenia Ventures and Private Investors
Capital Raised Undisclosed
FULL CONFERENCE GUIDE 2010
27
Ellison & Zander 2009
Igniting conversations
to encourage innovation
and economic growth
since 1985.

Hastings & Eisner 2010

www.churchillclub.org
Churchill Club is a member-supported non-profit organization.
Top Ten Tech Trends 2010 For information about membership and upcoming programs,
please visit www.churchillclub.org
or contact us at 408.265.0130.
Join us Sept. 28th
for Churchill Club’s
Annual Dinner 2010
with: Juniper Network’s
CEO, Kevin Johnson Ignite your own conversations
emerging companies
RTT

Realtime Technology (RTT) AG stands for creative and fascinating 3D

visualization solutions, which bring products to life in realtime and portray
them in a natural and realistic environment. The company provides its clients

summit
with assistance during each stage of the life cycle of their products – from the
initial product design stage through to development and subsequent marketing
and sales. The 3D data model from the product development stage serves as
the basis for all the following steps in the product lifecycle. It can be used,
for example, to rapidly create computer generated, photorealistic product
illustrations for the marketing department or to develop a 3D online product
configurator on a website. In this way, RTT doesn’t just speed up decision
making and development processes for its clients, but it also opens up new
opportunities with regard to marketing and sales. The company was founded in
1999 and its head office is in Munich, Germany. RTT AG has over 400 employees
and is represented in 14 locations worldwide. Many leading businesses have
put their trust in RTT and its portfolio of clients includes names such as Adidas,
Audi, BASF, BMW, Bosch, Daimler, EADS, Harley-Davidson, Miele, Porsche,
Samsung, Thyssen-Krupp, Toyota and Volkswagen. RTT AG is a stock market
listed company (Xetra:R1T; WKN: 701220; ISIN: DE0007012205). For more
information visitwww.rtt.ag.
Speaker Jeroen Snepvangers, President and CEO
Speaker Session
4007 - Emerging Companies: CEO on Stage featuring Aqumin, RTT, and Scalable
Display Technologies (Thursday, Sept 23, 10:00)
CEOs Ludwig A. Fuchs and Jeroen Snepvangers
Investors Balderton Capital, Siemens VC, Heliad
Capital Raised 15 Million Euros

Rocketick

Rocketick is revolutionizing the chip verification world by bringing acceleration

to the fingertips of every engineer. Rocketsim is the world’s first GPU-based
logic simulation accelerator .Rocketsim is a highly cost-effective solution. It is a
pure software-product that runs on commercial, low cost GPUs that are widely
available from nVidia.
Speaker Uri Tal, CEO
Speaker Session
4005 - Emerging Companies: CEO on Stage featuring Jedox Business
Intelligence, Rocketick, and Softkinetic (Wednesday, Sept 22, 17:00)
CEO Uri Tal
Investors Peregrine Ventures
Capital Raised Undisclosed

Scalable Display Technologies

Scalable is a leading provider of auto-calibration software used to create edge-

blended displays. Its patented EasyBlend™ software simplifies the creation of
super-resolution, multi-projector displays of the highest quality and scalable
size. EasyBlend enables widespread use of multi-projector displays for a new
class of simulators, and supports new forms of digital signage/data visualization
tools.
FULL CONFERENCE GUIDE 2010

Speaker Andrew Jamison, CEO

Speaker Session
4007 - Emerging Companies: CEO on Stage featuring Aqumin, RTT, and Scalable
Display Technologies (Thursday, Sept 23, 10:00)
CEO Andrew Jamison
Investors Undisclosed
Capital Raised Undisclosed
29
Softkinetic

Softkinetic is the leading provider of 3D gesture recognition technologies

that transform the way people interact with the digital world. We provide the
most advanced software platform for building immersive, transparent and
intuitive user experiences within the fields of Interactive Digital Entertainment,
Consumer Electronics, Interactive Marketing, Digital Signage and other
markets. Our patented 3D gesture recognition middleware, iisu™, supports all
3D cameras and supports all platforms, from PCs to embedded systems and
set-top-boxes.

Speaker Michel Tombroff, CEO

Speaker Session
4005 - Emerging Companies: CEO on Stage featuring Jedox Business
Intelligence, Rocketick, and Softkinetic (Wednesday, Sept 22, 17:00)

CEO Michel Tombroff

Investors Privately held company
Capital Raised 5 Million Euros

Total Immersion

Total Immersion is the global leader in augmented reality. Through its patented
D’Fusion™ technology, Total Immersion blurs the line between the virtual
world and the real world by integrating real time interactive 3D graphics into
a live video stream. Total Immersion offers consumers a compelling way to
interact with brands in their own environment. With augmented reality, the
brand temporarily “resides” in the viewer’s space.Imagine a favorite animated
character sitting in the next chair, or a static product suddenly “come to life”–
that’s Total Immersion’s augmented reality.

Speaker Bruno Uzzan, Founder & CEO

Speaker Session
4011 - Emerging Companies: CEO on Stage featuring Cinnafilm, Perceptive Pixel
and Total Immersion (Thursday, Sept 23, 16:00)

CEO Bruno Uzzan

Investors Partech International, I Source Gestion, Elaia Partners
Capital Raised $8 Million
emerging companies
Universal Robotics

Universal Robotics creates software that enables machines to learn from their

summit
experiences, react and adapt to their surroundings, and perform tasks that are
costly, dangerous or difficult for humans to undertake. The company’s signature
technology, Neocortex, which was developed over seven years at NASA and
Vanderbilt University, will increase efficiency and worker safety across industries
in applications including warehousing, mining, handling hazardous waste and
automating vehicles such as forklifts.

Speaker David Peters, Founder & CEO

Speaker Session
4008 - Emerging Companies: CEO on Stage featuring ICD and Universal Robotics
(Thursday, Sept 23, 11:00)

CEO David Peters

Investors Private Equity
Capital Raised Undisclosed

Useful Progress

UsefulProgress, was created in 2003 develops new software strategies and

computers intended for 3D imaging with high-definition. UsefulProgress
receives the agreement of «R&D Company» since 2007. The company is based
in the University Paris Descartes. The applications affect sectors as diverse
as medical, pharmaceutical, industrial, mining (diamonds), archeology, higher
education.

Speaker Sylvain Ordureau, CEO

Speaker Session
4010 - Emerging Companies: CEO on Stage featuring NaturalMotion, OptiTex,
and Useful Progress (Thursday, Sept 23, 15:00)

CEO Sylvain Ordureau

Investors Undisclosed
Capital Raised Undisclosed
FULL CONFERENCE GUIDE 2010
31
SPONSORED BY:

GTC SYNNEX
NETWORK
Please visit these Tesla Preferred Partner
exhibits and be entered in a drawing to win a free
NVIDIA Tesla C2050!

Ace Computers Appro CIARA Technologies Colfax International

Creative Consultants Exxact Corporation GraphStream Inc. JRTI

Koi Computers Microway PSSC Labs Penguin Computing

SGI
Monday, Sept 20, 13:00 (80 minutes) programming with CUDA through a number of hands-on code
Marriott San Jose Ballroom examples. Examine more deeply the various APIs available to
CUDA applications and learn the best (and worst) ways in which
2004 Languages, APIs and Development Tools for GPU
to employ them in applications. Master the first half of the book
Computing (Pre-Conference Tutorial)
“CUDA by Example” as taught by the author, pointing you on a
Get a head start on the conference with this first-day introduction trajectory to complete the second half on your own after course
to key technologies for GPU Computing. This 90-minute tutorial completion.
session will cover the key features and differences between the
major programming languages, APIs and development tools Speaker(s): Jason Sanders (Senior Software Engineer, NVIDIA)
available today. Attendees will also learn several high level design Topic(s): Programming Languages & Techniques,
patterns for consumer, professional and HPC applications, with
practical programming considerations for each. Monday, Sept 20, 14:30 (80 minutes)
Room C
Speaker(s): Phillip Miller (Director, Workstation Software Product

MONDAY
Management, NVIDIA), Holger Kunz (Director, 2159 Programming the NVIDIA Digital Video Pipeline with
Workstation Software Development, NVIDIA), Brian Direct3D (Pre-Conference Tutorial)
Harrison (NVIDIA), Thomas Ruge (Software Learn how to program the NVIDIA Quadro Digital Video pipeline
Manager, NVIDIA) using Direct3D. This session will provide an overview of the SDK,
Topic(s): Programming Languages & Techniques, discuss device control, data transfers, performance measuring
Tools & Libraries and tuning, ancillary data and application design considerations.
Speaker(s): Thomas True (Applied Engineer, NVIDIA)
Monday, Sept 20, 13:00 (80 minutes)
Topic(s): Programming Languages & Techniques,
Room B
Video Processing, Computer Graphics
2024 NVIDIA Acceleration Engines Overview
(Pre-Conference Tutorial) Monday, Sept 20, 14:30 (80 minutes)
Come learn of the software engines NVIDIA freely provides to Room A5
application developers to rapidly leverage new GPU capabilities 2260 DirectCompute (Pre-Conference Tutorial)
and dramatically reduce the time it takes to bring compelling
Learn how to to use the DirectCompute API to solve GPU
features to end users.
computing problems. This tutorial will introduce the
Speaker(s): Phillip Miller (Director, Workstation Software Product DirectCompute API, cover the recommended best practices for
Management, NVIDIA), Holger Kunz (Director, GPU programming, and go over examples of how to use this API
Workstation Software Development, NVIDIA), Brian efficiently and effectively to solve compute-intensive problems.
Harrison (NVIDIA)
Speaker(s): Eric Young (Manager of Developer Technology
Topic(s): Programming Languages & Techniques, Computer
Professional and Consumer Applications, NVIDIA)
Vision, Ray Tracing
Topic(s): Programming Languages & Techniques

Monday, Sept 20, 13:00 (80 minutes)

Monday, Sept 20, 14:30 (80 minutes)
Room A5
Room B
2157 DirectX 11 Overview (Pre-Conference Tutorial)
2261 Introduction to GPU Ray Tracing with NVIDIA OptiX
This presentation gives an overview of the DirectX 11 pipeline and (Pre-Conference Tutorial)
how it extends previous DirectX versions to enable stunning visual
Learn how to use NVIDIA OptiX to quickly develop high
effects in real-time graphics applications.
performance ray tracing applications for interactive rendering,
Speaker(s): Cem Cebenoyan (Senior Manager, Developer offline rendering, or scientific visualization. This session will
Technology, NVIDIA) explore the latest available OptiX version.
Topic(s): Computer Graphics, General Interest
Speaker(s): Dave McAllister (Development Lead, OptiX
Development, NVIDIA), Phillip Miller (Director,
Monday, Sept 20, 13:00 (80 minutes) Workstation Software Product Management, NVIDIA)
Room C
Topic(s): Ray Tracing, High Performance Computing,
2158 Pipeline with OpenGL (Pre-Conference Tutorial) Computer Graphics
This tutorial session teaches attendees how to program the
NVIDIA Quadro Digital Video Pipeline with OpenGL. It will go in- Monday, Sept 20, 16:00 (80 minutes)
depth into the techniques and recommended practices. Room C
Speaker(s): Thomas True (Applied Engineer, NVIDIA) 2010 Implementing Stereoscopic 3D in Your Applications
Topic(s): Programming Languages & Techniques, (Pre-Conference Tutorial)
FULL CONFERENCE GUIDE 2010

Video Processing, Computer Graphics Let’s dive into the 3rd dimension. This talk presents a
comprehensive technical overview of NVIDIA’s stereo technology
Monday, Sept 20, 14:30 (80 minutes) and tools. After a complete introduction to NVIDIA’s stereo
Marriott San Jose Ballroom technology, we will then explore in more detail production
techniques for the new artistic space of effects and creativity
2131 Introduction to CUDA C (Pre-Conference Tutorial)
offered by 3D stereo. The take away of this session will be a solid
Starting with a background in C or C++, learn everything you need understanding of NVIDIA’s stereo technology and how to take best
to know in order to start programming in CUDA C. Beginning advantage of it.
with a “Hello, World” CUDA C program, explore parallel
33
Speaker(s): Samuel Gateau (Developer Technology Engineer,
NVIDIA), Steve Nash (Applied Engineer, NVIDIA)
Topic(s): Programming Languages & Techniques,
Stereoscopic 3D

Monday, Sept 20, 16:00 (80 minutes)

Marriott San Jose Ballroom
2018 OpenCL on the GPU (Pre-Conference Tutorial)
OpenCL is Khronos’ new open standard for parallel programming
of heterogeneous systems. This tutorial session will introduce the
main concepts behind the standard and illustrate them with some
simple code walkthrough. Attendees will also learn how to make
efficient use of the API to achieve good performance on the GPU.
MONDAY

Speaker(s): Cliff Woolley (CUDA Developer Technology

Engineer, NVIDIA)
Topic(s): Tools & Libraries

Monday, Sept 20, 16:00 (80 minutes)

Room A3
2127 OpenGL (Pre-Conference Tutorial)
This session will discuss the latest OpenGL features offered by
NVIDIA for both Quadro and Geforce line of products. Learn more
about OpenGL 4 as well as NVIDIA specific OpenGL extensions.
Speaker(s): Mark Kilgard (Principal System Software
Engineer, NVIDIA)
Topic(s): Programming Languages & Techniques,

Monday, Sept 20, 16:00 (80 minutes)

Room B
2245 Parallel Nsight for Microsoft Visual Studio (Pre-
Conference Tutorial)
NVIDIA Parallel Nsight provides access to the power of the GPU
from within the familiar environment of Microsoft Visual Studio. In
this session, you will learn how to use Parallel Nsight to develop
GPU computing and graphics applications.

Learn how to use the powerful Parallel Nsight debugger to

identify errors in CUDA C/C++ kernels and HLSL shaders using
GPU breakpoints and direct memory and variable inspection.
See how Parallel Nsight displays system-wide performance
characteristics, allowing you to create efficient GPU algorithms.
Speaker(s): Kumar Iyer (Product Manager, NVIDIA)
Topic(s): Tools & Libraries
Welcome to the decade of smart.
A year ago, we began a global conversation about how our Leading retailers reduced supply chain costs by up to 30%
planet can become smarter. and increased sales by up to 10%.
One year into this new era, the signs of a smarter planet are With sophisticated mathematical models, we can actually
all around us. Smarter systems are creating value in every begin to predict and react to changes in our systems. New
major industry and across every region of both the developed York has smart crime fighting. Galway has smart water. A
and developing worlds. smart grid in Copenhagen keeps energy flowing.
Intelligence is being infused into the systems and processes We’ve learned a lot over the past year about what it takes to
that make the world work—into things no one would recognize build a smarter planet. We’ve also learned about the issues
as computers: cars, appliances, roadways, power grids, clothes, it raises—like protecting personal information and securing
even natural systems such as agriculture and waterways. critical infrastructures.
Trillions of digital devices, connected through the Internet, are The good news is that business leaders, policymakers and
producing a vast ocean of data. And that information can now government officials around the world are stepping up to
be turned into knowledge because we have the computational these challenges. Above all, they realize that we cannot let
power and advanced analytics to make sense of it all. this moment pass. The time to act is now, and the way to
In a study of 439 cities, those with transportation congestion act is together. The decade of smart is under way.
systems reduced average travel delays by more than 700,000 Let’s build a smarter planet. Join us and see what others
hours annually. are doing at ibm.com/smarterplanet
Eight hospitals and 470 primary care clinics improved clinical
results and operational efficiency by up to 10% through
information access at the point of care.

IBM, the IBM logo, ibm.com, Smarter Planet and the planet icon are trademarks of International Business Machines Corp., registered in
many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at www.ibm.com/legal/copytrade.shtml. © International Business Machines Corporation 2010.
Tuesday, Sept 21, 09:00 (90 minutes) Tuesday, Sept 21, 11:00 (50 minutes)
Keynote Hall Room A2
1001 Opening Keynote with Jen-Hsun Huang, NVIDIA 2096 High-Speed CT Reconstruction in Medical Diagnosis &
Do not miss this opening keynote, featuring Jen-Hsun Huang, Industrial NDT Applications
CEO and Co-Founder of NVIDIA and special guests. Hear about We present the software platform CERA developed by Siemens,
what’s next in computing and graphics, and preview disruptive which utilizes (multiple) graphics processing units (GPUs) in
technologies and exciting demonstrations from across industries. order to deliver high-speed CT reconstructions, and describe
its implementation challenges using CUDA and OpenCL. We
Jen-Hsun Huang co-founded NVIDIA in 1993 and has served since
further show how GPU acceleration enables the utilization
its inception as president, chief executive officer and a member of
of reconstruction approaches which provide highly improved
the board of directors.
reconstruction quality in NDT applications.
Speaker(s): Jen-Hsun Huang (CEO & Co-Founder, NVIDIA)
Speaker(s): Holger Scherl (Computer Scientist, Siemens AG)
Topic(s): General Interest
Topic(s): Medical Imaging & Visualization, Imaging

Tuesday, Sept 21, 11:00 (50 minutes)

Tuesday, Sept 21, 11:00 (50 minutes)
Marriott San Jose Ballroom
Room N
2223 Academic Welcome Social and Poster Review
TUESDAY

2112 The Heisenberg Spin Glass Model on GPU: Myth

This session is open to academic attendees only. We invite you to join
your fellow academics to preview this year’s NVIDIA Research Summit versus Fact
Posters and mingle with your colleagues. Included will be a special Dive into implementations of the 3D Heisenberg spin glass model
presentation from our 2010-2011 Graduate Fellowship recipients for GPUs. We will discuss results showing that fast shared
to showcase the research that earned them this prestigious award. memory gives better performance with respect to slow global
These students were selected from 268 applications in 28 countries.
memory only under certain conditions. Covers careful kernel
Their research confronts a variety of challenges of immense technical
and strategic importance, including light-transport simulation, tuning to achieve significant speedup with respect to a state-of-art
computer vision, programmability and optimization for heterogeneous high end multicore processor.
systems, and much more. We believe that these minds lead the future
Speaker(s): Massimo Bernaschi (Professor, Istituto Applicazioni
in our industry. Light snacks and drinks will be served.
del Calcolo - C.N.R.)
Speaker(s): Bill Dally (NVIDIA), David Luebke (NVIDIA),NVIDIA
Topic(s): Physics Simulation
Graduate Fellows
Topic(s): General Interest
Tuesday, Sept 21, 11:00 (50 minutes)
Room A3
Tuesday, Sept 21, 11:00 (50 minutes)
Room K 2119 Supercomputing for the Masses: Killer-Apps, Parallel
Mappings, Scalability and Application Lifespan
2047 Bridging Ray and Raster Processing on GPUs
Hear the latest on how supercomputing for the masses is
Explore new techniques in real time rendering. We will discuss a
changing the world. We will look at some of the one- to three-
system for ray traced global illumination (GI) carefully integrated
orders of magnitude faster killer apps and see how they do it. We
with a traditional raster renderer using an incremental irradiance
will discuss specific mapping to GPGPU hardware and techniques
cache. Covers novel GPU methods for spawning secondary GI
for high performance and near-linear scalability both within
rays on only visible cells, smoothly sampling the visible 3D cache
and across multiple GPGPUs. We will also consider software
into 2D, and incrementally ray traced spherical harmonics basis.
investment and the decades long longevity of some successful
Details applying a range of optimizations to achieve real-time
massively parallel Investments in multithreaded software,
frame rates with the OptiX ray tracing engine.
scalability, balance metrics, lack of consensus on programming
Speaker(s): Kenny Mitchell (Research Lead, Black Rock Studio) models, and lifecycle considerations.
Topic(s): Ray Tracing
Speaker(s): Robert Farber (Senior Scientist, PNNL)
Topic(s): High Performance Computing, Algorithms &
Tuesday, Sept 21, 11:00 (50 minutes) Numerical Techniques, Machine Learning & Artificial
Room C Intelligence, Physics Simulation
2079 A Fast, Scalable High-Order Unstructured
Compressible Flow Solver Tuesday, Sept 21, 11:00 (50 minutes)
We will describe a scalable and efficient high-order unstructured Room A7
compressible flow solver for GPUs. The solver allows the 2130 GPU Computing and a Revolution in Design
achievement of arbitrary order of accuracy for flows over Engineering
complex geometries. High-order solvers require more operations
Join design enginneering experts for a discussion of the
per degree of freedom, thus making them highly suitable for
technology needs limiting their use of GPUs and how vendors
massively parallel processors. Preliminary results indicate speed-
are addressing those needs. We will cover performance in
ups up to 70x with the Tesla C1060 compared to the Intel i7 CPU.
analysis, simulation, and rendering, as well as how to use GPUs to
Memory access was optimized using shared and texture memory.
accelerate and improve the engineering design process.
Speaker(s): David M. Williams (Ph.D. Candidate, Stanford
Speaker(s): Peter Varhol (HPC Editor, Desktop Engineering
University), Patrice Castonguay (Ph.D. Candidate,
Magazine)
Stanford University)
Topic(s): General Interest
Topic(s): Computational Fluid Dynamics, Algorithms &
Numerical Techniques, Physics Simulation
Tuesday, Sept 21, 11:00 (50 minutes) Tuesday, Sept 21, 11:00 (50 minutes)
Room A8 Room A5
2132 Accelerating Biologically Inspired Computer Vision 2267 GPU Computing with MATLAB®
Models MATLAB is a widely used tool for scientific, engineering and
Join us for a discussion on applying commodity-server-based financial applications. As the popularity of GPUs has grown,
clusters and GPU-based clusters to simulating computer vision there is strong interest from engineers and scientists who solve
algorithms at a scale that approaches that of biological vision. We computationally intensive problems to be able to leverage GPUs
consider the limitations of each technology, survey approaches within MATLAB and other products from MathWorks. This talk will
taken thus far, and suggest new hybrid models and programming discuss how MathWorks tools can help engineers and scientist
frameworks to overcome current limitations and substantially to take advantage of GPU resources while continuing to work in
improve performance. the familiar MATLAB environment. A range of capabilities will be
discussed and demonstrated.
Speaker(s): Tom Dean (Research Scientist, Google Inc.)
Topic(s): Computer Vision, Machine Learning & Speaker(s): Loren Dean (Director of Engineering, MATLAB
Artificial Intelligence Products, MathWorks)
Topic(s): Tools & Libraries
Tuesday, Sept 21, 11:00 (50 minutes)
Room A1 Tuesday, Sept 21, 11:30 (20 minutes)

TUESDAY
Room B
2165 Rendering Revolution
Learn how GPU technologies are transforming the making of 2149 Overview of Parallel Nsight for Visual Studio
pixels. This talk will cover GPU-centric rendering techniques NVIDIA Parallel Nsight provides access to the power of the GPU
that leverage both the raw computational capabilities of NVIDIA’s from within the familiar environment of Microsoft Visual Studio.
GPUs and advanced pixel-shading techniques for interactive This session is an entry level overview of the GPU computing and
visualization and rendering. graphics development features of Parallel Nsight as well as a
glimpse into the future of this powerful tool.
Speaker(s): Ken Pimentel (Director, Media & Entertainment,
Autodesk) Speaker(s): Kumar Iyer (Product Manager, NVIDIA)
Topic(s): Computer Graphics, Film Topic(s): Tools & Libraries

Tuesday, Sept 21, 11:00 (50 minutes) Tuesday, Sept 21, 12:00 (120 minutes)
Room D Exhibit Hall
2172 Unveiling Cellular & Molecular Events of Cardiac 1004 Exhibits Open / Networking Lunch
Arrhythmias Join your colleagues in the exhibit hall to preview emerging
George Mason University is using CUDA technology to get a technologies and see some of the most innovative solutions
20x speed-up in simulations of intracellular calcium dynamics, available today. Lunch will be served.
thought to play a major role in the generation of cardiac
Topic(s): General Interest
arrhythmias. We will discuss the novel algorithms we have
developed for Markov Chain Monte Carlo Simulation and their
Tuesday, Sept 21, 14:00 (50 minutes)
use in investigating elementary events of calcium release in the
Room A8
cardiac myocyte. The resulting extremely fast simulation time
has generated new insights into how defects in the control of 2013 iray - GPUs and the Photorealistic Rendering
intracellular calcium may lead to cardiac arrhythmia. Revolution
Speaker(s): Tuan Hoang-Trong (PhD student, George Hear about the ongoing revolution in the production of
Mason University) photorealistic imagery being powered by GPUs. We will explore
Topic(s): Life Sciences, Algorithms & Numerical Techniques,
the algorithms and concepts behind iray – a CUDA accelerated
Physics Simulation
software library from mental images/NVIDIA that provides
an interactive, push-button, fast synthetic digital camera in
software to a variety of OEM applications and platforms. We
Tuesday, Sept 21, 11:00 (50 minutes)
will demonstrate iray embedded in commercial CAD and Digital
Room L
Content Creation applications as well as in 3D cloud computing
2214 Faster Simulations of the National Airspace System platforms.
Learn about twenty-four hour, fast-time simulations of traffic Speaker(s): Michael Kaplan (Vice President of Strategic
in the National Airspace System, which use GPU technology to Development, mental images/NVIDIA)
help perform key steps in the trajectory prediction of flights.
Topic(s): Digital Content Creation (DCC), Cloud Computing,
GPUs enabled us to improve the runtime by up to two orders of
Ray Tracing
magnitude versus the previously required tens of minutes per
FULL CONFERENCE GUIDE 2010

execution. We will present a brief overview of the problem domain

Tuesday, Sept 21, 14:00 (50 minutes)
and a description of how the GPU has opened doors to uncharted
Room C
research areas.
2015 Efficient Tridiagonal Solvers for ADI methods and
Speaker(s): Joseph Rios (Research Aerospace Engineer, NASA)
Fluid Simulation
Topic(s): General Interest
Learn about new techniques to efficiently implement the
Alternating Direction Implicit method on GPU for large 2D and 3D
domains with complex boundaries.
37
A novel tridiagonal solver for systems with variable sizes and a Speaker(s): Everett Phillips (Applied Engineer - GPU Computing,
new hybrid approach will be covered in detail. Comprehensive NVIDIA), Massimiliano Fatica (Manager, NVIDIA)
performance analysis and key Fermi optimizations will be Topic(s): High Performance Computing, Algorithms &
explored. Numerical Techniques

Various applications of tridiagonal solvers such as 3D direct

numerical fluid simulation and a 2D depth-of-field effect for
Tuesday, Sept 21, 14:00 (50 minutes)
games will be briefly discussed.
Room A2

Speaker(s): Nikolai Sakharnykh (Developer Technology Engineer,

2094 Nearly Instantaneous Reconstruction for MRIs
NVIDIA) GE’s Autocalibrating Reconstruction for Cartesian Imaging (ARC)
Topic(s): Algorithms & Numerical Techniques, Computational is a computationally intensive, widely used algorithm in MRI
Fluid Dynamics Reconstruction using Parallel Imaging. We demonstrate that an
optimized CUDA implementation of ARC on a GPU can enable
nearly instantaneous reconstruction and speedups of up to 10x
Tuesday, Sept 21, 14:00 (50 minutes)
over an optimized dual socket QuadCore CPU implementation. We
Room A5
will discuss challenges both with computational intensity and data
2019 GPU-Accelerated Internet Technologies & Trends read/write efficiency. We will also compare the Fermi C2050 with
Join us for a whirlwind demo-punctuated tour of up-and- the C1060.
TUESDAY

coming technologies that promise to bring GPU acceleration to

Speaker(s): Babu Narayanan (Lab Manager, GE Global Research)
the Worldwide Web. We’ll cover 2D graphics, 3D graphics and
Topic(s): Medical Imaging & Visualization, High Performance
video. In addition to summarizing the emerging standards and
Computing
technologies, performance test results showing how they scale
on various GPUs will be presented, along with recommendations
for how to design for best performance. Finally, adoption trends
Tuesday, Sept 21, 14:00 (50 minutes)
and ecosystem dynamics will be summarized. Attendees should
Room B
leave with a richer understanding of the possibilities enabled by 2150 Parallel Nsight: Debugging Massively Parallel
the GPU-Accelerated Web, and new insights into when and how it Applications [Advanced]
will matter. Data parallel algorithms that provide real-time financial options
Speaker(s): Chris Pedersen (Market Development Manager, pricing or identification of hidden oil reserves are utilizing the
NVIDIA) massively parallel nature of the GPU for industry changing
Topic(s): GPU Accelerated Internet, Stereoscopic 3D, performance gains. Developers require industry standard
Video Processing development tools to create the software that accomplishes these
parallel tasks.
Tuesday, Sept 21, 14:00 (50 minutes) NVIDIA Parallel Nsight delivers the power of the GPU within the
Room K familiar environment of Microsoft Visual Studio. In this session,
2028 Mathematica for GPU Programming you will learn advanced techniques for debugging CUDA C/
C++ and DirectCompute code using Parallel Nsight, including
Mathematica is widely used in scientific, engineering,
conditional and data breakpoints as well as out of bound GPU
mathematical fields and education. In this session, new tools for
memory access detection.
general GPU programming in the next release of Mathematica are
presented. These tools build on top of Mathematica’s technology Speaker(s): Sebastien Domine (Sr. Dir. Developer Tools, NVIDIA)
which provides a simple, yet powerful, interface to the large base Topic(s): Tools & Libraries, Programming Languages &
of compiling tools. Applications of CUDA and OpenCL from within Techniques
Mathematica will be presented. These examples will provide
a general overview of the powerful development environment Tuesday, Sept 21, 14:00 (50 minutes)
for GPU programming that Mathematica can offer not just for Room A1
researchers but for anybody with basic knowledge of Mathematica
and GPU programming.
2152 Using Virtual Texturing to Handle Massive Texture
Data
Speaker(s): Ulises Cervantes-Pimentel (Senior Kernel Developer, A virtual texture implementation allows applications the ability
Wolfram Research) to manage gigantic amounts of texture data for rendering
Topic(s): Programming Languages & Techniques, Algorithms complex data sets. However, practical utilization involves feeding
& Numerical Techniques, Imaging, Tools & Libraries it adequate data. The GPU offers a powerful engine capable of
accelerating the transcoding of efficient storage formats into
Tuesday, Sept 21, 14:00 (50 minutes) formats useful for rendering. This session will demonstrate a
Room A3 virtual texturing implementation and the steps needed to GPU
2057 CUDA-Accelerated LINPACK on Clusters accelerate the non-rendering portions of managing and loading
the virtual texture data.
This talk will illustrate the use of GPUs to accelerate the LINPACK
benchmark on clusters with GPUs, where both the CPUs and Speaker(s): Evan Hart (Software Engineer, NVIDIA)
the GPUs are used in synergy. The acceleration is obtained Topic(s): Computer Graphics
executing DGEMM (matrix multiply) and DTRSM (for the solution
of triangular systems) calls simultaneously on both GPU and CPU
cores. Details of the implementation will be presented together
with results that shows how effective the solution is, both for
performance and power efficiency.
Tuesday, Sept 21, 14:00 (50 minutes) Speaker(s): Mark Govett (Chief, Advanced Computing Section,
Room A7 NOAA Earth System Research Laboratory)
Topic(s): General Interest
2222 Working Man’s Guide to 3D Video Editing
Video editing is currently at two simultaneous inflections points:
Tuesday, Sept 21, 14:00 (20 minutes)
use of GPUs for video processing and the beginning of wide spread
Room N
adoption of 3D. At this time however, identifying and navigating
through the necessary tools and equipment to create compelling 2299 Integrating CUDA BLAS with IMSL Fortran
3D video content is challenging. As GPU hardware becomes more prevalent in both research and
commercial institutions, software that takes advantage of this
This session is intended to provide a pragmatic guide to creating
specialized hardware is growing in demand. In many cases, it
prosumer 3D video content and how the GPU greatly
is infeasible or impossible to rewrite an existing program to run
assists and speeds up this process. entirely on the GPU, so the goal is often to offload as much work
as possible. As the IMSL Library team at Rogue Wave Software
The intended audience is anyone interested in how to create
considers how best to tackle the GPU realm with a general
compelling 3D movies at a prosumer level.
mathematical library, the IMSL Fortran Library takes an initial
Speaker(s): Ian Williams (Director PSG Applied Engineering, step where the CUDA BLAS library is utilized to offload CPU
NVIDIA), Kevan O’Brien (NVIDIA) work to GPU hardware. This presentation will discuss the

TUESDAY
Topic(s): Digital Content Creation (DCC), Stereoscopic 3D approach and architecture of the solution. Benchmark results
will show where success has been found. Plans for future products
Tuesday, Sept 21, 14:00 (50 minutes) will also be covered.
Room M Speaker(s): Chris Gottbrath (Principal Product Manager,
2233 Solving Your GPU Computing Needs (Sponsored by HP) TotalView Technologies, Inc., a Rogue Wave
Software company)
In this session we will go into detail and you will learn about HP’s
GPU enabled systems, from Workstations to our GPU enabled Topic(s): Tools & Libraries
servers and clusters. You will get the latest information on
configurations, options, GPU management and use cases. Tuesday, Sept 21, 14:00 (50 minutes)
Room D
Speaker(s): Dave Korf (Marketing, HP), Will Wade (Business
Alliance Manager, HP) 2303 Using Tegra to Solve The Electric Car Power Dilemma
Topic(s): High Performance Computing Explore how advanced SoC technologies are transforming the
world of automotive industry. Learn on how using nVidia Tegra
Tuesday, Sept 21, 14:00 (50 minutes) increased the available range while pushing the envelope on next-
Marriott San Jose Ballroom gen driver experience. Sharing the lessons learned in the world
of electric cars and challenges in constructing a mass production
2262 CUDA Centers of Excellence Super-Session I electric vehicle.
Come hear about the groundbreaking research taking place at
Speaker(s): Theo Valich (President, Bright Side Network Inc.)
the CUDA Centers of Excellence, an elite group of world-renown
research universities that are pushing the frontier of massively Topic(s): Embedded & Automotive, Computer Vision, Video
parallel computing using CUDA. Researchers from these top Processing, Computer Graphics
institutions will survey cutting-edge research that is advancing
the state of the art in GPU computing and dozens of application Tuesday, Sept 21, 15:00 (50 minutes)
fields across science and engineering. Room A3

In this session we will hear from Professor Hanspeter Pfister of 2017 Lessons Learned Deploying the World’s First GPU-
Harvard University and Professor Jeff Vetter of Georgia Tech and Based Petaflop System
Oak Ridge National Laboratory. Learn what to expect when deploying PetaFLOP or larger systems.
The June 2010 list of the Top 500 computer systems featured the
Speaker(s): Hanspeter Pfister (Professor, Harvard University),
first GPU based cluster to exceed 1 PetaFLOP of foating point
Jeffrey Vetter (Professor, Georgia Tech / Oak Ridge
power -- a system that was built in a fraction of the time and the
National Laboratory)
cost a CPU-only system of that performance would have required.
Topic(s): General Interest An overview of how system builders and administrators should
prepare for large-scale HPC deployments.
Tuesday, Sept 21, 14:00 (50 minutes)
Room L Speaker(s): Dale Southard (Senior Solution Architect, NVIDIA)
Topic(s): High Performance Computing
2276 Using GPUs to Run Next-Generation Weather Models
We are using GPUs to run a new weather model being developed Tuesday, Sept 21, 15:00 (50 minutes)
at NOAA’s Earth System Research Laboratory (ESRL) called
FULL CONFERENCE GUIDE 2010

Room A2
the Non-hydrostatic Icosahedral Model (NIM). NIM is slated to
run at high resolution (4km global scale) within two years. This 2074 Driving a Product from Rasterization to Ray Tracing:
presentation will highlight work required to parallelize and run The Developer Experience
the NIM. We will describe progress running on multiple GPUs, Learn from the challenges encountered while using DirectX to
report on our evaluation of two FORTRAN GPU compilers, and update the Bunkspeed Move rasterization engine to work with
give performance updates of NIM using Fermi. We will also Mental Images’ iRay. This work was part of the creation of
discuss special challenges developing and running operational Bunkspeed Shot, which allows the user to leverage both the high
weather models on GPUs. quality image generation of iRay and a highly interactive, good
39
CUDA The VGK platform is a collection of software
tools that enable Flash content to run
Open GL
on a variety of systems, from low-power
ActionScript Accelerator electronics and mobile devices to high-end
3D desktop and game systems. Flash can be
Multi Parallel Animations
integrated into your 2D and 3D applications
3D Apps with ease.

PUT SOME FLASH IN YOUR APPS!

Go to www.animatedmedia.ca for our demos

Contact Us: [email protected] or 416.977.7187

Follow us on Twitter @animatedmedia
quality rasterization engine (used for quick setup of a scene). Speaker(s): Allan Peter Engsig-Karup (Assistant Professor,
Covers major differences between a ray tracing based interactive Scientific Computing, Technical University
system, including GPU based ray tracing, and a traditional GPU of Denmark)
rasterization engine. Topic(s): Computational Fluid Dynamics, Algorithms &
Numerical Techniques, Physics Simulation
Speaker(s): Nick Gebbie, (Senior Graphics Programmer,
Bunkspeed)
Topic(s): Ray Tracing
Tuesday, Sept 21, 15:00 (50 minutes)
Room A8
Tuesday, Sept 21, 15:00 (50 minutes) 2113 WebGL: Bringing 3D to the Web
Room C WebGL is a newly-emerging standard for 3D graphics and visual
2085 Tridiagonal Solvers: Auto-Tuning and Optimizations computing on the web. Supported and developed by major web
browser vendors, WebGL enables rich interactive 3D graphics
In this presentation, we will discuss and analyze the performance
delivered through a web browser, on both desktop and mobile
of three optimization techniques for tridiagonal solvers. We
platforms. This session will contain an introduction to WebGL,
first present a hybrid Parallel Cyclic Reduction(PCR)-Gaussian
and will focus application development issues unique to the web
Elimination(GE) tridiagonal solver, which combines work-efficient
platform, optimization concerns, and how web technologies
and step-efficient algorithms for high performance. We further
such as offline app support, HTML5 video and audio, File and
discuss an auto-tuned variant of this technique which selects the

TUESDAY
WebSockets integrate with WebGL. Experienced OpenGL
optimal switching point between algorithms on a per-machine
developers will learn how to transition their knowledge to WebGL
basis. Next, we present a technique to handle large systems,
development.
where shared memory constraints prohibit previous work to solve
these systems directly. Finally, we will discuss optimizations on Speaker(s): Vladimir Vukicevic (Principal Engineer, Mozilla
a cyclic reduction technique that avoid bank conflicts on current Corporation)
hardware. Topic(s): GPU Accelerated Internet, Tools & Libraries,
Computer Graphics
Speaker(s): Andrew Davidson (Graduate Student, University
of California, Davis), Yao Zhang (Graduate Student,
University of California, Davis) Tuesday, Sept 21, 15:00 (50 minutes)
Topic(s): Algorithms & Numerical Techniques, Computational
Room B
Fluid Dynamics 2147 GPGPU Development for Windows HPC Server
Attend this demo-driven session to see how to schedule jobs to
Tuesday, Sept 21, 15:00 (50 minutes) a Windows compute cluster that includes GPUs. We will also
Room N demonstrate GPU-enhanced versions of some commonly used
2090 Developing Highly Scalable Particle-Mesh Codes for HPC open-source codes, and show how NVIDIA Parallel Nsight™
GPUs: A Generic Approach can be used to debug GPU applications on a cluster. Provides
a brief introduction to performance profiling tools that allow
Dive deep into a multi-parallel Particle in Cell code that utilizes
developers to analyze system, CPU and GPU events.
MPI, pthreads, and CUDA. Around this specific application a
general C++ framework for transparent data transfers between Speaker(s): Calvin Clark (Senior Consultant, Microsoft)
GPUs has been developed and will be presented. Further Topic(s): High Performance Computing, Tools & Libraries
techniques employed include interleaving of communication
and computation, particle tiling and a study of how well CUDA Tuesday, Sept 21, 15:00 (20 minutes)
performance can be transferred to OpenCL. Room K
Speaker(s): Guido Juckeland (Senior System Engineer (HPC), 2148 Rapid Prototyping and Visualization with OpenCL
Leader Hardware Accelerator Group, TU Dresden Studio
- ZIH), Michael Bussmann (Junior Group Leader Learn about OpenCL Studio, an integrated OpenCL and OpenGL
Computational Radiation Physics, development environment for parallel programming and
Forschungszentrum Dresden-Rossendorf) visualization. We will discuss building end user applications and
Topic(s): Physics Simulation, Astronomy & Astrophysics, High using its integrated visualization capabilities to better understand
Performance Computing the output and internal structure of parallel algorithms. We
will also demonstrate its capabilities using several sample
Tuesday, Sept 21, 15:00 (50 minutes) applications including particle systems, volumetric rendering,
Room L and image processing.
2103 Development of an Efficient GPU-Accelerated Model Speaker(s): Jochen Stier (Founder, Geist Software Labs)
for Fully Nonlinear Water Waves Topic(s): Tools & Libraries
This work is concerned with the development of an efficient high-
FULL CONFERENCE GUIDE 2010

throughput scalable model for simulation of fully nonlinear water Tuesday, Sept 21, 15:00 (50 minutes)
waves (OceanWave3D) applicable to solve and analyze large-scale Room A7
problems in coastal engineering. The goal can be achieved through
algorithm redesign and parallelization of an optimized sequential
2224 GPU Acceleration in Adobe Creative Tools
single-CPU algorithm based on a flexible-order Finite Difference Hear experts explain how Adobe Creative Suite 5 harnesses
Method. High performance is pursued by utilizing many-core the power of CUDA technology in several of its core software
processing in the model focusing on GPUs for acceleration of code applications. We will focus on the complete redesign of the
execution. This involves combining analytical methods with an core video playback and rendering engine in Adobe Premiere
algorithm redesign of the current numerical model. Pro CS5 and how it uses the power of GPUs to deliver superior
41
performance and change the game for Adobe in professional Tuesday, Sept 21, 15:00 (50 minutes)
video production. Room M
Speaker(s): Paul Young (Adobe), Steve Hoeg (Adobe), 2270 Appro’s GPU Computing Solutions
Al Mooney (Adobe) Learn how GPU’s are changing the High Performance Computing
Topic(s): Video Processing, Imaging landscape to deliver price/performance levels that were previously
considered unachievable. Join Appro (http://www.appro.com),
Tuesday, Sept 21, 15:00 (50 minutes) a leading provider of supercomputing solutions; to discuss the
Room A1 introduction of the Appro Tetra server, the most powerful GPU
2227 OpenGL 4.0 Tessellation for Professional Applications server available today in a 1U form factor and the availability
of a new modular GPU expansion blade, both based on NVIDIA
The new generation of accelerated graphics is elevating
Tesla 20-series GPUs. The availability of these two products is
visual computing to new heights. Tessellation, one of its most
a confirmation of Appro’s commitment in providing the most
anticipated features, is already used in many scenarios to bring
innovative and powerful computing platforms at very attractive
3D graphics to an unprecedented level of realism.
prices to the High Performance Computing markets.
This talk will introduce tessellation using OpenGL 4.0. We will also Speaker(s): John Lee (Vice President, Appro International, Inc.)
describe how an existing application can be adapted to efficiently
Topic(s): High Performance Computing
take advantage of this new feature and also how to overcome
TUESDAY

some of the challenges.

Tuesday, Sept 21, 15:30 (20 minutes)
Speaker(s): Philippe Rollin (Applied Engineer, NVIDIA) Room K
Topic(s): Computer Graphics, Tools & Libraries
2268 Think Data-Parallel! Building Data-Parallel Code
with M
Tuesday, Sept 21, 15:00 (50 minutes)
Discover and leverage parallelism inherent in pre-existing
Room A5
codes. Often times, parallelism is hidden in seemingly serial
2235 Advanced Medical Volume Rendering and programs. This is due obfuscation via indexing or looping wherein
Segmentation on the GPU the parallelism is seemingly non-existent. Several real-world
Learn how to speed up your interactive medical visualization examples of seemingly serial code demonstrate simple, yet
pipeline by an order of magnitude and dramatically improve surprisingly effective rules for detecting potential parallelism.
rendering quality at the same time. Leading researchers in
For each example, learn how to express the code at a higher,
medical imaging informatics describe recent advances in volume
more concise level in M by vectorizing computations. We give
visualization and interactive segmentation.
several canned techniques of vectorization for many common, and
Emphasis is on the underlying parallel GPU algorithms and sometimes very difficult, use cases.
acceleration data structures.
Learn how such vectorization concisely brings the parallelism of
Speaker(s): Mike Roberts (Research Assistant, Hotchkiss code to the forefront and transforms programs that might have
Brain Institute, University of Calgary, Canada), Eric been originally difficult to run on a SIMT device very suitable for
Penner (Research Associate, Hotchkiss Brain execution on the GPU. GPU speedups will be shown utilizing
Institute, University of Calgary, Canada) Jacket.
Topic(s): Medical Imaging & Visualization, Algorithms & Speaker(s): Gallagher Pryor (VP of Research, AccelerEyes)
Numerical Techniques, Computer Vision,
Topic(s): General Interest
Computer Graphics

Tuesday, Sept 21, 16:00 (50 minutes)

Tuesday, Sept 21, 15:00 (50 minutes)
Room A1
Marriott San Jose Ballroom
2022 Solving PDEs on Regular Grids with OpenCurrent
2263 CUDA Centers of Excellence Super-Session II
OpenCurrent is an open source library with support for structured
Come hear about the groundbreaking research taking place at
3D grids and various PDE solvers that operate on them, including
the CUDA Centers of Excellence, an elite group of world-renown
a multigrid Poisson solver and an incompressible Navier-Stokes
research universities that are pushing the frontier of massively
solver. It also includes extensions for splitting grids across
parallel computing using CUDA. Researchers from these top
multiple GPUs. This talk will provide a basic introduction to the
institutions will survey cutting-edge research that is advancing
code base and its design principles.
the state of the art in GPU computing and dozens of application
fields across science and engineering. Speaker(s): Jonathan Cohen (Senior Research Scientist,
NVIDIA Research)
In this session we will hear from Dr. Wei Ge at the Chinese
Topic(s): Computational Fluid Dynamics
Academy of Science, Professor Amitabh Varshney at the University
of Maryland, and Adjunct Assistant Professor Stan Tomov at the
Tuesday, Sept 21, 16:00 (50 minutes)
University of Tennessee – Knoxville.
Room A2
Speaker(s): Stan Tomov (University of Tennessee), Amitabh
2036 Algorithms for Automated Segmentation of Medical
Varshney (Professor, University of Maryland), Wei
Imaging Studies Utilizing CUDA
Ge (Professor, Institute of Process Engineering,
Chinese Academy of Sciences) Discover how GPU computing can help doctors make sense of
Topic(s): General Interest
modern imaging studies. This session is intended for a general
audience as well as medical informatics specialists. The focus

will be on algorithmic approaches to segmentation as it pertains
to CTA (computed tomography angiography) studies. Topics Speaker(s): Charles Loop (Senior Researcher,
covered will include specialized optimization algorithms and novel Microsoft Research)
lumen tracking methodologies. Topic(s): Computer Graphics
Speaker(s): Supratik Moulik (University of Pennsylvania)
Topic(s): Medical Imaging & Visualization, Computer Vision
Tuesday, Sept 21, 16:00 (50 minutes)
Room B
Tuesday, Sept 21, 16:00 (50 minutes) 2151 Parallel Nsight: Analyzing and Optimizing Massively
Room A3 Parallel Applications [Advanced]
2052 Power Management Techniques for Heterogeneous Life altering products that provide early detection of breast cancer
Exascale Computing or simulate molecular behavior, accelerating drug discovery,
are becoming reality thanks to the power of the GPU. As these
Power consumption has become the leading design constraint
technologies become mainstream, mainstream tools are required
for large scale computing systems. In order to achieve exascale
to support these development efforts.
computing, system energy efficiency must be improved
significantly. Our approach will focus on investigating software NVIDIA Parallel Nsight delivers the power of the GPU within
methodologies to achieve energy efficient computing on the familiar environment of Microsoft Visual Studio. In this
heterogeneous systems accelerated with GPUs. session, you will learn advanced techniques for visualizing your
application’s workloads and performance characteristics across

TUESDAY
Speaker(s): Xiaohui Cui (Research Scientist, Oak Ridge
the CPU, GPU, and operating system, and explore the depths of
National Laboratory)
Parallel Nsight profilers, including GPU performance counters
Topic(s): High Performance Computing
and how to use them.

Tuesday, Sept 21, 16:00 (50 minutes) Speaker(s): Sebastien Domine (Sr. Dir. Developer Tools, NVIDIA)
Room C Topic(s): Tools & Libraries, Programming Languages &
Techniques
2056 Next-Generation Rendering with CgFX
Dive into the details of using CgFX – Cg’s effect framework – to
Tuesday, Sept 21, 16:00 (50 minutes)
combine ray-tracing with real-time rendering and enable the next
Room A7
generation of complex high-quality rendering. You will learn how
to use CgFX to create complex rendering effects in a concise and 2161 NVIDIA Quadro Digital Video Pipeline Overview
elegant fashion by: This session will provide an overview of the Quadro Digital Video
Pipeline. It will cover a description of the DVP components,
• Blending material-level and scene-level effects in a consistent way,
application architectures software architectures, and
• Seamlessly integrating CUDA-based data processing within the programming resources available.
CgFX rendering pipeline,
Speaker(s): Thomas True (Applied Engineer, NVIDIA)
• Mixing OptiX-based rendering with CgFX and OpenGL. Topic(s): Computer Graphics, Video Processing, Programming
Languages & Techniques
Speaker(s): Tristan Lorach (Computer Graphics Engineer,
NVIDIA)
Tuesday, Sept 21, 16:00 (20 minutes)
Topic(s): Computer Graphics
Room K
Tuesday, Sept 21, 16:00 (50 minutes) 2179 GPU - An R Library for Native GPU Objects
Room A5 Come learn about the GPU R package. R is the widely popular
open source statistical programming language. The GPU package
2067 Experiences with Code Optimizations for High
extends R by providing GPU-based types, classes and methods
Performance GPGPU Programs
implementing GPU versions of R vectors, matrices, lists and
Attend this session to learn and share code optimizations to
data frames. Subsequent operations with these are executed
achieve high performance GPU computing. We will cover code
on the GPU. Users are not required to create special bindings or
transformations for memory coalesing, workload management at
implement special syntax, nor do they need copy objects between
both thread and thread-block levels, and different ways to handle
CPU and GPU. The GPU packages allows programmers access
memory partition conflicts. We will also discuss Integration of
to the computational power of GPUs with little modification to
code optimizations into a compiler.
existing code.
Speaker(s) Huiyang Zhou (Associate Professor, North Carolina
Speaker(s): Christopher Brown (Partner, Open Data)
State University), Yi Yang (Ph.D. student, North
Topic(s): Tools & Libraries, Algorithms & Numerical
Carolina State University)
Techniques, High Performance Computing
Topic(s): Programming Languages & Techniques

Tuesday, Sept 21, 16:00 (50 minutes)

FULL CONFERENCE GUIDE 2010

Tuesday, Sept 21, 16:00 (50 minutes) Room M

Room N
2247 Reconfiguring a Pool of GPUs on The Fly (Sponsored
2129 Hardware Subdivision and Tessellation of Catmull-
by NextIO)
Clark Surfaces
Today’s HPC applications break down large data set problems
See how the new DirectX 11 Hardware Tessellation and Compute
into smaller, independent elements solved by massively parallel
Shader can be used to implement an adaptive Catmull-Clark
processor systems. GPU’s as co processing devices are optimized
subdivision surface renderer. We use a table driven approach to
for this task and their popularity in technical computing is rapidly
performing Catmull-Clark subdivision in parallel utilizing one
thread per output mesh vertex.
43
advancing. Like many rapidly advancing technologies, they Tuesday, Sept 21, 16:00 (50 minutes)
leave in their wake new and challenging problems. In the effort Room L
to cut costs while increasing performance, damaging ripple-
2295 Large-Scale CFD Applications and a Full GPU
effects can occur; resources can be over or under provisioned,
Implementation of a Weather Prediction Code on the
inventory difficult to manage, lots of single points of failure mean
TSUBAME Supercomputer
constant job interruptions, manual reconfiguration of resources
are required for each job, servicing and lifecycle management Many CFD applications have been successfully accelerated on
require outages. Most of these problems can be addressed and GPUs, but for large-scale simulations that require memory
overcome by combining GPU resources into managed, structured beyond a single GPU, communication is required between GPUs
pools. NextIO will present and demonstrate a new and innovative over cluster nodes through PCI-Express and interconnects.
approach to consolidating and managing pools of NVIDIA GPU To overcome performance bottlenecks and preserve parallel
resources along with the cost and operational savings benefits scalability, an overlapping technique between computation and
associated with top of rack GPU consolidation appliances. communication is essential. This work presents results of an
LBM for incompressible flow, and a Tsunami simulation solving
Companies that consolidate GPU resources in a top of rack the shallow water equation for simulations on the NVIDIA
appliance can reduce GPU operational costs by $1000 per GPU Tesla-based TSUBAME supercomputer of Tokyo Tech. In addition
per year which in most cases pays for the GPU itself. NextIO results will be presented on a complete GPU implementation of
will present the TCO models for a 500 GPU installation using a production-level weather prediction code developed by the JMA
traditional upgrade methodology vs. a top of rack appliance
TUESDAY

that achieves 15 TFLOPS for an 80-fold speedup.

solution which reduces installation time to 5 minutes per GPU
Speaker(s): Takayuki Aoki (Professor, Tokyo Institute
without incurring application downtime on server nodes.
of Technology)
Speaker(s): K.C. Murphy (Vice President of Marketing, NextIO) Topic(s): Computational Fluid Dynamics
Topic(s): High Performance Computing, Cloud Computing
Tuesday, Sept 21, 16:30 (20 minutes)
Tuesday, Sept 21, 16:00 (50 minutes) Room K
Marriott San Jose Ballroom
2111 Using R for High-Performance Data Analysis
2264 CUDA Centers of Excellence Super-Session III Data analysis is the art and the science of getting the correct
Come hear about the groundbreaking research taking place at quantitative models and their numerical parameters from the
the CUDA Centers of Excellence, an elite group of world-renown observed data. In this talk, we report on a project to integrate
research universities that are pushing the frontier of massively CUDA into the open source data analysis environment R. The
parallel computing using CUDA. Researchers from these top combined use of the CPU and GPU resources can efficiently
institutions will survey cutting-edge research that is advancing exploit the significant amount of data parallelism inherent in most
the state of the art in GPU computing and dozens of application data analysis problems and methods. This makes interactive
fields across science and engineering. analysis possible even for large, compute-intensive problems.
In this session we will hear from Dr. Wen-mei Hwu at the The implementation and the achievable performance gains will be
University of Illinois at Urbana – Champaign, Professor Yangdong demonstrated on a concrete example from quantitative finance.
Deng at Tsinghua University and Dr. Charles D. Hansen at the Speaker(s): Domokos Vermes (Associate Professor, Worcester
University of Utah. Polytechnic Insitute)
Speaker(s): Wen-mei Hwu (Professor, University of Illinois, Topic(s): Tools & Libraries, Databases & Data Mining,
Urbana-Champaign), Yangdong Deng (Associate Finance, Life Sciences
Professor, Tsinghua University), Charles Hansen
(Professor, University of Utah) Tuesday, Sept 21, 17:00 (50 minutes)
Topic(s): General Interest Room A2
2009 4D Visualization and Analysis of Flow
Tuesday, Sept 21, 16:00 (50 minutes) 4D flow or vector data is now common in CFD simulations as well
Room A8 as acquisition techniques like 4D flow MRI to study abnormal
2274 Harnessing the Power of the GPU in Internet Explorer 9 blood flow patterns. We show how by mixing compute and
Internet Explorer 9 is bringing the power of modern GPUs to graphics combined with stereo we are now able to interactively
Web. Thanks to hardware accelerated graphics, the websites analyze and visualize the resulting data to understand abnormal
that you use every day become faster and developers can create flow patterns. Topics include flow field rendering, computing
new classes of web applications which were previously not derived quantities, merging volumetric rendering with computed
possible. This session will provide an inside look into how Internet geometry such as particles and surfaces, and integration 3d
Explorer was redesigned to leverage the GPU. We’ll show detailed vision stereo.
performance results, discuss our architectural approach, and Speaker(s): Shalini Venkataraman (Applied Engineer, NVIDIA)
look at the impact of the GPU on HTML5. A session by engineers Topic(s): Medical Imaging & Visualization, Computational
for engineers with lots of fun demos. Fluid Dynamics, Stereoscopic 3D
Speaker(s): Jason Weber (Internet Explorer Performance
Lead, Microsoft)
Topic(s): GPU Accelerated Internet
Tuesday, Sept 21, 17:00 (20 minutes) experimental data, supports multi-gpu simulation, and can run
Room M up-to 6300x6300 domain sizes at 320 million cells per second on
the GTX 480.
2026 MatCloud: Accelerating Matrix Math GPU Operations
with SaaS Speaker(s): André Rigland Brodtkorb (Research Scientist,
We present MatCloud (www.mat-cloud.com), a cloud SINTEF ICT)
infrastructure and service for scientific computing using state- Topic(s): Physics Simulation, Computational Fluid Dynamics
of-the-art GPU clusters. MatCloud is a service infrastructure
exposed by a simple web terminal interface to run Matlab-like Tuesday, Sept 21, 17:00 (50 minutes)
commands/scripts. Join us to see how GPU technology can not Room A7
only be applied to cloud computing community, but also boost the
2205 A Highly Reliable RAID System Based on GPUs
adoption of cloud computing for its dramatic performance gains
over traditional cloud infrastructures. MatCloud is an in-progress While RAID is the prevailing method of creating reliable secondary
academic project and is under active development. storage infrastructure, many users desire more flexibility
than offered by current implementations. To attain needed
Speaker(s): Xing Wu (Research Assistant, North Carolina State performance, customers have often sought after hardware-
University), Frank Mueller (Associate Professor, based RAID solutions. This talk describes a RAID system that
North Carolina State University) offloads erasure correction coding calculations to GPUs, allowing
Topic(s): Cloud Computing, Tools & Libraries increased reliability by supporting new RAID levels while

TUESDAY
maintaining high performance.
Tuesday, Sept 21, 17:00 (50 minutes)
Speaker(s): Matthew Curry , Sandia National Laboratories and
Room A8
the University of Alabama at Birmingham
2060 GPUs in a Flash: Mapping the Flash Animated Topic(s): High Performance Computing
Software Vector Rendering Model to the GPU
Explore the Flash rendering architecture including the challenges Tuesday, Sept 21, 17:00 (50 minutes)
of mapping from an animated software vector rendering model Room B
to a GPU. We will also discuss how the landscape of mobile,
2212 Parallel Nsight for Accelerated DirectX 11
desktop, devices, drivers, and APIs impacts the design and
Development [Advanced]
deployment of a GPU based Flash Player.
Parallel Nsight is NVIDIA’s new development environment for
Speaker(s): Lee Thomason (Principal Scientist, Adobe Systems) graphics and GPU computing. In this advanced session, you
Topic(s): GPU Accelerated Internet will learn how Parallel Nsight can accelerate debugging and
profiling of Direct3D 11 applications. Attendees will learn how
Tuesday, Sept 21, 17:00 (50 minutes) to debug Direct3D frames and HLSL shaders using Parallel
Room A5 Nsight’s powerful Graphics Inspector and Debugger which
allows developers to inspect Direct3D resources and state, set
2084 State of the Art in GPU Data-Parallel Algorithm
breakpoints in HLSL shaders, examine shader variables, and
Primitives
see which graphics primitives are live on the GPU. Attendees
Learn about the importance of optimized data-parallel algorithm
will also learn how to use the Frame Profiler to capture and
primitives as building blocks for efficient real-world applications.
mine performance information, and easily pinpoint bottlenecked
Fundamental parallel algorithms like sorting, parallel reduction,
GPU units.
and parallel scan are key components in a wide range of
applications from video games to serious science. This session Speaker(s): Simon Barrett (Senior Software Engineer, NVIDIA)
will cover the state of the art in data-parallel primitive algorithms Topic(s): Programming Languages & Techniques,
for GPUs. Starting with an explanation of the purpose and Computer Graphics
applications of the algorithms, we will discuss key algorithm
design principles, demonstrate current open source algorithm Tuesday, Sept 21, 17:00 (50 minutes)
libraries for GPUs (CUDPP and Thrust), describe optimizations Room A3
using new features in the Fermi architecture, and explore future
2225 Tools for Managing Clusters of NVIDIA GPUs
directions.
Learn about the suite of tools NVIDIA provides to manage
Speaker(s): Mark Harris (Senior Developer Technology large installations of GPUs from the NVIDIA Tesla Series. The
Engineer, NVIDIA) presentation will cover cluster management – tool and library – ,
Topic(s): Algorithms & Numerical Techniques, High as well as the GPUDirect technology that enables GPUs to
Performance Computing, Tools & Libraries communicate faster across the network.
Speaker(s): Peter Buckingham (Tesla Software Manager,
Tuesday, Sept 21, 17:00 (50 minutes)
NVIDIA), Andrew Iles (Software Engineer, NVIDIA)
Room N
Topic(s): Tools & Libraries
FULL CONFERENCE GUIDE 2010

2102 Evacuate Now? Faster-than-real-time Shallow Water

Simulation on GPUs
Learn how to simulate a half an hour dam break in 27 seconds!
We present how shallow water simulation with interactive
visualization is successfully mapped to modern graphics
hardware. Featuring a live demo, we will present interactive
shallow water simulations running on a standard laptop.
The implementation has been verified against analytical and
45
Tuesday, Sept 21, 17:00 (50 minutes) Performance information can also be displayed in the object
Room L cache. The environment provides the developer an environment
where he can focus on developing high performance functionality,
2239 Fast GPU Preconditioning for Fluid Simulations in
and all intermediate layers of interface are taking care of by the
Film Production
environment.
Explore how a less efficient, but highly parallel algorithm can still
be a superior alternative to a sequential CPU method. This talk Speaker(s): Peter Decrem (Director, Rates Products, Quantifi)
will present a simple CUDA-based Poisson solver to the conjugate Topic(s): Tools & Libraries, Finance
gradient method designed for solving well-conditioned matrices
such as those that arise from the pressure projection stage of Tuesday, Sept 21, 17:00 (50 minutes)
a Navier-Stokes fluid solver. In contrast to other active areas of Room D
research in this field, we show that a more brute force approach 2304 Harnessing the GPU to Accelerate Automotive
can still significantly out-perform the best CPU alternatives by Development
sacrificing a high convergence rate in place of achieving much
Learn how GPU technologies broke speed limits in automotive
faster iterations.
development. By using GPU-accelerated tools, small team of
Speaker(s): Dan Bailey (R&D, Double Negative) engineers created a complete certifiable vehicle in only two years,
Topic(s): Computational Fluid Dynamics, Algorithms & using fraction of the budget used in conventional industry. Talk
Numerical Techniques, Film will cover tools and techniques used in creation of XD concept,
TUESDAY

as well as how to overcome challenges moving a product from

Tuesday, Sept 21, 17:00 (50 minutes) concept to mass production stage.
Room C Speaker(s): Theo Valich (President, Bright Side Network Inc.)
2250 GPU Ray Tracing Exposed: Under the Hood of the Topic(s): Embedded & Automotive, Computer Vision,
NVIDIA OptiX Ray Tracing Engine Computer Graphics
Take a deep dive into many of the design choices and
implementation details of the NVIDIA OptiX ray tracing engine. Tuesday, Sept 21, 18:00 (120 minutes)
Learn how domain specific compilation, a unique execution model Keynote Hall
and a general object model, are combined into a flexible and 1005 Research Posters Showcase / Exhibits Open /
powerful API. Networking Reception
Speaker(s): Austin Robison (Lead Developer, OptiX Integration, Join your colleagues in the exhibit hall to preview emerging
NVIDIA) technologies and see some of the most innovative solutions
Topic(s): Ray Tracing available today. We also encourage you to take the opportunity to
browse the research posters to see “what’s next” out of the world
Tuesday, Sept 21, 17:00 (50 minutes) of research and academia, and meet the poster presenters, who
Marriott San Jose Ballroom will be stationed near their posters during this time. Appetizers
and drinks will be served.
2265 CUDA Centers of Excellence Super-Session IV
Come hear about the groundbreaking research taking place at Topic(s): General Interest
the CUDA Centers of Excellence, an elite group of world-renown
research universities that are pushing the frontier of massively
parallel computing using CUDA. Researchers from these top
institutions will survey cutting-edge research that is advancing
the state of the art in GPU computing and dozens of application
fields across science and engineering.

In this session we will hear from Professor Ting-wai Chiu at

National Taiwan University, Dr. Satoshi Matsuoka at Tokyo Tech
and Dr. Paul Calleja at the University of Cambridge.
Speaker(s): Paul Calleja (University of Cambridge), Ting-Wai
Chiu (Professor, National Taiwan University), Satoshi
Matsuoka (Professor, Tokyo Institute of Technology)
Topic(s): General Interest

Tuesday, Sept 21, 17:00 (50 minutes)

Room K
2297 Developing CUDA Accelerated .NET Plugins for
Microsoft Excel
Quantifi will demo its xLDevelopment environment, which provide
developers with an easy to use development environment which
allows cuda functionality to be in Microsoft Excel. With as little
as four lines, one will also select the position of the function in
the menu bar, xml markup language will display in the excel help
functionality, and objects can be easily added to the object cache.
These objects can then be inspected by the end user or developer.
GE
Intelligent Platforms

1 Rugged Cluster.
290 Processors.
We’ve redefined the size, weight and power
equation for rugged applications.

Thanks to our partnership with NVIDIA®, you can now

take advantage of massively parallel GPGPU processing
in rugged applications such as radar, sonar, signal intel-
ligence, signal processing, image processing and software
defined radio. Indeed, any rugged application that involves
parallel execution of multiple threads will benefit. Best of
all, these performance improvements are accessible to
almost everyone because the CUDA™ architecture allows
you to program in C.

We offer a number of NVIDIA-based GPGPU products in

various rugged form factors, such as the 6U VPX IPN250,
which is CUDA-capable and supports OpenCL™. And our
AXISLib VSIPL DSP libraries will help you develop high
performance multi-core and multi-processor applications
without needing to understand the complexities of the
underlying hardware. You can also easily migrate AXIS
applications to new platforms. For more information on
our growing family of CUDA-capable GPGPU boards,
please visit www.ge-ip.com/gpgpu

© 2010 GE Intelligent Platforms, Inc. All rights reserved.

All other brands or names are property of their respective holders.
Wednesday, Sept 22, 09:00 (50 minutes) Wednesday, Sept 22, 10:00 (50 minutes)
Keynote Hall Marriott Guadalupe Room
1002 Day 2 Keynote with Dr. Klaus Schulten, University of 2082 CU-LSP: GPU-based Spectral Analysis of Unevenly
Illinois, Urbana-Champaign Sampled Data
How does the H1N1 “Swine Flu” virus avoid drugs while attacking Standard FFT algorithms cannot be applied to spectral analysis of
our cells? What can we learn about solar energy by studying unevenly sampled data. Alternative approaches scale as O(N^2),
biological photosynthesis? How do our cells read the genetic making them an ideal target for harnessing the raw computing
code? What comes next in computational biology? power of GPUs. To this end, I have developed CU-LSP, a CUDA
spectral analysis code based on the Lomb-Scargle periodogram.
Computational biology is approaching a new and exciting
Preliminary benchmarking indicates impressive speed-ups, on
frontier: the ability to simulate structures and processes in
the order of 400 relative to a single core of a modern CPU. An
living cells. Come learn about the “computational microscope,”
initial application of CU-LSP will be the analysis of time-series
a new research instrument that scientists can use to simulate
data from planet-search and asteroseismology satellites.
biomolecules at nearly infinite resolution. The computational
microscope complements the most advanced physical Speaker(s): Richard Townsend (Assistant Professor, University of
microscopes to guide today’s biomedical research. In this keynote Wisconsin-Madison)
address, computational biology pioneer Dr. Klaus Schulten of Topic(s): Astronomy & Astrophysics, Algorithms & Numerical
the University of Illinois, Urbana-Champaign, will introduce the Techniques, Signal processing
computational microscope, showcase the widely used software
underlying it, and highlight major discoveries made with the aid Wednesday, Sept 22, 10:00 (50 minutes)
of the computational microscope ranging from viewing protein Room A1
WEDNESDAY

folding, translating the genetic code in cells, and harvesting solar

2134 Ultra High Resolution Displays and Interactive
energy in photosynthesis. He will also look towards a future when
Eyepoint Using CUDA
cell tomography and computing will establish atom-by-atom
views of entire life forms. We’ll go over the challenges we have overcome in building 100
million pixel seamless displays. One customer requirement
Klaus Schulten received his Ph.D. from Harvard University in involves interactive changes of the eyepoint as a person moves,
1974. He is Swanlund Professor of Physics and is also affiliated relative to the screen, yet the distortions computed are quite non-
with the Department of Chemistry as well as with the Center for linear. We discuss our use of a gpu to implement this procedure.
Biophysics and Computational Biology. Professor Schulten is a
full-time faculty member in the Beckman Institute and directs the Speaker(s): Rajeev Surati (President, Scalable
Theoretical and Computational Biophysics Group at the University Display Technologies)
of Illinois Urbana-Champaign, IL. Honors and awards: Award in Topic(s): Computer Graphics, High Performance Computing,
Computational Biology 2008; Humboldt Award of the German Medical Imaging & Visualization
Humboldt Foundation (2004); University of Illinois Scholar (1996);
Fellow of the American Physical Society (1993); Nernst Prize of Wednesday, Sept 22, 10:00 (50 minutes)
the Physical Chemistry Society of Germany (1981). Room A2
Topic(s): General Interest, Life Sciences 2141 Moving the Frontier of Oil and Gas Exploration and
Speaker(s): Klaus Schulten (Professor, University of Illinois, Production with GPUs
Urbana-Champaign) Learn how the Oil and Gas Industry is embracing GPUs in order
to tackle new and complex oil and gas plays around the world.
Wednesday, Sept 22, 10:00 (50 minutes) The first part of this talk gives an overview of the business and
Room C geopolitical drivers of the industry, followed with the critical
contribution of computation in the quest for secure supply of
2058 A Practical Introduction to Computational Fluid energy.
Dynamics on GPUs
Speaker(s): Maurice Nessim (Schlumberger), Shashi Menon
Learn step-by-step procedures to write an explicit CFD solver
based on final difference methods with staggered grid allocations (R&D Manager, Schlumberger)
and boundary fitted coordinates. We will discuss the derivation Topic(s): Energy Exploration, High Performance Computing
of the mathematical model, discretization of the model
equations, development of the algorithms, and parallelization and Wednesday, Sept 22, 10:00 (50 minutes)
visualization of the computed data using OpenCL and OpenGL. Room L
Compares case studies of natural convection, driven cavity, 2160 StarPU: a Runtime System for Scheduling Tasks
scaling analysis, and magneto-thermal convection computed
See how StarPU provides task scheduling facilities for a
using CSIRO’s CPU/GPU supercomputer cluster to known
hybrid platform and a powerful data management library that
analytical and experimental solutions.
transparently takes care of data across the entire machine. We
Speaker(s): Tomasz Bednarz (3d Visualisation Software will discuss the significant performance improvements resulting
Engineer, CSIRO) from its flexible scheduler as well as its ability to mix parallel
Topic(s): Computational Fluid Dynamics, Algorithms & CPU kernels (eg. written in OpenMP or TBB) with CUDA/OpenCL
Numerical Techniques, High Performance and MPI.
Computing, Physics Simulation Speaker(s): Cedric Augonnet (INRIA)
Topic(s): Tools & Libraries, High Performance Computing
Wednesday, Sept 22, 10:00 (20 minutes) Wednesday, Sept 22, 10:00 (50 minutes)
Room D Room E
2163 Leveraging GPUs for Evolutionary Game Theory 2231 Driving on Mars, Redux: System Level Simulation of
Learn how GPUs are being used to accelerate the study of Dynamic Systems
the emergence of cooperative behavior in biology, from the Learn how GPU and HPC computing are used to predict through
interactions of humans to viruses to bacteria. The work presented simulation the dynamics of large complex mechanical systems
here achieves a speedup of 209x on a cluster of 4 Tesla GPUs. such as tracked vehicles including the Mars Rover. The
presentation outlines the physics based approach and numerical
Speaker(s): Amanda Peters (PhD Candidate, Harvard University)
solution methods that enabled the simulation of dynamic systems
Topic(s): Algorithms & Numerical Techniques, Life Sciences
with millions of bodies on the GPU. The presentation will also
explain how a HPC cluster is used to effectively render scenes
Wednesday, Sept 22, 10:00 (50 minutes) with tens of thousands of bodies for generating animations that
Room A3 can be used by Engineers in the design process.
2166 The Triad of Extreme Computing-Fast Algorithms, Speaker(s): Dan Negrut (Assistant Professor, University
Open Software and Heterogeneous Systems of Wisconsin)
The first wave of successful GPU accelerations has been Topic(s): Physics Simulation, Algorithms & Numerical
crowded with highly-parallel methods that adapted well to the Techniques, High Performance Computing
hardware. But the easy-pickings are now running out. The truly
challenging applications require “going back to the algorithmic Wednesday, Sept 22, 10:00 (50 minutes)
drawing board.” To develop new versions of the most effective Room A5

WEDNESDAY
fast algorithms, such that our science can most benefit, an
ideal environment is created by the open software model, where 2249 New Programming Tools GPU Computing
efforts can be shared. We will describe one area of application This session will focus on new parallel programming tools
--electrostatics of biomolecules in solution-- where we see at for GPU computing. The type of tools that fit into the session
work the triad of extreme computing: fast algorithms, open include (1) Planning tools for porting legacy applications to use
software, and heterogeneous computing. GPU computing, (2) High-level programming and scripting tools
for GPU computing, (3) Automation of common performance
Speaker(s): Lorena Barba (Assistant Professor, Boston
optimizations for GPU computing, (4) Performance analysis
University)
and diagnosis tools for GPU computing, (5) Tools that simplify
Topic(s): Algorithms & Numerical Techniques, Physics
heterogeneous parallel computing.
Simulation
Speaker(s): Wen-mei Hwu (Professor, University of Illinois,
Wednesday, Sept 22, 10:00 (50 minutes) Urbana-Champaign), Andrew Schuh (Project
Room B Manager, University of Illinois)
Topic(s): Tools & Libraries
2168 Interactive Molecular Dynamics for Nanomechanical
and Nanochemical Experiments
Wednesday, Sept 22, 10:00 (50 minutes)
Hear how the combination of GPU accelerated molecular Marriott San Jose Ballroom
dynamics simulation software, 3D TV displays, affordable haptic
game controllers, and high performance molecular visualization 2280 TSUBAME2.0 Experience
is leading to new ways to study materials and objects on the Tsubame2.0 is the next-generation multi-petaflops
nanoscale. We will present the concept of an appliance for supercomputer that been designed and built at Tokyo Tech,
integrated virtual nanoscale experiments and challenges related with more than 4000 NVIDIA Fermi GPUs. as a successor to the
to software and hardware. highly successful Tsubame1. Deep design considerations were
made based on experiences on Tsubame1 retrofitted with the
Speaker(s): Axel Kohlmeyer (Associate Director, Institute for
previous generation Tesla to maximize the versatility and the
Computational Molecular Science, Temple University)
competitiveness of the system across considerable number of
Topic(s): Molecular Dynamics
application domains, as well as accommodating as much strong
scaling as possible. This resulted in a totally new custom system
Wednesday, Sept 22, 10:00 (50 minutes) design in collaboration with HP and NEC, rather than a machine
Room A7 with a retrofitted GPUs. The resulting supercomputer hopefully
2169 Real-time Volumetric Medical Ultrasound will become a design template of future large-scale GPU systems
Applications for GPU Computing to come.
Real-time volumetric medical ultrasound requires computationaly Speaker(s): Satoshi Matsuoka (Professor, Tokyo Institute
intensive rapid processing of data for visualization of aquired of Technology)
acoustic data. Clinical applications of GPU-based technologies in Topic(s): High Performance Computing
obstetrics and cardiology will be discussed.
FULL CONFERENCE GUIDE 2010

Speaker(s): Roee Lazebnik (Director of Product Development, Wednesday, Sept 22, 10:00 (50 minutes)
Siemens Healthcare) Room A8
Topic(s): Medical Imaging & Visualization, Imaging, 2305 PantaRay: Accelerating Out-Of-Core Ray Tracing of
Stereoscopic 3D, Computer Graphics Sparsely Sampled Occlusion
Modern VFX rendering pipelines are faced with major complexity
challenges: a film like Avatar requires rendering hundreds of
thousands of frames, each containing hundreds of millions
49
or billions of polygons. Furthermore, the process of lighting Wednesday, Sept 22, 10:30 (20 minutes)
requires many rendering iterations across all shots. In this Room D
talk, we present the architecture of an efficient out-of-core ray
2109 Migration of a Complete 3D Poisson Solver from
tracing system designed to make rendering precomputations
Legacy Fortran to CUDA
of gigantic assets practical on GPUs. The system we describe,
dubbed PantaRay, leverages the development of modern ray We describe our journey of migrating a legacy direct solver
tracing algorithms for massively parallel GPU architectures and library for Poisson equations written in Fortran77 to CUDA in
combines them with new out-of-core streaming and level of detail order to harness the computational power provided by the Tesla
rendering techniques. device (“Fermi”). This legacy library is still widely used today as
it is the most complete library that can deal with three different
Speaker(s): Luca Fascione (Senior Research and Development boundary conditions (Dirchlet, Neumann and Cyclic) and two
Engineer, Weta Digital) grid configurations (staggered and centered) independently in
Topic(s): Digital Content Creation (DCC) any of the three dimensions (x, y, z); giving a total of over 200
configurations.
Wednesday, Sept 22, 10:00 (20 minutes)
Speaker(s): Huynh Phung (Research Engineer, A*STAR Institute
Room K
of High Performance Computing)
2306 Gate-Level Simulation with GP-GPUs Topic(s): Tools & Libraries, Computational Fluid Dynamics
Logic simulation is a critical component of the digital design
tool flow. It is used from high-level descriptions down to gate- Wednesday, Sept 22, 10:30 (20 minutes)
level to validate several aspects of the design, particularly Room K
functional correctness. Despite development houses investing
2300 High-Performance Compressive Sensing using Jacket
WEDNESDAY

vast resources in the simulation task it is still far from achieving

the performance demands of validating complex modern designs This talk will present the ongoing work that I am doing in the
at gate-level. We developed a GP-GPU accelerated gate-level L1-optimization group at Rice Universtiy. The purpose of the
simulator using NVIDIA CUDA. work is to merge both compressive sensing, for image/signal
reconstructions and GPU computation, using NVIDIA’s GPUs to
We leverage novel algorithms for circuit netlist partitioning enhance the technology of CS.
and found that our experimental prototype could handle large,
industrial scale designs comprised of millions of gates while This talk will cover basic concepts in compressive sensing
delivering 13x speedup on average over a typical commercial and the easy adaptation of operating on the GPU, in particular
simulator. working with Jacket (by AccelerEyes). We willthen cover some of
our numerical experiments that encompass the use of different
Speaker(s): Debapriya Chatterjee (Graduate Student, University flavors of algorithms.
of Michigan)
Speaker(s): Nabor Reyna Jr. (Graduate Student, Rice University)
Topic(s): General Interest
Topic(s): Imaging, Tools & Libraries
Wednesday, Sept 22, 10:00 (50 minutes)
Room N Wednesday, Sept 22, 11:00 (20 minutes)
Room B
2308 Building Cutting-Edge Realtime 3D Applications with
NVIDIA SceniX 2034 Reformulating Algorithms for the GPU
Learn how NVIDIA SceniX is a rapid start to building state of the Important applications in signal, data processing and
art, realtime 3D applications, and how raytracing can be combined bioinformatics that use dynamic programming are difficult to
with raster graphics for new levels of interactive realism. parallelize due to intrinsic data dependencies. We demonstrate
a novel technique to extract parallelism out of data dependent
Speaker(s): Brian Harrison (NVIDIA), Michael Morrison (NVIDIA) algorithms and reformulate the same for GPUs.
Topic(s): Computer Graphics, Computer Vision, Ray Tracing,
Stereoscopic 3D This simple technique breaks the dependencies and resolves
them at an optimal point later in time, thus obtaining remarkable
Wednesday, Sept 22, 10:00 (50 minutes) speedup on GPUs. We present a case study from computational
Keynote Hall biology i.e., protein motif-finding. We also present how the same
technique can be extended and applied to other relevant problems
4000 Emerging Companies Summit Opening Address such as gene-prediction and phylogenetics.
The Emerging Companies Summit is a unique forum for startup
Speaker(s): Narayan Ganesan (Research Scientist, University
companies to showcase innovative applications that leverage
of Delaware), Michela Taufer (Assistant Professor,
the GPU to solve visual and compute-intensive problems. The
University of Delaware)
Opening Address includes an overview of NVIDIA’s GPU ecosystem
Topic(s): Life Sciences, Algorithms & Numerical Techniques,
development activities and an interaction on stage with selected
High Performance Computing
companies building groundbreaking applications on top of the
GPU platform.
Wednesday, Sept 22, 11:00 (20 minutes)
The ECS is a great opportunity to discover new players in the Room K
GPU ecosystem, find great investments, explore partnership and
2039 GPU Debugging with Allinea DDT
customer/vendor opportunities, network/build relationships, and
discuss the future of an industry that is reshaping computing. Discover how a debugger can help you fix those hard to find bugs
in your GPU software, with this introduction to the special CUDA
Speaker(s): Jeff Herbst (Vice President of Business features in Allinea DDT.
Development, NVIDIA)
Topic(s): General Interest
Speaker(s): David Lecomber (CTO, Allinea Software) Topic(s): Computational Fluid Dynamics, Algorithms &
Topic(s): Tools & Libraries Numerical Techniques, High Performance Computing

Wednesday, Sept 22, 11:00 (50 minutes) Wednesday, Sept 22, 11:00 (50 minutes)
Room A2 Room L
2059 Industrial Seismic Imaging on GPUs 2092 Integrating CUDA into a Large-Scale Commercial
At Hess Corporation, we have moved the most computationally Database Management System
intensive parts of our seismic imaging codes from CPUs to GPUs In a large-scale database installation where data tables are
over the past few years. In this talk I will give an overview of distributed across multiple servers, computational throughput
seismic imaging, highlighting the physical and computational can be optimized by using GPUs on each server and integrating
algorithms of these codes. I will discuss our software approach database management with GPU resources.
and the programming effort to port them to GPUs, concluding
In the Department of Physics and Astronomy at The Johns
with a summary of our progress in adopting GPUs in production.
Hopkins University, we are experimenting with a set of software
Speaker(s): Scott Morton (Geophysical Advisor, tools that closely couple SQL statements with GPU functionality.
Hess Corporation) While still under development, the new framework is now
Topic(s): Energy Exploration, High Performance Computing routinely used in our research projects, e.g., to study the spatial
clustering of galaxies as well as genomics.
Wednesday, Sept 22, 11:00 (50 minutes) Speaker(s): Richard Wilton (Research Scientist, The Johns
Room E Hopkins University), Tamas Budavari (Research
2065 Massively Accelerating Iterative Gauss-Newton Scientist, Johns Hopkins University)

WEDNESDAY
Fitting Topic(s): Databases & Data Mining, Astronomy &
To measure three-dimensional shape data of objects, we build Astrophysics, High Performance Computing,
up a measurement system that assigns three-dimensional Tools & Libraries
coordinates to the position of projected measurement labels in
a camera image. To achieve high measurement accuracy across Wednesday, Sept 22, 11:00 (50 minutes)
high amounts of measurement points, we need a very quick Marriott Guadalupe Room
routine to localize measurement labels with high precision. To 2099 Cosmology Powered by GPUs Redux
speed up the computation, we evaluate the fits using the CUDA
Cosmological simulations aim at reproducing the physical
architecture. The final implementation speeds up the fitting of 104
processes which occur on the largest scales of the Universe
two-dimensional Gauss functions by a factor of 90.
since the Big-Bang by means of numerical calculations on
Speaker(s): Daniel Härter (University of Freiburg, IMTEK, supercomputers. Using CUDA, I have implemented standard
Laboratory for Process Technology) cosmological techniques on GPU architecture (PM N-Body solver,
Topic(s): Computer Vision, Stereoscopic 3D Hydrodynamics & moment-based radiative transfer) and designed
them to run on supercomputing facilities by means of MPI+CUDA
Wednesday, Sept 22, 11:00 (50 minutes) mixed programming. These applications are able to run on 100 or
Room A1 more graphics devices with typical scalar x50 accelerations and
with a communication overhead limited to 15%. It allow to explore
2071 Large Scale Visualization Soup physical regimes which were out of reach of current simulations.
The unprecedented realism that is possible today allows for
visualization at an ever larger scale. This talk will walk through Speaker(s): Dominique Aubert (Lecturer, Strasbourg University)
several case studies from high resolution single displays to Topic(s): Astronomy & Astrophysics
completely immersive environments. Details will be shared
on how to architect and implement these installations, with Wednesday, Sept 22, 11:00 (50 minutes)
attention to the typical issues encountered. It will cover how to Room N
implement stereo 3D in OpenGL, Direct3D, as well as how that 2104 Rapid Prototyping Using Thrust: Saving Lives with
relates to the different display technologies (projectors, multi- High Performance Dosimetry
display, CAVEs, etc.).
Radiation poisoning is an everpresent danger for intervention
Speaker(s): Steve Nash (Applied Engineer, NVIDIA) teams that must visit nuclear sites. Virtual reality can help teams
Topic(s): Computer Graphics, Stereoscopic 3D prepare for intervention, but efficient computation of radiation
dosage is critical to study complex scenarios. Radiation protection
Wednesday, Sept 22, 11:00 (50 minutes) research often uses codes based on the straight line attenuation
Marriott San Jose Ballroom method. As with other approaches, geometrical computations
(finding all the interactions radiation rays/objects intersection)
2078 Shockingly fast and accurate CFD simulations remain the simulation bottleneck. This talk will describe how we
In the last three years we have demonstrated how GPU
FULL CONFERENCE GUIDE 2010

have used the Thrust high-level library for CUDA C/C++ to quickly
accelerated discontinuous Galerkin methods have enabled prototype innovative algorithms and achieve a significant speed
simulation of time-dependent, electromagnetic scattering from up.
airplanes and helicopters.
Speaker(s): Guillaume Saupin (CEA)
In this talk we will discuss how we have extended these Topic(s): High Performance Computing, Algorithms &
techniques to enable GPU accelerated simulation of supersonic Numerical Techniques, Physics Simulation,
airflow as well. Ray Tracing
Speaker(s): Timothy Warburton (Associate Professor,
Rice University)
51
Wednesday, Sept 22, 11:00 (50 minutes) Elif Albuz (NVIDIA), Nathan Whitehead (CUDA
Room A7 Software Engineer, NVIDIA), Frank Jargstorff
(Software Engineer, NVIDIA)
2146 Virtual Surgery
Topic(s): Tools & Libraries
Come see how 3D Vision technology is used in Virtual Surgery
Training for Medical Education. BioDigital Systems in conjuncture
Wednesday, Sept 22, 11:00 (50 minutes)
with University of California San Francisco (UCSF), has developed
Room A5
a dental injection simulator to teach students of dentistry the
mechanics of nerve block injection. 3D Vision Technology has 2275 The Evolution of GPUs for General Purpose Computing
added a new dimension of realism by providing users with a Learn how the GPU evolved from its humble beginning as a “VGA
unique immersive experience. Accelerator” to become a massively parallel general purpose
Speaker(s): Aaron Oliker (Managing Partner/Director 3D
accelerator for heterogeneous computing systems. This talk will
Technology, BioDigital)
focus on significant milestones in GPU hardware architecture and
software programming models, covering several key concepts
Topic(s): Medical Imaging & Visualization, Stereoscopic 3D
that demonstrate why advances in GPU parallel processing
performance and power efficiency will continue to outpace CPUs.
Wednesday, Sept 22, 11:00 (50 minutes)
Room C Speaker(s): Ian Buck (Software Director of GPU Computing,
NVIDIA)
2177 Simplifying Parallel Programming with Domain
Topic(s): General Interest
Specific Languages
Explore a new approach in parallel programming which leverages
Wednesday, Sept 22, 11:00 (50 minutes)
WEDNESDAY

Domain Specific Languages (DSLs) to simplify programming

Room A8
heterogeneous systems (multi-core processors and GPUs).
This approach allows DSL users to take advantage of the 2286 Towards Peta-Scale Green Computation -
power of GPUs without having working knowledge of lower Applications of the GPU Supercomputers in the Chinese
level programming models such as CUDA. Topics will cover the Academy of Sciences (CAS)
advantages of the DSL approach in parallel programming, and China now holds three spots in the June 2010 Top500 list of
the runtime implementation details with optimizations to have the GPU-based supercomputers, and two of them, using NVIDIA
performance benefits of using GPUs. GPUs, are related to CAS. Efficient use of these systems is more
Speaker(s): HyoukJoong Lee (PhD Student, Stanford University),
important than peak or Linpack performance. This session will
Hassan Chafi (PhD Candidate, Stanford University)
cover some of the large-scale multi-GPU applications in CAS,
ranging from molecular dynamics below nano-scale to complex
Topic(s): Tools & Libraries, High Performance Computing
flows on meter-scale and porous media on geological scales, as
well as fundamental linear algebra and data/image analysis. The
Wednesday, Sept 22, 11:00 (50 minutes)
idea of keeping high-efficiency and generality of the computation
Room D
platform by maintaining a consistency among the target physical
2207 Playing Zero-Sum Games on the GPU system, the computational model and algorithm, and the
A Zero-Sum game is a match for which the gain of one results in computer hardware will be explained in detail and demonstrated
loss of the other. Tic-Tac-Toe, Checkers and Chess are Zero- through a number of super-computing applications in the
Sum board game examples. For realizing the best player move, chemical, oil, mining, metallurgical and biological industries.
the game is abstracted as a tree, often quite deep, consisting Speaker(s): Wei Ge (Professor, Institute of Process Engineering,
of all possible configurations. We present an efficient GPU Chinese Academy of Sciences), Xiaowei Wang (Dr.,
implementation of the Mini-Max search algorithm, enhanced with Institute of Process Engineering), Yunquan Zhang
Alpha-Beta pruning. We highlight challenges for deploying non- (Professor, Institute of Software, CAS), Long Wang
tail recursion of a highly irregular algorithm on GPUs, proposing (Associate Professor, Super Computing Center, Institute)
a hybrid of compiler and user managed stack. We demonstrate Topic(s): High Performance Computing
superior performance for running many thousands of 3D Tic-Tac-
Toe matches, simultaneously.
Wednesday, Sept 22, 11:00 (50 minutes)
Speaker(s): Avi Bleiweiss (Principal Architect, NVIDIA Corporation) Room M
Topic(s): Machine Learning & Artificial Intelligence 2293 Scaling Up and Scaling Out GPUs with Supermicro’s
Twin™ Architecture (Sponsored by Supermicro)
Wednesday, Sept 22, 11:00 (50 minutes)
Find out how Supermicro scales up and scales out GPU
Room A3
performance by using Twin™ architecture. In this session, we
2216 CUDA Libraries Open House outline Supermicro’s Twin™ architecture advantages across
Learn about NVIDIA’s CUDA libraries and meet the engineers 1U/2U GPU servers and the design of personal supercomputer,
that develop them. Lead developers will cover the capabilities, and how we are able to scale and optimize GPU technology for
performance and future directions for NVIDIA’s CUFFT, CUBLAS, datacenter environment and for professional workstation.
CURAND, and NPP libraries (other libraries such as CUSPARSE Speaker(s): Don Clegg (Supermicro)
and open source Thrust are covered in other talks). After the Topic(s): High Performance Computing, Computer Vision
presentation, NVIDIA developers will remain in the room to chat
and answer questions during the lunch break.
Speaker(s): Ujval Kapasi (CUDA Platform SW, NVIDIA), Philippe
Vandermersch (Senior Software Engineer, NVIDIA),
Wednesday, Sept 22, 11:00 (50 minutes) Wednesday, Sept 22, 12:00 (120 minutes)
Keynote Hall Exhibit Hall
4001 Emerging Companies: CEO on Stage featuring 1004 Exhibits Open / Networking Lunch
Elemental Technologies, Geomerics, and Milabra Join your colleagues in the exhibit hall to preview emerging
See the hottest new technologies from startups that are technologies and see some of the most innovative solutions
transforming computing. available today. Lunch will be served.

In a lively and fast-paced exchange, the “Emerging Companies Topic(s): General Interest
Summit - CEO on Stage” sessions will feature CEOs from three
startups who will each have 15 minutes to introduce their Wednesday, Sept 22, 14:00 (50 minutes)
companies and interact with a panel of leading venture capitalists, Marriott Guadalupe Room
technology executives, and industry analysts.
2000 Gravitational N-body Simulations: How Massive Black
Panelist(s): Drew Lanza (Partner, Morgenthaler), Dan’l Lewin Holes Interact with Stellar Systems
(Corporate VP and Strategic & Emerging Business Astrophysics is a field where super computing is a must to obtain
Development, Microsoft), Jon Peddie (President, new scientific results. in particular, the study of the interaction
JPR), Jeff Herbst (Vice President of Business among massive black holes and surrounding stars is a hot topic,
Development, NVIDIA) which requires heavy computations to have good representation
Speaker(s): Sam Cox (CEO, Milabra), Sam Blackman (CEO of what happens in the inner regions of galaxies. We present
and Co-Founder, Elemental Technologies, Inc.), the results obtained with our high precisioned N-body code,
Chris Doran (Founder and Chief Operating NBSymple, which exploits the joint power of a multi core CPU
Officer, Geomerics) system together with the high performance NVIDIA Tesla C1060

WEDNESDAY
Topic(s): General Interest, Video Processing, GPUs.
Computer Graphics
The code is available at the website:
astrowww.phys.uniroma1.it/dolcetta/nbsymple.html
Wednesday, Sept 22, 11:30 (20 minutes)
Room B Speaker(s): Roberto Capuzzo-Dolcetta (Professor, Sapienza Univ.
of Roma), Alessandra Mastrobuono Battisti (PhD
2035 Simulations of Large Membrane Regions
Student, Sapienza- University of Rome)
Learn how to study membrane-bound protein receptors by Topic(s): Astronomy & Astrophysics, Algorithms & Numerical
moving beyond the current state-of-the-art simulations that only Techniques
consider small patches of physiological membranes. Towards this
end, this session presents how to apply large-scale GPU-enabled
Wednesday, Sept 22, 14:00 (50 minutes)
computations of extended phospholipid bilayer membranes using
Room D
a GPU code based on the CHARMM force field for MD simulations.
Our code enables fast simulations of large membrane regions in 2038 The Best of Both Worlds: Flexible Data Structures for
NVT and NVE ensembles and includes different methods for the Heterogeneous Computing
representation of the electrostatic interactions, i.e., reaction force Learn how to switch between array of structs (AoS) and struct of
field and Ewald summation (PME) methods. Performance and arrays (SoA) storage without having to change the data access
scientific results for dimyristoylphosphatidylcholine (PC) based syntax. A few changes to the struct and container definitions will
lipid bilayers are presented. enable you to evaluate the performance of AoS vs. SoA on your
Speaker(s): Michela Taufer (Assistant Professor, University
existing AoS code. We present a simple abstraction that retains
of Delaware), Narayan Ganesan (Research Scientist,
the more intuitive AoS syntax array[index]component, yet allows
University of Delaware), Sandeep Patel (Assistant
you to switch between AoS and SoA storage with a single template
Professor, University of Delaware)
parameter at class definition.
Topic(s): Molecular Dynamics, High Performance Computing, Speaker(s): Robert Strzodka (Senior Researcher, Max Planck
Physics Simulation Institut Informatik)
Topic(s): Algorithms & Numerical Techniques,
Wednesday, Sept 22, 11:30 (20 minutes) Tools & Libraries
Room K
2117 Migration of C and Fortran Apps to GPGPU using HMPP Wednesday, Sept 22, 14:00 (50 minutes)
Room A3
GPGPU is a tremendous opportunity to many application fields.
Migrating legacy software to GPGPU is a complex process that 2041 PyCUDA: Even Simpler GPU Programming with
requiresmastering the technological risks (e.g. loss of code Python
portabilit, extensive code restructuration, debugging complexity) Explore PyCUDA, a robust, open-source toolkit that lets you
as well as costs. In this talk, we present a methodology based on control your GPU from the comfort of Python, a Matlab-like
FULL CONFERENCE GUIDE 2010

HMPP (Heterogeneous Multicore Parallel Programming), allowing scripting language. Learn about Fermi tuning with PyCUDA,
incremental processes that reduce the cost and risks of porting the new interfaces for CUBLAS and CUFFT, the ecosystem of
codes to GPGPU. third-party libraries built on PyCUDA, and examples illustrating
Speaker(s): Francois Bodin (CTO, CAPS Entreprise)
PyCUDA’s benefits to large-scale applications.
Topic(s): High Performance Computing, Tools & Libraries Speaker(s): Andreas Kloeckner (Courant Instructor, Courant
Institute, NYU)
Topic(s): Tools & Libraries, Computational Fluid Dynamics,
Physics Simulation
53
Wednesday, Sept 22, 14:00 (20 minutes) Speaker(s): Bruno Nicoletti (CTO, The Foundry)
Room K Topic(s): Film, Tools & Libraries, Video Processing

2045 Roe-Pike Scheme for 2D Euler Equations

Wednesday, Sept 22, 14:00 (50 minutes)
Hear how we are improving our elsA and CEDRE computational
Room E
fluid dynamics software by working on solving the Euler equations
set on the GPU. We discuss how our implementation considers 2137 CUDA for Real-Time Multigrid Finite Element
the associated Riemann problem and the Roe-Pike differencing Simulation of Soft Tissue Deformations
scheme at several orders in space while also introducing The take-away of this presentation is an efficient CUDA
immerse boundary conditions. Covers the significant speedup implementation of a finite hexahedra multigrid solver for
obtained through algorithmic and computational optimizations. simulating elastic deformable models in real time. Due to the
Speaker(s): Matthieu Lefebvre (PhD student, ONERA)
regular shape of the numerical stencil induced by the hexahedral
regime, computations and data layout can be restructured to avoid
Topic(s): Computational Fluid Dynamics, Algorithms &
execution divergence and to support memory access patterns
Numerical Techniques
enabling the hardware to coalesce multiple memory accesses into
single memory transactions. This enables to effectively exploit
Wednesday, Sept 22, 14:00 (50 minutes)
the GPU’s parallel processing units and high memory bandwidth.
Room B
Performance gains of up to a factor of 12 compared to a highly
2068 Parallelizing FPGA Technology Mapping using GPUs optimized CPU implementation are demonstrated.
FPGA technology mapping is an algorithm that is heavily data Speaker(s): Christian Dick (PostGraduate Fellow ,Technische
parallel, but contains many features that make it unattractive Universität München), Joachim Georgii (PostDoc,

WEDNESDAY
for GPU implementation. The algorithm uses data in irregular Technische Universität München)
ways since it is a graph-based algorithm. It also makes heavy Topic(s): Physics Simulation, Algorithms & Numerical
use of constructs like recursion which is not supported by Techniques, High Performance Computing
GPU hardware. In this paper, we take a state-of-the-art FPGA
technology mapping algorithm within Berkeley’s ABC package
Wednesday, Sept 22, 14:00 (50 minutes)
and attempt to parallelize it on a GPU. We show that runtime
Room A7
gains of 3.1x are achievable while maintaining identical quality as
demonstrated by running these netlists through Altera’s Quartus 2139 Interactive Histology of Large-Scale Biomedical
II place-and-route tool. Image Stacks
Speaker(s): Doris Chen (Student, University of Toronto)
Get the latest information on leveraging GPU computing to
process and visualize large-scale biomedical image stacks. We
Topic(s): Algorithms & Numerical Techniques,
will discuss both display-aware processing and GPU-accelerated
texture compression for histology applications on the GPU.
Wednesday, Sept 22, 14:00 (50 minutes)
Room L Speaker(s): Won-Ki Jeong (Research Scientist, Harvard
University), Jens Schneider (Postdoctoral Fellow,
2120 High Performance Complex Event Processing on
King Abdullah University of Science and Technology)
GPGPU
Topic(s): Medical Imaging & Visualization, Imaging,
Complex Event processing (CEP),a crucial component in Life Sciences
enterprise-scale applications, is the key element in that it allows
applications to process the incoming event streams and apply
Wednesday, Sept 22, 14:00 (50 minutes)
relevant techniques in real-time for quicker decisions, making it
Room A5
easy to identify complex patterns in the events. Much of the time,
this system is consumed by the event matching algorithms. Our 2140 Superfast Nearest Neighbor Searches Using a
work utilizes the highly parallel GPU for event matching algorithm Minimal kd-tree
wherein every incoming event is worked upon by this algorithm Learn how to adapt a kd-tree spatial data structure for efficient
and results in high throughput. nearest neighbor (NN) searches on a GPU. Although the kd-tree
Speaker(s): Murali Krishna (Junior Research Associate, Infosys
is not a natural fit for GPU implementation, it can still be effective
Technologies Limited), Sudeep Mallick (Principle
with the right engineering decisions. By bounding the maximum
Research Scientist, Infosys)
height of the kd-tree, minimizing the memory footprint of data
structures, and optimizing the GPU kernel code, multi-core GPU
Topic(s): Databases & Data Mining, Finance
NN searches with tens of thousands to tens of millions of points
run 10-40 times faster than the equivalent single-core CPU NN
Wednesday, Sept 22, 14:00 (50 minutes)
searches.
Room A1
Speaker(s): Shawn Brown (Graduate Student, UNC, Chapel Hill)
2125 Developing GPU Enabled Visual Effects For Film
Topic(s): Algorithms & Numerical Techniques, Databases &
And Video
FULL CONFERENCE GUIDE 2010

Data Mining, Machine Learning & Artificial Intelligence

The arrival of fully programable GPUs is now changing the visual
effects industry, which traditionally relied on CPU computation
Wednesday, Sept 22, 14:00 (50 minutes)
to create their spectacular imagery. Implementing the complex
Room A2
image processing algorithms used by VFX is a challenge, but the
payoffs in terms of interactivity and throughput can be enormous. 2142 Complex Geophysical Imaging Algorithms Enabled by
Hear how The Foundry’s novel image processing architecture GPU technology
simplifies the implementation of GPU-enabled VFX software and Learn how computational expensive geophysical methods with
eases the transition from a CPU based infrastructure to a GPU 100s of TB of data become a commercial reality through the
based one.
55
adoption of GPUs. The first part of the talk will give an overview Speaker(s): Eri Rubin (Senior CUDA R&D developer, OptiTex)
of the computational challenges for imaging facing the oil and Topic(s): Physics Simulation, Tools & Libraries
gas industry. The second part will show how the current most
advanced methods are taking advantage of the GPU technology. Wednesday, Sept 22, 14:00 (50 minutes)
Speaker(s): David Nichols (Research Director, Schlumberger) Room N
Topic(s): Energy Exploration, Algorithms & Numerical 2248 Parallel Processing on GPUs at the University of Utah
Techniques, High Performance Computing The University of Utah is a CUDA Center of Excellence. We have
been doing both basic and applied research using CUDA. In this
Wednesday, Sept 22, 14:00 (50 minutes) session, we plan to give 3-4 talks on ongoing research. Most of
Room C the work that we will be presenting has been peered reviewed at
2164 Analytical Performance Models to Improve the top conferences.
Efficiency of GPU Computing Speaker(s): Claudio Silva (Professor, University of Utah), Huy Vo
Dive deep into a simple analytical model that provides insight (Research Assistant, University of Utah)
into performance bottlenecks of parallel applications on GPU Topic(s): High Performance Computing, Life Sciences,
architectures. We will discuss how the model estimates the Medical Imaging & Visualization, Tools & Libraries
execution time of massively parallel programs. We will also cover
how to optimize applications based on our developed performance Wednesday, Sept 22, 14:00 (50 minutes)
analysis models. Room M
Speaker(s): Hyesoon Kim (Assistant Professor, Georgia Tech) 2287 Internal GPUs on Dedicated x16 Slots - Are They
Topic(s): Tools & Libraries Needed For HPC? (Sponsored by Dell)
WEDNESDAY

We have benchmarked the real performance impact on a series

Wednesday, Sept 22, 14:00 (50 minutes) of GPU accelerated applications to understand the benefits and
Marriott San Jose Ballroom drawbacks of different system level configurations. Come hear
2204 Bridging GPU Computing and Neuroscience to Build about the effects on performance of GPUs in shared slots and of
Large-Scale Face Recognition on Facebook. GPUs that are externally connected.
Biologically-inspired computer vision algorithms – those that Speaker(s): Mark Fernandez (Computer Scientist, Dell)
aim to mirror the computations performed by the brain’s visual Topic(s): High Performance Computing
system – have emerged as exceptionally promising candidates in
object and face recognition research, achieving performance on Wednesday, Sept 22, 14:00 (50 minutes)
a range of object and face recognition tasks. Recently, we have Keynote Hall
begun harnessing the newly-available power of NVIDIA GPUs to
tackle the problem of biologically-inspired model selection within
4002 Emerging Companies: CEO on Stage featuring
a largescale model search framework, drawing inspiration from
Allegorithmic SAS, Bunkspeed, and miGenius
high-throughput screening approaches in molecular biology and See the hottest new technologies from startups that are
genetics where a large number of organisms are screened in transforming computing.
parallel for a given property of interest. In a lively and fast-paced exchange, the “Emerging Companies
As the available computational power provided by massively Summit - CEO on Stage” sessions will feature CEOs from
paralleltechnology from NVIDIA continues to expand, w e three startups who will each have 15 minutes to introduce their
hope that this research will hold great potential for new social companies and interact with a panel of leading venture capitalists,
networking applications in addition to rapidly accelerating technology executives, and industry analysts.
progress in artificial vision, and for generating new, Panelist(s): Drew Lanza (Partner, Morgenthaler), Dan’l Lewin
experimentally testable hypotheses for the study of biological (Corporate VP and Strategic and Emerging Business
vision. Development, Microsoft), Jon Peddie (President,
Speaker(s): Nicolas Pinto (PhD Student, MIT) JPR), Jeff Herbst (Vice President of Business
Topic(s): Computer Vision, High Performance Computing, Development, NVIDIA)
Machine Learning & Artificial Intelligence, Speaker(s): Philip Lunn (CEO, Bunkspeed), Dr Sébastien Deguy
Neuroscience (Founder and CEO, Allegorithmic), Chris Blewitt
(Director, miGenius Limited)
Wednesday, Sept 22, 14:00 (50 minutes) Topic(s): General Interest, Cloud Computing, Computer
Room A8 Graphics, Mobile & Tablet & Phone

2246 The Challenges of Integrating CUDA Engines Into an

Existing Package, Yet Not Sinking the Boat
Based on a true story, come listen to a daring tale about the
process of integrating a large CUDA component (physical
engine) into an existing product (3D engine) replacing some of its
functionality. The architectural difficulties and finer points that
needed to be addressed. The tuning and testing of such a large
system, while not effecting the stability of the original system.
Wednesday, Sept 22, 14:30 (20 minutes) embed them in Python programs that interoperate with numerical
Room K and visualization libraries such as NumPy, SciPy and Matplotlib.
We will examine how to express computations using Copperhead,
2049 Deflated Preconditioned Conjugate Gradient on
explore the performance of Copperhead programs running on
the GPU
GPUs, and discuss Copperhead’s runtime model, which enables
Explore how to use deflation as a second level preconditioning data-parallel execution from within Python.
technique to speed up Block Incomplete Cholesky Preconditioned
Conjugate Gradient Method. We use it to solve the Pressure Speaker(s): Bryan Catanzaro (PhD Candidate, University of
correction equation involved in the solution of the Two-Phase California, Berkeley)
Fluid Flow problem. Our implementation reaches speedup Topic(s): Tools & Libraries
factors between 25-30, for more than 260,000 unknowns, when
compared to the CPU. Wednesday, Sept 22, 15:00 (50 minutes)
Room M
Speaker(s): Rohit Gupta (Researcher/Teacher, Delft University of
Technology), Kees Vuik (Professor, Delft University 2080 Tackling Multi-Gigabit Design Challenges with a
of Technology) Practical Virtual EMI/ESD Lab
Topic(s): Computational Fluid Dynamics, Algorithms & Learn about efficient methodologies for performant and cost-
Numerical Techniques effective EMI and ESD suppression techniques by means of
massive GPU parallel processing for simulations. We will discuss
Wednesday, Sept 22, 15:00 (50 minutes) solving ever more complicated EMI and ESD challenges very early
Room A1 in the design process using in a so called ‘Virtual EMI/ESD lab’.
2029 Computer Vision Algorithms for Automating HD Post- Speaker(s Davy Pissoort (Professor, KHBO-FMEC),

WEDNESDAY
Production Amolak Badesha (Senior Application Expert &
Discover how post-production tasks can be accelerated by taking Strategist, Agilent Technologies), Hany Fahmy
advantage of GPU-based algorithms. In this talk we present (Director, SI/EMC Engineering, NVIDIA)
computer vision algorithms for corner detection, feature point Topic(s): Physics Simulation, Tools & Libraries
tracking, image warping and image inpainting, and their efficient
implementation on GPUs using CUDA. We also show how to Wednesday, Sept 22, 15:00 (50 minutes)
use these algorithms to do real-time stabilization and temporal Room C
re-sampling (re-timing) of high definition video sequences, both 2122 Using GPUs for Real-Time Brain-Computer Interfaces
common tasks in post-production. Benchmarking of the GPU
Learn how GPU processing can provide researchers with
implementations against optimized CPU algorithms demonstrates
an inexpensive and versatile alternative to dedicated signal
a speedup of approximately an order of magnitude.
processing hardware for real-time neural prosthetics. Topics
Speaker(s): Hannes Fassold (Scientist, JOANNEUM RESEARCH) will include an overview of algorithms, current state-of-the-art
Topic(s): Computer Vision, Video Processing hardware, GPU processing in a real-time environment, multi-
platform processing, and future directions in BCIs using GPU
Wednesday, Sept 22, 15:00 (50 minutes) processing.
Marriott Guadalupe Room Speaker(s): Adam Wilson (Postdoctoral Fellow, University
2044 GRASSY: Leveraging GPU Texture Units for of Cincinnati)
Asteroseismic Data Analysis Topic(s): Neuroscience, Algorithms & Numerical Techniques,
Learn how to use the hidden computation capability of GPU Signal processing
texture units for general purpose computation. We describe
GRASSY, a system for stellar spectral synthesis where the Wednesday, Sept 22, 15:00 (50 minutes)
core problem is interpolation between pre-computed intensity Room E
value. We map these pre-computed tables to the GPU’s texture 2170 Lattice Boltzmann Multi-Phase Simulations in Porous
memory. Interpolation then becomes a texture lookup where Media using GPUs
the hardware automatically performs the interpolation, albeit at
Learn how a very efficient implementation of multiphase lattice
very low precision. Our mathematical framework reasons about
Boltzmann methods (LBM) based on CUDA delivers significant
the impact of this precision and our performance results show
benefits for predictions of properties in rocks. This simulator on
500X speedups. This work generalizes the GPU texture units
NVIDIA hardware enables us to perform pore scale multi-phase
as computation engines and opens up new problems for GPU
(oil-water-matrix) simulations in natural porous media and to
acceleration.
predict important rock properties like absolute permeability,
Speaker(s): Matt Sinclair (Research Assistant, UW-Madison) relative permeabilites, and capillary pressure. We will show
Topic(s): Astronomy & Astrophysics, High videos of these simulations in complex real world porous media
Performance Computing and rocks.
FULL CONFERENCE GUIDE 2010

Speaker(s): Jonas Toelke (Chief Computational Software

Wednesday, Sept 22, 15:00 (50 minutes) Development, Ingrain)
Room N
Topic(s): Computational Fluid Dynamics, Energy Exploration
2050 Copperhead: Data-Parallel Python for the GPU
Learn how to write Python programs that execute highly
efficiently on GPUs using Copperhead, a data-parallel Python
runtime. Using standard Python constructs like map and reduce,
we will see how to construct data-parallel computations and
57
Wednesday, Sept 22, 15:00 (50 minutes) Wednesday, Sept 22, 15:00 (50 minutes)
Room A2 Room L
2174 Reverse Time Migration on GPUs 2237 Accelerating Business Intelligence Applications with
Learn how GPUs can be used to accelerate subsurface imaging Fast Multidimensional Aggregation
for Oil & Gas exploration. We will discuss results and lessons In this research session, we present an approach using NVIDIA
learned while implementing a Reverse Time Migration algorithm GPUs as massively parallel coprocessors for in-memory OLAP
on GPUs achieving significant performance improvements over a computations. Early tests have shown speedup factors of more
comparable CPU implementation. than 40x compared to optimized sequential algorithms on a CPU.
In addition to the data structures and algorithms involved, we
Speaker(s): Alex Loddoch (Research Scientist, Chevron)
describe a method to extend the approach to systems with more
Topic(s): Energy Exploration, High Performance Computing
than one GPU in order to scale it to larger data sets.

Wednesday, Sept 22, 15:00 (50 minutes) Speaker(s): Tobias Lauer (Researcher, University of Freiburg),
Room A7 Christoffer Anselm (Software Developer, Jedox
Business Intelligence)
2211 Modern Architecture for Massively Parallel Medical
Topic(s): Databases & Data Mining
Tomographic Image Reconstruction on a GPU Cluster
Learn how to combine GPU and Cluster Programming with a Wednesday, Sept 22, 15:00 (50 minutes)
real-world example. Many aspects of medical tomographic image Room A5
reconstruction are embarrassingly parallel, but require massive
compute power. We distribute the load onto a cluster of multi-GPU 2238 Better Performance at Lower Occupancy
WEDNESDAY

equipped nodes using Message Passing Interface (MPI) and CUDA. It is usually advised to optimize CUDA kernels for higher
The Thrust library allows for a modern object-oriented approach. occupancy to hide memory and arithmetic latencies better. In
this presentation, I show that increasing occupancy is not the
Speaker(s): Sven Prevrhal (Staff Research Scientist, Philips),
only way and not always the best way to hide latency on GPU.
Jingyu Cui (Graduate Student, Stanford University)
Instead, it may be advantageous to rely on the parallelism within
Topic(s): Medical Imaging & Visualization, Algorithms &
threads-instruction-level parallelism. This insight yields a simple
Numerical Techniques, High Performance
optimization technique that is used in later versions of CUBLAS
Computing, Tools & Libraries
and CUFFT. I discuss the rationale behind the technique and
illustrate it by speeding up matrix multiplication, starting with the
Wednesday, Sept 22, 15:00 (50 minutes) basic implementation found in the NVIDIA GPU Computing SDK.
Room B
Speaker(s): Vasily Volkov (Student, UC Berkeley)
2218 Redesigning Molecular Dynamics for GPUs and GPU
Topic(s): High Performance Computing
Clusters
Generalized Born and Particle Mesh Ewald (PME) molecular Wednesday, Sept 22, 15:00 (20 minutes)
dynamics are two computationally intensive algorithms for Room K
simulating biological molecules. While several adaptations of
Generalized Born have attained excellent speedup on GPUs, high 2251 TotalView Debugger for CUDA
performance Particle Mesh Ewald has been more elusive. Here Hear how the TotalView debugger is being extended to support
we describe in detail a recent port of PME implemented within GPU computation with CUDA. In addition to the basic challenges
AMBER 11 that has achieved performance on par with up to 128 associated with debugging parallel programming, CUDA
nodes of a top ten supercomputer. programming introduces a number of new concepts for which
developers need visibility in debugging: a hierarchical memory,
Speaker(s): Scott Le Grand (Principal Engineer, NVIDIA)
near-SIMD warps, streams, and kernels, among others. How do
Topic(s): Molecular Dynamics, Algorithms & Numerical
we create a tool that handles it all? We’ll be discussing the status
Techniques, High Performance Computing,
of our work and the challenges encountered in bringing this all
Life Sciences
together into a single package, TotalView for CUDA.

Wednesday, Sept 22, 15:00 (50 minutes) Speaker(s): Chris Gottbrath (Principal Product Manager,
Room A3 TotalView Technologies, Inc., a Rogue Wave
Software company)
2234 Unstructured Finite Volume Code on a Cluster with
Topic(s): Tools & Libraries
Multiple GPUs per Node
Explore how a code written to run in parallel using OpenMP and Wednesday, Sept 22, 15:00 (50 minutes)
on a single GPU was modified to run across multiple GPUs and Room A8
nodes on a multi-CPU, multi-GPU cluster installed at the Naval
Research Laboratory. We will discuss the performance of this 2273 GPUs In the Front Line of our Defenses (Sponsored
code running in parallel using MPI/OpenMP and MPI/CUDA. by GE)
Find out how GPUs are accelerating defense and aerospace
Speaker(s): Keith Obenschain (Computer Scientist, Naval
applications and providing superior information processing
Research Lab), Andrew Corrigan (Naval
to drive the next generation of capabilities to protect both
Research Laboratory & George Mason University)
homelands and soldiers. Learn how rugged VPX hardware and
Topic(s): Computational Fluid Dynamics,
software architectures are able to scale from small power- &
High Performance Computing
weight-constrained vehicles through to large complex processing
arrays, on platforms as diverse as unmanned aerial vehicles
(UAV), through tracked ground vehicles, and to ship borne radar.
Speaker(s): Simon Collins (Product Manager, Panelists: Sam Cox (CEO, Milabra), Tom Dean (Research
GE Intelligent Platforms) Scientist, Google) Janko Mrsic-Flogel (CTO, MirriAd),
Topic(s): High Performance Computing Joe Stam (Sr. Applications Engineer, NVIDIA), Yoram
Yaacovi (CTO & General Manager, Technologies
Wednesday, Sept 22, 15:00 (50 minutes) at Microsoft)
Marriott San Jose Ballroom Topic(s): Computer Vision

2281 Domain-Specific Languages

Wednesday, Sept 22, 15:30 (20 minutes)
Computer graphics has introduced several domain-specific Room K
languages (DSLs) that enable high performance and parallelism
for narrow problem domains - RenderMan, Cg, GLSL, and 2143 CUDA Fortran Programming for NVIDIA GPUs
recently OpenRL and OptiX. We think that similar approaches An introduction to programming NVIDIA GPUs using CUDA
can benefit other areas of GPU computing - visualization, Fortran. Suitable for expert Fortran or CUDA C programmers
animation, physics simulation, or scientific data analysis. In this who need to extract maximum performance from GPUs using an
talk, we present Shadie, a domain-specific shading language explicit GPU Fortran programming model. Introduces the CUDA
for rapid development of complex custom volume visualizations Fortran language, and through examples, illustrates how to
in radiation oncology. The shaders are written in a high-level explicitly program GPUs in native Fortran 95/03 through creation
Python-like language and translated to CUDA for efficiency. We of GPU kernel subroutines, management of host and device
will explain how you can develop your own DSLs using source-to- memory, definition of CUDA grids and thread blocks, launching
source translation and a suitable backend library. kernels, and use of the CUDA Fortran runtime API. This talk
includes a live component with a Windows laptop containing an
Speaker(s): Hanspeter Pfister (Professor, Harvard University)
NVIDIA GPU and the PGI CUDA Fortran compiler.

WEDNESDAY
Topic(s): Programming Languages & Techniques,
General Interest Speaker(s): Brent Leback (Engineering Manager,
The Portland Group)
Wednesday, Sept 22, 15:00 (50 minutes) Topic(s): Tools & Libraries, High Performance Computing,
Room D Programming Languages & Techniques

2296 CUDA Optimization for Ninjas: A Case Study of High-

Wednesday, Sept 22, 16:00 (50 minutes)
Performance Sorting
Room D
In this presentation, we use our implementation for high
performance radix sorting as a case study for illustrating 2020 GPU-Accelerated Data Expansion for the Marching
advanced design patterns and idioms. These techniques have Cubes Algorithm
allowed us to demonstrate Fermi sorting rates that exceed 1.0 Learn how to accelerate marching cubes on the GPU by taking
billion 32-bit keys per second (and over 770 million key-value advantage of the GPU’s high memory bandwidth and fast on-chip
pairs per second), making it the fastest fully-programmable shared memory in a data expansion algorithm that can extract the
micro-architecture for this genre of sorting problems. complete iso-surface mesh from (dynamic) volume data without
requiring any data transfers back to the CPU.
Although the CUDA programming model is elegantly decoupled
from any particular hardware configuration, we present Speaker(s): Chris Dyken (Research Scientist, SINTEF), Gernot
techniques for exploiting knowledge of the NVIDIA GPU machine Ziegler (Developer Technology (Compute), NVIDIA)
model in order to produce more efficient implementations. Topic(s): Algorithms & Numerical Techniques, Imaging,
Our design patterns enable the compiler to specialize a single Medical Imaging & Visualization
program text for a variety of architectures, resulting in target
code that “fits” the underlying hardware significantly better than Wednesday, Sept 22, 16:00 (50 minutes)
more general approaches. In particular, we discuss strategies Room K
for kernel fusion, warp-synchronous programming, flexible
2069 GPU-Accelerated Business Intelligence Analytics
granularity via meta-programming, algorithm serialization, and
data-movement tuning. Join us and learn why GPU computing is a game changer for
business intelligence (BI). We will discuss how GPUs can be
Speaker(s): Duane Merrill (PhD Candidate, University of Virginia) used to accelerate BI analytics at much lower cost, higher
Topic(s): Programming Languages & Techniques performance, and better power efficiency than other alternatives.
Speaker(s): Ren Wu (Senior Scientist, HP Labs)
Wednesday, Sept 22, 15:00 (50 minutes)
Topic(s): Databases & Data Mining, Finance, High
Keynote Hall
Performance Computing
4003 Emerging Companies Summit Panel: GPUs for
Computer Vision Wednesday, Sept 22, 16:00 (50 minutes)
Moderated by Jon Peddie, President at Jon Peddie Research Room A1
FULL CONFERENCE GUIDE 2010

The GPU (graphics processing unit) runs advanced applications 2072 GPUs at the Computer Animation Studio
which are transforming existing industries and creating new ones. Learn five simple ways in which GPUs have been adopted in the
Join our panel of leading industry experts as they discuss the production pipeline at Blue Sky Studios. Covers how we use
latest technology advances in the usage of GPUs for Computer GPUs to improve animation tools, add real-time anaglyph support,
Vision, they will cover facial, gesture, human motion, and and accelerate noise functions including code samples from
biometrics recognition, augmented reality, robotic computing production tools.
and more.
59
Speaker(s): Hugo Ayala (Sr. Research Associate, Wednesday, Sept 22, 16:00 (50 minutes)
Blue Sky Studios) Marriott Guadalupe Room
Topic(s): Film, Stereoscopic 3D, Tools & Libraries
2108 Binary Black Holes Simulations using CUDA
Get the latest information on how to evolve binary black holes
Wednesday, Sept 22, 16:00 (50 minutes)
simulations on GPUs.
Room B
Speaker(s): Abdul Mroue (Post-Doc Fellow, CITA, University
2073 High Performance Molecular Simulation,
of Toronto)
Visualization, and Analysis on GPUs
Topic(s): Astronomy & Astrophysics, Algorithms & Numerical
This talk will present recent successes in the use of GPUs to
Techniques, Physics Simulation
accelerate interactive visualization and analysis tasks on desktop
computers, and batch-mode simulation and analysis jobs on
Wednesday, Sept 22, 16:00 (50 minutes)
GPU-accelerated HPC clusters. We’ll present Fermi-specific
Room A8
algorithms and optimizations and compare with those for other
devices. We’ll also present performance and performance/ 2118 Large-scale Gas Turbine Simulations on GPU Clusters
watt results for NAMD molecular dynamics simulations and This talk describes a strategy for implementing structured grid
VMD analysis calculations on GPU clusters, and conclude with PDE solvers on GPUs. Techniques covered include the use of
a discussion of ongoing work and future opportunities for GPU source-to-source compilation and the use of sparse matrix
acceleration, particularly as applied to the analysis of petascale vector multiplications for complicated boundary conditions. A
simulations of large biomolecular complexes and long simulation new production-quality solver for flows in turbomachines called
timescales. Turbostream that uses these techniques is presented. The impact
WEDNESDAY

Speaker(s): John Stone (Senior Research Programmer, of the use of GPUs on the turbomachinery design process is
University of Illinois at Urbana-Champaign) demonstrated by two 64-GPU simulations that have recently been
Topic(s): Molecular Dynamics, Algorithms & Numerical performed on the University of Cambridge’s GPU cluster.
Techniques, High Performance Computing, Speaker(s): Tobias Brandvik (PhD Student, University
Life Sciences of Cambridge)
Topic(s): Computational Fluid Dynamics,
Wednesday, Sept 22, 16:00 (50 minutes)
Room E Wednesday, Sept 22, 16:00 (50 minutes)
2083 GPU Accelerated Solver for the 3D Two-phase Marriott San Jose Ballroom
Incompressible Navier-Stokes Equations 2135 Processing Petabytes per Second with the ATLAS
This demonstrates the potential of GPUs for solving complex free Experiment at the Large Hadron Collider at CERN
surface flow problems using level set methods. These methods Learn how GPUs could be adopted by the ATLAS detector at the
are capable of producing complex surface deformations, and Large Hadron Collider (LHC) at CERN. The detector, located at
therefore are used widely in computer graphics, as well as one of the collision points, must trigger on unprecedented data
engineering applications. This work demonstrates that GPUs acquisition rates (PB/s), to decide whether to record the event,
can be used to accelerate the most computationally expensive or lose it forever. In the beginning, we introduce the ATLAS
part of free surface flow calculations, and therefore allows much experiment and the computational challenges it faces. The
larger problems to be solved on workstation machines than was second part will focus on how GPUs can be used for algorithm
previously possible. These techniques will be exemplified by our acceleration - using two critical algorithms as exemplars. Finally,
current project to port our in-house fluid solver NaSt3DGPF to we will outline how GPGPU acceleration could be exploited and
the GPU. incorporated into the future ATLA computing framework.
Speaker(s): Peter Zaspel (Research Assistant University of Bonn) Speaker(s): Philip Clark (Reader (Associate Professor) in Particle
Topic(s): Computational Fluid Dynamics, Algorithms & Physics, University of Edinburgh), Andy Washbrook
Numerical Techniques, High Performance (Postdoctoral Research Assistant, University
Computing, Physics Simulation of Edinburgh)
Topic(s): High Performance Computing, Algorithms &
Wednesday, Sept 22, 16:00 (50 minutes) Numerical Techniques, Physics Simulation
Room C
2093 Computational Photography: Real-Time Plenoptic Wednesday, Sept 22, 16:00 (50 minutes)
Rendering Room A7
Get the latest information on GPU-based plenoptic rendering 2144 Large-Scale Visualization Using A GPU Cluster
including a demonstration of refocusing, novel view generation, Learn how to visualize extremely large-scale scientific data using
polarization, high dynamic range, and stereo 3D. Learn how GPU GPGPU techniques on a GPU-accelerated visualization cluster.
hardware enables plenoptic rendering tasks with high-resolution Recent advances in general-purpose GPU (GPGPU) computing
imagery to be performed interactively, opening up entirely new provide a promising solution to compute-intensive scientific
possibilities for modern photography. visualization. However, the largest scientific simulations produce
Speaker(s): Andrew Lumsdaine (Professor, Indiana University), datasets that are orders of magnitude larger than the memory
Georgi Chunev (Research Assistant, Indiana available on current GPUs. Many distributed GPUs must be used
University), Todor Georgiev (Senior Research in parallel. We present Longhorn, currently the world’s largest
Scientist II, Adobe Systems) GPU-enhanced cluster dedicated for visualization and data
Topic(s): Imaging, Computer Vision, Stereoscopic 3D analysis, and describe the distributed memory architecture and
GPGPU techniques to interactively visualize massive datasets
using distributed GPUs on Longhorn.
Speaker(s): Byungil Jeong (Visualization Scientist, TACC / and discuss handling boundary conditions and using separate
UT-Austin), Paul Navratil (Visualization Scientist, kernels to improve efficiency.
Texas Advanced Computing Center)
Speaker(s): Javier Cabezas (Researcher, Barcelona
Topic(s): Medical Imaging & Visualization, High Performance
Supercomputing Center), Mauricio Araya (Senior
Computing
Researcher, Barcelona Supercomputing Center)
Topic(s): Energy Exploration, Algorithms & Numerical
Wednesday, Sept 22, 16:00 (50 minutes) Techniques, High Performance Computing
Room A5
2154 The Impact of Data Movement on GPU Performance Wednesday, Sept 22, 16:00 (50 minutes)
GPU computing has taken the scientific computing landscape Room L
by storm, fueled by the massively parallel arithmetic hardware. 2252 Simulating Housefly Vision Elements Using OpenCL
When coding, researchers rely on best practices that have
An OpenCL GPU based computer simulation of a biologically
been developed in the short timespan of GPGPU. This session
motivated model, based on the anatomy of housefly’s first optic
challenges a widely held belief that transfers to/from the GPU
ganglion, the lamina ganglionaris (the lamina layer) is presented.
device must be minimized to achieve the best performance by
Specific to GPU technology, the computer model demonstrates:
presenting a case study on CULA, our library for dense linear
the implementation of a 2nd Order Runga-Kutta method to
algebra. The topics to be discussed include the relationship
approximate coupled differential equations using GPU hardware;
between computation and transfer time for synchronous/
the mapping of a non-Cartesian coordinate system onto the
asynchronous transfers, and impact that data allocations have on
Cartesian layout of the threads. Testing examined usage and
memory performance and overall solution time.
access across device memory spaces to determine the optimal

WEDNESDAY
Speaker(s): John Humphrey (Senior Engineer, EM Photonics, usage/access method for the ANN. This result was generalized
Inc), Daniel Price (Engineer, EM Photonics, Inc.) for OpenCL GPU devices, using the capabilities of OpenCL.
Topic(s): High Performance Computing, Algorithms &
Speaker(s): Karen Haines (Professor, WASP/The University of
Numerical Techniques, Tools & Libraries
Western Australia)
Topic(s): Neuroscience, Algorithms & Numerical Techniques,
Wednesday, Sept 22, 16:00 (50 minutes) Signal processing
Room A3
2201 A Case Study of Accelerating Matlab Based WEDNESDAY, SEPT 22, 16:00 (50 MINUTES)
Applications using GPUs ROOM M
Learn how to accelerate Matlab based applications using GPUs. 2302 Microsoft Technologies for HPC
We cover a popular neuro-imaging software called SPM and
NVIDIA Parallel Nsight provides access to the power of the GPU
show how to use CUDA and Jacket to speedup computationally
from within the familiar environment of Microsoft Visual Studio. In
intensive Matlab applications.
this session, we will expand on the computational power of Visual
Speaker(s): Aniruddha Dasgupta (Graduate Student, Georgia Studio 2010, Windows HPC Server and the Technical Computing
Institute of Technology) Libraries and show how to increase your performance.
Topic(s): Medical Imaging & Visualization
Speaker(s): Calvin Clark (Senior Consultant, Microsoft)
Topic(s): High Performance Computing
Wednesday, Sept 22, 16:00 (50 minutes)
Room N
Wednesday, Sept 22, 16:00 (50 minutes)
2217 GPU-Based Conjugate Gradient Solvers for Keynote Hall
Lattice QCD
4004 Emerging Companies: CEO on Stage featuring
Learn how to perform state-of-the-art quantum chromodynamics Cooliris, empulse GmbH, and Playcast Media Systems
(QCD) computation using NVIDIA GPUs at 1% of the cost of a
See the hottest new technologies from startups that could
conventional supercomputer and 10% of its power consumption.
transform computing.
We will discuss how physicists around the world are using GPU
clusters to solve QCD. We will focus upon how TWQCD have been In a lively and fast-paced exchange, the “Emerging Companies
using a large GPU cluster (200 GPUs) to simulate QCD, attaining Summit - CEO on Stage” sessions will feature CEOs from three
36 Teraflops (sustained). startups who will each have 15 minutes to introduce their
companies and interact with a panel of leading venture capitalists,
Speaker(s): Ting-Wai Chiu (Professor, National Taiwan
technology executives, and industry analysts.
University)
Topic(s): High Performance Computing, Physics Simulation Panelist(s): Nathan Brookwood (Research Fellow, Insight64),
Charles Carmel (VP of Corporate Business
Wednesday, Sept 22, 16:00 (50 minutes) Development, Cisco), Flip Gianos (General Partner,
FULL CONFERENCE GUIDE 2010

Room A2 InterwestInterWest Partners), Jeff Herbst

(Vice President of Business Development, NVIDIA)
2226 Reverse Time Migration with GMAC
Speaker(s): Austin Shoemaker (Cooliris), Michael Hummel
Get a close look at implementing Reverse Time Migration (Managing Director, empulse GmbH), Natan
(RTM) applications across multiple GPUs. We will focus on how Peterfreund (CTO, Playcast Media Systems)
RTM applications can be scaled using the GMAC asymmetric
Topic(s): General Interest, Databases & Data Mining, Video
distributed shared memory (ADSM) library to break the problem
Processing, Computer Graphics
into manageable chunks. We will provide an introduction to GMAC
61
Wednesday, Sept 22, 17:00 (50 minutes) Speaker(s): Tom-Michael Thamm (VP Products, mental images
Room A3 GmbH), Marc Nienhaus (Sen. Graphics Software
Engineer, mental images GmbH)
2005 Porting Large-Scale Legacy Fortran Codes
Topic(s): Energy Exploration, Databases & Data Mining,
Explore a new automatic Fortran translator which has been Imaging, Tools & Libraries
developed and used to port the numerical subroutines of FEFLO,
a general-purpose legacy Computational Fluid Dynamics code
Wednesday, Sept 22, 17:00 (50 minutes)
operating on unstructured grids, to run on the GPU. Data transfer
Room C
to the CPU is minimized throughout the course of a CFD run.
Benchmarks of large-scale production runs will be presented. 2021 Efficient Volume Segmentation on the GPU
Speaker(s): Andrew Corrigan (Research Mathematician Naval
Explore a new technique in the detection of common regions in
Research Laboratory & George Mason University),
a 2D/3D data array. Connected components along the axes are
Rainald Löhner (Professor, George Mason University)
linked before actual label propagation starts. The algorithm is
completely gather-based, which allows for several optimizations
Topic(s): Algorithms & Numerical Techniques, Computational
in the CUDA C implementation. It enables real-time frame rates
Fluid Dynamics, Machine Learning & AI
for the analysis of typical 2D images and interactive frame rates
for the analysis of typical volume data.
Wednesday, Sept 22, 17:00 (50 minutes)
Room B Speaker(s): Allan Rasmusson (PhD candidate/NVIDIA Intern,
University of Aarhus), Gernot Ziegler (Developer
2006 Short-Range Molecular Dynamics on GPU
Technology, Compute, NVIDIA)
Learn how to accelerate short-range molecular dynamics using Topic(s): Algorithms & Numerical Techniques, Computer
WEDNESDAY

CUDA C. We will cover building the neighbor list and calculating Vision, Imaging, Medical Imaging & Visualization
the forces on the GPU. To handle the case where a few particles
have significantly more neighbors than most other particles,
Wednesday, Sept 22, 17:00 (50 minutes)
we propose a hybrid data structure for the neighbor list that
Room A8
can achieve a good balance between performance and storage
efficiency. A CUDA C implementation of the technique for 2077 Catastrophic Risk Management: Fast and Flexible
Leonard-Jones forces can be found in the LAMMPS molecular with GPU Analytics
dynamics open source code. RMS will describe our experience leveraging GPUs and simple
Speaker(s): Peng Wang (Developer Technology Engineer, NVIDIA)
software architectural principles to deliver both spectacular
performance gains and enhanced flexibility in next generation
Topic(s): Molecular Dynamics
portfolio risk management applications.

Wednesday, Sept 22, 17:00 (60 minutes) Speaker(s): Philippe Stephan (CTO, RMS)
Marriott San Jose Ballroom Topic(s): Finance

2011 Fundamental Performance Optimizations for GPUs

Wednesday, Sept 22, 17:00 (50 minutes)
This presentation covers the major CUDA optimizations. Topics
Room A5
will include: maximizing memory throughput, kernel launch
configuration, using shared memory, and improving GPU/ 2089 Analyzing CUDA Accelerated Application Performance
CPU interaction. While C for CUDA is used for illustration, the at 20 PFLOP/s
concepts covered will apply equally to programs written with Learn how applications can be executed over multiple GPUs
OpenCL and DirectCompute APIs. located in multiple hosts, what the challenges are to scale one
Speaker(s): Paulius Micikevicius (Developer Technology
application to a 20 PFLOP/s machine and why tool support is
Engineer, NVIDIA)
a necessity. Receive an overview on the available performance
analysis tools that support CUDA developers in generating
Topic(s): Programming Languages & Techniques, Tools &
applications with outstanding speedups.
Libraries
Speaker(s): Guido Juckeland (Senior System Engineer (HPC),
Wednesday, Sept 22, 17:00 (50 minutes) Leader Hardware Accelerator Group, TU Dresden
Room A2 - ZIH), Jeremy Meredith, (Computer Scientist, Oak
Ridge National Laboratory)
2014 Scalable Subsurface Data Visualization Framework
Topic(s): High Performance Computing, Tools & Libraries
Mental Images’ DiCE-based geospatial library is a CUDA and
cluster-based visualization framework that enables scalable
Wednesday, Sept 22, 17:00 (50 minutes)
processing and rendering of huge amounts of subsurface data for
Room E
interactive seismic interpretation.
2128 Hybrid Quantum Mechanics/Electrodynamics (QM/ED)
Geospatial exploration in the oil and gas industries is concerned Modeling of Solar Cells on a CUDA Cluster
with scanning the earth’s subsurface structure for detecting oil
One of the greatest challenges of the twenty-first century is
and for cost-effective drilling of detected oil reservoirs.
the utilization of renewable energy. In providing a theoretical
Efficient seismic interpretation requires the interpreters to be explanation and guidelines for computer-aided design of dye-
able to interactively explore huge amounts of volumetric seismic sensitized solar cell (DSSC), we recently developed a hybrid
information with embedded stacked horizons to gain visual insight multi-scale quantum mechanics/classical electrodynamics (QM/
into the subsurface structure and to determine where oil recovery ED) methodology.
facilities and drilling infrastructure shall be built.
Our numerical simulations were tested on a CUDA enabled Linux Speaker(s): Mark Cheung (Physicist, Lockheed Martin Solar &
cluster using CP2K. We extended its CUDA implementation to Astrophysics Laboratory)
MPI parallel environment. Our preliminary results demonstrated Topic(s): Astronomy & Astrophysics, Computer Vision,
a superior performance advantage of hybrid MPI/GPGPU Computational Fluid Dynamics, Physics Simulation
programming that could potentially shorten the total simulation
wall time by an order of magnitude. Wednesday, Sept 22, 17:00 (50 minutes)
Speaker(s): Hanning Chen (Research Associate, Northwestern Room N
University) 2242 Swarming Bacteria and Diffusing Particles: High-
Topic(s): Quantum Chemistry, Energy Exploration, Molecular Throughput Analysis of Microscopic 3D Motion
Dynamics, Physics Simulation Ever since the 1827 discovery of Brownian motion by observing
pollen grains, quantifying motion under the microscope has led
Wednesday, Sept 22, 17:00 (50 minutes) to breakthroughs in physics, biology and engineering. Here, I
Room A1 present methods we have developed using confocal microscopy
2162 Real-time Reyes: Programmable Rendering on to deduce 3D structure and dynamics from 2D image sequences.
Graphics Processors We analyze the motion of diffusing colloidal particles and swarms
of bacteria free to swim in 3D, which we observe at the single-
We present a discussion of ideas and techniques behind
organism level. We rely heavily on GPU computing to process our
programmable graphics pipelines on modern GPUs, specifically
large data sets, making extensive use of NPP, CuFFT and optical-
the example design of a real-time Reyes renderer. Walking
flow CUDA algorithms originally developed for machine vision in
through this example, we address the philosophy beneath
automobiles.
programmable GPU graphics, the broad strategy for the specific

WEDNESDAY
pipeline, and algorithmic and implementation-level details Speaker(s): Peter Lu (Post-Doctoral Research Fellow, Harvard
for key rendering stages. We cover several issues concerning University)
GPU efficiency, including those involving work scheduling, Topic(s): Computer Vision, Imaging, Life Sciences
parallelization of traditional stages, and balancing of rendering
workloads. We expect the audience to gain an in-depth exposure Wednesday, Sept 22, 17:00 (50 minutes)
of the state of research in programmable graphics, and an insight Room A7
into efficient pipeline design for irregular workloads.
2243 Microsoft RemoteFX - GPU Virtualization for Desktop
Speaker(s): Anjul Patney (Graduate Student, University of Centralization
California, Davis), Stanley Tzeng (Graduate Student, Learn about Microsoft’s upcoming GPU Virtualization feature,
University of California, Davis) RemoteFX, which will ship in Windows Server 2008 R2 SP1.
Topic(s): Computer Graphics, Film Microsoft RemoteFX enables GPUs to be hosted in the datacenter
as a service that can be shared by multiple users for streaming
Wednesday, Sept 22, 17:00 (50 minutes) the real-time and complete Windows 7 desktop experience
Room D to ultra-lightweight client devices anywhere on the corporate
2167 Designing a Geoscience Accelerator Library network. With Microsoft RemoteFX, users will be able to work
Accessible from High Level Languages remotely in a Windows Aero desktop environment, watch
full-motion video, enjoy Silverlight animations, and run 3D
Explore a library for geoscience applications on CUDA and
applications – all with the fidelity of local-like performance.
OpenCL platforms. Target applications span atmosphere, ocean,
geomorphology and porous media flows. These areas are linked Speaker(s): Tad Brockway (Product Unit Manager, Microsoft)
by common numerical techniques encapsulated in our library. Topic(s): Cloud Computing, Computer Graphics
We will review the scope of the library, its meta-programming
approaches, and its key design attributes. We will also Wednesday, Sept 22, 17:00 (50 minutes)
demonstrate its support for multi-GPU parallelism within and Room K
across address spaces and provide examples of is use from high
level languages including C, Fortran, and Python.
2282 GPU-Enabled Biomedical Imaging
The purpose of this presentation is to describe several novel
Speaker(s): Chris Hill (Principle Research Scientist, M.I.T), Alan biomedical imaging applications which make extensive use
Richardson (Graduate Student, M.I.T) of GPUs. In CT iterative reconstructions, for example, high
Topic(s): Programming Languages & Techniques, Algorithms performance computing is allowing us to see details and
& Numerical Techniques, Computational Fluid structures we previously were not able to discern.
Dynamics, Tools & Libraries
Speaker(s): Homer Pien (Director of the Laboratory for Medical
Imaging and Computations, Massachusetts General
Wednesday, Sept 22, 17:00 (50 minutes)
Hospital / Harvard Medical School)
Marriott Guadalupe Room
Topic(s): Medical Imaging & Visualization, High Performance
2178 Using GPUs to Track Changes in the Sun
FULL CONFERENCE GUIDE 2010

Computing, Imaging, Life Sciences

Learn how GPU computing is enabling astrophysicists to study
our closest star. NASA’s recently launched Solar Dynamics
Observatory is continuously streaming full-disk images of the
Sun at visible, UV and EUV wavelengths. This presentation will
discuss ways that GPU computing is helping scientists cope with
the analysis of the immense data volumes as well as in numerical
modeling of the Sun.
63
Wednesday, Sept 22, 17:00 (20 minutes) Speaker(s): Kristian Raue (CEO & Founder, Jedox Business
Room L Intelligence), Uri Tal (CEO, Rocketick),
Michel Tombroff (CEO, Softkinetic)
2285 Walt Disney Animation Studios’ GPU-Acelerated
Topic(s): General Interest, Computer Vision, Databases &
Animatic Lighting Process with Soft Shadows and
Data Mining, High Performance Computing
Depth of Field
See how Walt Disney Animation’s software uses OpenGL and
Wednesday, Sept 22, 17:30 (20 minutes)
GLSL shaders to interactively display depth of field, accurate
Room L
lighting, and soft shadows in the Maya viewport. Learn how
this improved our animatic process and helps us make better 2284 GPU Implementation of Collision-Based Deformation
animated movies. Addressing the production needs for the upcoming Disney
animated movie, we are in the process of developing a new Maya
We’ll show the tools in action and show the progression of a shot
deformer that incorporates state-of-the-art collision-based
from standard Maya to final animatic look, and will compare the
deformations. Our deformer includes both dynamic and quasi-
result with a production Renderman render. We’ll also walk you
static solutions. Our solvers conserves volume and constrains
through the GLSL shader render passes it uses to do deferred
surface area by solving linear systems in a graded volume
lighting and shadowing.
mesh. To achieve realistic deformation in production-ready
Speaker(s): David Adler (Principal Software Engineer, Walt data at interactive rates, we leverage the computational power
Disney Animation Studios) of the NVIDIA GPU architecture using CUDA. Our underlying
Topic(s): Film, Digital Content Creation (DCC) data structure is specifically designed and optimized for CUDA
(i.e. coalescing data access, minimizing CPU-GPU interaction,
Wednesday, Sept 22, 17:00 (50 minutes) utilizing shared memory).
WEDNESDAY

Room M Speaker(s): Dmitriy Pinskiy (Sr Software Engineer, Walt Disney

2294 GPU.NET with TidePowerd Animation Studios)
Join TidePowerd for a demonstration of GPU.NET, our innovative Topic(s): Film
new product which dramatically cuts the time needed to develop
and maintain a GPU-based application by extending Microsoft’s Wednesday, Sept 22, 18:00 (120 minutes)
.NET Framework onto GPUs. With GPU.NET, your device- Keynote Hall
accelerated code can be written in any .NET-supported language 1006 Exhibits Open / Networking Reception
(e.g., C#, F#, IronPython) and called like any other method - so
Join your colleagues in the exhibit hall to preview emerging
it’s easy to create new GPU-based applications without having
technologies and see some of the most innovative solutions
to retrain your developers. You’ll learn how to use GPU.NET to
available today. Appetizers and drinks will be served.
quickly develop a financial calculator in C#, use the built-in Visual
Studio unit-testing tools to ensure the correctness of the code, Topic(s): General Interest
and seamlessly deploy the application into a mixed Windows /
Linux environment. We’ll also discuss how GPU.NET expands
the frontiers of GPU computing into lucrative new markets
such business intelligence, database processing, and data
visualization.
Speaker(s): Jack Pappas (Co-founder, CEO, TidePowerd)
Topic(s): Programming Languages & Techniques

Wednesday, Sept 22, 17:00 (50 minutes)

Keynote Hall
4005 Emerging Companies: CEO on Stage featuring Jedox
Business Intelligence, Rocketick, and Softkinetic
See the hottest new technologies from startups that are
transforming computing.

In a lively and fast-paced exchange, the “Emerging Companies

Summit - CEO on Stage” sessions will feature CEOs from three
startups who will each have 15 minutes to introduce their
companies and interact with a panel of leading venture capitalists,
technology executives, and industry analysts.
Panelist(s): Flip Gianos (General Partner, Interwest), Charles
Carmel (VP of Corporate Business Development,
Cisco), Nathan Brookwood (Research Fellow,
Insight64), Jeff Herbst (Vice President of Business
Development, NVIDIA)
Thursday, Sept 23, 09:00 (50 minutes) Thursday, Sept 23, 09:00 (50 minutes)
Room A5 Room N
2027 GPU-Based Image Processing in Military Applications 2076 Implementing CUDA Audio Networks
There are more than 6000 Unmanned Aerial Vehicles (UAVs) Learn how to implement a commercial software library that
in use in the US Military. The US Army alone has flown more exploits CUDA for audio applications. We focus on the overall
than 1 million UAV flight hours. Every UAV captures at least threading architecture and the underlying math for implementing
one stream of video; some as many as 9. All this video needs to general purpose audio processing in CUDA devices. Covers
be processed and analyzed both during the mission, and post- the use of inter-process communication to make a plug-in
mission. Traditionally, custom ASICs, and FPGAs were required implementation loadable in 32 bit hosts installed in 64 bit
for even the most rudimentary image processing tasks. Now, systems, distributing the GPU load on remote servers, and
GPUs provide orders of magnitude more compute at a fraction of creating a CUDA network for high-end purposes such as a big
the cost. Hear how MotionDSP uses GPUs to provide previously recording facility.
impossible capabilities to military imaging.
Speaker(s): Giancarlo Del Sordo (Chief Developer and Product
Speaker(s): Sean Varah (CEO, MotionDSP Inc.) Manager, Acustica Audio)
Topic(s): Video Processing, High Performance Computing Topic(s): Audio Processing, Signal processing

Thursday, Sept 23, 09:00 (50 minutes) Thursday, Sept 23, 09:00 (50 minutes)
Room L Room A3
2030 High-Throughput Cell Signaling Network Learning 2138 Faster, Cheaper, Better – Hybridization of Linear
with GPUs Algebra for GPUs
Explore how GPUs are being used to enable high-throughput cell Learn how to develop faster, cheaper and better linear algebra
signaling network discovery and data-intensive computational software for GPUs through a hybridization methodology that is
systems biology more generally. Systems biology is transitioning built on (1) Representing linear algebra algorithms as directed
from a largely reductive discipline to one focused on building acyclic graphs where nodes correspond to tasks and edges to
predictive models of large-scale biological systems. New dependencies among them, and (2) Scheduling the execution
instrumentation will provide the necessary raw data for such an of the tasks over hybrid architectures of GPUs and multicore.
THURSDAY

approach, the key challenge now is building the hardware and Examples will be given using MAGMA, a new generation of
software tools to efficiently and interactively build these models. linear algebra libraries that extends the sequential LAPACK-
This session will describe how GPUs can and will play a key role style algorithms to the highly parallel GPU and multicore
in these efforts. heterogeneous architectures.
Speaker(s): Michael Linderman (Engineering Research Speaker(s): Hatem Ltaief (Sr. Research Associate, University of
Associate, Stanford University) Tennessee), Stan Tomov (Research Scientist,
Topic(s): Life Sciences, Algorithms & Numerical Techniques, University of Tennessee)
Machine Learning & Artificial Intelligence Topic(s): High Performance Computing, Algorithms &
Numerical Techniques, Tools & Libraries
Thursday, Sept 23, 09:00 (50 minutes)
Room A7 Thursday, Sept 23, 09:00 (50 minutes)
Room K
2033 Accelerating Pricing Models with virtual GPUs
Join Citadel to explore our three year undertaking on the 2145 Photo Editing on the GPU with MuseMage
feasibility of GPGPU computing for option pricing. We will discuss See how MuseMage greatly accelerates image processing and
our 140X performance boost and the hurdles we had to overcome editing while providing real-time feedback by harnessing the
to integrate GPUs into our existing infrastructure. Please note power of GPUs. We will discuss the majority of MuseMage tools
that our talk will not get into the details of the model (that’s which are fully implemented on GPUs.
proprietary information), but we will share our innovative solution
Speaker(s): Kaiyong Zhao (Graduate Student, HKBU), Yubo
to drive a grid of virtual GPUs.
Zhang (PhD student, UC Davis)
Speaker(s): Scott Donovan (System Architect, Citadel Topic(s): Imaging
Investment Group)
Topic(s): Finance, High Performance Computing Thursday, Sept 23, 09:00 (50 minutes)
Marriott San Jose Ballroom
Thursday, Sept 23, 09:00 (50 minutes)
2156 GMAC: Global Memory For Accelerators
Room C
Learn how to use GMAC, a novel run-time for CUDA GPUs.
2048 H.264/AVC Video Encoding with CUDA and OpenCL GMAC unifies the host and device memories into a unified virtual
Join experts from MainConcept, a leading provider of video codecs address space, enabling the host code to directly access the
to the professional market, as they demonstrate the latest version device memory, and removing the need for data transfers between
of their CUDA-based H.264/AVC Encoder. host and device memories. Moreover, GMAC also allows pointers
to be used by both, the host and device code indistinctly.
Speaker(s): Thomas Kramer (VP Product Management,
MainConcept) This session will present the GMAC run-time and show how to use
Topic(s): Video Processing, Tools & Libraries it in current applications. This session will cover from the basics
of GMAC to multi-threaded applications using POSIX threads,
OpenMP and MPI.
Speaker(s): Isaac Gelado (Lecturer and Researcher, Universitat Speaker(s): Frank Mueller (Associate Professor, North Carolina
Politecnica de Catalunya) State University), Xing Wu (Research Assistant,
Topic(s): Tools & Libraries North Carolina State University)
Topic(s): Tools & Libraries, High Performance Computing
Thursday, Sept 23, 09:00 (50 minutes)
Room A1 Thursday, Sept 23, 09:00 (20 minutes)
Room M
2202 A Programming Model and Tool for Automatic High-
Performance C to CUDA Mapping 2278 Strategies for Code Encapsulation in GPU
Discover our automatic C-to-CUDA mapper prototype, and how Implementations
it optimizes execution and data movement for a broad class of Code encapsulation is a common technique used to reduce code
loop codes. Coupled with our powerful mapper, C as an input complexity that a given programmer has to understand. It allows
language does not only offer portability but also performance and the use of increasingly complex systems of hardware, software,
performance portability. Learn about our optimizations and some and algorithms to tackle increasingly difficult scientific problems.
of the performance obtained through different uses of the mapper. Unfortunately, code encapsulation is not easily attainable
in current GPU environments. We will share our OpenCL
Speaker(s): Benoit Meister (Senior Engineer, Reservoir Labs)
development experiences for achieving partial encapsulation in
Topic(s): Tools & Libraries
GPU implementations, and discuss best practices in this area.

Thursday, Sept 23, 09:00 (20 minutes) Speaker(s): Brian Cole (Developer, OpenEye Scientific Software)
Room A8 Topic(s): Programming Languages & Techniques, High
Performance Computing, Life Sciences
2206 Accelerated Computational Fluid Dynamics
Employing GPUs
Thursday, Sept 23, 09:00 (50 minutes)
Speaker(s): Daniel Gaudlitz (Project Manager, FluiDyna)
Room A2
Topic(s): Computational Fluid Dynamics,
High Performance Computing 2301 GPU Cluster Computing: Accelerating Scientific
Discovery
Thursday, Sept 23, 09:00 (50 minutes) We propose holding a research roundtable focussed on using

THURSDAY
Room B GPU clusters to support scientific research. The roundtable
will bring together researchers that have recently deployed or
2236 A Work-Efficient GPU Algorithm for Level Set
are interested in deploying GPU clusters to enable scientific
Segmentation
research. At the research roundtable they will be able to share
Explore a novel GPU level set segmentation algorithm that their experiences in deploying this new technology and discuss
is both work-efficient and step-efficient. Our algorithm has the future of this technology in supporting research to tackle the
O(logn) step-complexity, in contrast to previous GPU algorithms world’s most challenging scientific problems.
which have O(n) step-complexity. We apply our algorithm to 3D
medical images and we show that in typical clinical scenarios, our To open discussion we will provide a brief presentation about
algorithm reduces the total number of processed level set field deployment of the CSIRO’s latest supercomputer cluster, which
elements by 16x and is 14x faster than previous GPU algorithms is among the world’s first to combine traditional CPUs with
with no reduction in segmentation accuracy. more powerful NVIDIA GPUs, that is providing a world class
computational and simulation science facility to advance priority
Speaker(s): Mike Roberts (Research Assistant, Hotchkiss Brain CSIRO science.
Institute, University of Calgary, Canada)
Topic(s): Medical Imaging & Visualization, Algorithms & Speaker(s): John Taylor (Science and Business Leader, CSIRO),
Numerical Techniques, Computer Vision, Dragan Dimitrovici (XENON Systems Pty Ltd)
Computer Graphics Topic(s): High Performance Computing

Thursday, Sept 23, 09:00 (50 minutes) Thursday, Sept 23, 09:00 (50 minutes)
Room D Keynote Hall
2272 GStream: A General-Purpose Data Streaming 4006 Fireside Chat with Jen-Hsun Huang - Co-founder &
Framework on GPUs CEO, NVIDIA
We present GStream, a general-purpose, scalable and C++ Jen-Hsun Huang will take part in a fireside chat by Quentin Hardy,
template run-time framework amenable to both the streaming National Editor at Forbes Magazine. They will discuss the rise of
problem and GPU architectures. GStream offers transparent GPUs, current trends in visual and parallel computing, and the
streaming data transmissions and automatic memory transformational changes ahead for the industry.
synchronization over a rich collection of computing resources Speaker(s): Quentin Hardy (National Editor, Forbes Magazine),
that are transparently allocated and reused. Various problems Jen-Hsun Huang (CEO & President, NVIDIA)
FULL CONFERENCE GUIDE 2010

other than streaming application, such as scientific computing, Topic(s): General Interest
numerical codes and text processing, can be easily expressed
using GStream and subsequently integrated with our GStream
Thursday, Sept 23, 09:30 (20 minutes)
library. GStream’s ease of use combined with efficient exploitation
Room A8
of GPU resources have the potential to lead to higher coding
productivity and application performance through our data- 2037 Numtech & GPGPU, a SME Point of View
centric specification paradigm. Hear why and how Numtech, a french SME working in the field of
atmospheric dispersion and expertise of meteorological events, is
67
benchmarking GPGPU for its futures applications. A compressible Discusses how the results of this work may lead to better
and an incompressible interactive flow solvers are described. diagnostics for detecting leukemia in blood cells.
Speaker(s): Emmanuel Buisson (CEO, Numtech) Speaker(s): Robert Zigon (Sr Staff Development Engineer,
Topic(s): Computational Fluid Dynamics, Physics Simulation Beckman Coulter)
Topic(s): Life Sciences
Thursday, Sept 23, 10:00 (50 minutes)
Room L Thursday, Sept 23, 10:00 (20 minutes)
Room A8
2001 Acceleration of the Freesurfer Suite for
Neuroimaging Analysis 2110 Acceleration of a Novel Rotorcraft Wake Simulation
See how GPU technology has dramatically accelerated the Dive deep as we present the details of a new CUDA-based
Freesurfer suite of tools used by thousands of researchers for the algorithm for accurate rotorcraft wake simulations. We use
analysis of neuroimaging data. a vortex particle method, accelerated with a multipole tree
algorithm, combined with a traditional grid-based CFD code. This
Speaker(s): Richard Edgar (Assistant in Neuroscience,
CUDA algorithm can evaluate the velocity and velocity-gradient
Massachusetts General Hospital, Harvard University)
with an effective throughput approaching 300 billion interactions
Topic(s): Medical Imaging & Visualization, Imaging,
per second on a C1060. This gives 10x speed-up and 2.5x better
Tools & Libraries
accuracy compared to the parallel CPU version.

Thursday, Sept 23, 10:00 (50 minutes) Speaker(s): Christopher Stone (Research Scientist,
Room A3 Intelligent Light)
Topic(s): Computational Fluid Dynamics, Algorithms &
2002 CUDA Debugging on Linux and MacOS with cuda-gdb
Numerical Techniques
Boost your development speed by mastering the CUDA debugging
tools NVIDIA provides. In this session you will learn the basics of Thursday, Sept 23, 10:00 (50 minutes)
cuda-gdb and cuda-memcheck, as well as their more advanced Room N
features with live demonstrations on Linux and MacOS.
2116 Real-time Multichannel Audio Convolution
Speaker(s): Satish Salian (Manager CUDA Debugger Tools,
THURSDAY

Learn how a synthesis of 3D sound scenes can be achieved using

NVIDIA)
a peer-to-peer music streaming environment and GPU. We will
Topic(s): Tools & Libraries
discuss the technical and cost benefits to this approach, while
noting that it frees the CPU for other tasks.
Thursday, Sept 23, 10:00 (50 minutes)
Room A7 Speaker(s): Jose Antonio Belloch (MsC, Institute of
Telecommunications and Multimedia Applications,
2032 Practical Methods Beyond Monte Carlo in Finance Universidad Politecnica de Valencia),
Murex will share its practical experience using GPUs to accelerate Alberto Gonzalez (Professor, Universidad
high-performance analytics based on GPU-enabled Monte Politecnica de Valencia), Antonio M. Vidal
Carlo and PDE methods. We will also briefly describe Murex’s (Professor, Universidad Politecnica de Valenc)
experience developing a high-level payoff scripting language Topic(s): Audio Processing, Signal processing
that allows user-definable payoffs for single and cross-asset
instruments. Thursday, Sept 23, 10:00 (50 minutes)
Speaker(s): Pierre Spatz (Head of Quantitative Research, Room A5
Murex SAS) 2124 Operating System Abstractions for GPU Programming
Topic(s): Finance, Algorithms & Numerical Techniques
GPGPU frameworks such as CUDA improve programmability,
but GPU parallelism remains inaccessible in many application
Thursday, Sept 23, 10:00 (50 minutes) domains. This session argues that poor OS support causes this
Room K problem. OSes do not provide the kind of high-level abstractions
2053 Pixel Bender: Building a Domain Specific Language on for GPUs that applications expect for other resources like CPUs
the GPU and file systems. We advocate reorganizing kernel abstractions to
Examine the challenges and advantages of building the Pixel support GPUs as first-class computing resources, with traditional
Bender domain specific language for image processing for the guarantees such as fairness and isolation. We demonstrate
GPU. We will examine how Pixel Bender was made to work within shortcomings in Windows 7 GPU support, and show that better
several Adobe applications across a wide range of hardware OS abstractions can accelerate interactive workloads like gesture
systems and platforms. recognition by a factor of 10X over a CUDA implementation.

Speaker(s): Bob Archer (Senior Computer Scientist, Adobe Speaker(s): Christopher Rossbach (Researcher, Microsoft
Systems Inc) Research), Emmett Witchel (Professor, University
of Texas at Austin)
Topic(s): Tools & Libraries
Topic(s): Programming Languages & Techniques,
Tools & Libraries
Thursday, Sept 23, 10:00 (50 minutes)
Room A2
2055 Application of Fermi GPU to Flow Cytometry and
Cancer Detection
Learn how a Tesla C2050 enabled scientists to explore cancer
data sets 400 times faster than a PC-only implementation.
Thursday, Sept 23, 10:00 (20 minutes) package provides results that are indistinguishable from the CPU
Room B code is extremely tricky and often the desire to take shortcuts to
boost performance can affect accuracy with unpredictable results.
2149 Overview of Parallel Nsight for Visual Studio
We have developed a comprehensive validation suite that can be
NVIDIA Parallel Nsight provides access to the power of the GPU used to perform the detailed testing that is required to ensure the
from within the familiar environment of Microsoft Visual Studio. approximations necessary for GPU performance do not impact the
This session is an entry level overview of the GPU computing and scientific results. Additionally we will discuss how we have made
graphics development features of Parallel Nsight as well as a careful use of mixed single and double precision arithmetic in
glimpse into the future of this powerful tool. the AMBER implementation to achieve equivalence in the results
Speaker(s): Kumar Iyer (Product Manager, NVIDIA) without excessively compromising performance. Finally we
Topic(s): Tools & Libraries provide examples of recent breakthrough simulations conducted
using GPU enabled AMBER 11.
Thursday, Sept 23, 10:00 (50 minutes) Speaker(s): Ross Walker (Research Professor, San Diego
Room A1 Supercomputer Center)
2176 Easy GPU Meta-programming: A Case Study in Topic(s): Molecular Dynamics
Biologically-Inspired Computer Vision
Learn how to let the computer optimize your CUDA and OpenCL Thursday, Sept 23, 10:00 (50 minutes)
code for you with easy GPU Meta-programming and Scripting (e.g. Keynote Hall
PyCUDA). We will present a case study in which we consider the 4007 Emerging Companies: CEO on Stage featuring
step-wise optimization of a 3D filter bank convolution, using a Aqumin, RTT, and Scalable Display Technologies
suite of open-source tools. See the hottest new technologies from startups that are
Speaker(s): Nicolas Pinto (PhD Student, MIT) transforming computing.
Topic(s): Tools & Libraries, Computer Vision, High In a lively and fast-paced exchange, the “Emerging Companies
Performance Computing, Neuroscience Summit - CEO on Stage” sessions will feature CEOs from three
startups who will each have 15 minutes to introduce their
Thursday, Sept 23, 10:00 (50 minutes) companies and interact with a panel of leading venture capitalists,

THURSDAY
Room C technology executives, and industry analysts.
2215 Extending OpenCV with GPU Acceleration Panelist(s): Norman Winarsky (VP of Ventures, Licensing and
OpenCV is a widely popular computer vision library, with millions Strategic Programs, SRI), Savitha Srinivasan
of downloads and hundreds of thousands of users. Applications (Corporate Venture Partner, IBM), and Rob Enderle
span many industries including robotics, industrial machine (Analyst, Enderle Group), Jeff Herbst (Vice President
vision, automotive, film & broadcast, medical, and consumer of Business Development, NVIDIA
applications. NVIDIA and the OpenCV development team are Speaker(s): Andrew Jamison (CEO, Scalable Display
collaborating to provide CUDA implementations of the most Technologies), Jeroen Snepvangers (CEO, RTT),
demanding algorithms, thus enabling a new level of real-time Michael Zeitlin (CEO, Aqumin)
capability and higher quality results. Topic(s): General Interest, Finance, Imaging,
This talk with introduce OpenCV, and summarize the new CUDA Computer Graphics
enabled capabilities, and provide an overview of future plans.
Thursday, Sept 23, 10:30 (20 minutes)
Speaker(s): Joe Stam (Sr. Applications Engineer, NVIDIA) Room A8
Topic(s): Computer Vision, Imaging, Stereoscopic 3D,
Video Processing 2061 Accelerating Explicit FEM Shock & Blast Simulations
Explicit finite element codes are widely used to simulate the
Thursday, Sept 23, 10:00 (50 minutes) response of structures and mechanical equipment subjected to
Marriott San Jose Ballroom shock, blast and wave propagation phenomena. High resolution
models require run times ranging from a few seconds to a few
2269 Bringing GPUs to Mainstream Molecular months are common and hence the payoff from GPU acceleration
Dynamics Packages is tremendous. We describe the acceleration of our commercial
Recent work in close collaboration with NVIDIA has produced a finite element code NLFLEX using CUDA. We developed GPU
GPU accelerated version of the AMBER Molecular Dynamics Code kernels in CUDA based on our production code NLFLEX, for linear
PMEMD that runs between 20 and 130 times the speed of a single elasticity, explosives, elasto-plasticity and large deformation
2.8GHz Intel Nehalem Processor, with even higher performance elasticity. We attained order of magnitude (10X) acceleration in
on multiple GPUs, but which does not make sacrifices in the single precision and approximately (5X) in double precision mode.
accuracy or validity of such calculations to achieve this. The GPU
Speaker(s): Nachiket Gokhale (Senior Research Engineer,
accelerated version supports both explicit solvent particle mesh
Weidlinger Associates Inc)
FULL CONFERENCE GUIDE 2010

ewald (PME) and implicit solvent simulations and is available

as part of the new AMBER 11 package. This talk will provide an Topic(s): Algorithms & Numerical Techniques, Computational
overview of the AMBER software, background behind this GPU Fluid Dynamics, Physics Simulation
work, benchmarks, the impact that GPU accelerated MD can have
on the field, the techniques used to achieve the performance seen
without sacrificing accuracy and finally the validation methods
used to ensure simulations are directly equivalent to CPU based
calculations. Ensuring that a GPU implementation of a MD
69
Thursday, Sept 23, 10:30 (20 minutes) Thursday, Sept 23, 11:00 (50 minutes)
Room B Room L
2292 Implementation of High-Order Adaptive CFD Methods 2043 Disparity Map Generation
on GPUs Explore the algorithms and implementation of disparity maps
A discontinuous high-order formulation named the Correction on the GPU. We will discuss how a disparity map facilitates
Procedure via Reconstruction (CPR) is recently implemented on stereoscopic content creation, applications and approaches tried,
Nvidia GPUs. The CPR formulation is related to the discontinuous and final results of real time calculations on GPUs.
Galerkin (DG) method, and unifies several methods such as
Speaker(s): Henry Gu (CTO, GIC)
the DG, spectral volume and spectral difference into a single
Topic(s): Stereoscopic 3D, Computer Vision, Imaging
framework efficient for hybrid meshes. In preliminary 2D inviscid
flow computations, a single GPU has been able to deliver a
speedup of 44 over a CPU of the same generation. Extension is Thursday, Sept 23, 11:00 (50 minutes)
being made for viscous flow computation, and results will be Room K
presented at the final presentation. 2051 GPGPU in Commercial Software: Lessons From Three
Speaker(s): Arun Somani (Anson Marston Professor, Iowa State
Cycles of the Adobe Creative Suite
University), Z.J. Wang (Professor, Iowa State Learn about leveraging GPUs for commercial software. We will
University), Lizandro Solano (Iowa State University) discuss lessons learned creating and using the Adobe Image
Topic(s): Computational Fluid Dynamics Foundation libraries to accelerate image and video processing
using GPUs and multi-core. These libraries are used by most
Thursday, Sept 23, 11:00 (50 minutes) of Adobe’s applications as well as integrated by hobbyist and
Room B professional applications with different levels of experience with
GPUs and diverse user bases.
2007 Folding@home: Petaflops on the Cheap Today;
Exaflops Soon? Speaker(s): Kevin Goldsmith (Senior Engineering Manager,
Adobe Systems, Incorporated)
Learn how Folding@home has used petascale computing with
Topic(s): Imaging, Video Processing
GPUs to make fundamental breakthroughs in computational
biology and how this technology can make an impact in your work.
THURSDAY

Thursday, Sept 23, 11:00 (50 minutes)

Speaker(s): Vijay Pande (Director, Folding@home Distributed Room A3
Computing Project, Stanford University)
2070 CUSPARSE Library: A Set of Basic Linear Algebra
Topic(s): Life Sciences, Cloud Computing, High Performance
Subroutines for Sparse Matrices
Computing, Molecular Dynamics
The CUSPARSE library can impact and enable software solutions
Thursday, Sept 23, 11:00 (50 minutes) for computational science and engineering problems in the fields
Room A5 of energy exploration, physical simulations and life sciences
among many others. It provides sparse linear algebra primitives
2023 Processing Device Arrays with C++ Metaprogramming that can be used to implement iterative linear system and
I will describe tricks for building APIs using C++ eigenvalue solvers and can also serve as a building block for
metaprogramming that generate custom kernels for complex the state-of-the-art sparse direct solvers. CUSPARSE library
manipulation of device-side arrays in CUDA. Using a variation of is implemented using CUDA parallel programming model and
Expression Templates, multiple operations can be fused into a provides sparse analogs to BLAS level-1,2,3 operations, such
single kernel that executes with reasonable efficiency. as matrix-vector multiplication, triangular solve and format
conversion routines.
Speaker(s): Jonathan Cohen (Senior Research Scientist,
NVIDIA Research) Speaker(s): Maxim Naumov (Software Engineer, NVIDIA)
Topic(s): Programming Languages & Techniques, Topic(s): Tools & Libraries, Algorithms & Numerical
Tools & Libraries Techniques, High Performance Computing

Thursday, Sept 23, 11:00 (50 minutes) Thursday, Sept 23, 11:00 (50 minutes)
Room N Room A1
2042 Interactive 3D Audio Rendering Systems 2075 GPU-Accelerated Video Encoding
Learn how to leverage GPUs for interactive audio rendering. This Learn how to accelerate video encoding using the GPU. We
session will give a short overview of the architecture of current will give an overview of the typical video encoding pipeline and
GPUs, emphasizing some key differences between GPU and CPUs discuss how different parts of the pipeline can be ported to
programming models for audio processing. We will illustrate the GPU using various approaches. We will focus on block-based
benefits of GPU-accelerated audio rendering with results from Motion Estimation, in particular, as it is the corner stone of video
3D audio processing and sound scattering simulations. Finally, encoding algorithms. The efficiency of its implementation on the
we will discuss best practices for GPU implementations as well GPU is crucial to the speed and quality of the encoder.
as future opportunities for audio rendering on massively parallel
Speaker(s): Anton Obukhov (Developer Technology Engineer,
architectures.
NVIDIA)
Speaker(s): Nicolas Tsingos (Senior Staff Engineer, Topic(s): Video Processing
Dolby Laboratories)
Topic(s): Audio Processing, Ray Tracing, Signal processing
Thursday, Sept 23, 11:00 (50 minutes) Speaker(s): Daniel Ayres (PhD Candidate, University of Maryland)
Room A7 Topic(s): Life Sciences

2098 Enabling On Demand Value-At-Risk for Financial

Markets Thursday, Sept 23, 11:00 (50 minutes)
Marriott San Jose Ballroom
Learn how financial market risk managers can increase their
ability to preempt exposure limit breaching and tighten risk 2219 High-Productivity CUDA Development with the Thrust
control to increase investor confidence. Gain insight into the Template Library
techniques for obtaining high performance Monte-Carlo based Thrust is a parallel template library for developing CUDA
market value-at-risk (VaR) estimates over a hierarchy of risk applications. Modeled after the C++ Standard Template Library
aggregation levels. This session will focus on how the new (STL), Thrust brings a familiar abstraction layer to the realm
Fermi platform can be used by financial institutions to enable of GPU computing. Thrust provides host and device variants of
on-demand estimates of the market VaR, and discuss important the STL vector container to simplify memory management and
software architecture decisions, the benefits of the new facilitate data transfers. These containers are complemented
GigaThread Engine and Parallel DataCache, as well as the guiding with a large collection of generic data-parallel algorithms and a
principles for constructing efficient algorithms on GPUs. suite of useful iterator adaptors. Together, these features form
Speaker(s): Matthew Dixon (Professor, UC Davis), Jike Chong
a flexible high-level interface for GPU programming that greatly
(Principal Software Architect, Parasians, LLC)
enhances developer productivity. In this session we’ll discuss
Thrust’s features and explain the basic design philosophy of the
Topic(s): Finance, Algorithms & Numerical Techniques
library.

Thursday, Sept 23, 11:00 (20 minutes) Speaker(s): Nathan Bell (Research Scientist, NVIDIA Research)
Room A8 Topic(s): Tools & Libraries

2171 Parallel Algorithms for Interactive Mechanical CAD

Thursday, Sept 23, 11:00 (50 minutes)
The broad objective of our research is to develop mechanical
Keynote Hall
Computer-Aided Design tools that provide interactive feedback to
the designer. We have developed GPU algorithms for fundamental 4008 Emerging Companies: CEO on Stage featuring ICD
CAD operations (NURBS evaluation, surface-surface intersection, and Universal Robotics

THURSDAY
separation distance computation, moment computation, etc.) See the hottest new technologies from startups that are
that are one to two orders of magnitude faster, and often more transforming computing.
accurate, than current commercial CPU implementations. We will
touch on strategies we have employed to meet GPU programming In a lively and fast-paced exchange, the “Emerging Companies
challenges, such as the separation of CPU/GPU operations, Summit - CEO on Stage” sessions will feature CEOs from three
imposing artificial structure on computations, and transforming startups who will each have 15 minutes to introduce their
problem definitions to suit GPU-computation models. companies and interact with a panel of leading venture capitalists,
technology executives, and industry analysts.
Speaker(s): Sara McMains (Associate Professor, University of
Panelist(s): Rob Enderle (Analyst, Enderle Group), Jeff Herbst
California Berkeley), Adarsh Krishnamurthy
(Vice President of Business Development, NVIDIA),
(Student, University of California Berkeley)
Savitha Srinivasan (Corporate Venture Partner, IBM),
Topic(s): Algorithms & Numerical Techniques, Tools &
Norman Winarsky (VP of Ventures, Licensing and
Libraries, Computer Graphics
Strategic Programs, SRI)
Speaker(s): David Peters (Founder and CEO, Universal
Thursday, Sept 23, 11:00 (50 minutes)
Robotics), David Hayes (CEO, ICD)
Room C
Topic(s): General Interest, Machine Learning & Artificial
2173 Enabling Large-Scale CCTV Face Recognition Intelligence, Mobile Devices
Learn how to use CUDA and GPGPU to perform large scale face
search for both forensics as well as CCTV face recognition. Thursday, Sept 23, 11:30 (20 minutes)
Speaker(s): Ben Lever (Senior Research Engineer, NICTA),
Room A8
Abbas Bigdeli (Senior Researcher and 2106 Particleworks: Particle-based CAE Software on
Technology Manager, NICTA) Multi-GPU
Topic(s): Computer Vision, Video Processing Prometech Software, Inc. is an university launched technology
venture in Japan and has been working in the field of particle-
Thursday, Sept 23, 11:00 (50 minutes) based computational fluid dynamics for several years. Through
Room A2 collaboratinos with major automotive and material companies in
2203 Modeling Evolution Computing the Tree of Life Japan, Prometech has implemented our Particle technology on
Multi-GPU and delivered as a CAE software, “Particleworks”. In
Learn how GPUs are being used to accelerate our understanding
FULL CONFERENCE GUIDE 2010

this session, we will discuss the theoretical background of our

of the tree of life. This session will cover BEAGLE, which is an
simulation (MPS; Moving Particle Simulation method), Multi GPU
open API and library for evaluating phylogenetic likelihoods of
programming techniques of sparse matrix solver, performance
biomolecular sequence evolution. BEAGLE uses novel algorithms
results of Particleworks and the analysis examples of the Auto
and methods for evaluating phylogenies under arbitrary
and Material.
molecular evolutionary models on GPUs, making use of the large
number of processing cores to efficiently parallelize calculations.
71
Speaker(s): Issei Masaie (Chief GPU Engineer, Prometech Thursday, Sept 23, 14:00 (50 minutes)
Software, Inc.) Room K
Topic(s): Computational Fluid Dynamics, High Performance
2087 Fast High-Quality Panorama Stitching
Computing
We present a panorama stitching application implemented with
CUDA C on the GPU. The image processing pipeline consist
Thursday, Sept 23, 11:30 (20 minutes)
of SIFT feature detection and matching and Graphcut image
Room D
stitching to achieve high-quality results. We demonstrate live
2298 Accelerated Image Quality Assessment using panorama creation with a Webcam.
Structural Similarity
Speaker(s): Timo Stich (Developer Technology Engineer, NVIDIA)
Explores the GPU porting and performance analysis of the
Topic(s): Video Processing, Algorithms & Numerical
image quality assessment algorithm based on structural
Techniques, Computer Vision, Imaging
similarity index(SSI). This index is a powerful tool for image
quality assessment and the algorithm is highly suitable for GPU
Thursday, Sept 23, 14:00 (50 minutes)
architecture, offering a rapid image quality assessment in many
Room A2
image restoration applications.
2115 Modified Smith-Waterman-Gotoh Algorithm for CUDA
Speaker(s): Mahesh Khadtare (Member, Technical Staff,
Implementation
Computational Research)
Topic(s): Computer Vision, Imaging It is axiomatic that computational throughput can be increased
by exploiting the parallelism of GPU hardware — but what if the
computational algorithm is not easy to implement in parallel? We
Thursday, Sept 23, 12:00 (120 minutes)
have modified one such algorithm — the Smith-Waterman-Gotoh
Exhibit Hall
dynamic programming algorithm for local sequence alignment
1004 Exhibits Open / Networking Lunch — so as to make it more amenable to data-parallel computation.
Join your colleagues in the exhibit hall to preview emerging The result is a successful CUDA implementation that fully exploits
technologies and see some of the most innovative solutions GPU parallelism.
available today. Lunch will be served. Speaker(s): Richard Wilton (Research Scientist, The Johns
THURSDAY

Topic(s): General Interest Hopkins University)

Topic(s): Life Sciences, Algorithms & Numerical Techniques
Thursday, Sept 23, 14:00 (50 minutes)
Room A7 Thursday, Sept 23, 14:00 (50 minutes)
Room L
2040 Derivatives & Bond Portfolio Valuation in a Hybrid
CPU/GPU Environment 2121 Maximizing Throughput of Barco’s GPU-Enabled Video
Learn how to compute traditional end of day computations Processing Server
in real time through the use of a hybrid GPU/CPU computing Find out how Imec middleware realizes the full potential of
environment. We will detail how computing intensive tasks are GPU-enabled video processing servers to manage multiple
delegated to the GPU while interface issues are dealt with by the video processing pipelines. We will discuss how the middleware
CPU. We will discuss our methodology consisting of the following monitors GPU and CPU execution to best balance the load.
three components: (1) valuations; (2) by tenor risk measures; Covers how we achieved a 30% increase in throughput with
and (3) full distributions allowing for more complex analytics only a minimal 0.05% overhead on Barco’s GPU-enabled video
such as exotic options products valuation and counterparty value processing server.
adjustments calculation.
Speaker(s): Maja D’Hondt (Program Manager, imec)
Speaker(s): Peter Decrem (Director, Rates Products, Quantifi) Topic(s): Video Processing, Tools & Libraries
Topic(s): Finance, Algorithms & Numerical Techniques,
High Performance Computing Thursday, Sept 23, 14:00 (20 minutes)
Room A8
Thursday, Sept 23, 14:00 (50 minutes)
2155 GPGPU in the real world. The ABAQUS experience
Marriott San Jose Ballroom
We describe the ABAQUS experience in integrating GPGPU
2054 NAMD, CUDA, and Clusters: Taking GPU Molecular acceleration into a complex, high performance commercial
Dynamics Beyond the Desktop engineering software. In particular we discuss the trade-off we
A supercomputer is only as fast as its weakest link. The highly had to make and the benefits we obtained from this technology.
parallel molecular dynamics code NAMD was one of the first
Speaker(s): Luis Crivelli (Director Solver Development, Dassualt
codes to run on a GPU cluster when G80 and CUDA were
Systems Simulia Corporation)
introduced in 2007. Now, after three short years, the Fermi
Topic(s): Physics Simulation, Algorithms & Numerical
architecture opens the possibility of new algorithms, simpler
Techniques, Computational Fluid Dynamics,
code, and easier optimization. Come learn the opportunities and
High Performance Computing
pitfalls of taking GPU computing to the petascale.
Speaker(s): James Phillips (Senior Research Programmer,
University of Illinois)
Topic(s): Molecular Dynamics, High Performance Computing,
Life Sciences, Physics Simulation
Thursday, Sept 23, 14:00 (50 minutes) layouts and how they relate to Thrust. Lastly, we’ll present
Room N evidence that Thrust implementations are fast, while remaining
concise and readable.
2175 Hello GPU: High-Quality, Real-Time Speech
Recognition on Embedded GPUs Speaker(s): Jared Hoberock (Research Scientist, NVIDIA)
In this presentation, we will talk about our experiences of Topic(s): Tools & Libraries
implementing an end-to-end automatic speech recognition
system that runs in faster than real-time on embedded GPUs, Thursday, Sept 23, 14:00 (50 minutes)
targeted towards small form-factor consumer devices. Focusing Room A1
specifically on some of the challenges encountered during the 2241 Standing Out: Implementing a Great Stereo UI
design process, a major portion of our talk will focus on giving
Learn how to make S3D compatible user interfaces, HUDs, and
insights into modifications we made to well-established speech
in-game menus. The first part of this session will outline the
algorithms to fit well within the GPU programming model. We
common problems users encounter when displaying traditional
will show how these changes helped us in realizing a highly
2D UI in stereoscopic 3D. The second part will focus on the
optimized system on platforms with limited memory bandwidth
different techniques, tips/tricks, and best practices developers
and compute resources.
can use to create high-quality S3D interfaces. The presentation
Speaker(s): Kshitij Gupta (Graduate Student Researcher, will highlight examples from several shipped titles, as well
UC Davis) as showcase a complete 3D UI game demo running in S3D on
Topic(s): Embedded & Automotive, Audio Processing, Signal multiple devices including PC and mobile.
processing, Mobile & Tablet & Phone
Speaker(s): Brendan Iribe (President, Scaleform)
Topic(s): Stereoscopic 3D, Tools & Libraries, Computer
Thursday, Sept 23, 14:00 (50 minutes)
Graphics, Mobile & Tablet & Phone
Room C
2209 Accelerating Computer Vision on the Fermi Thursday, Sept 23, 14:00 (50 minutes)
Architecture Keynote Hall
GPUS have evolved from fixed function to general purpose, 4009 Emerging Companies Summit Panel: The “New
and continue to evolve with new features being added in

THURSDAY
Normal” For Building Emerging Companies Based On
every generation. This talk will discuss how to exploit the Disruptive Technologies
new features introduced by the Fermi architecture (such as
Moderated by Jeff Herbst – Vice President of Business
concurrent kernel execution, writes to texture) to accelerate
Development, NVIDIA
computer vision algorithms.
Speaker(s): James Fung (Developer Technology, NVIDIA) Start-ups are facing unique challenges as aresult of the current
economic and business environment. Not only is the venture
Topic(s): Computer Vision, Tools & Libraries
funding environment very difficult, but small companies are
finding it increasingly difficult to “break out” of the pack through
Thursday, Sept 23, 14:00 (50 minutes)
IPO’s and attractive M&A exits. This panel of experts (which
Room A3
includes VC and corporate investors) will attempt to assess the
2210 GPU-Ocelot: An Open Source Debugging and current state of both the public and private markets, and will
Compilation Framework for CUDA explore various strategies and options for building successful
Learn how to debug and profile CUDA applications using GPU- companies in this “new” environment. Topics will include
Ocelot. Ocelot is a compilation and emulation framework for traditional forms of equity and debt, angel financing, as well as
CUDA that includes debugging and profiling tools as well as other creative/strategic financing options (eg. NRE arrangements,
backend compilers for NVIDIA GPUs and x86 CPUs. We will strategic partnerships etc.). The discussing promises to be both
present examples of applications developed on x86 CPUs lively and provocative.
and deployed on NVIDIA GPUs. We will also discuss memory Panelist(s): Gerald Brady (Managing Director, Silicon Valley
checking, race detection, and deadlock detection tools available Bank), Bill Frauenhofer (Managing Director,
within Ocelot. Citigroup Global Markets), Garrett Herbert (Partner,
Speaker(s): Gregory Diamos (PhD Student, Georgia Institute M&A Transaction Services, Deloitte & Touche
of Technology), Andrew Kerr (PhD Student, Georgia LLP), Eric Jensen (Partner, Business Department
Institute of Technology), Sudhakar Yalamanchili Chair, Cooley LLP), Andrew T. Sheehan (Managing
(Professor, Georgia Institute of Technology) Director, Sutter Hill Ventures)
Topic(s): Tools & Libraries Topic(s): Finance, General Interest

Thursday, Sept 23, 14:00 (50 minutes) Thursday, Sept 23, 14:30 (20 minutes)
Room B Room A8
FULL CONFERENCE GUIDE 2010

2220 Thrust by Example: Advanced Features and 2240 Accelerating LS-DYNA with MPI, OpenMP, and CUDA
Techniques When solving implicit problems, the computational bottleneck
Thrust is a parallel template library for developing CUDA in LS-DYNA is the multifrontal linear solver. These operations
applications which is modeled after the C++ Standard Template are performed with double precision arithmetic, hence until the
Library (STL). In this session we’ll show how to implement arrival of the Tesla 2050, experiments with GPU acceleration
decompose problems into the algorithms provided by Thrust. were only a curiosity. This is no longer the case, and in this
We’ll also discuss the performance implications of “kernel talk we will describe how LS-DYNA’s hybrid (MPI and OpenMP)
fusion” and “array of structs” vs. “structure of arrays” memory solver is further accelerated using GPUs to factor large dense
frontal matrices.
73
Speaker(s): Bob Lucas (Computational Sciences Division 10x faster performance compared to sequentially processing ASR
Director, University of Southern California) on a CPU. The state-of-art algorithm for ASR performs a graph
Topic(s): High Performance Computing, Algorithms & traversal on a large, irregular graph with millions of states and
Numerical Techniques arcs, guided by speech input only known at runtime. We present
four generalizable techniques including: dynamic data-gather
Thursday, Sept 23, 15:00 (50 minutes) buffer, find-unique, lock-free data structures using atomics,
Room K and hybrid global/local task queues. When used together, these
techniques can effectively resolve ASR implementation challenges
2003 Using CUDA to Accelerate Radar Image Processing on a GPU.
Come see how current GPU technology provides the means for
the first portable real-time radar image processing algorithm. Speaker(s): Jike Chong (Principal Software Architect,
This session will outline how the GPU has afforded nearly three Parasians, LLC)
orders of magnitude improvement in performance for Synthetic Topic(s): Machine Learning & Artificial Intelligence,
Aperture Radar’s (SAR) hallmark image processing algorithm. Algorithms & Numerical Techniques,
We will present algorithm details and further improvements. Audio Processing

Speaker(s): Aaron Rogan (Research Scientist and System

Thursday, Sept 23, 15:00 (50 minutes)
Adminstrator, Neva Ridge Technologies)
Room A5
Topic(s): Signal processing, Algorithms & Numerical
Techniques, Imaging, Video Processing 2062 HOOMD-blue: Fast and Flexible Many-Particle
Dynamics
Thursday, Sept 23, 15:00 (120 minutes) See the newest capabilities and performance enhancements
Marriott San Jose Ballroom in HOOMD-blue, a general-purpose many-particle dynamics
application written for GPUs. Speedups of 80-100x are attained
2012 Analysis-Driven Performance Optimization for a wide range of simulation types. Topics for this presentation
The goal of this session is to demystify performance optimization include an overview of HOOMD-blue, design and implementation
by transforming it into an analysis-driven process. There are details of the underlying algorithms, and a discussion on how
three fundamental limiters to kernel performance: instruction generality is maintained without sacrificing performance.
throughput, memory throughput, and latency. In this session we
THURSDAY

will describe: Speaker(s): Joshua Anderson (Research Area Specialist,

University of Michigan)
•how to use profiling tools and source code instrumentation to Topic(s): Molecular Dynamics, High Performance Computing,
assess the significance of each limiter; Life Sciences, Physics Simulation
•what optimizations to apply for each limiter;
Thursday, Sept 23, 15:00 (20 minutes)
•how to determine when hardware limits are reached. Room A7
Concepts will be illustrated with some examples and are equally 2064 Correlated Paths for Monte Carlo Simulations
applicable to both CUDA and OpenCL development. It is assumed Learn how the GPU can be deployed to generated correlated
that attendees are already familiar with the fundamental paths for Monte Carlo simulation. Using Asian Basket options as
optimization techniques. an example, the session shows the generation of correlated paths
Speaker(s): Paulius Micikevicius (NVIDIA) with a local volatility model for each of the underlying assets.
Topic(s): Tools & Libraries Once the paths have been computed, the payoff in each scenario
is computed and reduced to determine the expected value, all on
Thursday, Sept 23, 15:00 (50 minutes) the GPU.
Room A1 Speaker(s): Thomas Bradley (Developer Technology
2016 VDPAU: PureVideo on Unix Engineer, NVIDIA)
Topic(s): Finance
Learn about VDPAU (Video Decode and Presentation API for
Unix). VDPAU provides GPU-accelerated video decoding, post-
processing, UI compositing, and display on Unix. VDPAU also Thursday, Sept 23, 15:00 (50 minutes)
supports sharing surfaces with OpenGL and CUDA (“interop”). Room B
This allows developers to implement their own post-processing 2081 Morphing a GPU into a Network Processor
algorithms or scene analysis, or to use decoded video surfaces as Modern Internet routers must meet two conflicting objectives,
part of a scene rendered using OpenGL. high performance and good programmability, to satisfy the
Speaker(s): Stephen Warren (Snr Linux Software ever-increasing bandwidth requirements under fast changing
Engineer, NVIDIA) network protocols. A few recent works prove that GPUs have great
Topic(s): Video Processing, Tools & Libraries potential to serve as the packet processing engine for software
routers. However, current GPU’s batched execution model cannot
Thursday, Sept 23, 15:00 (50 minutes) guarantee quality-of-service (QoS) requirement. In this work, we
Room N show how to convert a GPU into an effective packet processor
through minimal changes in both hardware architecture and
2046 Efficient Automatic Speech Recognition on the GPU scheduling mechanism. Experimental results proved that the new
Learn about how the GPU is able to meet the challenges of GPU architecture could meet stringent QoS requirements, but
implementing automatic speech recognition (ASR), gain insights maintain a high processing throughput.
into the data-parallel implementation techniques that can provide
Speaker(s): Yangdong Deng (Associate Professor, Tsinghua Speaker(s): John Humphrey (Senior Engineer, EM Photonics, Inc)
University) Topic(s): High Performance Computing, Algorithms &
Topic(s): General Interest, High Performance Computing Numerical Techniques, Tools & Libraries

Thursday, Sept 23, 15:00 (50 minutes) Thursday, Sept 23, 15:00 (20 minutes)
Room A2 Room A8
2105 CUDA-FRESCO: An Efficient Algorithm for Mapping 2213 BCSLIB-GPU: Significant Performance Gains for CAE
Short Reads Hear product architects and developers describe the algorithmic
Learn about CUDA-FRESCO and how it addresses issues with depths and high level breath of the use of GPUs that have been
MUMmerGPU. We will detail how CUDA-FRESCO overcomes employed to create BCSLIB-GPU, the GPU enablement of the
MUMmerGPU’s problems processing reads with errors or industry standard sparse matrix software suite, BCSLIB-EXT.
mismatches and delivers additional performance beyond We provide a range of comparison data with Tesla and Fermi
MUMmerGPU’s 5-12x speedup with less than 100bp query length. compared with multi-core CPU only systems and for a wide range
of realisitic demanding real world test problems.
Speaker(s): Chun-Yuan Lin (Assistant Professor, Department of
CSIE, Chang Gung University) Speaker(s): Danl Pierce (Partner, Access Analytics Int’l, LLC)
Topic(s): Life Sciences, Algorithms & Numerical Techniques, Topic(s): Tools & Libraries, Algorithms & Numerical
Tools & Libraries Techniques, High Performance Computing,
Embedded & Automotive
Thursday, Sept 23, 15:00 (50 minutes)
Room L Thursday, Sept 23, 15:00 (50 minutes)
Keynote Hall
2107 Accelerating Stereographic and Multi-View Images
Using Layered Rendering 4010 Emerging Companies: CEO on Stage featuring
Explore applications of geometry shaders in improving the NaturalMotion Ltd, OptiTex, and Useful Progress
performance of stereo pair or multi-viewer image generation. See the hottest new technologies from startups that are
This session will cover the basic approach of single-pass stereo- transforming computing.
pair creation and provides guidelines for when layered rendering

THURSDAY
In a lively and fast-paced exchange, the “Emerging Companies
can be used to increase performance. A particular emphasis
Summit - CEO on Stage” sessions will feature CEOs from three
will be placed on virtual reality and scientific visualization, but
startups who will each have 15 minutes to introduce their
the techniques discussed apply to a wide range of rendering
companies and interact with a panel of leading venture capitalists,
environments. Results will be shown for three GPU architectures,
technology executives, and industry analysts.
including the new GF100 GPU.
Panelist(s): Tim Bajarin (President, Creative Strategies),
Speaker(s): Jonathan Marbach (Director of Software
Jeff Herbst (Vice President of Business
Architecture and Engineering, TerraSpark
Development, NVIDIA), Bill Tai (General Partner,
Geosciences, LLC)
CRV), Paul Weiskopf (Sr. VP of Corporate
Topic(s): Stereoscopic 3D
Development, Adobe)
Speaker(s): Yoram Burg (President, OptiTex.), Sylvain Ordureau
Thursday, Sept 23, 15:00 (50 minutes)
(CEO, Useful Progress), Torsten Reil (CEO,
Room C
NaturalMotion Ltd)
2123 Enabling Augmented Reality with GPU Computing Topic(s): General Interest, Medical Imaging & Visualization,
This talk will take a detailed look at Sportvision’s “First and 10” Physics Simulation, Computer Graphics
system, perhaps the most widely experienced example of AR ever,
with 106 million viewers during the 2010 Superbowl alone. We’ll Thursday, Sept 23, 15:30 (20 minutes)
examine the current implementation and the GPU features that Room A7
enable low latency, video-rate performance.
2063 Banking on Monte Carlo… and Beyond
Speaker(s): Ryan Ismert (Director of Engineering, Sportvision, Inc.) Last year NAG presented spectacular results for Monte Carlo
Topic(s): Computer Vision techniques on GPUs using NAG’s GPU library. This year we will
talk about new projects in the areas of Monte Carlo and PDE
Thursday, Sept 23, 15:00 (50 minutes) techniques, delivering additional benefits to the finance industry
Room A3 for real-world problems, including credit modeling.
2153 CULA - A Hybrid GPU Linear Algebra Package Speaker(s): Ian Reid (Chief Commercial Officer, NAG)
Get the latest information on CULA, an implementation of Topic(s): Finance
hybrid GPU/CPU linear algebra solvers for NVIDIA GPUs. CULA
launched at GTC2009 and has since received large speedups and Thursday, Sept 23, 15:30 (20 minutes)
FULL CONFERENCE GUIDE 2010

many new features. We will cover all the features, old and new, Room A8
along with performance, inner workings, and how users can
2208 Acceleration of SIMULIA’s Abaqus Solver on
integrate CULA into their applications. Learn how your existing
NVIDIA GPUs
linear algebra applications can benefit from a high quality library.
Much more information is available at www.culatools.com and at Learn about Acceleware’s and Dassault Systemes’ integrated
our presentation and booth. solution that performs an LDL^T factorization on GPUs within
the Abaqus software package. We will discuss efficient GPU
75
parallelization of the factorization algorithm and enabling the how GPU-based computation enables visual servoing and box
CPU and GPU to overlap their computations and data transfers. moving. We also discuss the potential of the GPU to solve more
Includes an end user simulation case study and GPU performance difficult sensory problems such as multi-robot cooperation,
measurements including 300 GFlops in single precision and 145 multimodal sensor binding, attention, sensitization, and
GFlops in double precision on NVIDIA Tesla C2050. habituation.
Speaker(s): Chris Mason (Product Manager, Acceleware) Speaker(s): Dr. Alan Peters (CTO, Universal Robotics, Inc.)
Topic(s): High Performance Computing Topic(s): Machine Learning & Artificial Intelligence

Thursday, Sept 23, 14:00 (50 minutes) Thursday, Sept 23, 16:00 (50 minutes)
Room A5 Room A1
2008 OpenCL Optimization 2095 Building High Density Real-Time Video
Learn how to optimize your OpenCL application to achieve Processing Systems
maximum performance on NVIDIA GPUs. We will first briefly Learn how GPU Direct can be used to effectively build real time,
discuss how the OpenCL programming model maps onto NVIDIA high performance, cost effective video processing products. We
GPU’s architecture. We will then talk about memory, instruction, will focus especially on how to optimize bus throughput while
and NDRange optimization techniques, illustrating each with keeping CPU load and latency minimal.
small code samples.
Speaker(s): Ronny Dewaele (Director Technology Center, Barco)
Speaker(s): Peng Wang (Developer Technology Engineer, NVIDIA) Topic(s): Video Processing, Imaging
Topic(s): Tools & Libraries, High Performance Computing
Thursday, Sept 23, 16:00 (50 minutes)
Thursday, Sept 23, 16:00 (50 minutes) Room A3
Room B
2100 Hybrid GPU/Multicore Solutions for Large Linear
2086 GPGPU DL_POLY Algebra Problems
Discover DL_POLY. Large linear algebra problems may be solved using recursive
DL_POLY: an MD code ICHEC has ported to CUDA. The block decomposition in which GPUs efficiently compute the sub-
THURSDAY

presentation especially focuses on the auto-tuning of the work blocks and multicore CPUs put the sub-blocks back together
distribution between CPU and GPU within a large shared memory space. This talk will present
Speaker(s): Gilles Civario (Head of Capability Computing and benchmark results for such a hybrid approach, implemented in
Novel Architecture Group, ICHEC) Matlab® and using Jacket® to access the GPU compute power.
Topic(s): Molecular Dynamics, High Performance Computing Speaker(s): Nolan Davis (Research Scientist, SAIC)
Topic(s): High Performance Computing, Algorithms &
Thursday, Sept 23, 16:00 (50 minutes) Numerical Techniques, Signal processing
Room A2
2088 Nucleotide String Matching Using Thursday, Sept 23, 16:00 (50 minutes)
CUDA-Accelerated Agrep Room C
Dive deep into the intelligent utilization of various CUDA 2114 Cascaded HOG on GPU
memory spaces to remarkably speedup approximate DNA/ We propose a real time HOG based object detector implemented
RNA nucleotide sequence matching algorithm in bioinformatics on GPU. To accelerate the detection process, the proposed
by an amazing factor of 67 compared to multi-threaded quad method uses two serially-cascaded HOG detectors. The first low
core CPU counterpart. Our talk provides a very good example dimensional HOG detector discards detection windows obviously
to demonstrate how to use indexable array to save frequently not showing target objects. It reduces the computational cost of
updated variables directly into GPU registers, how to organize the second high dimensional HOG detector. This method tested on
shared memory into a 2D array to avoid bank conflict, and how to 640x480 color image and the same size movie. The computation
shuffle the data structure to satisfy the requirement for coalesced time decreases to 70ms per image. That is 4 times faster
global memory access. Our CUDA implementation employs online than a case of single detector. This method provides real time
approach and can be applied in real time. performance even on middle end GPUs such as GeForce GTS 250.
Speaker(s): Hongjian Li (Graduate Student, The Chinese Speaker(s): Kento Tarui (Researcher, AquaCast Corporation)
University of Hong Kong) Topic(s): Computer Vision, Machine Learning &
Topic(s): Life Sciences, Algorithms & Numerical Techniques Artificial Intelligence

Thursday, Sept 23, 16:00 (50 minutes) Thursday, Sept 23, 16:00 (50 minutes)
Room N Room K
2091 The GPU in the Reactive Control of Industrial Robots 2126 Accelerating Signal Processing: Introduction to
Universal Robotics is using GPUs for real-time visual sensing GPU VSIPL
in the reactive control of industrial robots. For a robot to work Learn how to use the Vector Signal Image Processing Library
in a complex dynamic environment to achieve a more loosely to accelerate signal processing applications without needing to
specified goal, such as moving arbitrary boxes from a pallet to a understand platform-specific programming and optimization
conveyor, requires reactivity. Reactive control requires intensive, techniques. We will discuss how GPU VSIPL implements
concurrent, low-latency computation for motion planning, the VSIPL API and uses CUDA-capable GPUs to maximize
exception handling, and sensing. We describe and demonstrate performance of several example applications.
Speaker(s): Dan Campbell (Research Engineer, Georgia Tech intended to provide a pragmatic guide to creating prosumer 3D
Research Institute) video content and how the GPU greatly assists and speeds up this
Topic(s): Signal processing, Tools & Libraries process. The intended audience is anyone interested in how to
create compelling 3D movies at a prosumer level.
Thursday, Sept 23, 16:00 (20 minutes) Speaker(s): Rudy Sarzo (Principal, SMI), Ian Williams (Director
Room A8 PSG Applied Engineering, NVIDIA), Kevan O’Brien
2133 3D Full Wave EM Simulations Accelerated by GPU (NVIDIA)
Computing Topic(s): Digital Content Creation (DCC)
3D Full Wave Electromagnetic simulations of RF components,
antennas, printed circuit boards, can be quite time consuming. Thursday, Sept 23, 16:00 (50 minutes)
Computer Simulation Technology (CST) toolsuite includes the Room L
capability to activate GPU Computing. Examples will be shown of 2283 500 Teraflops Heterogeneous Cluster
using Tesla C1060 and S1070 configurations to provide significant
HPC Affiliated Resource Center (ARC) will be host of a very large
performance improvement of complex simulations.
interactive HPC. The large cluster (CONDOR) will integrate cell
Speaker(s): Fabrizio Zanella (Systems Manager, CST of America) broadband engine processors, GPGPUs and powerful x86 server
Topic(s): High Performance Computing, nodes, with a combined capability of 500 Teraflops. Applications
will include neuromorphic computing, video synthetic aperture
Thursday, Sept 23, 16:00 (20 minutes) radar backprojection, matrix multiplications, and others. This
Room A7 presentation will discuss progress on performance optimization
using the Heterogeneous Cluster and lessons learned from
2136 Pseudo Random Number Generators for Massively this research.
Parallel Apps
Learn how to select the best and fastest pseudo random number Speaker(s): Mark Barnell (HPC Director, Air Force Research
generator for your massively parallel Monte Carlo simulation. Lab (AFRL))
Pseudo random numbers generators (PRNG) are a fundamental Topic(s): High Performance Computing
building block of these simulations and it is thus required to
select suitable PRNGs with regard to the specific problem at Thursday, Sept 23, 16:00 (50 minutes)

THURSDAY
hand while considering the parallel hardware architecture. Keynote Hall
Recent developments in random number generations provide 4011 Emerging Companies: CEO on Stage featuring
a wide variety of choices, each with different properties and Cinnafilm Inc., Perceptive Pixel, and Total Immersion
trade-offs. We provide a comprehensive survey of the current
See the hottest new technologies from startups that are
state of the art for massively parallel PRNG and show a broad
transforming computing.
range of applications.
In a lively and fast-paced exchange, the “Emerging Companies
Speaker(s): Holger Dammertz (PhD Student, Ulm University)
Summit - CEO on Stage” sessions will feature CEOs from three
Topic(s): Algorithms & Numerical Techniques, Finance
startups who will each have 15 minutes to introduce their
companies and interact with a panel of leading venture capitalists,
Thursday, Sept 23, 16:00 (50 minutes) technology executives, and industry analysts.
Room A5
Panelist(s): Tim Bajarin (President, Creative Strategies),
2271 Compose CUDA Masterpieces! Write better, Jeff Herbst (Vice President of Business
Leverage More Development, NVIDIA), Bill Tai (General Partner,
Not all CUDA code is created equally. Learn how to step up your CRV), Paul Weiskopf (Sr. VP of Corporate
CUDA game. Also, learn how to build large, multi-person CUDA Development, Adobe)
projects for your organization. In very clear descriptions, learn Speaker(s): Lance Maurer (CEO, Cinnafilm, Inc.), Bruno Uzzan
the difference between naïve GPU code, intermediate GPU code, (Founder and CEO, Total Immersion), Jeff Han
and advanced GPU mastery. We show how careful construction (Founder and Chief Scientist, Perceptive Pixel)
of CUDA kernels can affect application performance. We also
Topic(s): General Interest, Computer Vision, Film, Imaging
discuss how Jacket tools greatly facilitate the development of
CUDA-based projects. Finally, we will debut the Jacket runtime’s
Thursday, Sept 23, 16:30 (20 minutes)
new C/C++ library. With this library, the technical computing
Room A8
functions in Jacket’s MATLAB engine are made available in C/C++.
2066 Accelerating System Level Signal Integrity Simulation
Speaker(s): James Malcolm (VP of Engineering, AccelerEyes)
Discuss how GPU acceleration for key parts of the ANSYS Nexxim
Topic(s): Tools & Libraries
Simulator resulted in significant speedup over multi-core
processors. We will cover time consumption and data parallelism
Thursday, Sept 23, 16:00 (50 minutes)
exposure considerations, and focus on key areas where GPU
FULL CONFERENCE GUIDE 2010

Room D
acceleration was applied including convolution and Eye rendering.
2279 Working Man’s Guide to 3D Video Editing
Speaker(s): Danil Kirsanov (Scientist, ANSYS), Ekanathan
Video editing is currently at two simultaneous inflections points: Palamadai (Research & Development Engineer,
use of GPUs for video processing and the beginning of wide ANSYS)
spread adoption of 3D. At this time however, identifying and
Topic(s): Physics Simulation, Algorithms & Numerical
navigating through the necessary tools and equipment to create
Techniques, Signal processing
compelling 3D video content is challenging. This session is
77
Thursday, Sept 23, 16:30 (20 minutes)
Room A7
2101 Pricing American Options Using GPUs
This presentation focuses on the challenging problem of Pricing
High-Dimensional American Options (PHAO) and how GPUs can
be involved in this task. On the one hand, we present a method
based on Malliavin calculus which is effective for parallel
architecture. On the other hand, we compare this method
with Longstaff & Schwartz method which is more dedicated to
sequential architecture. We will conclude with some ideas about
the parallelization of the former method on a cluster of machines
and finally we will discuss this method considering it as a
reformulation of a non-linear parabolic problem using BSDEs.
Speaker(s): Lokman A. Abbas-Turki (PhD Student in Applied
Mathematics, Paris-Est University)
Topic(s): Finance, Physics Simulation

Thursday, Sept 23, 17:00 (60 minutes)

Keynote Hall
1003 Closing Ceremonies and Keynote with Dr. Sebastian
Thrun, Stanford University
What really causes accidents and congestion on our roadways?
How close are we to fully autonomous cars? In his keynote
address, Stanford Professor and Google Distinguished Engineer,
Dr. Sebastian Thrun, will show how his two autonomous vehicles,
Stanley (DARPA Grand Challenge winner), and Junior (2nd Place
THURSDAY

in the DARPA Urban Challenge) demonstrate how close yet

how far away we are to fully autonomous cars. Using computer
vision combined with lasers, radars, GPS sensors, gyros,
accelerometers, and wheel velocity, the vehicle control systems
are able to perceive and plan the routes to safely navigate Stanley
and Junior through the courses. However, these closed courses
are a far cry from everyday driving. Find out what the team will do
next to get one step closer to the “holy grail” of computer vision,
and a huge leap forward toward the concept of fully autonomous
vehicles.

Sebastian Thrun is a professor of computer science and electrical

engineering at Stanford, where he directs the Stanford AI Lab.
He is also a distinguished engineer at Google. Thrun’s team
won the DARPA Grand Challenge, a US-Government sponsored
autonomous robot race that took place in 2005. Thrun also
pioneered the scientific field of probabilistic robotics, and he co-
invented Google Street View. In recognition of his contributions,
Thrun has been elected into the US National Academy of
Engineering and the German Academy of Sciences. He is an
elected fellow of the AAAI, ECCAI, and WTN. Popular Science
included Thrun in their “Brilliant Ten,” Forbes Magazine in their
“E-Gang” members, Scientific American in their list of 50 world
technology and policy leaders, and Fortune selected him as
one of the 50 smartest people in tech. Wired Magazine awarded
Thrun’s robot Stanley the top spot in the most influential robots of
all times. The robot is now part of a permanent exhibition in the
Smithsonian Museum of American History. Thrun has authored 11
books and over 300 scientific articles.
Topic(s): General Interest, Computer Vision, Machine
Learning & Artificial Intelligence
Speaker(s): Sebastian Thrun (Professor/Distinguished Engineer,
Stanford/Google)
Algorithms & Numerical Techniques implementation demonstrates multiple factors of
speedup (up to 3.8x) for all NVIDIA GPGPUs. For
A01 - Communication-Avoiding QR this domain of sorting problems, we believe our
Decomposition for GPUs sorting primitive to be the fastest available for
Communication-Avoiding QR is a recent algorithm any fully-programmable microarchitecture: our
for solving a QR decomposition, which is optimal stock NVIDIA GTX480 sorting results exceed the
with regard to the amount of communication 1G keys/sec average sorting rate (i.e., one billion
performed. We’ve implemented the CAQR 32-bit keys sorted per second).
algorithm on the GPU and found that it performs Author: Duane Merrill (University of Virginia)
exceptionally well, especially for the challenging
case of tall-skinny matrices.
A06 - Task Management for Irregular Workloads
Author: Michael Anderson (University of
on the GPU
California, Berkeley)
We explore software mechanisms for managing
irregular tasks on graphics processing units.
A02 - Accelerating Symbolic Computations on Traditional GPU programming guidelines teaches
NVIDIA Fermi us how to efficiently program the GPU for data
We present the first implementation of a complete parallel pipelines with regular input and output.
modular resultant algorithm on the graphics We present a strategy for solving task parallel
hardware. Our recent developments taking pipelines which can handle irregular workloads
advantage of new NVidia Fermi GPU architecture on the GPU. We demonstrate that dynamic
and instruction set allowed us to achieve about scheduling and efficient memory management are
150x speed-up over a modular resultant algorithm critical problems in achieving high efficiency on
from Maple 13. irregular workloads. We showcase our results on
Author: Pavel Emeliyanenko (Max-Planck Institute a real time Reyes rendering pipeline.
for Informatics) Author: Stanley Tzeng (University of California, Davis)

A03 - Particle-In-Cell Simulations on the GPU A07 - A Hybrid Method for Solving Tridiagonal
Particle-In-Cell simulations represent an Systems on GPU
important technique in the field of kinetic plasma Tridiagonal linear systems are of importance
simulations. 2D particle pushing and conserved to many problems in numerical analysis and
current aggregation has been implemented in computational fluid dynamics, as well as to
NVIDIA RESEARCH SUMMIT

CUDA. On a TESLA C1060 the CUDA code is 4 computer graphics applications in video games
times faster than SSE2 optimized code on a quad and computer-animated films. This poster
core INTEL XEON processor. presents our study on the performance of multiple
POSTER LISTING

Author: Hartmut Ruhl (Ludwig-Maximilians- tridiagonal algorithms on a GPU. We design a

University) novel hybrid algorithm that combines a work-
efficient algorithm with a step-efficient algorithm
in a way well-suited for a GPU architecture.
A04 - Parallel Ant Colony Optimization with CUDA
Our hybrid solver achieves 8x and 2x speedup
The Ant Colony Optimization (ACO) Algorithm is a
respectively in single precision and double
metaheuristic that is used to find shortest paths
precision over a multi-threaded highly-optimized
in graphs. By using CUDA to implement an ACO
CPU solver and a 2x speedup over a basic GPU
algorithm, we achieved significant improvement
solver.
in performance over a highly-tuned sequential
Author: Yao Zhang (University of California, Davis)
CPU implementation. The construction step of
the ACO algorithm consists of each ant creating
an independent solution, and this step is where A08 - Development of Desktop Computing
most of the computation is spent. Since the Applications and Engineering Tools on GPUs
construction step is the same for most ACO A GPU competence center and laboratory for
variations, parallelizing this step will also allow for research and collaboration within academia
easy adaptation to different pheromone updating and partners in industry has been established
functions. Currently, our research tests this in 2008 at section for Scientific Computing, DTU
hypothesis on the travelling salesmen problem. informatics, Technical University of Denmark.
Author: Octavian Nitica (University of Delaware) In GPULab we focus on the utilization of GPUs
for high-performance computing applications
and software tools in science and engineering,
A05 - High Performance and Scalable Radix
inverse problems, visualization, imaging, dynamic
Sorting for GPU Stream Architectures
optimization. This poster illustrates the latest
The need to rank and order data is pervasive,
and most interesting projects that have been
and sorting operations are fundamental to
developed at our center.
many algorithms. This poster presents a very
Author: Hans Henrik B. Soerensen (Technical
efficient method for sorting large sequences of
University of Denmark)
fixed-length keys (and values) using GPU stream
processors. Compared to the state-of-the-art, our
A09 - Ballot Counting for Optimal Binary Prefix Sum A13 - Implementation of Adaptive Cross
This poster describes a new technique for Approximation on NVIDIA GPUs
performing binary prefix sums using Fermi’s The Method of Moments is a popular
new __ballot() and __popc() functions. These computational method for solving integral
instructions greatly increase intra-warp equations in electromagnetics. However, it suffers
communication, allowing for an 80% speedup from high computational and memory costs since
over standard GPU methods in applications like it requires the solution of a dense linear system.
Radix Sort. It also points to future research that The Adaptive Cross Approximation (ACA) is an
will enable suffix array construction, Burrows- effective technique for compressing the system
Wheeler Transform, and the BZIP algorithm to matrix thereby reducing the necessary storage
take advantage of these instructions for efficient as well as the number of operations required to
GPU compression. solve the system. Acceleration of the ACA MoM
Author: David Whittaker (University of Alabama with NVIDIA GPUs can finally enable the solution
at Birmingham) of “real world” scattering problems on a personal
workstation in a practical timeframe.
Author: Daniel Faircloth (Georgia Tech Research
A10 - Deriving Parallelism and GPU Acceleration
Institute)
of Algorithms with Inter-Dependent Data Fields
This poster presents an approach to derive
parallelism in algorithms that involve building A14 - A GPU Accelerated Continuous-based
sparse matrix that represents relationships Discrete Element Method for Elastodynamics
between inter-dependent data fields and Analysis
enhancing its performance on the GPU. This The Continuum-based Distinct Element Method
work compares the algorithm performance on (CDEM) is the combination of Finite Element
the GPU to its CPU variant that employs the Method (FEM) and Discrete Element Method
traditional sparse matrix-vector multiplication (DEM), which is mainly used in general structural
(SpMV) approach. We have also compared our analyses, as well as landslide stability evaluations,
algorithm performance with CUSP SpMV on GPU. coal and gas outburst analyses. By means of
The softwares used in this work are MATLAB and CUDA and a GTX-285 VGA card, the GPU version
Jacket – GPU engine for MATLAB achieves hundreds times speedup ratio.
Author: Jaideep Singh (Accelereyes) Author: Zhaosong Ma (Institute of Mechanics,
Chinese Academy of Sciences)

NVIDIA RESEARCH SUMMIT

A11 - Parallelizing the Particle Level Set Method
The particle level set is widely used as an accurate A15 - GPU Algorithms for NURBS Minimum
interface tracking tool in simulation, computer Distance and Clearance Computations

POSTER LISTING
vision and other related fields. However, high We present GPU algorithms and strategies for
computation cost prevents applying this method accelerating distance queries and clearance
to real-time and interactive scenarios. This computations on models made of trimmed
work intensively used parallel design patterns NURBS surfaces. We provide a generalized
that are implemented in the thrust library, framework for using GPUs as co-processors
like compaction, reduction and scattering, to in accelerating CAD operations. The accuracy
parallelize the particle level set method in order to of our algorithm is based on the model space
attain real-time performance. precision, unlike earlier graphics algorithms that
Author: Wen Zheng (Stanford University) were based only on image space precision. Our
algorithms are at least an order of magnitude
faster and about two orders of magnitude more
A12 - Accelerating Cuda Graph Algorithms at
accurate than the commercial solid modeling
Maximum Warp
kernel ACIS.
Graphs are powerful data representations favored
Author: Adarsh Krishnamurthy (University of
Graphs are powerful data representations favored
California, Berkeley)
in many computational domains. GPUs have
showed promising results in this domain, but their
performance when the graph is highly irregular. A16 - Gate-Level Simulation with GP-GPUs
In this study, we propose three general schemes This poster describes my research work on how
to accelerate graph algorithms on a modern GPU to leverage the GP-GPU execution parallelism to
architecture: (i) deferred processing of outliers, achieve high performance in the time consuming
(ii) efficient dynamic workload balancing and problem of gate-level simulation of digital
(iii) warp-based execution exploiting threads hardware designs.
FULL CONFERENCE GUIDE 2010

in a SIMD-like manner. Our evaluation reveals Author: Debapriya Chatterjee (University

that our schemes exhibit up to 9x speedup over of Michigan)
previous GPU algorithms and 23x over single CPU
execution on irregular graphs.They also yield up to
30% improvement,even for regular graphs
Author: Sungpack Hong (Stanford University)
81
SGI Altix UV ® ®

accelerating performance

SGI Altix UV is the largest shared • Massive parallel compute power

memory system offering the Tesla-20 series. SGI • Optimized system performance
integrates NVIDIA Tesla solutions to accelerate • Faster time-to-results
the pace of our customers’ work to solve the
most computationally-intensive challenges including
structural design, drug research, oil and gas Learn more about Altix UV at
exploration, and computational finance. www.sgi.com/altixuv

www.sgi.com

© 2010 SGI. SGI and Altix are registered trademarks or trademarks of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
All other trademarks are property of their respective holders.
A17 - CUDA Implemenation of Barrier Option C02 - Efficient Automatic Speech Recognition on
Valuation using Jump-Diffusion Model and the GPU
Browning Bridge Automatic speech recognition (ASR) technology
Impressive speedups up to 100x using GPUs is emerging as a critical component in data
compared to CPUs are achieved by taking analytics for a wealth of media data being
advantage data parallelism, increased bandwidth generated everyday. ASR-based applications
and the ability to hide latency. We have contain fine-grained concurrency that has great
implemented a Monte Carlo valuation of a barrier potential to be exploited on the GPU. However,
option modeled by a standard diffusion process the state-of-art ASR algorithm involves a highly
with a jump diffusion term obeying an underlying parallel graph traversal on an irregular graph
Poisson process to account for rare events. In with millions of states and arcs, making efficient
addition, a Brownian Bridge is incorporated to parallel implementations highly challenging. We
account for barrier crossings in between diffusion present four generalizable techniques including:
trajectories and to reduce bias. This option is dynamic data-gather buffer, find-unique, lock-
representative of exotic options which lack a free data structures using atomics, and hybrid
closed-form solution and are amenable to Monte global/local task queues. When used together,
Carlo type methods for valuation. these techniques can effectively resolve ASR
Author: Vincent Natoli (Stone Ridge Technology) implementation challenges on an NVIDIA GPU.
Author: Jike Chong (Parasians, LLC)

Astronomy & Astrophysics

Computational Fluid Dynamics
B01 - Black Holes in Galactic Nuclei Simulated
with Large GPU Clusters in CAS D01 - High-Order Unstructured Compressible
Many, if not all galaxies harbour supermassive Flow Solver on the GPU
black holes. If galaxies merge, which is quite The objective of this project is to develop a
common in the process of hierarchical structure scalable and efficient high-order unstructured
formation in the universe, their black holes sink to compressible flow solver for GPUs. The solver
the centre of the merger remnant and form a tight allows the achievement of arbitrary order of
binary. Depending on initial conditions and time accuracy for flows over complex geometries.
supermassive black hole binaries are prominent High-order solvers require more operations per

NVIDIA RESEARCH SUMMIT

gravitational wave sources, if they ultimately come degree of freedom, thus making them highly
close together and coalesce. We model such suitable for massively parallel processors.
systems as gravitating N-body systems (stars) Preliminary results indicate speed-ups up to 70x

POSTER LISTING
with two or more massive bodies (black holes), with the Tesla C1060 compared to the Intel i7 CPU.
including if necessary relativistic corrections to Memory access was optimized using shared and
the classical Newtonian gravitational forces (Kupi texture memory.
et al. 2006, Berentzen et al.2009). Author: Patrice Castonguay (Stanford University)
Author: Rainer Spurzem (National Astronomical
Obersvatories, Chinese Academy of Sciences) D02 - Parallel 3D Geometric Multigrid Solver on
GPU Clusters
An investigation of the performance and scalability
of a multigrid pressure Poisson equation solver
Audio Processing running on a GPU cluster.
Author: Dana Jacobsen (Boise State University)
C01 - Exploring Recognition Network
Representations for Efficient Speech Inference
D03 - Acceleration of mesh-free CFD using CUDA
on the GPU
In this work, the acceleration of a mesh-free
We explore two contending recognition network
Computational Fluid Dynamics (CFD) code is
representations for speech inference engines:
performed using CUDA. The poster gives an
the linear lexical model (LLM) and the weighted
overview of the CUDA implementation strategy
ﬁnite state transducer (WFST) on NVIDIA GTX285
and the resulting performance increase.
and GTX480 GPUs. We demonstrate that while
Author: Ruairi Nestor (Irish Centre for
an inference engine using the simpler LLM
High-End Computing)
representation evaluates 22x more transitions per
second than the advanced WFST representation,
D04 - Airblast Modelling on Multiple Tesla units
FULL CONFERENCE GUIDE 2010

the simple structure of the LLM representation

allows 4.7-6.4x faster evaluation and 53-65x faster We used NVIDIA Tesla GPUs to accelerate
operands gathering for each state transition. the solution of hyperbolic partial differential
We illustrate that the performance of a speech equations, with application to modelling airblast
inference engine based on the LLM representation generated by industrial bench mining operations.
is competitive with the WFST representation on Parallelisation over multiple GPUs was achieved
highly parallel GPUs. using MPI.
Author: Jike Chong (Parasians, LLC) Author: Sean Lovett (University of Cambridge)
83
D05 - Implementation of High-Order Adaptive data properties. Therefore, we should dynamically
CFD Methods on GPGPUs adjust the parallelization strategy at runtime to
This poster describes our implementation of optimize key computations.
adaptive high-order CFD methods on GPUs. A Author: Bor-Yiing Su (University of California, Berkeley)
speedup factor of up to 44 has been achieved for
2D flow problems.
F02 - Dense Point Trajectories by GPU-
Author: Z.J. Wang (Iowa State University)
Accelerated Large Displacement Optical Flow
In this poster we discuss a method for
D06 - Computational Fluid Dynamics on GPU computing point trajectories based on a fast
Computational Fluid Dynamics, an important parallel implementation of a recent optical ﬂow
branch in HPC field, has a history of seeking and algorithm that tolerates fast motion. The parallel
requiring higher computational performance. implementation of large displacement optical ﬂow
The traditional way to satisfy this quest is to use runs about 78x faster than the serial C++ version.
faster machines or supercomputers. Yet these We use this implementation is a point tracking
approaches seem inconvenient and costly to many application. Our resulting technique tracks up to
individual researchers. We investigated the use three orders of magnitude more points and is 46%
of GPU to accelerate CFD codes and tested the more accurate than the Kanade-Lucas-Tomasi
performances on CUDA and OpenCL platform. tracker. Compared to the Particle Video tracker,
We have ported 2D cave flow, 2D Riemann, we achieve 66% better accuracy while retaining
and 2D flow over a RAE2882 airfoil to the GPU the ability to handle large displacements while
and explored some GPU-specific optimization running an order of magnitude faster.
strategies. In most cases, approximately 16 to 63 x Author: Narayanan Sundaram (University of
speed up can be achieved. California, Berkeley)
Author: Long Wang (Supercomputing Center,
Chinese Academy of Sciences)
F03 - Visual Cortex on a Chip: Large-scale, Real-
Time Functional Models of Visual Cortex on a
GPGPU
Computer Graphics Los Alamos National Laboratory’s Petascale
Synthetic Visual Cognition project is exploring
E01 - Dynamic and Implicit Trees for Graphics full-scale, real-time functional models of human
visual cortex to understand how human vision
NVIDIA RESEARCH SUMMIT

and Visualization on the GPU

We propose a new way to represent trees that achieves its accuracy, robustness and speed.
allows for faster algorithms, that are simple Commercial-off-the-shelf hardware to support
to implement (especially on the GPU), and this modeling is rapidly improving, e.g., a teraflop
POSTER LISTING

with a lower memory overhead than previous GPGPU card costs ~$500 and is ~size of mouse
approaches. Using our data structure, we have cortex. We present results demonstrating image
seen significant improvements in both volume ray classification on UAV aerial video with a visual
casting and ray tracing applications over previous cortex model running on a 240-core NVIDIA
state-of-the-art methods. GeForce GTX285, and see >x10 speed-up. As
Author: Nathan Andrysco (Purdue University) this technology continues to improve, cortical
modeling on GPGPU devices has the potential to
revolutionize computer vision.
E02 - Fragment-Parallel Composite and Filter Author: Steven Brumby (Los Alamos
In this poster, we describe our recent work in National Laboratory)
the area of programmable graphics pipelines by
presenting a fragment-parallel formulation of an
A-buffer-style composite and filter equation, and F04 - Fermi in Action: Robust Background
describe its implementation on a modern GPU. Subtraction for Real-time Video Analysis
Author: Anjul Patney (University of California, Davis) Background subtraction is one of the important
image processing steps for video surveillance and
many computer vision problems such as tracking
& recognition. However, robust background
Computer Vision subtraction that adapts well to variable
environment changes is highly computational
F01 - Architecture Aware Design for a Parallel and consumed large amount of memory. Thus,
Object Recognition System its practical application is often limited. Here,
We have developed a parallel object we aimed to expand its usage and tackle vision
recognition system using CUDA, achieving problems that requires high frame rate camera
70x-80x speedup against the original serial such as real-time sports analysis, real-time
implementation. In order to optimize our object detection and recognition. Using recent
implementation, we evaluated the performance advances in accelerator hardware – NVIDIA
of different parallelization strategies on some Fermi Architecture and taking advantage of
key computations in the object recognition heterogeneous computing , we are able to gain
system. Finally we concluded that the parallel good performance that allows to use in these
implementation performance is sensitive to input practical applications.
Author: Melvin Wong (Institute for Infocomm Research)
F05 - Bridging Neuroscience and GPU Computing clustered using hamming distances. Each of
to Build General Purpose Computer Vision these clusters is geometrically verified and
The construction of artificial vision systems connected using Geotags. Connected clusters
and the study of biological vision are naturally are bundle adjusted and the obtained registration
intertwined as they represent simultaneous is used to estimate depthmaps that are finally
efforts to forward- and reverse-engineer systems fused to obtain dense 3D models. Each of the
with similar goals. Here, we present a high- above steps, except Bundle Adjustment, is
throughput approach to more expansively explore implemented in CUDA and runs on multiple GPUs.
biologically-inspired models by leveraging GPUs. The performance of our pipeline is two order
We show that this approach can yield significant of magnitude faster on one order more images
gains in performance on object and face compared to state of the art method.
recognition (including “Labeled Faces in the Wild” Author: Jan-Michael Frahm (University of North
challenge and faces from Facebook), consistently Carolina, Chapel Hill)
outperforming the state-of-the-art. We highlight
how the application of flexible programming
F10 - Portable Central Vision Enhancement
tools, such as high-level scripting, template
System for Macular Degeneration Patients
metaprogramming/auto-tuning, can enable large
Vision enhancement systems is an alternative
performance gains, while managing complexity for
visual aid device to enhance the remaining vision
the developer.
for visual impairment subjects. Our aim is to
Author: Nicolas Pinto (Massachusetts Institute
develop a mobile central vision enhancement
of Technology)
system for macular degeneration patients. Three
different types of enhancement algorithms have
F06 - CUDA for Vision and Imaging Library been developed and their efficiency was tested on
CUVI Lib (CUDA for Vision and Imaging Library) low vision patients. These three algorithms have
is a software library that provides a set of been implemented on a portable low power devic.
GPU accelerated computer vision and image The Nvidia system-on-a-chip Tegra has been
processing functions. CUVI can both be utilized as chosen for this implementation.
an add-on library for the NVIDIA’s NPP (NVIDIA Author: Chloe Vaniet (Imperial College London)
Performance Primitives) as it compliments the
functionality present in NPP as well as it can be
F11 - Dense Stereo Vision on GPU
used as a standalone library ready to be plugged
A dense stereo vision for a material handling

NVIDIA RESEARCH SUMMIT

into end-user C/C++ applications.
dual-arm industrial robot have been implemented
Author: Salman Ul Haq (TunaCode)
with the Rectification, Stereo Correspondence
and 3D Pose from depth are ported out to GPU

POSTER LISTING
F07 - GPU-Friendly Multi-View Stereo using CUDA.
Reconstruction Using Surfel Representation and Author: Esubalew Bekele (Universal Robotics Inc.)
Graph Cuts
We present a new surfel (surface element) based
F12 - Upsampling Range Data in Dynamic
multi-view stereo algorithm which runs entirely
Environments
on GPU. We utilize flexibility of surfel-based 3D
We present a flexible, parallelized method for
shape representation and global optimization
fusing information from optical and range sensors
by graph cuts in a same framework.The
based on an accelerated high-dimensional
orientation of the constructed surfel candidates
filtering approach. Our system takes as input a
imposes an effective constraint that reduces the
sequence of monocular camera images as well
effect of the minimal surface bias. The entire
as a stream of sparse range measurements as
processing pipeline is implemented on the latest
obtained from a laser or other sensor system.
GPU to speed up the processing significantly.
Our method produces a dense, high-resolution
Experimental results show that the proposed
depth map of the scene, automatically generating
approach reconstructs the 3D shape of an object
confidence values for every interpolated depth
accurately and efficiently, which runs more than
point. We describe how to integrate priors on
100 times faster than on CPU.
object shape, motion and appearance and how to
Author: In Kyu Park (Inha University)
achieve an efficient implementation using parallel
processing hardware such as GPUs.
F08 - CUDA Accelerated Face Recognition Author: Jennifer Dolson (Stanford University)
A GPU based implementation of a face recognition
solution using PCA with Eigenfaces algorithm.
FULL CONFERENCE GUIDE 2010

F13 - GPU Accelerated Marker-less Motion

Author: Jayadeep Vijayan (NeST Software)
Capture
In this work, we derive an efficient filtering
F09 - GPU Driven Dense Reconstruction for algorithm for tracking human pose at 4-10 frames
Community Photo Collections per second using a stream of monocular depth
We present a system to reconstruct dense 3D
models from community photo collections.
First images are described using GIST and are
85
images. The key idea is to combine an accurate benefits and opportunities of our approach, we
generative model-which is achievable in this present a fine and coarse grain implementation
setting using state of the art GPU hardware-with for multidimensional queries.
a discriminative model that feeds data-driven Author: Peter Volk (Technische Universität Dresden)
evidence about body part locations. We describe a
novel algorithm for propagating the noisy evidence
G03 - Virtual Local Stores
about body part locations up the kinematic chain
We propose a mechanism to provide the benefits
using the unscented transform.We provide extensive
of a software-managed memory hierarchy on
experimental results on 28 real-world sequences
top of a hierarchy of hardware-managed caches.
using automatic ground-truth annotations from a
A virtual local store (VLS) is mapped into the
commercial motion capture system.
virtual address space of a process and backed
Author: Varun Ganapathi (Stanford University)
by physical main memory, but is stored in a
partition of the hardware-managed cache when
F14 - 3D Facial Feature Modeling with Active active. This reduces context switch cost, and
Appearance Models allows VLSs to migrate with their process thread.
Active Appearance Models (AAM) is a powerful The partition allocated to the VLS can be rapidly
tool for modeling and matching objects under reconfigured without flushing the cache, allowing
shape deformations and texture variations. It programmers to selectively use VLS in a library
learns characteristics of objects by building a routine with low overhead.
compact statistical model from applying Principal Author: Henry Cook (University of California, Berkeley)
Component Analysis (PCA) to a set of labeled
data. Although AAM has been widely applied in
the fields of computer vision, due to its flexible
framework, it still cannot satisfy the requirement Embedded & Automotive
of real-time situations. To alleviate this problem,
we address the computational complexity of the H01 - Driver Assistance: Speed-Limit Sign
fitting procedure by running the AAM optimization Recognition on the GPU
algorithm on a GPU using a hybrid CPU / GPU We investigate the use of differentGPU-based
block processing architecture. implementations for performing real-time
Author: Tim Llewellynn (nViso / EPFL) speed limit sign recognition on a resource-
constrainedembedded system. The system
NVIDIA RESEARCH SUMMIT

recognized US and European Union speed-limits

F15 - OpenCV on GPU at over 88% accuracy while running in real-time.
OpenCV is a free open source library of computer The system is hardware-accelerated using CUDA
vision algorithms. Recently a new module and OpenGL. It introduces a novel technique for
POSTER LISTING

consisting of functions implemented on GPU detecting speed-limit signs which is only possible
was introduced in OpenCV. It consists of several with the aid of GPU processing.
methods for calculating stereo correspondence Author: Vladimir Glavtchev (BMW)
between two images that is used to reconstruct
a 3D scene. A simple block-matching algorithm
works up to 10x faster compared to a CPU H02 - Complex Automotive Applications
implementation in OpenCV providing real-time NVIDIA GPU architecture becomes a very
processing of HD stereo pairs on Tesla cards. interesting hardware target for complex
Belief propagation-based algorithms show 20-50x automotive application. We implemented the
speedup compared to a CPU implementation. same automotive application on several different
Author: Anatoly Baksheev (ITEEZ) hardware targets and analyzed the maximum
frame rate and the effective CPU charge. This
paper shows how real-time applications like
pedestrian detection and driving assistance
Databases & Data Mining take benefits from a massively parallel
“central” architecture like GPU/CUDA. Real-
G02 - Speculative Query Processing time performance and zero-delay transfers
With an increasing amount of data and user can be achieved using a full asynchronous
demands for fast query processing, the implementation. The same approach can really
optimization of database operations continues to multiply the application performance by the
be a challenging task. A common optimization number of GPU devices present on the embedded
method is to leverage parallel hardware system, at a reasonable power consumption.
architectures. With the introduction of general- Author: Marius Vasiliu (University of Paris Sud)
purpose GPU computing, massively parallel
hardware has become available within commodity High Performance Computing
hardware. To efficiently exploit this technology,
we introduce the method of speculative query I01 - A GPU-based Architecture for Real-Time
processing. This speculative query processing Data Assessment at Synchrotron Experiments
works on index structures to efficiently support Modern X-ray imaging cameras provide millions
heavily used database operations. To show the of pixels and several thousand frames per second.
To process such an amount of information we divergence in parallel code kernels. The use of
have optimized the reconstruction software the GPU allows AB models to be visualised in real
employed at the tomography beamlines of ANKA time, which further widens the application of ABM
and ESRF synchrotrons to use the computational to real-time simulations.
power of modern graphic cards. Using GPUs as Author: Paul Richmond (University of Sheffield)
compute coprocessors we were able to reduce
the reconstruction time by a factor 30 and
I05 - The Scalable HeterOgeneous Computing
process a typical data set of 20GB in 40 seconds.
(SHOC) Benchmark Suite
The time needed for the first evaluation of the
SHOC is a benchmark suite for heterogeneous
reconstructed sample is reduced significantly and
systems. This poster describes the suite and
quasi real-time visualization is now possible.
presents recent performance measurements.
Author: Suren Chilingaryan (Karlsruhe Institute
Author: Kyle Spafford (Oak Ridge National Laboratory)
of Technology)

I06 - HyperFlow: An Efficient Dataflow

I02 - Automatic High-Performance GPU code
Architecture for Multi CPU-GPU Systems
Generation using CUDA-CHiLL
We propose a new pipeline architecture that
This poster presents a system to automatically
can take advantage of the many processing
generate high-performance GPU code starting
elements available in modern CPU-GPU systems
from an input sequential loop nest computation.
to maximize performance in visualization and
The compiler analyzes input computation in C
computational tasks. Our architecture is very
and automatically generates a set of equivalent
flexible and allows the construction of classical
code variants represented by transformation
parallel algorithms such as data streamers and
recipe. These recipes guide the underlying code
map/reduce templates. We also discuss examples
transformation and generation framework to apply
and performance benchmarks that demonstrate
code transformations and ultimately produces
the potential of our system.
CUDA code.
Author: Huy Vo (University of Utah)
We use the system to generate high performing
CUDA code for four BLAS functions, matrix
transpose and convolution stencils. The results I07 - MPI-CUDA Applications Checkpointing
mostly outperform CUBLAS2.2/CUDA_SDK2.2 and We propose a checkpoint/restart tool for multi-
naive GPU kernel and can achieve perform up to GPU applications such as MPI-CUDA applications

NVIDIA RESEARCH SUMMIT

435GF(mm) with avg speedup up to 1.78x. Author: Nguyen Toan (Tokyo Institute
Author: Malik M Khan (USC/ UoU) of Technology)

POSTER LISTING
I03 - CSIRO Advances in GPU Computing. What I08 - Particle Simulations using DEM on GPUs
could you do with 256 GPUs? Particle based numerical methods are an
The Commonwealth Scientific and Industrial emerging field since the GPU/CUDA technique
Research Organisation (CSIRO) is Australia’s became widely accepted in the last years. 80%
national science agency. CSIRO is currently of the whole material,used in pharmaceutical
applying GPU Computing on a scale ranging from technology are powders. Numerical simulations
single GPU workstations through to their 256 GPU of such material is possible by using the Discrete
cluster. This poster showcases some of CSIRO’s Element Method (DEM). The main restrictions
work in the areas GPU accelerated biological here is compute power together with the problem
imaging, image deconvolution, synchrotron size. Only a few ten-thousand particles lead to
science and CT reconstruction, and statistical weeks to months of compute time in order to
inference in complex environmental models. reflect processes of a few minutes in real time.
Speedups of between 8 to 230x have been seen DEM scales excelent with the massively-parallel
across these applications areas using a broard CUDA environment, enabling us to access the
range of GPU computing platforms. million particle range in acceptable job runtimes.
Author: Luke Domanski (CSIRO) Author: Charles Radeke (University Graz)

I04 - High Performance Agent-Based Simulation I09 - Mastering Multi-GPU Computing on a Torus
with FLAME for the GPU Network
The Flexibile Large-scale Agent Modelling We describe APEnet+, the new generation of
Environment for the GPU (FLAME GPU) addresses our 3D torus network which scales up to tens
the performance and architecture limitations of of thousands of cluster nodes with linear cost.
FULL CONFERENCE GUIDE 2010

previous work by presenting a flexible framework The basic component is a custom PCIe adapter
approach to ABM on the GPU. Most importantly with six high-speed links, designed around a
it addresses the issue of agent heterogeneity programmable HW component (FPGA), a nice
through the use of state machine based agent environment for studying integration techniques
representation. This representation allows between GPUs and network interfaces. The
agents to be separated into associated state lists highlevel programming model is MPI, while a low-
which are processed in batches to allow very level RDMA API is also available.
diverse population of agents whilst avoiding large Author: Davide Rossetti (National Institute of
Nuclear Physics)
87
I10 - Atmospheric Modelling, Simulation and I14 - An Atomic Tesla
Visualization using CUDA We examined the possibility of using an Atom-
The Laboratory Meteorological Dynamics (LMD) based host system to control a Tesla S1070.
by CNRS weather model is used extensively for Our simple benchmarks found that Atom-based
research and weather forecasting purposes. systems should be viable for codes with serial
Simulation of atmospheric climate is one of the portions small enough to make Amdahl’s Law
most challenging computational tasks because irrelevant. Such systems would have a much lower
of its numerical complexity and simulation time. power draw than ‘traditional’ GPU clusters.
The numerical simulations must be obviously Author: Richard Edgar (Massachusetts
achieved faster than in real time to use them in General Hospital)
decision support.
Author: Priyanka Sah (Indian Institute of
I15 - ICHEC’s GPU Research: Porting of Scientific
Technology, Delhi)
Application on NVIDIA GPU
ICHEC is the Irish National HPC centre, with
I11 - Automatic Program Generation for the a mission to provide both high performance
Fermi - DFT Transform computing resources and expertise for the Irish
The goal of SPIRAL is to push the limits research community. In addition to its core
of automation in software and hardware mission of research enablement, ICHEC started
development and optimization numerical kernels in May 2009 an exploratory activity in GPGPU
beyond what is possible with current tools. In this and CUDA programming. Quantum Espresso
research, we address the problem of an efficient is an increasingly popular molecular dynamic
high performance computing platform of libraries package, mainly developed by the DEMOCRITOS
automatically generated by a computer forNVIDIA group in Trieste (IT). PWscf is part of the Qauntum
GPU architectures. Spiral generates code that Espresso suite which performs electronic and
automatically bypasses all the architectural ionic structure calculations. Interesting part on
restrictions on GPUs, shared memory bank the porting of PWscf is an high performance [ZD]
conflicts, global memory coalescing and pushes gemm which execute in parallel between CPU
code to the limits (maximum number of threads, and GPU.
register pressure, etc.). The procedure of code Author: Ivan Girotto (Irish Centre for
generation is fast, platform dependent, easy to High-End Computing)
rewrite and problem adaptable.
NVIDIA RESEARCH SUMMIT

Author: Christos Angelopoulos (Carnegie

I16 - Implementation of Smith-Waterman
Mellon University)
algorithm in OpenCL for GPUs
In the poster is presented the implementation
POSTER LISTING

I12 - Fast N-body Algorithms for Dynamic of Smith-Waterman algorithm done in OpenCL.
Problems on the GPU This implementation is capable of computing
we present an extension of the earlier algorithm similarity indexes between query sequences and
by Gumerov & Duraiswami (J. Comput. Phys., a reference sequence with or without sequence
2008) which adapts the FMM to the GPU, where alignment paths. In accordance with the
the data structures are efficiently generated on requirement for the target application in cancer
the GPU as well. Details and performance on research the implementation provides processing
current architectures will be presented. of very long reference sequences (in the order of
Author: Qi Hu (University of Maryland) millions of nucleotides). Performance compares
favorably against CPU, being on the order of
14 - 610 times faster; 4.5 times faster than the
I13 - GPU Acceleration of Cube Calculus
Farrar’s implementation. It is also on par with
Operations
CUDASW++v2.0.1 performance, but with less
In our current work, we present the first massively
constraints in sequence length.
parallel, GPU accelerated implementation of the
Author: Dzmitry Razmyslovich (Institute of
Cube Calculus operations for multivalued and
Computer Engineering, University of Heidelberg)
binary logic, also called Cube Calculus Machine
(CCM). Substantial speedups upto the order of
85x are achieved using the CUDA enabled nVIDIA I17 - Computing Strongly Connected
Tesla GPU compared to the CPU implementation Components in Parallel on CUDA
on a sequential processor.CC is a very efficient The problem of decomposition of a directed
and convenient mathematical formalism for graph into its strongly connected components is
representation, processing and synthesis of a fundamental graph problem inherently present
binary and multivalued logic which has significant in many scientific and commercial applications.
applications in logic synthesis, image processing We show how existing parallel algorithms can
and machine learning. Thus, massive speedups be reformulated in order to be accelerated by
achieved using GPUs are very encouraging to build NVIDIA CUDA technology. We design a new
future parallel VLSI EDA systems CUDA-aware procedure for pivot selection and
Author: Vamsi Parasa (Portland State University) we redesign the parallel algorithms in order to
allow for CUDA accelerated computation. We
experimentally demonstrate that with a single
GTX 280 GPU card we can easily outperform processing in these situations greatly affects
optimal serial CPU algorithm. the workflow throughput. We report some early
Author: Milan Ceska (Masaryk University) results on GPU acceleration of the Neurite
Detection module in our groups’ HCA-Vision.
The most time consuming algorithm steps
I18 - A CUDA Runtime Target for the
are accelerated by up to 13.6x resulting in a
Sequoia Compiler
3.3x speedup for the entire algorithm (70% of
We describe an implementation of the Sequoia
theretical maximum).
Runtime interface in CUDA that enables the
Author: Luke Domanski (CSIRO)
Sequoia compiler to target programs written in
Sequoia for single and multiple GPU systems.
Author: Michael Bauer (Stanford University) J02 - Fast Radon Transform via Fast Non-
uniform FFTs on GPUs
Fast Radon Transform is required in X-ray Phase
I19 - GPU Computing for Real-Time Optical
Contrast Tomography performed at the Advanced
Measurement Techniques
Light Source, Lawrence Berkeley National Lab.
Measuring displacement and strains during
We describe a fast implementation based on fast
deformation of advanced materials which are
non-uniform FFTs on GPUs.
too small, big, compliant, soft or hot are typical
Author: Chao Yang (Lawrence Berkeley National
scenarios where non-contact techniques are
Laboratory)
needed. Using Digital Image Correlation and
Tracking, strain can be calculated from a series
of consecutive images with sub pixel resolution. J03 - Projected Conjugate Gradient Solvers on
However, the image processing is a computation GPU and its Applications
intensive task and can’t be performed in real In this work, the focus is specifically on how to
time using general purpose processors. We speedup the projected CG algorithm utilizing the
implemented 3 stage pipelined architecture: GPU. It is shown that the projected CG method
images are loaded, preprocessed using CPU, and can be used within the single precision accuracy
correlated on GPUs. Using two GTX295 cards we of the current GPU. One benefit gained through
were able to reach 35 times speedup compared to use of the projected CG is that it reduces the total
fastest Core i7 processor. number of matrix vector multiplications, which is
Author: Suren Chilingaryan (Karlsruhe Institute usually a bottleneck for an efficient GPU-based
of Technology) Krylov-based algorithm. A modified projection

NVIDIA RESEARCH SUMMIT

based CG algorithm in the thesis is further
proposed which shows a better performance.
I20 - An MPI/CUDA Implementation of
Numerical results using the GPU are provided to
Discontinuous Galerkin Time Domain Method for

POSTER LISTING
support the proposed algorithm.
Maxwell’s Equations
Author: Youzuo Lin (Arizona State University)
We describe an MPI/CUDA approach to solve
Maxwell’s equations in time domain by means
of an Interior Penalty Discontinuous Galerkin J04 - Real-time Direct Georeferencing of Images
Time Domain Methods and a local time stepping from Airborne Line Scan Cameras
algorithm. We show that MPI/CUDA provides 10x The Norwegian Defense Research Establishment
speed up versus MPI/CPU, in double precision. (FFI) is developing a technology demonstrator
Moreover, we present scalability results and an for airborne real-time hyperspectral target
85% parallelization efficiency up to 40 GPUs on detection. The system includes two nadir-
the Glenn cluster of Ohio Supercomputing Center. pointing line scan cameras. The line scanned
Finally, we study an electromagnetic cloaking images are georeferenced in real-time by
example for a broad band signal(8-11GHz), to intersecting rays cast from the cameras with
show the potential of our approach to solve real a 3D model of the terrain underneath. The
life examples in short simulation times. georeferenced images may then easily be
Author: Stylianos Dosopoulos (Ohio State University) ortho-rectified (e.g by using texture mapping
in OpenGL) and overlaid digital maps. This
poster presents the performance of a cuda
implementation of the georeferencing method.
Imaging Author: Trym Vegard Haavardsholm (Norwegian
Defence Research Establishment (FFI))
J01 - Neurite Detection using CUDA, GPU
Accelerated Biological Imaging for
FULL CONFERENCE GUIDE 2010

J05 - CUDA Acceleration of Color

High-Content Analysis
Histogram Matching
The analysis of microscopic neurite structures
Histogram matching techniques are methods for
in images is an important for studying the
the adjustment of color in a pair of images. It can
effects of lead compounds on brain diseases
be used as a preliminary stage for several video
or the regeneration of brain cells after trauma.
applications as for example 3D content creation.
In High-Content Analysis (HCA) 100s to 1000s
In such application two cameras separated a
of microscopy images are processed during
known distance acquire video streams that can be
automated experiments. The speed of the image
89
combined in order to compute a depth map. Medical Imaging & Visualization
As both cameras take slightly different scenes
they can be lit by different sources becoming M01 - Real-time Ultrasound Data Processing for
a possible color shift between their streams Regional Anesthesia Guidance
and thus penalizing the quality and the user Ultrasound imaging techniques such as Doppler
experience. Our approach considers the use flow imaging and acoustic radiation force impulse
of a NVIDIA 3D broadcast solution system with (ARFI) imaging require estimation of velocity or
professional HD cameras. displacement from the received echoes. Real-
Author: Antonio Sanz (Universidad Rey Juan Carlos) time processing and display of images allows
for real-time guidance of procedures, improving
patient safety and efficacy. Using CUDA, the
processing code has been implemented in pre-
Life Sciences clinical regional anesthesia studies investigating
new methods for localizing where fluid is being
K01 - Generalized Linear Model (GLM) Based injected. The computation time has been
Quantitative Trait Locus (QTL) Analysis reduced from 20 minutes to 18 seconds, resulting
Relating Genotype to Phenotype in Complex in the rapid display of dynamic images of the fluid
Environments has been identified as one of the being injected.
grand challenges of plant sciences. Under the Author: Stephen Rosenzweig (Duke University)
umbrella of the iPlant Collaborative funded
by the Plant Science Cyberinfrastructure
M02 - GPU-Accelerated Texture Decompression
Collaborative program of the NSF, our goal is
of Biomedical Image Stacks
to develop GPU implementation of the General
Histopathology is the microscopic examination
Linear Model (GLM) to statistically link genotype
of tissue in order to study the manifestations
to phenotype and dramatically decrease the
of disease. High resolutions images are vital
execution time for GLM analyses. GPU based
for accurate diagnoses and a major obstacle
highly parallelized Forward Regression stage of
to the use of digital imaging in histopathology
the GLM achieved 177x speedup over the Matlab
has been the inability to display these large
based serial version. Results of this study will
images at interactive rates. We have created a
enable larger, more intensive genetic mapping
tool for interactive visualization of biomedical
analyses to be conducted.
image stacks using GPU-accelerated on-the-fly
Author: Ali Akoglu (University of Arizona)
NVIDIA RESEARCH SUMMIT

texture decompression. The image stacks are

compressed using a novel approach custom
K02 - GPU-REMuSiC: The Implementation of tailored for the data we are dealing with, i.e. data
Constrain Multiple Sequence Alignment on exhibiting exceptionally high coherence between
POSTER LISTING

Graphics Processing Unit the slices of each image stack.

We implement RE-MuSiC tool on multi-GPUs Author: Chirantan Ekbote (Harvard University)
(called GPU-REMuSiC) with NVIDIA CUDA.
By a special model implementation, the DP
M03 - Accelerated Large Scale Spherical Model
computation time in GPU-REMuSiC running on
Forward Solutions for the EEG/MEG using CUDA
single and two GeForce GTX 260 cards achieves
The study presented in the poster looks at the
more than 75 and 130 speedups comparing to that
utility of a CUDA based approach to improve the
in sequential RE-MuSiC running on Intel i7 920
computational speed of the spherical model
CPU, respectively.
EEG and MEG forward solution for large scale
Author: Chun-Yuan Lin (Chang Gung University)
3-D dipole grid (on order of 1000 and up) and
sensor locations (on order of 100 and up). Fast
computation of the forward solution is critical
Machine Learning & Artificial in improving the speed of the inverse solution in
Intelligence biosource imaging. The inverse solution gives the
location of the epileptogenic foci from the EEG and
L01 - CUDA Creatures MEG measurements.
CUDA Creatures applies parallel algorithms to the Author: Nitin Bangera (MIND Research Network)
iterated Prisoner’s Dilemma, a classic study of the
evolution of cooperation. We bring interactivity to
parameter space exploration by achieving 600x to M04 - CUDA Accelerated Real Time Volumetric
800x speedups on GTX 260. Cardiac Image Enhancement
Author: Andrew Hershberger (Stanford University) CUDA enables high data rate real time volumetric
cardiac ultrasound image enhancement.
Substantial improvements in processing data rate
and memory bandwidth demand over a CPU based
approach were found with CUDA.
Author: Ismayil Guracar (Siemens Medical Solutons)
M05 - Efficient Visualization of Salient Manifolds Neuroscience
in Scalar, Vector, and Tensor Fields
Our research focuses on harnessing the O01 - Distributed Multi-Level Out-of-Core
massively parallel compute power of the GPU to Volume Rendering
visually explore complex datasets. We propose In neuroscience, scans of brain tissue are
adaptive GPU-based approaches that intertwines acquired using electron microscopy, resulting
computation and rendering. Along side we present in extremely high-resolution volume data with
novel dynamic data structures for the GPU. Our sizes of many terabytes. To support the work of
research include the visualization of salient neurobiologists, interactive exploration of such
structures in vector fields using LCS, extraction of volumes requires new approaches for distributed
ridge and valley surfaces from volumetric scalar out-of-core volume rendering. A major goal of
fields with scale analysis, and efficient volume / our distributed GPU volume rendering system
surface rendering. is to sustain a pixel-to-voxel ratio of about 1:1.
Author: Samer Barakat (Purdue University) This display-aware approach effectively bounds
the working set size required for ray-casting,
which makes it largely independent of the volume
M06 - Highly Parallel Image Reconstruction for
resolution. Currently, our system achieves
Positron Emission Tomography (PET)
interactive volume rendering of 43GB and 92GB
We present a novel method of computing line
volumes on 1 to 8 Tesla nodes.
projection operations required for list-mode
Author: Johanna Beyer (King Abdullah University of
ordered-subsets expectation-maximization
Science and Technology)
(OSEM) for fully 3-D PET image reconstruction on
a GPU using the CUDA framework. Our method
overcomes challenges such as compute thread
divergence and exploits GPU capabilities such Programming Languages & Techniques
as shared memory and atomic operations. This
new GPU-CUDA implementation is 120X faster P01 - GPU-to-CPU Callbacks
than a reference CPU implementation. The image Our poster outlines GPU-to-CPU callbacks, a
quality is preserved with root mean squared (RMS) method for the GPU to request work from the
deviation between the images generated using CPU. We give some motivation, demonstrate the
the CPU and the GPU being 0.08%, which has code architecture, and give samples of CPU and
negligible effect in typical clinical applications. GPU code that show callbacks being executed.

NVIDIA RESEARCH SUMMIT

Author: Jingyu Cui (Stanford University) Author: Jeff Stuart (University of California, Davis)

POSTER LISTING
Molecular Dynamics Physics Simulation

N01 - Energy Evaluation of Rosetta Proteins Q01 - Acceleration of Computational

Using CUDA Electromagnetics Physical Optics - Shooting and
In this poster, we describe preliminary results Bouncing Ray Method
using CUDA to accelerate the energy evaluation of Electromagnetic fields radiated by a 1964 Ford
proteins folded by the Rosetta software suite. Thunderbird are calculated over 50 times faster
Author: Will Kohut (University of California, Davis) than a standard CPU by using a Quadro FX 5800
GPU.
Author: Huan-Ting Meng (University of Illinois at
N02 - GPU Accelerated Molecular Dynamics
Urbana-Champaign)
Algorithms for Soft Matter Systems using
HOOMD-Blue
The rheological, thermodynamic, and self- Q02 - Massively Parallel Micromagnetic FEM
assembly behavior of liquids, colloids, polymers, Calculations with Graphical Processing Units
foams, gels, granular materials and biological We adapted our Micromagnetic Simulator
systems are often studied in simulation by using “TetraMag” to NVIDIA’s CUDA architecture,
coarse-grained models based on molecular resulting in a significant increase in calculation
dynamics algorithms. The open source general speed and cost efficiency over the most recent
purpose particle dynamics code HOOMD-Blue PC-based machines. The poster gives an outline
has been expanded to include the simulation of the general challenges and the methods
techniques and pair potentials used to study this used to adapt the solutions to GPUs as well as
FULL CONFERENCE GUIDE 2010

class of problems. benchmark results obtained using standard

Author: Carolyn Phillips (University of Michigan) micromagnetic problems.
Author: Elmar Westphal (Forschungszentrum Juelich)
91
Q03 - Multiplying Speedups: GPU-Accelerated GPU architectures, and enables a single
Fast Multipole BEM, for Applications in implementation to automatically scale to use
Protein Electrostatics additional hardware as it becomes available.
We have developed a fast multipole boundary Author: Rami Mukhtar (NICTA)
element method (BEM) for biomolecular
electrostatics. With GPU acceleration of the FMM,
R03 - Language and Compiler Extensions for
there is a multiplicative speed-up resulting from
Heterogeneous Computing
the fast O(N) algorithm and GPU hardware. With
GPGPU architectures offer large performance
this method, we can obtain converged results for
gains over their traditional CPU counterparts
multi-million atom systems in less than an hour,
for many applications. However, current GPU
using multi-GPU clusters.
programming models present numerous
Author: Lorena Barba (Boston University)
challenges to the programmer: lower-level
languages, explicit data movement, loss of
Q04 - GPU-Powered Control of a Compliant portability, and performance optimization
Humanoid Robot challenges. In this paper, we present novel
The ECCEROBOT project deals with the methods and compiler transformations that
construction and control of a robot with a increase productivity by enabling users to easily
humanoid skeleton and muscle-like compliant, program GPUs using the high productivity
elastic actuators. The nonlinear passive and programming language Chapel.
active coupling between the skeletal elements, Author: Albert Sidelnik (University of Illinois at
combined with the effect of environmental Urbana-Champaign)
interaction, present an extremly complex control
problem. Our solution; motor programs are
found using physics-based simulation of both
the robot and its environment to locate candidate Signal processing
movements. For real time control multiple copies
of the simulation must be run in faster than S01 - Achieving 1 TFLOP for the Radio Astronomy
real time, requiring the use of GPU acceleration. Correlator
Further, in order to capture the environment we In this work we apply CUDA, using the Fermi
use GPU-accelerated dense reconstruction vision. architecture, to the problem of cross-correlation
Author: Alan Diamond (University Of Sussex, UK) arising in radio astronomy. This accounts for
NVIDIA RESEARCH SUMMIT

the bulk of computation in radio astronomy, and

essentially is described by vector outer-products.
Traditionally this task is performed using FPGAs,
Programming Languages & Techniques and the goal of this work was to see how efficiently
POSTER LISTING

GPUs could be used for this task. We describe

R01 - A Speech Recognition Application the tiling strategies and optimization techniques
Framework for Highly Parallel Implementations employed to maximize performance. We achieve
on the GPU in excess of 1 teraflop per second using a single
Data layout, data placement, and synchronization GeForce GTX 480, which corresponds to 78% of
processes are not usually part of a speech peak performance,
application expert’s daily concerns. Yet failure to Author: Michael Clark (Harvard University)
carefully take these concerns into account in a
highly parallel implementation on the graphics
S02 - CUDA Implementation of Software for
processing units (GPU) could mean an order of
Identifying Post-Translational Modifications
magnitude of loss in application performance.
InsPecT is a software for identifying post-
We present an application framework for parallel
translational modifications of protein. With the
programming of automatic speech recognition
help of the MS-Alignment algorithm, InsPecT can
(ASR) applications that allows a speech
search PTMs in unrestrictive mode, even reveal
application expert to effectively implement speech
unknown types of modifications. However, the MS-
applications on the GPU, and demonstrate how
Alignment has a tremendous time complexity and
the ASR application framework has enabled
takes more than 99% computing time of InsPecT.
a Matlab/Java programmer to achieve a 20x
We accelerated MS-Alignment on GPUs. After
speedup in application performance on a GPU.
optimization and parallelization with MPI, cuda-
Author: Jike Chong (Parasians, LLC)
InsPecT, a new open source software based on
MPI+CUDA with high efficiency is born.
R02 - Scalable Computer Vision Applications Author: Long Wang (Supercomputing Center,
We are developing a domain specific language for Chinese Academy of Sciences)
computer vision algorithms that facilitates rapid
implementation of algorithms that are scalable
and portable across CPU-GPU architectures.
The presented approach significantly lowers
the barrier of implementation of computer
vision algorithms for heterogeneous CPU-
Tools & Libraries Video Processing

U01 - Mint: An OpenMP to CUDA Translator V01 - Real-Time Color Space Conversion for High
We aim to facilitate GPU programming for finite Resolution Video
difference applications. We have developed Mint, a Color space conversion or color correction
source to source compiler to generate CUDA code is a widely used technique to adapt the color
from OpenMP code. Mint transforms omp parallel characteristics of video material to the display
for loops into CUDA kernels and applies domain technology employed (e.g. CRT, LCD, projection) or
specific optimizations such as shared memory, to create a certain artistic look. As color correction
register and kernel fuse optimizations. Since often is an interactive task and colorists need a
our translator targets structured grid problems, direct response, state-of-the-art real-time color
it optimizes the code better than the general correction systems for video are so far based on
purpose compilers. In this poster, we present expensive dedicated hardware. This submission
translation and optimization steps along with our shows the feasibility to replace dedicated color
initial performance results. correction systems by General Purpose GPUs. It
Author: Didem Unat (University of California, is shown that a single Tesla C2050 GPU supports
San Diego) real-time color correction up to a resolution of
4096x2048 pixel.
U02 - Real-Time Particle Simulation in the Author: Klaus Gaedke (Technicolor)
Blender Game Engine with OpenCL
The goal of this project is to produce interactive V02 - 3D Object Detection in Digital Holographic
scientific visualizations that can be used in Microscope Images
educational games. We use the computational Digital Holographic Microscopy (DHM) is based
power of OpenCL to enable features in the on the classical holographic principle invented
Blender Game Engine that would otherwise not by Hungarian physicist Dennis Gabor. The
be possible in real-time. By adding an interactive holographic images are acquired by a CCD
particle system to the game engine, we set the camera. Depth slices can be reconstructed using
stage to demonstrate many interesting scientific Fourier transform. The numerical reconstruction
phenomena (molecular dynamics, fluid dynamics, and further image processing for object detection
statistics) with the added benefit of real-time is done using General Purpose Graphical
special effects for games in general. Processor Units (GPGPU).

NVIDIA RESEARCH SUMMIT

Author: Ian Johnson (Florida State University) Author: Vilmos Szabo (Pazmany Peter
Catholic University)
U03 - GStream: A General-Purpose Data

POSTER LISTING
Streaming Framework on GPU Clusters
In this poster, we propose GStream, a general-
purpose, scalable data streaming framework
on GPUs. The contributions of GStream are as
follows: (1) We provide powerful, yet concise
language abstractions suitable to describe
conventional algorithms as streaming problems.
(2) We project these abstraction onto GPUs to fully
exploit their inherent massive data- parallelism.
(3) We demonstrate the viability of streaming on
accelerators. Experiments show that the proposed
framework provides flexibility, programmability
and performance gains for various benchmarks
from a variety of domains, including but not
limited to data streaming, data parallel problems,
numerical codes and text search.
Author: Yongpeng Zhang (North Carolina
State University)

U04 - NukadaFFT : An Auto-Tuning FFT Library

for CUDA GPUs
We have released our FFT library for CUDA GPUs.
FULL CONFERENCE GUIDE 2010

Most of algorithms and auto-tuning technologies

of FFT for CUDA are already published. The library
now supports new Fermi architecture and works
with CUDA 3.0 or later.
Author: Akira Nukada (Tokyo Institute of Technology)
93
Abbas-Turki, Lokman A. many conferences. Last year, he published a CUDA
PhD Student in Applied Mathematics (Paris-Est programming textbook in Japanese.
University) hh Session(s): 2295 - Large-scale CFD
SUPELEC Engineer in Signal Processing and MS degree Applications and a Full GPU Implementation
in Applied Mathematics and Mathematical Finance. My of a Weather Prediction Code on the TSUBAME
research interests are: High-Performance Computing, Supercomputer (Tuesday, Sept 21, 16:00)
Mathematical Finances, Economics, Statistics.
hh Session(s): 2101 - Pricing American Options Araya, Mauricio
Using GPUs (Thursday, Sept 23, 16:30) Senior Researcher (Barcelona Supercomputing Center)
Mauricio Araya Polo received the engineering degree in
Adler, David computer science in 2001
Principal Software Engineer (Walt Disney Animation from the University of Chile. He received the master
Studios) (2003) and PhD (2006) degrees from the University
Mr. Adler is a Principal Software Engineer at Walt Disney of Nice - Sophia-Antipolis at the Institut National de
Animation Studios, where he develops software such Recherche en Informatique et Automatique (INRIA),
as a GPU-enhanced viewport for Maya for the Layout France. Since 2007, he is researcher, from 2009 senior
and Animation production departments. Mr. Adler has researcher, on Computational Geophysics at the
17 years experience developing software for 2D and 3D Barcelona Supercomputing Center (BSC). His research
animated motion picture production. interests cover the areas of multi-core architectures,
and programming models, numerical algorithms and
hh Session(s): 2285 - Walt Disney Animation code optimization techniques for HPC.
Studios’ GPU-Acelerated Animatic Lighting
Process with Soft Shadows and Depth of hh Session(s): 2226 - Reverse Time Migration
Field (Wednesday, Sept 22, 17:00) with GMAC (Wednesday, Sept 22, 16:00)

Albuz, Elif Archer, Bob

(NVIDIA) Senior Computer Scientist (Adobe Systems Inc)
Elif Albuz joined NVIDIA 6 years ago to design software Bob has been at Adobe for 6 years and is the lead for
for video acceleration on GPUs. She is currently leading the Pixel Bender language and compiler. Prior to Adobe
CUDA FFT Library development. At NVIDIA, she designed he worked on 3D simulation, virtual reality systems and
video encoder/decoder and post-processing algorithms computer games.
and architected error resiliency and various parts of hh Session(s): 2053 - Pixel Bender: Building
video encoder. Before NVIDIA, she worked at Sony a Domain Specific Language on the
Electronics leading DVD firmware team. She holds GPU (Thursday, Sept 23, 10:00)
a masters from University of Delaware with focus on
Multimedia Processing on Parallel Architectures. Her Aubert, Dominique
expertise is in video codecs, video processing, parallel Lecturer (Strasbourg University)
algorithm design and low-level optimizations. D. Aubert is a lecturer at the University of Strasbourg,
hh Session(s): 2216 - CUDA Libraries Open France, and is a member of the Astronomical
House (Wednesday, Sept 22, 11:00) Observatory. His research covers topics such as
cosmology, the early Universe and the formation of
Anderson, Joshua galaxies. To this aim, he uses and developps applications
Research Area Specialist (University of Michigan) for the numerical simulation of astrophysical
Joshua Anderson is a Research Area Specialist in the phenomena such as the gravitational N-Body problem,

SPEAKERS AND
Laboratory for Computational Nanoscience & Soft the transfer of light through the large scale structures
Matter Simulation at the University of Michigan. Dr. of the Universe or the dynamics of astrophysical

PANELISTS
Anderson holds a Ph.D. degree in Condensed Matter fluids. Lately he ported such applications on the
Physics from Iowa State University and is the the lead large scale multi-GPUs cluster provided by the Titane
developer of HOOMD-blue, a high performance particle supercomputer in collaboration with the CEA, France. In
simulation tool. His current research interests include 2010, these numerical investigations on GPUs owed him
GPU computing, polymer physics, and nanoparticle self- to be chosen as one of the “Young Astrophysicist of the
assembly. Year” by the French Astrophysical Society.
hh Session(s): 2062 - HOOMD-blue: Fast and Flexible hh Session(s): 2099 - Cosmology Powered by
Many-Particle Dynamics (Thursday, Sept 23, 15:00) GPUs Redux (Wednesday, Sept 22, 11:00)

Anselm, Christoffer Augonnet, Cedric

Software Developer (Jedox Business Intelligence) PhD Candidate (INRIA)
Christoffer Anselm is a software developer trainee at Cédric Augonnet got his bachelor degree from the École
Jedox Business Intelligence, Germany. Normale Supérieure in Lyon and his master degree
from the Vrije Universiteit of Amsterdam and Bordeaux
hh Session(s): 2237 - Accelerating University. His research activities mainly focus on high
Business Intelligence Applications with performance computing on heterogeneous multicore
Fast Multidimensional Aggregation architectures. He is currently a PhD candidate in the
(Wednesday, Sept 22, 15:00) Runtime team in INRIA Bordeaux.
FULL CONFERENCE GUIDE 2010

Aoki, Takayuki hh Session(s): 2160 - StarPU: a Runtime System for

Professor (Tokyo Institute of Technology) Scheduling Tasks (Wednesday, Sept 22, 10:00)
Professor Aoki is a deputy director of Global Scientific Ayala, Hugo
Information and Computing center (GSIC), Tokyo Institute Sr. Research Associate (Blue Sky Studios)
of Technology. His research area is Computational Fluid
Dynamics (CFD) and he has developing high-accurate Hugo Ayala is a Senior Research Associate at Blue Sky
numerical schemes. He is a leader of GPU computing Studios, where he has worked on the films “Robots”,
for large-scale CFD and organizes GPU Computing “Ice Age: The Meltdown”, “Horton Hears a Who”, “Ice
Study Group with 500 members and GPU sessions in Age: Dawn of the Dinosaurs”, and the upcoming film,
95
“Rio”. His contributions range from the occasional fluid 1981 and has served as a consultant to most of the
simulation and effect, to the design and implementation leading hardware and software vendors in the industry
of footprint and crowd workflows. He works closely including IBM, Apple, Xerox, Hewlett Packard/Compaq,
with animators and especial effects artists to develop Dell, AT&T, Microsoft, Polaroid, Lotus, Epson, Toshiba
production related tools. Hugo studied Mechanical and numerous others. His articles and/or analysis
Engineering at the Massachusetts Institute of have appeared in USA Today, Wall Street Journal, The
Technology, from where he holds a bachelor, master, New York Times, Time and Newsweek magazines,
and doctorate degrees. He studied Screen Writing and BusinessWeek and most of the leading business and
Acting at the Harvard Extension school, and studied trade publications.
Studio Art at the Museum of Fine Arts in Boston. In hh Session(s): 4010 - Emerging Companies: CEO
addition to Blue Sky Studios, Hugo has worked as a on Stage featuring NaturalMotion Ltd, OptiTex,
software developer at Apple, and Boris FX. and Useful Progress (Thursday, Sept 23, 15:00)
hh Session(s): 2072 - GPUs at the Computer hh 4011 - Emerging Companies: CEO on Stage
Animation Studio (Wednesday, Sept 22, 16:00) featuring Cinnafilm Inc., Perceptive Pixel, and
Total Immersion (Thursday, Sept 23, 16:00)
Ayres, Daniel
PhD Candidate (University of Maryland) Barba, Lorena
Daniel Ayres is a Ph.D. student researching Assistant Professor (Boston University)
computational algorithms related to molecular evolution Dr. Barba is a computational scientist and a fluid
at the Center for Bioinformatics and Computational dynamicist. Her research covers particle methods used
Biology at the University of Maryland. A major focus for fluid simulation, the development of fast and efficient
of his research is on GPU-computing of likelihood algorithms, the use of novel computer architectures, as
calculations in phylogenetic analysis. He has a Bachelor well as fundamental and applied fluid dynamics.
of Science in Computer Engineering from the University She obtained her PhD (2004) in Aeronautics from
of Illinois at Urbana- Champaign where he studied topics CALTECH, then joined the Dept. of Mathematics at the
such as scientific computing, algorithms, networks, and University of Bristol, UK. Since Fall of 2008, she has
computer graphics. been an Asst. Professor of Mechanical Engineering at
hh Session(s): 2203 - Modeling Evolution Computing Boston University.
the Tree of Life (Thursday, Sept 23, 11:00) hh Session(s): 2166 - The Triad of Extreme Computing-
Fast Algorithms, Open Software and Heterogeneous
Badesha, Amolak Systems (Wednesday, Sept 22, 10:00)
Senior Application Expert & Strategist (Agilent
Technologies) Barnell, Mark
Amolak Singh Badesha is Senior Application Expert HPC Director (Air Force Research Lab)
and Market Strategist with Agilent Technologies. He Mr. Mark Barnell has more than 24 years of experience
has BSEE and MSEE degrees with focus on High- in HPC and is the Air Force HPC Director for Advanced
frequency Wireless and Digital Design. Amolak plays a Computing Architectures. He works at the Information
critical role in defining the roadmap and delivering new Directorate of the Air Force Research Laboratory
technologies for Agilent’s High-Speed Digital Silicon and in Rome, NY. His area includes high performance
System customers. He is also GPU evangelist at Agilent, computers, Urban ISR (SAR), distributed and next
and stronlgy believes in the future of highly parallel generation architectures. Current projects include
programming and hardware. Amolak has published design, build and integration of the 500 TFLOPS
many papers on Wireless and High-Speed Digital heterogeneous Condor Cluster.
design. He has received severa; awards from Agilent and
SPEAKERS AND

external customers for outstanding performance and hh Session(s): 2283 - 500 Teraflops Heterogeneous
strategic vision. Cluster (Thursday, Sept 23, 16:00)
PANELISTS

hh Session(s): 2080 - Tackling Multi-Gigabit Barrett, Simon

Design Challenges with a Practical Virtual Senior Software Engineer (NVIDIA)
EMI/ESD Lab (Wednesday, Sept 22, 15:00)
Simon Barrett is a Senior Software Engineer in the
Bailey, Dan Developer Tools group at NVIDIA and is part of the
R&D (Double Negative) Parallel Nsight development team. Previously, he
worked on NVIDIA’s PerfHUD, the NVIDIA Display Driver,
Dan graduated from the University of Bristol in the Sony PlayStation 3, and at Transmeta on Code Morphing
UK with a First Class Master of Engineering degree in Software.
Computer Science in the Summer of 2007. He worked
in Research and Development at The Moving Picture hh Session(s): 2212 - Parallel Nsight for
Company in London for two years, concentrating Accelerated DirectX 11 Development
predominantly on improving and extending the existing [Advanced] (Tuesday, Sept 21, 17:00)
pipeline. He has been working in Research and
Development on the proprietary fluid solver at Double Bednarz, Tomasz
Negative in London for the last year, focusing heavily on 3d Visualisation Software Engineer (CSIRO)
GPU development. He received his B.Sc. in 2001 and M.Sc. in 2002 in
hh Session(s): 2239 - Fast GPU Preconditioning Technical Physics from Faculty of Physics and Applied
for Fluid Simulations in Film Production Computer Science, AGH University of Science and
(Tuesday, Sept 21, 17:00) Technology, Poland. In 2005 he obtained his Ph.D. in
Energy and Environmental Engineering in 2005 from
Bajarin, Tim Interdisciplinary Graduate School of Engineering
President (Creative Strategies) Sciences, Kyushu University, Japan. From 2005, he
worked as Research Associate and Senior Research
Tim Bajarin is recognized as one of the leading industry Fellow at the School of Engineering & Physical Sciences,
consultants, analysts and futurists, covering the field James Cook University in Townsville, Australia.
of personal computers and consumer technology. Currently, he works as 3D Visualization Software
Mr. Bajarin has been with Creative Strategies since Engineer for CSIRO Earth Science and Resource
Engineering in Brisbane, Australia.
hh Session(s): 2058 - A Practical Introduction as chief executive officer. In a few short years, Sam
to Computational Fluid Dynamics on has grown Elemental into the leading provider of
GPUs (Wednesday, Sept 22, 10:00) GPU-accelerated video processing solutions building
innovative products used by leading video content
Bell, Nathan providers. Elemental™ Live, the first ever massively
Research Scientist (NVIDIA Research) parallel live video processing system, allows for
Nathan Bell joined NVIDIA Research in August 2008. His simultaneous encoding of video streams, targeting
current research interests include sparse linear algebra the comprehensive specifications required for the
and programming models for parallel computing. four-screen experience of TV, PC, tablet and mobile.
Nathan contributes to several open source projects Prior to co-founding Elemental in 2006, Sam specified
including Thrust, a high-level parallel template library, and architected next-generation products as an IC
and PyAMG, a library of algebraic multigrid methods design manager for Pixelworks. He spent time in China
in Python. Nathan received a bachelor’s degree in organizing the company’s Shanghai design center and
Computer Science from Georgia Tech and a Ph.D in was responsible for a wide variety of functional blocks
Computer Science from the University of Illinois at on six ImageProcessor ICs. Prior to joining Pixelworks in
Urbana-Champaign (UIUC). 2000, Sam held engineering positions at Silicon Graphics
and Intel Corporation. Due to a growing reputation for
hh Session(s): 2219 - High-Productivity CUDA deep industry knowledge, Sam has been tapped for
Development with the Thrust Template presentations and contributions to leading trade and
Library (Thursday, Sept 23, 11:00) news organizations.
Belloch, Jose Antonio hh Session(s): 4001 - Emerging Companies:
PhD Candidate (Institute of Telecommunications and CEO on Stage featuring Elemental
Multimedia Applications, Universidad Politecnica de Technologies, Inc., Geomerics, and
Valencia) Milabra (Wednesday, Sept 22, 11:00)
Jose A. Belloch was born in Requena, Spain (1983). Bleiweiss, Avi
He received the degree in Electrical Engineering Principal Architect (NVIDIA Corporation)
from the Universidad Politécnica de Valencia, Spain,
(2007). In 2008, he worked for the Company Getemed Avi Bleiweiss joined NVIDIA Corporation in 2007 as a
(Teltow, Germany) as a software developer. In 2009, member of the architecture group with his main role
he enrolled in a PhD program in 2009 at the Institute of leveraging GPU computing to accelerate game AI
of Telecommunications and Multimedia Applications workloads. He spans 25 years of R&D experience
inside Universidad Politécnica de Valencia. His research in the development of high end graphics systems.
interests are focused on Audio Signal Processing onto Previously, he worked for AMD/ATI where he led the
the CUDA environment. software development of ASHLI, a GPU shading toolkit,
and frameworks evaluating GPU performance for game
hh Session(s): 2116 - Real-time Multichannel physics and ray tracing. Before that he was a principal
Audio Convolution (Thursday, Sept 23, 10:00) engineer at Silicon Graphics, responsible for simulator
and driver implementation of a programmable geometry
Bernaschi, Massimo engine. Formerly, he was a distinguished engineer at
Professor (Istituto Applicazioni del Calcolo - C.N.R.) Kubota Graphics Computer, overseeing the architecture
Massimo Bernaschi has spent ten years with IBM of Denali’s rasterization subsystem. And earlier, Mr.
mostly working in the field of parallel and distributed Bleiweiss was a visitor scientist at Hewlett Packard
computing. Currently he is:Chief Technology Officer (HP) Laboratories, where he collaborated with Prof. Don
at the Istituto per le Applicazioni del Calcolo “M. Greenberg on a design of HP Precision Architecture’s
Picone” of C.N.R. (National Research Council) and coprocessor for realizing radiosity algorithms. He

SPEAKERS AND
Adjunct Professor of Computer Science, “La Sapienza” contributed numerous publications to Siggraph and
University, Rome. He is the author of more than 120 Graphics Hardware conferences.

PANELISTS
papers in international journals and proceedings of hh Session(s): 2207 - Playing Zero-Sum Games
international conferences. on the GPU (Wednesday, Sept 22, 11:00)
hh Session(s): 2112 - The Heisenberg
Spin Glass Model on GPU: Myth versus Blewitt, Chris
Fact (Tuesday, Sept 21, 11:00) CFO (miGenius)
Combining deep knowledge and experience in the
Bigdeli, Abbas development and implementation of ‘cutting edge’ 3D
Senior Researcher and Technology Manager, (NICTA) systems and physical based rendering technologies,
Abbas is currently a Senior Researcher and Technology Chris has aquired a unique perspective and inate
Manager for Advanced Surveillance Project at National understanding of how these two diverse fields can be
ICT Australia lab. He has collaborated on industrial brought together most effectively for the diverse range
projects with various companies in New Zealand, of users needing these capabilities. Chris has over 25
Australia and USA. He has more than 10 years years of experience in developing effective and efficient
experience in consultancy, scientific research and solutions for the wider needs of the non-technical user.
technology leadership in the areas of digital signal hh Session(s): 4002 - Emerging Companies: CEO on
and image processing, computer architecture, and Stage featuring Allegorithmic SAS, Bunkspeed,
information security. He has published over 60 papers and miGenius (Wednesday, Sept 22, 14:00)
in journals, book chapters and refereed international
FULL CONFERENCE GUIDE 2010

conferences. He has invented 4 patents in the areas of Bodin, Francois

information security and computer vision. CTO (CAPS entreprise)
hh Session(s): 2173 - Enabling Large-Scale CCTV Francois Bodin cofounded CAPS (www.caps-entreprise.
Face Recognition (Thursday, Sept 23, 11:00) com) in 2002 while he was a Professor at University of
Rennes I and since January 2008 he joined the company
Blackman, Sam as CTO. His contribution includes new approaches for
CEO and Co-Founder (Elemental Technologies, Inc.) exploiting high performance processors in scientific
Sam brings extensive management experience and computing and in embedded applications. Professor
video processing expertise to the Elemental team Francois Bodin holds a Master’s in CS and a PhD in CS,
both from University of Rennes I.
97
hh Session(s): 2117 - Migration of C and Fortran Apps semiconductors. Fifteen years ago he moved to Gartner/
to GPGPU using HMPP (Wednesday, Sept 22, 11:30) Dataquest and started tracking the semiconductor
market. In 1998 he established Insight 64, where
Bradley, Thomas he serves as the research fellow, concentrating on
Developer Technology Engineer (NVIDIA) CPUs and GPUs used in general purpose computing
Thomas Bradley MEng(Hons) MIET graduated with applications. Nathan is widely quoted in the general and
a first-class MEng degree in Computer Systems trade press on topics relating to semiconductor industry.
Engineering from the University of Bristol, UK, He has degrees from MIT and the Harvard Business
in 2000. He also completed the final year of the School, and resides in Saratoga CA, where he strives
Diplôme d’Ingénieur at l’École Nationale Supérieure to keep his home network with roughly a dozen nodes
de Télécommunications in Brest, France. He has running smoothly.
led architecture development for video encoding hh Session(s): 4004 - Emerging Companies: CEO
processors for general purpose parallel processors at on Stage featuring Cooliris, empulse GmbH,
STMicroelectronics and ClearSpeed. Since then, he has and Playcast (Wednesday, Sept 22, 16:00)
specialized in High Performance Computing software hh 4005 - Emerging Companies: CEO on Stage
development at ClearSpeed and at NVIDIA. featuring Jedox Business Intelligence, Rocketick,
hh Session(s): 2064 - Correlated Paths for Monte and Softkinetic (Wednesday, Sept 22, 17:00
Carlo Simulations (Thursday, Sept 23, 15:00)
Brown, Christopher
Brandvik, Tobias Partner (Open Data)
PhD Student (University of Cambridge) Christopher Brown is a Partner at Open Data and has
Tobias Brandvik is a PhD student at the Whittle been with the company since January 2005. He has over
Laboratory under the supervision of Dr Graham Pullan. ten years of experience leading development of machine
He obtained his MEng degree from the University of algorithms in the large data space. He received a B.S.
Cambridge in 2007. Tobias’s current PhD research from the University of California at Berkeley, a M.S. from
is focused on how to best use emerging multi-core the University of California at Santa Barbara and has
architectures for scientific computing. This includes both done post graduate study at U.C. Berkeley’s Haas School
creating tools to ease the porting of legacy applications, of Business and Stanford University.
and to investigate the possibilities offered by the greater hh Session(s): 2179 - GPU - An R Library for
computational power of multi-core processors in real Native GPU Objects (Tuesday, Sept 21, 16:00)
design settings. His main area of research has been the
development of the Turbostream solver and the software Brown, Shawn
framework that enables the solver to run on many Graduate Student (UNC, Chapel Hill)
different multi-core processors. Now that the majority of
this work has been completed, Tobias’s focus is on novel Shawn Brown received his B.S. in Comp. Sci & Math
uses of the solver for turbomachinery simulations. at BYU in 1992. He worked as a Software Developer at
Microsoft from 1994 to 1998. From 1998-2001, he was
hh Session(s): 2118 - Large-scale Gas a software developer for expedia.com, until he was
Turbine Simulations on GPU Clusters promoted to the Lead Developer in 2001. He then worked
(Wednesday, Sept 22, 16:00) as the Developer Manager from 2002 to 2004. Currently,
he is a grad student at UNC.
Brockway, Tad
Product Unit Manager (Microsoft) hh Session(s): 2140 - Superfast Nearest
Neighbor Searches Using a Minimal kd-
Tad Brockway manages the RemoteFX engineering team tree (Wednesday, Sept 22, 14:00)
for Microsoft. RemoteFX is a set of RDP technologies
SPEAKERS AND

- most prominently graphics virtualization and the use Buck, Ian

of advanced codes - that are being added to Windows
PANELISTS

Software Director of GPU Computing (NVIDIA)

Server 2008 R2 Service Pack 1. These technologies are
based on the IP that Microsoft acquired and continued Ian Buck joined NVIDIA and started the CUDA team
to develop since acquiring Calista Technologies. Tad six years ago alongside two others, and has had the
has been an engineer and manager in the desktop wonderful pleasure of watching CUDA grow and totally
virtualization space for Microsoft since 1998. change the world of high performance computing.
Before joining NVIDIA, Ian was the development lead
hh Session(s): 2243 - Microsoft RemoteFX - GPU on Brook which was the forerunner to generalized
Virtualization for Desktop Centralization computing on GPUs.)
(Wednesday, Sept 22, 17:00)
hh Session(s): 2275 - The Evolution of
Brodtkorb, André Rigland GPUs for General Purpose Computing
Research Scientist (SINTEF ICT) (Wednesday, Sept 22, 11:00)

André R. Brodtkorb recently submitted his Ph.D. Buckingham, Peter

thesis entitled Scientific Computing on Heterogeneous Tesla Software Manager (NVIDIA)
Architectures, where the efficient use of GPUs has
been central. He is now a research scientist at SINTEF, Peter has had a long history of working in System
scandinavias largest independent research organization, Software development. He has worked on everything
and works with different aspects of GPU computing. from low-level embedded RTOSes to large scale
supercompters and distributed storage systems.
hh Session(s): 2102 - Evacuate Now? Faster- Currently Peter leads Tesla Software at Nvidia which
than-real-time Shallow Water Simulation is responsible for supporting Data Center and Cluster
on GPUs (Tuesday, Sept 21, 17:00) deployments of GPUs and improving the software
ecosystem for large scale Tesla deployments.
Brookwood, Nathan
Research Fellow (Insight 64) hh Session(s): 2225 - Tools for Managing Clusters
of NVIDIA GPUs (Tuesday, Sept 21, 17:00)
Nathan Brookwood has participated in the information
technology industry since the days of the first
transistorized computers, and has worked with
mainframes, minicomputers, personal computers and
Budavari, Tamas the U.K. Following six years in the commercial sector,
Research Scientist (Johns Hopkins University) where he led the market transition from proprietary
Tamas Budavari is a Research Scientist in the SMP systems to commodity cluster-based solutions. Dr.
Department of Physics and Astronomy at The Johns Calleja returned to academia to lead the formation of
Hopkins University, where he focuses on statistical and a new HPC service at Imperial College, London. From
computational challenges in astroinformatics. there, he moved to Cambridge University, to form a new
HPC service and to direct a major reorganization that
hh Session(s): 2092 - Integrating CUDA into a has resulted in University-wide HPC capabilities with a
Large-Scale Commercial Database Management novel pay per use cloud computing model. Cambridge
System (Wednesday, Sept 22, 11:00) University now boasts one of the largest academic
supercomputer in the U.K., occupying 20th position
Buisson, Emmanuel among the top 500 when first installed. Dr. Calleja sits on
CEO (Numtech) numerous national and international HPC committees
hh Session(s): 2037 - Numtech & GPGPU, a SME and advisory boards, as well being a founding member
Point of View (Thursday, Sept 23, 09:30) of the U.K. HPC Special Interest Group.
hh Session(s): 2265 - CUDA Centers of Excellence
Burg, Yoram Super-Session IV (Tuesday, Sept 21, 17:00)
President (OptiTex)
With over 20 years of entrepreneurial experience that Campbell, Dan
spanned across Asia, Middle East and North America, Research Engineer (Georgia Tech Research Institute)
Mr. Burg brings with him experience in multiple aspects Dan Campbell is a Princpal Research Engineer at
of running multi-million dollar organizations. Applying GTRI, and leads its High Performance Computing
experience in Information Technology, Purchasing, Branch. His work focuses on improving the ease of
Business Planning, Project and Change Management use, and increasing the deployment of emerging high
to leading international teams on marketing performance computing platforms, including GPUs.
implementations and support, holding various senior He currently leads the Applications, Benchmarks, and
positions in technology and non-technology businesses. Metrics team on DARPA’s Ubiquitous High Performance
Mr. Burg is a Graduate of the MEI (MBA) program by Computing program, and is co-chair of the VSIPL (pr:
SUT. vee-sip-uhl) Forum. His recent work includes the 2007-
hh Session(s): 4010 - Emerging Companies: CEO 2009 DARPA exascale requirements studies, automated
on Stage featuring NaturalMotion Ltd, OptiTex, compiler evaluation, and the GPU VSIPL software library.
and Useful Progress (Thursday, Sept 23, 15:00) hh Session(s): 2126 - Accelerating Signal
Processing: Introduction to GPU
Bussmann, Michael VSIPL (Thursday, Sept 23, 16:00)
Junior Group Leader Computational Radiation Physics
(Forschungszentrum Dresden-Rossendorf) Capuzzo-Dolcetta, Roberto
Michael Bussmann studied physics at the Ludwig- Professor (Sapienza Univ. of Roma)
Maximilians University and Munich looking for the Roberto Capuzzo Dolcetta took his MS in Mathematics
Higgs boson without finding it, then stayed in Munich and his Phd in Physics in Rome. He has been an
for his PHD on laser cooling of relativistic ion beams, associate professor at the University of Roma La
this time with success. In 2008 he joined the Laser Sapienza in Astronomy and Astrophysics since 2000.
Particle Radiation Group at the FZD in Dresden. Here he He is an expert of theoretical and computational
started a project to develop a particle-in-cell algorithm astrophysics and is a coordinator of the PhD program
for GPUs. Since August 2010 he is the head of the in Astronomy. He is also a member of the board of

SPEAKERS AND
Computational Radiation Physics Group at the FZD. physicisists in the Consiglio Universitario Nazionale.
hh Session(s): 2090 - Developing Highly Scalable hh Session(s): 2000 - Gravitational N-body

PANELISTS
Particle-Mesh Codes for GPUs: A Generic Simulations: How Massive Black Holes Interact
Approach (Tuesday, Sept 21, 15:00) with Stellar Systems (Wednesday, Sept 22, 14:00)

Cabezas, Javier Carmel, Charles

Researcher (Barcelona Supercomputing Center) Vice President, Corporate Business Development (Cisco)
Javier Cabezas received a bachelor’s degree in Charles Carmel is the Vice President of Corporate
Computer Science and a master’s degree in Computer Business Development at Cisco. In this role, he leads
Architecture from Universitat Politècnica de Catalunya a team responsible for setting and executing Cisco’s
(UPC). Since 2006, he has been a PhD student in worldwide acquisition and venture capital investment
the Computer Architecture Department at UPC. His strategy. Charles has been a driving force in defining
research is focused on operating system support for Cisco’s acclaimed business development activities
heterogeneous massively-parallel computing systems which focus on accelerating the company’s growth by
and massively-parallelaccelerators. entering new markets and integrating innovative, new
hh Session(s): 2226 - Reverse Time Migration technologies into Cisco’s core businesses. Charles has
with GMAC (Wednesday, Sept 22, 16:00) led the development and execution of more than 20
acquisitions totaling over $15 billion during his time
Calleja, Paul at Cisco. These include some of Cisco’s largest and
Director of High Performance Computing (University of most successful transactions such as Linksys and Pure
FULL CONFERENCE GUIDE 2010

Cambridge) Digital, which propelled Cisco into a leading role for the
Consumer technology market; Scientific Atlanta, which
Dr. Paul Calleja is Director of High Performance positioned Cisco as the leader in the delivery of digital
Computing at Cambridge University, where he video; and Webex and Tandberg, which established
provides research computing services across all Cisco’s leadership in the Collaboration market.
academic disciplines. Dr. Calleja obtained his Ph.D. in Transactions led by Charles have resulted in more than
computational bio-physics at Bath University. After filling $10 billion in incremental revenue to Cisco since 2002.
a post-doctoral research position at Birkbeck College, In addition to his work on acquisitions, Charles has
he moved into private industry, where he spearheaded helped accelerate Cisco’s role as a leading Corporate
early commercialization of HPC cluster solutions within Venture Capitalist with responsibility for managing
99
Cisco’s broad based $1.5 billion venture portfolio. He His research interests include computer architectures
has led investments across many segments and stages and parallel programming models.
which have created strategic and financial returns for hh Session(s): 2177 - Simplifying Parallel
Cisco and has been an active participant on the board of Programming with Domain Specific
directors for a number of portfolio companies. Prior to Languages (Wednesday, Sept 22, 11:00)
joining Cisco in 2001, Charles was an investment banker
at Goldman Sachs where he was an early member of the Chatterjee, Debapriya
Technology Investment Banking group. During his time Graduate student (University of Michigan)
at Goldman, Charles was involved with over $2.5 billion
of financings and M&A transactions across a diverse set Debapriya Chatterjee is a Ph.D. candidate working with
of industry leading technology companies. Charles holds Prof. Valeria Bertacco in the Electrical Engineering and
an MBA from Stanford’s Graduate School of Business Computer Science Department at University of Michigan,
and a Bachelors degree from Tufts University.. Ann Arbor. His research focuses on validation and
verification solutions for industry-scale digital designs.
hh Session(s): 4004 - Emerging Companies: CEO His solutions entail both novel and adaptable semi-
on Stage featuring Cooliris, empulse GmbH, formal verification methods and the use of massively
and Playcast (Wednesday, Sept 22, 16:00) parallel multi-core platforms to boost the performance
hh 4005 - Emerging Companies: CEO on Stage of key validation applications. Debapriya holds a B.
featuring Jedox Business Intelligence, Rocketick, Tech. degree from the Indian Institute of Technology,
and Softkinetic (Wednesday, Sept 22, 17:00 Kharagpur, India and a MS degree from the University
of Michigan; both degrees are in Computer Science and
Castonguay, Patrice Engineering.
PhD Candidate (Stanford University) hh Session(s): 2306 - Gate-Level Simulation
Mr. Castonguay is a Ph.D candidate in the Aeronautics with GP-GPUs (Wednesday, Sept 22, 10:00)
and Astronautics department at Stanford University
working in the Aerospace Computing Laboratory under Chen, Doris
the supervision of Professor Antony Jameson. The Student (University of Toronto)
Aerospace Computing Lab focuses on developing more Doris received her M.A.Sc and B.A.Sc degrees in
efficient and robust algorithms for modeling fluid Computer Engineering from the
dynamics. Mr. Castonguay is interested in developing University of Waterloo in 2007 and 2005 respectively. She
efficient high-order methods for the Navier-Stokes currently works at Altera’s Toronto Technology Center
equations. Most specifically, his research interests on advanced algorithms in the fields of device modeling,
include developing stable and efficient high-order CAD optimizations and logic synthesis. She has
methods for mixed grids. authored or co-authored 6 papers in FPGA CAD and
hh Session(s): 2079 - A Fast, Scalable High- device reliability.
Order Unstructured Compressible Flow hh Session(s): 2068 - Parallelizing FPGA Technology
Solver (Tuesday, Sept 21, 11:00) Mapping using GPUs (Wednesday, Sept 22, 14:00)
Catanzaro, Bryan Chen, Hanning
PhD Candidate (University of California, Berkeley) Research Associate (Northwestern University)
Bryan Catanzaro received his BS and MS degrees
from Brigham Young University, and is currently a PhD hh Session(s): 2128 - Hybrid Quantum Mechanics/
candidate at the University of California, Berkeley. His Electrodynamics (QM/ED) Modeling of Solar Cells
interests center on programming models for manycore on a CUDA Cluster (Wednesday, Sept 22, 17:00)
computers, with an applications driven emphasis.
SPEAKERS AND

Cheung, Mark
hh Session(s): 2050 - Copperhead: Data-Parallel Physicist (Lockheed Martin Solar & Astrophysics
PANELISTS

Python for the GPU (Wednesday, Sept 22, 15:00) Laboratory)

Cebenoyan, Cem A senior physicist at Lockheed Martin Solar &
Senior Manager, Developer Technology (NVIDIA) A senior physicist at Lockheed Martin Solar &
Astrophysics Laboratory, Mark Cheung’s research
Cem manages the Developer Technology teams in North focuses on understanding astrophysical processes
America, China, and Japan. When not in meetings, he occurring in the Sun. He specializes in using radiation
spends his time coming up with graphics optimizations magnetohydrodynamics simulations of the solar
and techniques. Before joining NVIDIA, he was a student/ atmosphere to aid interpretation of solar observations.
research assistant in the Graphics, Visualization, and His scientific work supports a number of NASA-funded
Usability Lab at Georgia Tech. solar missions, including the newly launched Solar
hh Session(s): 2157 - DirectX 11 Overview (Pre- Dynamics Observatory (SDO), Hinode, the Transition
Conference Tutorial) (Monday, Sept 20, 13:00) Region and Coronal Explorer (TRACE) and the upcoming
Interface Region Imaging Spectrograph (IRIS).
Cervantes-Pimentel, Ulises hh Session(s): 2178 - Using GPUs to Track Changes
Senior Kernel Developer (Wolfram Research) in the Sun (Wednesday, Sept 22, 17:00)
Ulises Cervantes-Pimentel is the Visualization Senior
Kernel Developer at Wolfram Research, Inc. He received Chiu, Ting-Wai
his Master and PhD at the UIUC in Applied Mathematics Professor (National Taiwan University)
and has worked at Wolfram Research improving the Ting-Wai Chiu is the group leader of TWQCD
plotting and scientific visualization capabilities of collaboration in Taiwan. They are using a GPU cluster
Mathematica. of more than 200 Nvidia GPUs(GTX480/C2050/S1070/
hh Session(s): 2028 - Mathematica for GPU C1060/GTX285) to perform a Monte Carlo simulation of
Programming (Tuesday, Sept 21, 14:00) lattice QCD, with 200 Tflops(peak)/36 Tflops(sustained).
Their aim is to try to understand the nonperturbative
Chafi, Hassan aspects of QCD.
PhD Candidate (Stanford University) hh Session(s): 2265 - CUDA Centers of Excellence
Hassan Chafi is a fourth year PhD candidate in the Super-Session IV (Tuesday, Sept 21, 17:00)
Electrical Engineering dept at Stanford University.
hh 2217 - GPU-Based Conjugate Gradient Solvers hh Session(s): 2135 - Processing Petabytes
for Lattice QCD (Wednesday, Sept 22, 16:00) per Second with the ATLAS experiment
at the Large Hadron Collider at CERN
Chong, Jike (Wednesday, Sept 22, 16:00)
Principal Software Architect (Parasians, LLC)
Jike Chong is the Founder and Chief Software Architect Clegg, Don
of Parasians LLC (Parallel Computing Artisans), which (Supermicro)
specializes in helping clients in compute-intensive hh Session(s): 2293 - Scaling Up and Scaling Out GPUs
industries achieve revolutionary performance in with Supermicro’s Twin™ Architecture (Sponsored
applications directly affecting revenue/cost with GPUs. by Supermicro) (Wednesday, Sept 22, 11:00)
Jike’s prior industry work in parallel computing led to
three first-authored US patents at Sun Microsystems Cohen, Jonathan
and Intel Corporation. He is a Ph.D. researcher at Senior Research Scientist (NVIDIA Research)
University of California, Berkeley, with B.S. and M.S. in
Electrical and Computer Engineering from Carnegie Jonathan Cohen is a Senior Research Scientist with
Mellon University. NVIDIA, where he develops methods for using GPUs for
scientific computing and physical simulation. Prior to
hh Session(s): 2098 - Enabling On Demand joining NVIDIA, he spent several years working in the
Value-At-Risk for Financial Markets Hollywood feature film visual effects industry where he
(Thursday, Sept 23, 11:00) was awarded an Academy Award (Technical Achievement
hh 2046 - Efficient Automatic Speech Recognition Award) in 2007 for his work on fluid simulation and
on the GPU (Thursday, Sept 23, 15:00) volumetric modeling for visual effects. He received an
undergraduate degree from Brown in Mathematics and
Chunev, Georgi Computer Science.
Research Assistant (Indiana University) hh Session(s): 2023 - Processing Device Arrays with
hh Session(s): 2093 - Computational C++ Metaprogramming (Thursday, Sept 23, 11:00)
Photography: Real-Time Plenoptic hh 2022 - Solving PDEs on Regular Grids with
Rendering (Wednesday, Sept 22, 16:00) OpenCurrent (Tuesday, Sept 21, 16:00)

Civario, Gilles Cole, Brian

Head of Capability Computing and Novel Architecture Developer (OpenEye Scientific Software)
Group (ICHEC) Brian was an undergraduate at Temple University while
Gilles is the Head of ICHEC’s Capability Computing doing part-time work at Wyeth. He claimed he did Aikido
and Novel Architecture group, and is also involved in and after seeing him throw people around it was obvious
the broader aspects of the Centre’s mission where his he had a future at OpenEye. Oh, he does do some
expertise is useful in areas such as highly complicated programming and many claim he is gifted in that area.
code installation, debugging or optimisation and At least those who don’t want to be thrown around the
hardware evaluation. After completing two Master room. He has a plan to go back for his PhD.
degrees in Scientific Computing and Algorithms, Gilles hh Session(s): 2278 - Strategies for Code
joined the R&D team of EDF and was responsible Encapsulation in GPU Implementations
for developing and maintaining nuclear power plant (Thursday, Sept 23, 09:00)
simulation codes, in collaboration with the CEA. From
there he worked as a support scientist with CEA/CCRT, Collins, Simon
one of the largest HPC centres in Europe. For more Product Manager (GE Intelligent Platforms)

SPEAKERS AND
information, please visit: http://www.ichec.ie/about_us/
gilles_civario Simon has been Product Manager for video and graphics
products at GE Intelligent Platforms since 1998,

PANELISTS
hh Session(s): 2086 - GPGPU DL_POLY during which time he has maintained the company’s
(Thursday, Sept 23, 16:00) leading position at the forefront of leading-edge, high-
performance commercial technology applied to the
Clark, Calvin rugged defense and aerospace market. Traditional
Senior Consultant (Microsoft) graphics applications have seen the company’s
Calvin Clark has been at Microsoft for 14 years. He is products deployed into diverse applications such as
a Senior Consultant in the Application Development cockpit displays in fighter jets, mission computers in
Consulting Group. Since 2006, his focus has been on helicopters and tanks, and into embedded training
High Performance Computing solutions, delivering systems on naval weapons systems. As the trend
trainings and consulting services to numerous ISVs, towards GPGPU has grown, Simon has defined new
Solution Integrators, and OEMs in the HPC space. He products suited to the next generation of Intelligence,
lives in Menlo Park, CA with his wife and daughter. Surveillance and Reconnaissance applications, once
hh Session(s): 2147 - GPGPU Development for again taking the performance lead in the rugged
Windows HPC Server (Tuesday, Sept 21, 15:00) marketplace. Prior to taking this role, Simon worked
in a number of engineering roles in the nuclear and
Clark, Philip scientific research industries. He graduated with a B.Sc.
Reader (Associate Professor) in Particle Physics in Microelectronics, and an M.Sc. (Eng) in Advanced
(University of Edinburgh) Manufacturing Technology.
FULL CONFERENCE GUIDE 2010

Dr Philip Clark is a reader (associate professor) at the hh Session(s): 2273 - GPUs In the Front
University of Edinburgh. He is the principal investigator Line of our Defenses (Sponsored by
for the Edinburgh ATLAS and GridPP particle physics GE) (Wednesday, Sept 22, 15:00)
research groups. He is the chairman of the ScotGrid
tier-2 compute and data centre. His primary research Corrigan, Andrew
is in elementary particle physics, but is also interested Research Mathematician (Naval Research Laboratory &
evolving computer architectures, particularly the George Mason University)
advent of many-core and GPGPU devices. He has 672 Andrew Corrigan is a research mathematician at the
publications (427 in peer reviewed journals). Naval Research Laboratory, where he is working on
101
the GPU implementation of CFD codes and supersonic Curry, Matthew
jet noise reduction. He received his Ph.D. from George A Highly Reliable RAID System Based on GPUs (Sandia
Mason University in Computational Mathematics in May National Laboratories and the University of Alabama at
2009. He also performed a post-doc through early 2010 Birmingham)
under Prof. Rainald Löhner porting FEFLO to graphics Matthew Curry is a Ph.D. candidate at the University of
hardware, as well as developing specialized numbering Alabama at Birmingham. He is a member of the High
schemes for edge-based unstructured grids on GPUs. Performance Computer Laboratory in the Computer and
hh Session(s): 2005 - Porting Large-Scale Legacy Information Sciences Department under the advisement
Fortran Codes (Wednesday, Sept 22, 17:00) of Dr. Anthony Skjellum. He is interested in GPU
hh 2234 - Unstructured Finite Volume Code computing, operating systems, and high performance
on a Cluster with Multiple GPUs per storage.
Node (Wednesday, Sept 22, 15:00) hh Session(s): 2205 - A Highly Reliable RAID System
Based on GPUs (Tuesday, Sept 21, 17:00)
Cox, Sam
CEO (Milabra) Dammertz, Holger
Sam is an technology entrepreneur and investor, PhD Student (Ulm University)
focusing on technology and media ventures. He founded As a PhD student at Ulm University, Germany, my
a successful design & software firm in Canada and has main focus of research is fast Ray Tracing for Global
consulted for firms in the UK, China, the United States Illumination and related rendering techniques. I am also
and Canada. He graduated with his MBA from Cass researching quasi-Monte Carlo methods and parallel
Business School in London, UK with a degree in strategy algorithms for graphics.
and completed his undergraduate degree in Art History, hh Session(s): 2136 - Pseudo Random
Chinese Language and Economics at Queen’s University Number Generators for Massively Parallel
in Canada. Apps (Thursday, Sept 23, 16:00)
hh Session(s): 4001 - Emerging Companies:
CEO on Stage featuring Elemental Dasgupta, Aniruddha
Technologies, Geomerics, and Milabra Graduate Student (Georgia Institute of Technology)
(Wednesday, Sept 22, 11:00) Aniruddha Dasgupta is currently working towards a
hh 4003 - Emerging Companies Summit Panel: GPUs Mater’s Degree from the department of Electrical and
for Computer Vision (Wednesday, Sept 22, 15:00) Computer Engineering at Georgia Tech.
His areas of research interests are GPGPU and GPU
Crivelli, Luis architecture.
Director Solver Development (Dassualt Systems Simulia hh Session(s): 2201 - A Case Study of
Corporation) Accelerating Matlab Based Applications
PhD in Aerospace Engineering. Director Of Solver using GPUs (Wednesday, Sept 22, 16:00)
Development at Dassault Systems Simulia Corporation.
16+ years experience in High Performance Computing Davidson, Andrew
and Parallel Computing. Graduate Student (University of California, Davis)
hh Session(s): 2155 - GPGPU in the real world. The Andrew Davidson is a graduate student in the Computer
ABAQUS experience (Thursday, Sept 23, 14:00) Engineering Department at the University of California,
Davis. His research interests include data-parallel
Cui, Jingyu algorithms and primitives, numerical methods, and
Graduate Student (Stanford University) auto-tuning. He is also a developer for the CUDA

SPEAKERS AND
Jingyu Cui received his B.E (2005) and M.S. (2008) degree Parallel Primitives Library (CUDPP) .
from Tsinghua University, and M.S. (2010) degree from hh Session(s): 2085 - Tridiagonal Solvers: Auto-

PANELISTS
Stanford University. He is currently pursuing his Ph.D. Tuning and Optimizations (Tuesday, Sept 21, 15:00)
degree working on high speed dynamic 4-dimensional
medical imaging. Jingyu has published 8 peer-reviewed Davis, Nolan
papers in conference proceedings as the leading author, Research Scientist (SAIC)
2 journal articles, and a book chapter. He also holds a Nolan R. Davis is a Senior Scientist with SAIC in San
US patent. Jingyu worked with Microsoft and Google, and Diego. He holds a doctorate in physics from the
made important contributions to several products. University of Texas at Dallas, and has spent over 25
hh Session(s): 2211 - Modern Architecture for years working in physics research, signal and image
Massively Parallel Medical Tomographic processing, and high performance computing. He
Image Reconstruction on a GPU Cluster has worked with large corporations and laboratories
(Wednesday, Sept 22, 15:00) including SAIC, Lockheed-Martin, the Johns-Hopkins
Applied Physics Laboratory, the Naval Research
Cui, Xiaohui Laboratory, and Walt Disney Feature Animation.
Research Scientist (Oak Ridge National Laboratory) hh Session(s): 2100 - Hybrid GPU/Multicore
Dr. Xiaohui Cui is the scientist staff of the Computational Solutions for Large Linear Algebra
Sciences & Engineering Division, Oak Ridge National Problems (Thursday, Sept 23, 16:00)
Laboratory of Department of Energy and the adjunct
associate professor of University of Louisville in Dean, Loren
FULL CONFERENCE GUIDE 2010

Kentucky. His research interests include swarm Director of Engineering, MATLAB Products (MathWorks)
intelligence, agent based modeling and simulation, GPU Loren Dean is a Director in the MATLAB® development
computing, and information retrieval. His research has organization. He has responsibility for MathWorks
been reported by MSNBC, New Scientist etc. In 2008 and parallel computing products, the Test & Measurement
2009, he received the Department of Energy Outstanding application area and the eProducts and Services
Mentor Awards. organization. Loren has been with MathWorks since
hh Session(s): 2052 - Power Management 1995. Prior to joining MathWorks, Loren worked for
Techniques for Heterogeneous Exascale AlliedSignal Aerospace, performing systems analysis
Computing (Tuesday, Sept 21, 16:00) and integration for aircraft engines, with extensive use of
103
MATLAB and Simulink®. Loren has a B.S. and an M.S. in Deng, Yangdong
Aeronautical Engineering from Purdue University and an Associate Professor (Tsinghua University)
M.B.A. from Northeastern University. Yangdong (Steve) Deng received his Ph.D. degree in
hh Session(s): 2267 - GPU Computing with Electrical and Computer Engineering from Carnegie
MATLAB® (Tuesday, Sept 21, 11:00) Mellon University, Pittsburgh, PA, in 2006. He received
his ME and BE degrees in Electronic Department
Dean, Tom from Tsinghua University, Beijing, in 1998 and 1995,
Research Scientist (Google Inc.) respectively.
Tom Dean is a full-time research scientist at Google hh Session(s): 2081 - Morphing a GPU into a
in Mountain View, California. From 1993 to 2007 he Network Processor (Thursday, Sept 23, 15:00)
was Professor of Computer Science and Cognitive and hh 2264 - CUDA Centers of Excellence Super-
Linguistic Sciences at Brown University. He received his Session III (Tuesday, Sept 21, 16:00)
B.A. in mathematics from Virginia Polytechnic Institute
& State University in 1982 and his M.Sc. and Ph.D. in Dewaele, Ronny
computer science from Yale University in 1984 and 1986. Director Technology Center (Barco)
His research interests include automated planning and
control, computational biology, machine learning, neural Ronny Dewaele is responsible for the corporate
modeling, probabilistic inference, robotics and spatial Technology Center for Networked Visualization in Barco.
and temporal reasoning. For more information, please Ronny and his team focus on exploring new technologies
visit: http://www.cs.brown.edu/people/tld/pages/bio.html in the domain of network centric video processing.Ronny
Dewaele has a Master degree in Computer Science and
hh Session(s): 2132 - Accelerating Applied Mathematics from KU Leuver, Leuven, Belgium.
Biologically Inspired Computer Vision He lives and works in Belgium.
Models (Tuesday, Sept 21, 11:00)
hh Session(s): 2095 - Building High
hh 4003 - Emerging Companies Summit Panel: GPUs Density Real-Time Video Processing
for Computer Vision (Wednesday, Sept 22, 15:00) Systems (Thursday, Sept 23, 16:00)

Decrem, Peter D’Hondt, Maja

Director, Rates Products (Quantifi) Program Manager (imec)
Peter Decrem heads the Rates Group at Quantifi. As Maja D’Hondt is program manager at imec, Europe’s
Director, Peter is responsible for managing the product largest independent research center in nanoelectronics
development process of all Rates Solutions within the and nano-technology. Her team develops middleware
Quantifi product suite. Peter started in Research and for embedded and high performance systems, such
Technology at Bear Stearns and Deutsche Bank. He as Barco’s GPU-enabled video processing server. Maja
traded fixed income derivatives, government bonds and holds a PhD from the Vrije Universiteit Brussel in
agencies for Lehman Brothers and Salomon Brothers. Belgium (2004), then spent some time in Amsterdam
He was responsible for fixed income derivatives trading on a research project with ASML, and obtained a full
desk for a number of European banks. Most recently he research position at INRIA in France.
refocused on technology and specifically concentrated
on machine learning and high frequency trading on hh Session(s): 2121 - Maximizing Throughput
parallel systems prior to his joining of Quantifi. of Barco’s GPU-Enabled Video Processing
Server (Thursday, Sept 23, 14:00)
hh Session(s): 2040 - Derivatives & Bond
Portfolio Valuation in a Hybrid CPU/GPU Diamos, Gregory
Environment (Thursday, Sept 23, 14:00) PhD Student (Georgia Institute of Technology)
SPEAKERS AND

hh 2297 - Developing CUDA Accelerated .NET Plugins Gregory Diamos is a PhD student at the Georgia Institute
for Microsoft Excel (Tuesday, Sept 21, 17:00) of Technology, under the direction of Professor Sudhakar
PANELISTS

Yalamanchili. He received his B.S. and M.S. in Electrical

Deguy, Sébastien Engineering from the Georgia Institute of Technology
Founder and CEO (Allegorithmic) in 2006 and 2008, respectively. His current research
Dr. Sébastien Deguy is the CEO of Allegorithmic, the interests follow the industry shift from ILP to many
company behind the Substance procedural textures core architectures, where the ability to tightly integrate
authoring and rendering system. Dr. Deguy has a heterogeneous architectures offers the potential for
computer science background with a specialization in dramatic improvements in efficiency at the cost of
mathematics, random processes, simulation, computer increased design complexity.
vision and image synthesis. He is also an award-winning hh Session(s): 2210 - GPU-Ocelot: An Open
director and producer of traditional and animated short Source Debugging and Compilation Framework
films. for CUDA (Thursday, Sept 23, 14:00)
hh Session(s): 4002 - Emerging Companies: CEO on
Stage featuring Allegorithmic SAS, Bunkspeed, Dick, Christian
and miGenius (Wednesday, Sept 22, 14:00) PostGraduate Fellow (Technische Universität München)
Christian Dick received his Diploma in Computer
Del Sordo, Giancarlo Science from the Technische Universität München
Chief Developer and Product Manager (Acustica Audio) in July 2007 with honors. Since August 2007, he
Giancarlo Del Sordo is the main developer at Acustica has been a postgraduate fellow in the Computer
Audio. He specialized in Computer Engineering at the Graphics and Visualization Group at the Technische
University of Pisa. He also worked in the IT department Universität München under the supervision of Prof. Dr.
of Intesa Sanpaolo (first Italian bank), in NEXTRA R. Westermann. His research interests include highly
Asset Management and in Cap Gemini Ernst&Young responsive, physics-based simulation of deformable
consulting. models on high-resolution hierarchical representations,
hh Session(s): 2076 - Implementing CUDA Audio and the visualization of large-scale scientific data sets
Networks (Thursday, Sept 23, 09:00) such as terrain and volume data.
hh Session(s): 2137 - CUDA for Real-Time Multigrid
Finite Element Simulation of Soft Tissue
Deformations (Wednesday, Sept 22, 14:00)
Dimitrovici, Dragan Dyken, Chris
(XENON Systems Pty Ltd) Research Scientist (SINTEF)
hh Session(s): 2301 - GPU Cluster Computing: Christopher Dyken got his Ph.D. in Computational
Accelerating Scientific Discovery (Thursday, Sept Geometry at the University of Oslo in 2008, where he
23, 09:00) also lectured computer graphics for four years. He
currently holds a position as a research scientist at the
Dixon, Matthew Heterogeneous Computing Group at SINTEF ICT Applied
Professor (UC Davis) Mathematics, focusing on algorithms for heterogeneous
architectures and visualization techniques.
Matthew Dixon is a Krener assistant professor in the
mathematics department at UC Davis. He received hh Session(s): 2020 - GPU-Accelerated
his Ph.D. in applied mathematics from Imperial Data Expansion for the Marching Cubes
College (UK) in 2007 and has since held postdoctoral Algorithm (Wednesday, Sept 22, 16:00)
appointments with the Institute for Computational and
Mathematical Engineering at Stanford University and the Edgar, Richard
Department of Computer Science at UC Davis. He has Assistant in Neuroscience (Massachusetts General
also worked as a quantitative risk analyst for a number Hospital, Harvard University)
of leading investment banks and consulted to the Bank With a background in theoretical astrophysics, Richard
for International Settlements. Edgar is now working at Harvard and MGH on a variety of
hh Session(s): 2098 - Enabling On Demand projects in need of GPU acceleration.
Value-At-Risk for Financial Markets hh Session(s): 2001 - Acceleration of the
(Thursday, Sept 23, 11:00) Freesurfer Suite for Neuroimaging
Analysis (Thursday, Sept 23, 10:00)
Domine, Sebastien
Sr. Dir. Developer Tools (NVIDIA) Enderle, Rob
Sébastien Dominé is the Sr. Director of Developer Analyst (Enderle Group)
Technology Tools at NVIDIA. He runs various software Rob is President and Principal Analyst of the Enderle
engineering teams and oversees the development of Group, a forward looking emerging technology advisory
software products dedicated to ease the developer’s life firm. Recognized as one of the best general Inquiry
and to foster the creation of more applications that can Analysts in the world, Rob specializes in providing rapid
take advantage of the GPU. Prior to NVIDIA, he worked perspectives and suggested tactics and strategies to a
on PC games at GameFX/THQ and 3D digital content large number of clients dealing with rapidly changing
creation tools at Katrix and Nichimen Graphics. He holds global events. Rob lives emerging technology and has
a Diplôme d’Ingénieur in Computer Science from EPITA, a passion for personal technology and market strategy.
Paris, France. Rob trained as a news anchor and co-hosted CNET
hh Session(s): 2151 - Parallel Nsight: Analyzing radio during the 90s, has been widely used by both local
and Optimizing Massively Parallel Applications and national news TV and radio programs, and has
[Advanced] (Tuesday, Sept 21, 16:00) been identified as one of the worlds’ most influential
technology analysts. Currently Rob appears semi-weekly
hh 2150 - Parallel Nsight: Debugging for a tech segment on WSJ radio and writes for ECT
Massively Parallel Applications (TechNewsWorld, eCommerce Times, Linux World,
[Advanced] (Tuesday, Sept 21, 14:00) MacNewsWorld), Dark Reading, Digital Trends, TGDaily,
ITBusiness Edge and Datamation. Before founding the
Donovan, Scott Enderle Group in 2003 Rob was the Senior Research
System Architect (Citadel Investment Group) Fellow for Forrester Research and the Giga Information

SPEAKERS AND
Mr. Donovan has a masters degree in computer science Group. While there he ran the eCommerce, Security, and
and over 20 years IT experience. Throughout his career Mobile research practices.

PANELISTS
he has held positions at exchanges, investment banks, hh Session(s): 4007 - Emerging Companies: CEO
and hedge funds. He is currently a System Architect on Stage featuring Aqumin, RTT, and Scalable
at Citadel where his main area of focus is accelerating Display Technologies (Thursday, Sept 23, 10:00)
financial models with a combination of grid computing, hh 4008 - Emerging Companies: CEO
virtualization, and CUDA / OpenCL. on Stage featuring ICD and Universal
hh Session(s): 2033 - Integrating GPGPU Accelerated Robotics (Thursday, Sept 23, 11:00
Pricing Models into an Existing Financial Services
Infrastructure (Thursday, Sept 23, 09:00) Engsig-Karup, Allan Peter
Assistant Professor, Scientific Computing (Technical
Doran, Chris University of Denmark)
Founder and Chief Operating Officer (Geomerics) MSc, PhD in Applied math.Learn more: http://www.imm.
Dr. Chris Doran is Founder and Chief Operating Officer dtu.dk/~apek Involved in research related to utilization
at Geomerics. He is a leading research scientist with 20 of GPUs for Scientific Computing. Learn more: http://
years experience in applied mathematics and theoretical gpulab.imm.dtu.dk. Research interest in Computational
physics, and is the author of a major book on geometry Fluid Dynamics, High-Performance Computing, Coastal
and physics and of over 50 papers. Chris is a regular Engineering, Scientific Computing, Numerical analysis.
speaker at major international conferences, including hh Session(s): 2103 - Development of an Efficient
SIGGRAPH, Develop, Nordic Game and Montreal Games
FULL CONFERENCE GUIDE 2010

GPU-Accelerated Model for Fully Nonlinear

Summit. Chris is also Director of Studies in Physics for Water Waves (Tuesday, Sept 21, 15:00)
Sidney Sussex College, Cambridge.
hh Session(s): 4001 - Emerging Companies: CEO Fahmy, Hany
on Stage featuring Elemental Technologies, Inc., Director, SI/EMC Engineering (NVIDIA)
Geomerics, and Milabra (Wednesday, Sept 22, 11:00)
hh Session(s): 2080 - Tackling Multi-Gigabit
Design Challenges with a Practical Virtual
EMI/ESD Lab (Wednesday, Sept 22, 15:00)
105
Farber, Robert Frauenhofer, Bill
Senior Scientist (PNNL) Managing Director (Citigroup Global Markets)
Rob Farber has worked with massively parallel William Frauenhofer is a Managing Director in
computers and algorithms since the early 1980s Citigroup’s Technology Group based in Palo Alto, CA. Bill
as a scientist at prestigious institutions such Los joined the firm in 1996 and worked in the Real Estate &
Alamos National Laboratory, NERSC and PNNL, as a Lodging practice until 2000 when he relocated to San
consultant, and as a co-founder of two computation- Francisco to join Citigroup’s Technology team. Bill is
based companies that achieved liquidity events. currently the head of Citigroup’s global semiconductor
Recently, Rob has been teaching people how to think team and has worked on a broad base of transactions
about and program in CUDA through his article series including M&A, LBOs, and a number of public and
on the Doctor Dobbs Journal site as well as Scientific private financing transactions, including equity, equity-
Computing and other venues. linked, high yield and investment grade capital. Bill has
hh Session(s): 2119 - Supercomputing for the Masses: recently advised clients including: Amkor Technology,
Killer-Apps, Parallel Mappings, Scalability and Freescale Semiconductor, Atmel, Qimonda, Infineon,
Application Lifespan (Tuesday, Sept 21, 11:00) Nvidia, Wolfson Microelectronics, NXP, etc. Bill received
a M.B.A. with honors from the Stern School of Business
Fascione, Luca at New York University and a B.B.A. in Finance from
Senior Research and Development Engineer (Weta Loyola College in Maryland. Prior to joining Salomon
Digital) Brothers, he was an Analyst at The Warwick Group,
a boutique investment bank specializing in mid-cap
Luca Fascione is a Senior Research and Development mergers and acquisitions.
Engineer at Weta Digital. He is currently focused on
advanced rendering solutions and the development hh Session(s): 4009 - Emerging Companies
of innovative and forward-looking film pipelines that Summit Panel: The “New Normal” For Building
harness and expand on the best technologies available Emerging Companies Based On Disruptive
around the world. In addition to his work at Weta Digital, Technologies (Thursday, Sept 23, 14:00)
Luca spent time as a software engineer at Pixar and
Vanguard Animation Studios. He also taught courses on Fung, James
General Computer Graphics and Rendering Languages Developer Technology (NVIDIA)
as a professor at the University of Rome. James Fung’s work has been in the area of applying
hh Session(s): 2305 - PantaRay: Accelerating GPU Hardware for parallel general purpose computing,
Out-Of-Core Ray Tracing of Sparsely Sampled including implementing Computer Vision on the GPU. He
Occlusion (Wednesday, Sept 22, 10:00) holds a Ph.D. in Electrical and Computer Engineering
from the University of Toronto. He currently works
Fassold, Hannes at NVIDIA examining computer vision and image
Scientist (JOANNEUM RESEARCH) processing on graphics hardware.
Hannes Fassold finished his study of Technical hh Session(s): 2209 - Accelerating Computer
Mathematics in Graz, Austria, in 2004. Since July, 2004, Vision on the Fermi Architecture
he has been working as a scientist at JOANNEUM (Thursday, Sept 23, 14:00)
RESEARCH. His major work fields are the development
of algorithms for digital film restoration and video quality Ganesan, Narayan
analysis, using image processing and computer vision Research Scientist (University of Delaware)
methods. Dr.Ganesan received his Ph.D from Washington
hh Session(s): 2029 - Computer Vision University in St.Louis. His dissertation was on
Quantum-Information and Decoherence free Quantum-
SPEAKERS AND

Algorithms for Automating HD Post-

Production (Wednesday, Sept 22, 15:00) Computation. His current research is on scientific and
High-performance computing on GPUs. His recent work
PANELISTS

Fatica, Massimiliano on parallelizing sequentially dependent recurrence

Manager (NVIDIA) computations, technique applied to the popular
protein motif-finding problem, delivers unprecedented
Massimiliano is a manager of the Tesla Performance performance compared to current MPI-GPU-HMMER.
Group at NVIDIA where he works in the area of GPU He is currently the lead scientist behind the development
computing (high-performance computing and clusters). of an optimized Molecular Dynamics Simulation package
He holds a laurea in Aeronautical Engineering and a based on CHARMM force field. The package, delivers
PhD in Theoretical and Applied Mechanics from the highly-competitive performance and is currently used to
University of Rome “La Sapienza”. study behavior of large lipid-membranes, protein-ligand
hh Session(s): 2057 - CUDA-Accelerated LINPACK interactions and multiscale modeling at the University of
on Clusters (Tuesday, Sept 21, 14:00) Delaware.
hh Session(s): 2034 - Reformulating Algorithms
Fernandez, Mark for the GPU (Wednesday, Sept 22, 11:00)
Computer Scientist (Dell)
hh 2035 - Simulations of Large Membrane
As a HPC Computer Scientist in Dell’s Advanced Systems Regions (Wednesday, Sept 22, 11:30)
Group, Dr. Fernandez supports HPC customer/end-user
technical efforts at Dell. Dr. Fernandez is responsible for Gateau, Samuel
working with customers to capture user requirements Developer Technology Engineer (NVIDIA)
and incorporating those requirements into future Dell
HPC systems and solution. He also works closely with Sam is a member of the Content & Technology
Dell Engineering throughout the product development Engineer group at NVIDIA, who spends his energy
cycle. and creativity pushing pixels and teaching high-end,
real-time computer graphics to Games, DCC, and CAD
hh Session(s): 2287 - Internal GPUs on Dedicated developers. Before that, he was enjoying the sun of
x16 Slots - Are They Needed For HPC? (Sponsored Toulouse in France working in the virtual reality industry
by Dell) (Wednesday, Sept 22, 14:00) on extravagant visual simulations, navigation systems,
showrooms and museum applications.
hh Session(s): 2010 - Implementing Stereoscopic the Technische Universität München. His research
3D in Your Applications (Pre-Conference interests are simulation and visualization techniques,
Tutorial) (Monday, Sept 20, 16:00) mainly for deformable bodies, as well as their efficient
implementations based on multigrid methods.
Gaudlitz, Daniel hh Session(s): 2137 - CUDA for Real-Time Multigrid
Project Manager (FluiDyna) Finite Element Simulation of Soft Tissue
Daniel Gaudlitz received his diploma in Aeronautics Deformations (Wednesday, Sept 22, 14:00)
at the Technical University Dresden (equiv. to Master
of Science)in 2003. In 2008, he earned his PhD at Gianos, Flip
the Technical University Munich as a member of the General Partner (InterWest Partners)
research group for numerical simulations in fluid Philip “Flip” Gianos has been part of InterWest’s IT
mechanics led by Prof. N. A. Adams. Since 2009, he has team since 1982. With a background in engineering,
obtained his post-doc at the Institute of Aerodynamics he has invested in multiple areas of information
of Technical University Munich and is working as the technology, including semiconductors, computing
Project Manager and Manager Research & Development and networking equipment, and infrastructure and
at FluiDyna GmbH. applications software. He is chairman of the board of
hh Session(s): 2206 - Accelerated Xilinx (XLNX), a publicly held company, and is also a
Computational Fluid Dynamics Employing board member of several privately held companies,
GPUs (Thursday, Sept 23, 09:00) including: Bivio Networks, Brand.net, Convey Computer,
and SpectraLinear. Gianos also serves on the advisory
Gebbie, Nicholas board of Storm Ventures II, and is a past president of
Senior Graphics Programmer (Bunkspeed) the Western Association of Venture Capitalists. Prior to
joining InterWest, Gianos was with IBM for eight years
hh Session(s): 2074 - Driving a Product from in engineering management. He managed both chip
Rasterization to Ray Tracing: The Developer design and systems integration for several IBM office
Experience (Tuesday, Sept 21, 15:00) automation products. Gianos earned his M.B.A. from
Harvard University and received his M.S. and B.S. in
Ge, Wei electrical engineering from Stanford University. He has
Professor (Institute of Process Engineering, Chinese one international and two U.S. patents.
Academy of Sciences)
hh Session(s): 4004 - Emerging Companies:
Professor of Chemical Engineering and Simulation at CEO on Stage featuring Cooliris, empulse
Institute of Process Engineering, Chinese Academy of GmbH, and Playcast Media Systems
Sciences. Born 1970, B.Sc and Ph.D of Harbin Institute (Wednesday, Sept 22, 16:00)
of Technology, China.
hh 4005 - Emerging Companies: CEO on Stage
hh Session(s): 2263 - CUDA Centers of Excellence featuring Jedox Business Intelligence, Rocketick,
Super-Session II (Tuesday, Sept 21, 15:00) and Softkinetic (Wednesday, Sept 22, 17:00)
hh 2286 - Towards Peta-Scale Green Computation
- Applications of the GPU Supercomputers Gokhale, Nachiket
in the Chinese Academy of Sciences Senior Research Engineer (Weidlinger Associates Inc)
(CAS) (Wednesday, Sept 22, 11:00) Nachiket Gokhale, Ph.D. is actively involved in multi-
disciplinary research and development efforts at
Gelado, Isaac Weidlinger Associates, Inc. (WAI). Under a DARPA SBIR,
Lecturer and Researcher (Universitat Politecnica de he was involved in the development of a GPU-enabled
Catalunya) version of WAI’s explicit time domain, commercial

SPEAKERS AND
Isaac Gelado is an Assistant Professor at the Computer finite element software NLFLEX. He is interested
Architecture Department in Universitat Politecnica de in the development of fast GPU enabled algorithms

PANELISTS
Catalunya at Barcelona. Isaac Gelado holds a Master’s and computer codes for the high-fidelity solution of
degree on Telecommunications Engineering from challenging problems in computational mechanics with
Universidad de Valladolid, and will get a PhD degree an emphasis on transient phenomena in structural
from Universitat Politecnica de Catalunya in July, 2010. mechanics, such as shock and blast; FEM simulation
hh Session(s): 2156 - GMAC: Global Memory For of ultrasound with an emphasis on biomedical imaging
Accelerators (Thursday, Sept 23, 09:00) and therapy; and the computational design of novel
acoustic metamaterials. His project experience includes
Georgiev, Todor SBIR and STTR research for DARPA, ONR, and various
Senior Research Scientist II (Adobe Systems) protective design and structural engineering efforts
involving large finite element analyses. He earned
Todor Georgiev is a Senior Research Scientist at his Ph.D . and M.S. both in Mechanical Engineering
Adobe Systems, working closely with the Photoshop from Boston University where his work involved the
group. His contributions are often based on transfer finite element solution of linear and non-linear inverse
of mathematical methods from physics to image problems in biomechanical imaging.
processing and vision. Currently he is focusing on
developing cameras for radiance capture and interactive hh Session(s): 2061 - Accelerating Explicit FEM Shock
plenoptic / lightfield rendering. & Blast Simulations (Thursday, Sept 23, 10:30)
hh Session(s): 2093 - Computational Goldsmith, Kevin
Photography: Real-Time Plenoptic
FULL CONFERENCE GUIDE 2010

Senior Engineering Manager (Adobe Systems,

Rendering (Wednesday, Sept 22, 16:00) Incorporated)
Georgii, Joachim Kevin Goldsmith is the Manager of the Adobe Image
PostDoc (Technische Universität München) Foundation team. This team created a domain specific
language: Pixel Bender; which allows for highly
Joachim Georgii is a PostDoc at the computer graphics optimized parallel signal-processing computation and
and visualization group headed by Professor Rüdiger a dynamic runtime environment designed to scale
Westermann at the Technische Universität München. from current to future highly-parallel heterogeneous
In 2007, he received a PhD in computer science at hardware. AIF is currently part of many of Adobe’s
107
flagship applications. Kevin has over 18 years in the Gu, Henry
computer industry at companies such as Silicon CTO (GIC)
Graphics, Microsoft, IBM Research, (Colossal) Pictures Dr. Henry Gu is the founder of Green International
and others. Consulting, a global software development, consulting
hh Session(s): 2051 - GPGPU in Commercial and outsourcing company. Gu was the GM of Thomson
Software: Lessons From Three Cycles of the Corporate Research Center in Burbank before founding
Adobe Creative Suite (Thursday, Sept 23, 11:00) GIC. Before joining Thomson, Gu was CTO and VP of
da Vinci Systems. He led the company’s R&D team
Gonzalez, Alberto in developing generations of color corrector for the
Professor (Universidad Politecnica de Valencia) entertainment industry. In 2001, Gu received the
Alberto Gonzalez was born in Valencia, Spain, in 1968. Primetime Emmy Award for Outstanding Achievement in
He received the Ingeniero de Telecomunicacion degree Engineering Development.
from the Universidad Politecnica de Catalonia, Spain in hh Session(s): 2043 - Disparity Map
1992, and Ph.D degree from de Universidad Politecnica Generation (Thursday, Sept 23, 11:00)
de Valencia (UPV), Spain in 1997. His dissertation was
on adaptive filtering for active control applications. Gupta, Kshitij
From January 1995, he visited the Institute of Sound Graduate Student Researcher (UC Davis)
and Vibration Research, University of Southampton, Kshitij Gupta is a Ph.D. candidate in the Department of
UK, where he was involved in research on digital signal Electrical & Computer Engineering at UC Davis. He is
processing for active control. He is currently heading interested in a variety of application domains like audio,
the Audio and Communications Signal Processing image, and video. His primary interests are in exploring
Research Group (www.gtac.upv.es) that belongs to novel ways of transforming today’s high-performance
the Institute of Telecommunications and Multimedia algorithms onto emerging low-end, low-power, hybrid
Applications (i-TEAM, www.iteam.es). Dr. Gonzalez (CPU/GPU/DSP/ASIP) processors targeted towards
serves as Professor in digital signal processing mobile and automotive platforms. In his spare time,
and communications at UPV where he heads the he likes procrastinating about novel user-interfaces,
Communications Department (www.dcom.upv.es) since and hopes to work more actively on it some day.
April 2004. He has published more than 80 papers in Kshitij received his Masters in Electrical & Computer
journals and conferences on signal processing and Engineering from University of Pittsburgh (PA, USA),
applied acoustics. His current research interests include and his Bachelors in Electronics & Communication
fast adaptive filtering algorithms and multichannel Engineering from Osmania University (Hyderabad, India).
signal processing for communications and 3D sound
reproduction. hh Session(s): 2175 - Hello GPU: High-Quality,
Real-Time Speech Recognition on Embedded
hh Session(s): 2116 - Real-time Multichannel GPUs (Thursday, Sept 23, 14:00)
Audio Convolution (Thursday, Sept 23, 10:00)
Gupta, Rohit
Gottbrath, Chris Researcher/Teacher (Delft University Of Technology)
Principal Product Manager (TotalView Technologies, Inc.,
a Rogue Wave Software company) Rohit Gupta recently completed his Masters Study at
TUDelft (Aug 2010) in Computer Engineering. Rohit’s
Chris Gottbrath is principal product manager for masters thesis subject was in the domain of CFD
the TotalView Debugger product line at Rogue Wave on GPU computing. Previously, he has worked in
Softare. His work is focused on making it easier for the Domain of Embedded Systems for 4 years after
programmers, scientists and engineers to solve even completing his bachelors degree in India.
the most complex bugs and get “back to work.” He
SPEAKERS AND

has pursued this goal in a variety of customer-focused hh Session(s): 2049 - Deflated Preconditioned
technical roles with the TotalView team over the last Conjugate Gradient on the GPU
PANELISTS

seven years. Prior to that, as a graduate student of (Wednesday, Sept 22, 14:30)
astrophysics at the University of Arizona in Tucson, he
wrote cosmological simulations (with the occasional Haines, Karen
bug) using C and MPI on a small-scale Beowulf cluster. Professor (WASP/The University of Western Australia)
Chris is a regular contributor to HPC and software Dr. Haines completed her PhD in Electrical Engineering
development industry conferences worldwide. at the University of New Mexico. She received her
hh Session(s): 2299 - Integrating CUDA BLAS Masters in Engineering at Carnegie Mellon University
with IMSL Fortran (Tuesday, Sept 21, 14:00) and her Bachelor of Arts in Mathematics at the
University of California, San Diego. Her PhD research
hh 2251 - TotalView Debugger for CUDA efforts have lead to the development of a parallel motion
(Wednesday, Sept 22, 15:00) detection algorithm, which is based on the fly’s visual
processing system. The resulting model is suitable for
Govett, Mark robotic or computer vision applications. This work relied
Chief, Advanced Computing Section (NOAA Earth System on distributed parallel programming and advanced
Research Laboratory) scientific visualization methods.
I manage NOAA Earth System Research Laboratory’s hh Session(s): 2252 - Simulating Housefly
Advanced Computing Section, a software group Vision Elements Using OpenCL
that supports weather model development, code (Wednesday, Sept 22, 16:00)
parallelization, and exploring advanced computing
technologies including GPUs. I have a background in Han, Jeff
high performance computing, code parallelization and Founder, Chief Scientist (Perceptive Pixel)
compiler development. Recently, I wrote a Fortran to
CUDA compiler to parallelize and run a next generation Jeff Han is the founder and chief scientist of Perceptive
weather model on GPUs. Pixel A TED speaker in 2006, and named to the Time
100 most influential persons list in 2008, Jeff continues
hh Session(s): 2276 - Using GPUs to Run to contribute frequently to the research communities.
Next-Generation Weather Models Jeff’s formal training was in electrical engineering and
(Tuesday, Sept 21, 14:00) computer science at Cornell University, where he worked
on the innovative CU-SeeMe videoconferencing system.
hh 4011 - Emerging Companies: CEO on Stage hh Session(s): 2152 - Using Virtual
featuring Cinnafilm, Inc., Perceptive Pixel and Texturing to Handle Massive Texture
Total Immersion (Thursday, Sept 23, 16:00) Data (Tuesday, Sept 21, 14:00)

Hansen, Charles Härter, Daniel

Professor (University of Utah) (University of Freiburg, IMTEK, Laboratory for Process
Charles (Chuck) Hansen is a Professor of Computer Technology)
Science and an Associate Director of the Scientific From 2000 to 2005, Daniel studied Microsystems
Computing and Imaging Institute at the University of Technology at the University of Freiburg, where he wrote
Utah. Chuck Hansen has published over 100 peer his Diploma thesis about miniaturized illumination
reviewed journal and conference papers and has been a concepts at the Laboratory of Process Technology.
co-author on three papers recognized with “Best Paper hh Session(s): 2065 - Massively Accelerating Iterative
Awards” at the IEEE Visualization Conference (1998, Gauss-Newton Fitting (Wednesday, Sept 22, 11:00)
2001, 2002). He was co-author on the Best Paper at
IEEE Pacific Visualization 2010. He was awarded the Hayes, David
IEEE Technical Committee on Visualization and Graphics CEO (ICD)
“Technical Achievement Award” in 2005 in recognition
of seminal work on tools for understanding large-scale David is the Chief Executive Officer of ICD and has
scientific data sets. a wealth of experience in the mobile and consumer
electronics markets. David founded Velocity Mobile - an
hh Session(s): 2264 - CUDA Centers of Excellence ICD collaboration with Inventec in 2007. Prior to ICD,
Super-Session III (Tuesday, Sept 21, 16:00) David was Chief Executive Officer of A Living Picture
(ALP), a company he formed to develop “Momento,” an
Hardy, Quentin advanced, digital picture frame technology. ALP was
National Editor (Forbes Magazine) later acquired by i-mate in December 2006 and David
Quentin Hardy is National Editor for Forbes Media, became the Chief Technology Officer of i-mate following
responsible for cover stories and features for Forbes the acquisition. Prior to ALP and i-mate, David was the
magazine, along with stories, a blog and video interviews founder and Chief Executive Officer of DAT plc, one of
for the Forbes.com website. Mr. Hardy is a regular on the early pioneers of over-the-air device management
“Forbes on Fox,” a weekly business news show on Fox for Windows Mobile phones. Here, David forged strong
News Channel, as well as shows on CNBC, Bloomberg links within the Windows Mobile community, from ODMs
and other television channels. He hosts numerous through to operators and retailers.
panels on technology and business both independently hh Session(s): 4008 - Emerging Companies: CEO
and at Forbes events around the U.S. and overseas. on Stage featuring ICD and Universal Robotics
hh Session(s): 4006 - Fireside Chat with (Thursday, Sept 23, 11:00)
Jen-Hsun Huang - Co-founder & CEO,
NVIDIA (Thursday, Sept 23, 09:00) Herbert, Garrett
Partner, M&A Transaction Services (Deloitte & Touche LLP)
Harris, Mark Garrett Herbert is a partner and leads the Silicon
Senior Developer Technology Engineer (NVIDIA) Valley M&A Transaction Services practice in San Jose,
Mark Harris is a senior developer technology engineer CA and is the national leader for M&A Transaction
at NVIDIA, where he works with developers around the Services for the Telecom, Media, and Technology
world on software for computer graphics and high- Industry group. Garrett has over 18 years of professional
performance computing. His research interests include experience that last 13 years has been dedicated
parallel computing, general-purpose computation on to M&A. He has extensive transaction experience

SPEAKERS AND
GPUs, physically based simulation, real-time rendering, in advising financial and strategic buyers on due
and gastronomy. Mark earned his PhD in computer diligence, accounting structuring and financial reporting

PANELISTS
science from the University of North Carolina at Chapel aspects of transactions in technology, semiconductors,
Hill in 2003. He founded and maintains GPGPU.org, a and software transactions both domestically and
web site dedicated to general-purpose computation on internationally. In addition to his M&A experience with
GPUs. Deloitte, Garrett has M&A experience as an investment
hh Session(s): 2084 - State of the Art in professional in industry with Mentmore Holdings
GPU Data-Parallel Algorithm Primitives Corporation (a private equity group), Stellex Technologies
(Tuesday, Sept 21, 17:00) (wireless communications equipment), and Register.
com (NASDAQ: RCOM) where he was responsible for
Harrison, Brian target evaluation, due diligence, divestitures, and post-
(NVIDIA) transaction integration.
hh Session(s): 4009 - Emerging Companies
hh Session(s): 2024 - NVIDIA Acceleration Summit Panel: The “New Normal” For Building
Engines Overview (Pre-Conference Emerging Companies Based On Disruptive
Tutorial) (Monday, Sept 20, 13:00) Technologies (Thursday, Sept 23, 14:00)
hh 2308 - Building Cutting-Edge Realtime
3D Applications with NVIDIA SceniX Herbst, Jeff
(Wednesday, Sept 22, 10:00) Vice President of Business Development (NVIDIA)
Jeff is the Vice President of Business Development
FULL CONFERENCE GUIDE 2010

Hart, Evan at NVIDIA Corporation, the world leader in visual

Software Engineer (NVIDIA) computing technologies (and inventor of the GPU). In
Evan presently works as a developer technology this role, which he has held since 2001, Jeff leads
engineer for NVIDIA. He has worked for over a NVIDIA’s worldwide business development efforts,
decade to improve the quality and performance of 3D including overall ecosystem development, mergers and
rendering in applications. He has worked with a diverse acquisitions strategy, investments, partnerships and
set of application domains, including CAD, DCC, other strategic business relationships and transactions.
visualizaton, and games. Evan received is BS from Prior to NVIDIA, Jeff was the worldwide head of
The Ohio State University. corporate and business development at AltaVista, and
also served as general manager for a start-up focused
109
on content delivery infrastructure for wireless networks. Hoberock, Jared
Earlier in his career, Jeff was a partner with the law Research Scientist (NVIDIA)
firm of Wilson Sonsini where he specialized in corporate Jared Hoberock joined NVIDIA Research in October
finance, joint ventures, mergers and acquisitions and 2008. His current research interests include high
other strategic business and intellectual property- performance ray tracing and parallel programming
related transactions. Jeff holds a B.S degree in models. Jared has a contributed to both OptiX, NVIDIA’s
Computer Science from Brown University, and a law high performance ray tracing API, and Thrust, an open
degree from Stanford Law School. source library of high-level parallel primitives. Jared
hh Session(s): 4000 - Emerging Companies Summit received a bachelor’s degree in computer engineering
Opening Address (Wednesday, Sept 22, 10:00) from the University of Missouri at Columbia and a Ph.D
hh 4001 - Emerging Companies: CEO on Stage in computer science from the University of Illinois at
featuring Elemental Technologies, Geomerics, Urbana-Champaign. Jared is a two-time recipient of the
and Milabra (Wednesday, Sept 22, 11:00) NVIDIA Graduate Research Fellowship.
hh 4002 - Emerging Companies: CEO on Stage hh Session(s): 2220 - Thrust by Example: Advanced
featuring Allegorithmic SAS, Bunkspeed, Features and Techniques (Thursday, Sept 23, 14:00)
and miGenius (Wednesday, Sept 22, 14:00)
Hoeg, Steve
hh 4004 - Emerging Companies: CEO on Stage (Adobe)
featuring Cooliris, empulse GmbH, and
Playcast (Wednesday, Sept 22, 16:00) hh Session(s): 2224 - GPU Acceleration in Adobe
hh 4005 - Emerging Companies: CEO on Stage Creative Tools (Tuesday, Sept 21, 15:00)
featuring Jedox Business Intelligence, Rocketick,
and Softkinetic (Wednesday, Sept 22, 17:00 Huang, Jen-Hsun
CEO & President (NVIDIA)
hh 4007 - Emerging Companies: CEO on Stage
featuring Aqumin, RTT, and Scalable Display Jen-Hsun Huang co-founded NVIDIA in 1993 and has
Technologies (Thursday, Sept 23, 10:00) served since its inception as president, chief executive
officer, and a member of the board of directors.
hh 4008 - Emerging Companies: CEO on Under his leadership, NVIDIA invented—and led the
Stage featuring ICD and Universal development of—the graphics processing unit (GPU),
Robotics (Thursday, Sept 23, 11:00) pioneering its use in devices as varied as smart
hh 4009 - Emerging Companies Summit phones, PCs, cars, workstations, and supercomputers.
Panel: The “New Normal” For Building NVIDIA GPUs deliver unmatched visual computing with
Emerging Companies Based On Disruptive breathtaking, interactive graphics that delight users, and
Technologies (Thursday, Sept 23, 14:00) massive parallel computing power that accelerates work
hh 4010 - Emerging Companies: CEO on Stage on the world’s most challenging technical problems.
featuring NaturalMotion Ltd, OptiTex, and NVIDIA was named Company of the Year in 2007 by
Useful Progress (Thursday, Sept 23, 15:00) Forbes magazine and has ranked #1 over the past two
years in Innovation in the Semiconductor industry by
hh 4011 - Emerging Companies: CEO on Stage
Fortune.
featuring Cinnafilm, Inc., Perceptive Pixel and
Total Immersion (Thursday, Sept 23, 16:00) hh Session(s): 4006 - Fireside Chat with
Jen-Hsun Huang - Co-founder & CEO,
Hill, Chris NVIDIA (Thursday, Sept 23, 09:00)
Principle Research Scientist (M.I.T.)
Hummel, Michael
Chris Hill is a computational scientist from MIT who
SPEAKERS AND

Managing Director (empulse GmbH)

specializes in modeling planetary fluid dynamics. With
collaborators he has been developing fluid models of Michael studied electronics and computer science at
PANELISTS

atmosphere and ocean processes for 20 years. He Cranfield University, University of Hertforshire and FHT-
is a lead developer of the open-source M.I.T General Esslingen. He received several awards for outstanding
Circulation Model (http://mitgcm.org) and has been achievements. In 1994 he joined Accenture and worked
exploring applications of accelerators for several years. as a process and technology consultant. As youngest
With colleagues, he is developing a GPU oriented manager of Germany he left Accenture in 1999 and
accelerator library for geosciences. continued his career as project and program manager
for complex business and technology projects. In 2007
hh Session(s): 2167 - Designing a Geoscience
he founded “empulse” togehter with a former colleague
Accelerator Library Accessible from High Level
as a professional service and software development
Languages (Wednesday, Sept 22, 17:00)
company.
Hoang-Trong, Tuan hh Session(s): 4004 - Emerging Companies:
PhD Student (George Mason University) CEO on Stage featuring Cooliris, empulse
GmbH, and Playcast Media Systems
Tuan got his B.Eng. in Computer Science and (Wednesday, Sept 22, 16:00))
Engineering from HoChiMinh City University in Vietnam
in 2005. His M.Eng at Chonnam National University
Humphrey, John
(South Korea) was in Computer Engineering; where he
Senior Engineer (EM Photonics, Inc)
conducted research in artificial neural network, protein-
spot maching in 2-dimensional gel electrophoresis John Humphrey is a member of the Accelerated
from 2006-2008. Since 2008, he’s a PhD student at Computing Solutions group at EM Photonics. He earned
George Mason University, Department of Bioinformatics his MSEE degree from the University of Delaware,
and Computational Biology. His current research studying the acceleration of electromagnetics algorithms
interests are in calcium signalling, building cardiac cell using custom hardware platforms. At EM Photonics, he
model using high-performance computing with GPU launched a GPU research effort in 2005 with an FDTD
technology. solver based on OpenGL methods. Since then, he has
worked on accelerated algorithms in a variety of fields,
hh Session(s): 2172 - Unveiling Cellular &
including linear algebra solvers and computational fluid
Molecular Events of Cardiac Arrhythmias
dynamics engines.
(Tuesday, Sept 21, 11:00)
hh Session(s): 2153 - CULA - A Hybrid GPU Linear was vice president of marketing for software maker
Algebra Package (Thursday, Sept 23, 15:00) Patron Systems. Prior to that, he was vice president
hh 2154 - The Impact of Data Movement on GPU of sales for Entelagent Software Corporation and
Performance (Wednesday, Sept 22, 16:00) ViewTech Corporation. He founded GroupNet, Inc., a
PictureTel Corporation reseller. Jamison spent ten years
Hwu, Wen-mei with PictureTel Corporation where he was the fourth
Professor (University of Illinois, Urbana-Champaign) employee of the company and also a founding member
of the European management team of PictureTel
Wen-mei W. Hwu is a Professor of ECE at the University International LTD. During his years at PictureTel,
of Illinois at Urbana-Champaign. He received the ACM revenue grew to over $200M and market value reached
Maurice Wilkes Award, the ACM Grace Murray Hopper more than $1.0B. He received his undergraduate
Award, and the ISCA Most Influential Paper Award. business education from Northeastern University and
He is a fellow of IEEE and ACM and leads the GSRC conducted post-graduate studies in finance at Fairfield
Concurrent Systems Theme. He directs the UIUC CUDA University.
Center of Excellence. Dr. Hwu also received his Ph.D.
degree in Computer Science from UC Berkeley. hh Session(s): 4007 - Emerging Companies: CEO
on Stage featuring Aqumin, RTT, and Scalable
hh Session(s): 2264 - CUDA Centers of Excellence Display Technologies (Thursday, Sept 23, 10:00)
Super-Session III (Tuesday, Sept 21, 16:00)
hh 2249 - New Programming Tools GPU Jargstorff, Frank
Computing (Wednesday, Sept 22, 10:00) Software Engineer (NVIDIA)
Frank Jargstorff is a software engineer leading NVIDIA’s
Iribe, Brendan Performance Primitives effort (NPP). Frank received his
President (Scaleform) degree in computer science in 1997 from the University
Brendan Iribe co-founded Scaleform and established of Tübingen, Germany.
the company as the #1 video game user interface (UI) hh Session(s): 2216 - CUDA Libraries Open
and video codec provider. Brendan pioneers all aspects House (Wednesday, Sept 22, 11:00)
of product research, development, and promotion at
Scaleform. Under his leadership, Scaleform GFx has Jensen, Eric
been adopted by most commercial 3D engines (UE3, Partner, Business Department Chair (Cooley LLP)
CryEngine, Gamebryo) and licensed for use in over 600
titles in less than 4 years, including hit games from 19 of Eric C. Jensen is a business partner in the Cooley Palo
the top 20 worldwide video game publishers. Alto office. Mr. Jensen is head of the Firm’s Business
department and a member of the Management
hh Session(s): 2241 - Standing Out: Implementing Committee. Mr. Jensen has been with the Firm
a Great Stereo UI (Thursday, Sept 23, 14:00)) since 1988 and a partner since 1994. Mr. Jensen
practices securities and general corporate law, with
Iles, Andrew an emphasis on the representation of emerging and
Software Director (NVIDIA) public software, semiconductor, internet, and other
hh Session(s): 2225- Tools for Managing Clusters of information technology companies. He also has
NVIDIA GPUs (Tuesday, Sept 21, 17:00) extensive experience representing venture capital
funds and underwriters. He has counseled clients in
Ismert, Ryan the areas of corporate formations, venture financings,
Director of Engineering (Sportvision, Inc.) public offerings of equity and debt, mergers and
acquisitions, joint venture, licensing and related
Ryan has been building augmented reality systems for strategic transactions, employee incentive matters and

SPEAKERS AND
broadcast TV at Sportvision for 7 years. He currently SEC reporting and compliance. Mr. Jensen has been
leads a team focused on disrupting the current state of included as one of The Best Lawyers in America in 2006
camera tracking and broadcast rendering by leveraging

PANELISTS
- 2011 and named as one of Northern California’s “Super
the power of multiple GPUs. Lawyers” in 2007 - 2010. Mr. Jensen has also been
hh Session(s): 2123 - Enabling Augmented Reality ranked as a leading lawyer in Investment Funds: Venture
with GPU Computing (Thursday, Sept 23, 15:00) Capital in Chambers USA 2010 edition.
hh Session(s): 4009 - Emerging Companies
Iyer, Kumar Summit Panel: The “New Normal” For Building
Product Manager (NVIDIA) Emerging Companies Based On Disruptive
Kumar Iyer is a Product Manager of Developer Tools Technologies (Thursday, Sept 23, 14:00)
at NVIDIA, where he works on the most advanced GPU
development tools in the world. Prior to his work at Jeong, Byungil
NVIDIA, Kumar worked on PC and console games at Visualization Scientist (TACC / UT-Austin)
Electronic Arts, and in research at the USC Institute for Byungil Jeong is a visualization scientist with the
Creative Technologies. Kumar holds a B.S. in Computer Texas Advanced Computing Center at the University
Science from UCLA, and a MBA from the UCLA of Texas at Austin and a primary Scalable Adaptive
Anderson School of Management. Graphics Environment (SAGE) architect. His research
hh Session(s): 2245 - Parallel Nsight for interests include scalable parallel graphics architecture,
Microsoft Visual Studio (Pre-Conference collaborative remote visualization, large-scale data
Tutorial) (Monday, Sept 20, 16:00) visualization, and high-resolution display systems. Jeong
FULL CONFERENCE GUIDE 2010

hh 2149 - Overview of Parallel Nsight for has a PhD in computer science from the University of
Visual Studio (Thursday, Sept 23, 10:00) Illinois at Chicago.
hh 2149 - Overview of Parallel Nsight for hh Session(s): 2144 - Large-Scale Visualization
Visual Studio (Tuesday, Sept 21, 11:30) Using A GPU Cluster (Wednesday, Sept 22, 16:00)

Jamison, Andrew Jeong, Won-Ki

CEO (Scalable Display Technologies) Research Scientist (Harvard University)
Andrew Jamison is an experienced executive with four Dr. Jeong is a research scientist at the Harvard Center
successful start-ups to date. Most recently Andrew for Brain Science (CBS). His research interests include
111
image processing, scientific visualization, and GPGPU high-performance energy-efficient heterogeneous
in the field of biomedical image analysis. He received a architectures, programmer-compiler-microarchitecture
Ph.D. Degree in Computer Science from the University interaction especially for CPU/GPU systems. She
of Utah in 2008, and was a member of the Scientific received an MS and a Ph.D in computer engineering at
Computing and Imaging (SCI) institute at Utah. He The University of Texas at Austin.
received a NVIDIA Fellowship in 2007. He is currently a hh Session(s): 2164 - Analytical Performance
professional member of ACM. Models to Improve the Efficiency of GPU
hh Session(s): 2139 - Interactive Histology Computing (Wednesday, Sept 22, 14:00)
of Large-Scale Biomedical Image Stacks
(Wednesday, Sept 22, 14:00) Kirsanov, Danil
Scientist (ANSYS)
Juckeland, Guido
Senior System Engineer (HPC), Leader Hardware hh Session(s): 2066 - Accelerating System
Accelerator Group (TU Dresden - ZIH) Level Signal Integrity Simulation
(Thursday, Sept 23, 16:30)
Guido Juckeland received his M.Sc. from TU Dresden in
Information System Technology. He is responsible for the Kloeckner, Andreas
operation and design of HPC resources of TU Dresden. Courant Instructor (Courant Institute, NYU)
Currently, he is working on a Ph.D. thesis “Performance
Analysis for Hardware Accelerators”. Andreas recently completed his PhD in applied
mathematics with Jan Hesthaven at Brown University,
hh Session(s): 2090 - Developing Highly Scalable working on various aspects of high-order finite element
Particle-Mesh Codes for GPUs: A Generic methods. In September 2010, he will be joining the
Approach (Tuesday, Sept 21, 15:00) Courant Institute of Mathematical Sciences at New York
hh 2089 - Analyzing CUDA Accelerated University to work on problems in
Application Performance at 20 PFLOP/s computational electromagnetics with Leslie Greengard.
(Wednesday, Sept 22, 17:00) He is the main author of the PyCUDA and PyOpenCL
GPU computation packages.
Kapasi, Ujval hh Session(s): 2041 - PyCUDA: Even
CUDA Platform SW (NVIDIA) Simpler GPU Programming with Python
hh Session(s): 2216 - CUDA Libraries Open (Wednesday, Sept 22, 14:00)
House (Wednesday, Sept 22, 11:00)
Kohlmeyer, Axel
Kaplan, Michael Associate Director (Institute for Computational Molecular
Vice President of Strategic Development (mental images/ Science, Temple University)
NVIDIA) Axel Kohlmeyer is the Associate Director of the Institute
30 years of experience and contributions in the for Computational Molecular Science and the Associate
3D graphics industry; masters degree from the Professor of Chemistry and Computer Science at Temple
Cornell University Program of Computer Graphics University. He earned his PhD in Theoretical Chemistry
in 1980; inventor of spatial-subdivision raytracing, at the University of Ulm. His main interests are making
the first object-oriented 3D scene graph, and other creative use of scientific computing tools to advance
technologie; Michael Kaplan is currently the VP understanding of processes at the molecular and
Strategic Development at mental images/NVIDIA. He is atomistic level, and making these tools more capable
responsible for the iray CUDA accelerated photorealistic and accessible.
rendering technology project. hh Session(s): 2168 - Interactive Molecular
SPEAKERS AND

hh Session(s): 2013 - iray - GPUs and the Dynamics for Nanomechanical and Nanochemical
Photorealistic Rendering Revolution Experiments (Wednesday, Sept 22, 10:00)
PANELISTS

(Tuesday, Sept 21, 14:00)

Korf, Dave
Kerr, Andrew Scalable Computing & Infrastructure Organization,
PhD Student (Georgia Institute of Technology) Marketing (HP)
Mr Korf has over 15 years of engineering expierence
hh Session(s): 2210 - GPU-Ocelot: An Open with the last 20 plus years in various senior marekting,
Source Debugging and Compilation Framework product management and partner management
for CUDA (Thursday, Sept 23, 14:00) positions. Accelerators, Parnters and competitive
analysis are currently some of his focus areas.
Khadtare, Mahesh
Member, Technical Staff (Computational Research hh Session(s): 2233 - Solving Your GPU Computing
Laboratories, Pune, INDIA.) Needs (Sponsored by HP) (Tuesday, Sept 21, 14:00)

hh Session(s): 2298 - Accelerated Image Kramer, Thomas

Quality Assessment using Structural VP Product Management (MainConcept)
Similarity (Thursday, Sept 23, 11:30) Thomas is VP Product Management at MainConcept. He
has more than 8 years of experience in Video and Audio
Kilgard, Mark Technologies ranging from actual Codec Programming
Principal System Software Engineer (NVIDIA) and Pre-Sales activities to Codec Product Management.
hh Session(s): 2127 - OpenGL (Pre-Conference His current responsibility is for SDK business as well
Tutorial) (Monday, Sept 20, 16:00) end-user Applications. In his previous businesses he
has worked for archiving, bank and webservice oriented
Kim, Hyesoon clients. He has skills in education and training for
Assistant Professor (Georgia Tech) various PC based tools and applications. Thomas has
studied Business Informatics in Cologne, Germany.
Hyesoon Kim is an Assistant professor in the School of
Computer Science at the Georgia Institute hh Session(s): 2048 - H.264/AVC Video Encoding with
of Technology. Her research interests include CUDA and OpenCL (Thursday, Sept 23, 09:00)
Krishna, Murali research focus is on efficient algorithms and data
Junior Research Associate (Infosys Technologies Limited) structures for multidimensional aggregation for Online
Murali Krishna graduated with a Masters degree Analytical Processing (OLAP). His work is funded by
from IIIT Bangalore and has been working as a Junior the German Research Foundation (DFG) and is carried
Research Associate with SETLabs, the R&D arm of out in collaboration with Jedox Business Intelligence, a
Infosys Technologies. Murali has worked in areas of company specializing in Business Intelligence software.
Parallelism in workflows, automatic data parallelism hh Session(s): 2237 - Accelerating
extraction in legacy applications. Now, Marali’s research Business Intelligence Applications with
is focused on porting data parallel applications to GPUs. Fast Multidimensional Aggregation
hh Session(s): 2120 - High Performance (Wednesday, Sept 22, 15:00)
Complex Event Processing on GPGPU
(Wednesday, Sept 22, 14:00) Lazebnik, Roee
Director of Product Development (Siemens Healthcare)
Krishnamurthy, Adarsh Dr. Roee Lazebnik’s professional experience and training
Student (University of California Berkeley) consist of clinical radiology, biomedical engineering,
Adarsh Krishnamurthy is a PhD Candidate in the information technology, software development, and
department of Mechanical Engineering at U.C. Berkeley. healthcare business. He is the author of numerous
His research interests include Computer Aided Design published manuscripts, textbook chapters, and
(CAD), solid modeling, GPU algorithms, computational conference proceedings on many topics in medical
geometry, and ultrasonic non-destructive testing. He imaging.
received his Bachelors and Masters in Mechanical hh Session(s): 2169 - Real-time Volumetric
Engineering from Indian Institute of Technology, Madras. Medical Ultrasound Applications for GPU
hh Session(s): 2171 - Parallel Algorithms Computing (Wednesday, Sept 22, 10:00)
for Interactive Mechanical CAD
(Thursday, Sept 23, 11:00) Le Grand, Scott
Principal Engineer (NVIDIA)
Kunz, Holger Scott is a principal engineer on the CUDA platform team
Director, Workstation Software Development (NVIDIA) at NVIDIA with a B.S. in biology from Siena College and
Holger Kunz work has been in the area of professional a Ph.D. in biochemistry from the Pennsylvania State
visualization covering real-time visualization, photo- University. Scott developed Genesis, the first molecular
realistic visualization, multi-GPU rendering, shading modeling system for home computers, Folderol,
languages and computer vision. He received his Dipl. the first distributed computing project targeting the
Inform. at the University of Erlangen Nürnberg. protein folding problem, and BattleSphere, a 3D space
shooter for the Atari Jaguar. More recently, he ported
hh Session(s): 2024 - NVIDIA Acceleration the Folding@Home and AMBER molecular modeling
Engines Overview (Pre-Conference codebases to CUDA.
Tutorial) (Monday, Sept 20, 13:00)
hh Session(s): 2218 - Redesigning Molecular
Lanza, Drew Dynamics for GPUs and GPU Clusters
Partner (Morgenthaler) (Wednesday, Sept 22, 15:00)
Drew, based in Menlo Park, CA, joined Morgenthaler in Leback, Brent
2000 and became a Partner in 2001. Drew focuses on Engineering Manager (The Portland Group)
cleantech, semiconductors and systems. He is currently
a Director of Cortina Systems, Overture Networks, Brent Leback is an Engineering Manager for PGI. He

SPEAKERS AND
OmniPV, Unity Semiconductor, Autonet Mobile, SiPort, has worked in various positions over the last 26 years
ZeroG, and R2 Semiconductor. Drew spent 15 years in in HPC customer support, math library development,

PANELISTS
senior operating positions in the telecommunications applications engineering and consulting at QTC, Axian,
industry starting companies in both the components PGI and STMicroelectronics.
and the systems sectors of that industry. Drew was hh Session(s): 2143 - CUDA Fortran Programming
a founder and VP of Engineering at E/O Networks for NVIDIA GPUs (Wednesday, Sept 22, 15:30)
where he helped to design and produce a long reach
rural fiber optic telephony system. Drew started his Lecomber, David
optical telecommunications career in 1986 at Raynet, CTO (Allinea Software)
a pioneering company in the development of fiber to David Lecomber is one of the founders of Allinea and
the home technologies. Drew’s many roles at Raynet leads the research and development team behind Allinea
included VP of Marketing and VP of International DDT, the world’s most scalable parallel debugger.
Development. Drew was the founding CEO of Lightwave
Microsystems, a leader in the design and manufacture of hh Session(s): 2039 - GPU Debugging with
high volume optical integrated circuits. Drew graduated Allinea DDT (Wednesday, Sept 22, 11:00)
magna cum laude from Harvard with an MBA in 1987.
He received his BSEE & MSEE degrees from Stanford in Lee, HyoukJoong
1979. PhD Student (Stanford University)
hh Session(s): 4001 - Emerging Companies: HyoukJoong Lee is a PhD student in electrical
CEO on Stage featuring Elemental engineering at Stanford University. His research
Technologies, Inc., Geomerics, and interests include parallel systems architecture and
FULL CONFERENCE GUIDE 2010

Milabra (Wednesday, Sept 22, 11:00) programming models. He has a BS degree from Seoul
National University.
hh 4002 - Emerging Companies: CEO on Stage
featuring Allegorithmic SAS, Bunkspeed, hh Session(s): 2177 - Simplifying Parallel
and miGenius (Wednesday, Sept 22, 14:00) Programming with Domain Specific
Languages (Wednesday, Sept 22, 11:00)
Lauer, Tobias
Researcher (University of Freiburg) Lee, John
(Appro)
Tobias Lauer received his PhD in computer science
from the University of Freiburg in 2007. His current John K. Lee joined Appro in 2001, and is responsible
113
for leading Appro’s hardware product development hh Session(s): 2088 - Nucleotide String
engineering team. In addition, Mr. Lee leads the Matching Using CUDA-Accelerated
company’s Project Management team that is responsible Agrep (Thursday, Sept 23, 16:00)
for deploying Appro’s complex cluster solutions. He has
served as the Program Executive for some of Appro’s Lichtenbelt, Barthold
most important cluster projects such as 2006 Peloton Sr. OpenGL Manager (NVIDIA)
Project as well as 2007 TLCC Cluster Project. Prior to his Barthold Lichtenbelt is a Sr. Manager of the OpenGL
role at Appro, Mr. Lee served in both Sales and Service core driver team at NVIDIA. He is also the Chair of the
Management capacities at multiple storage and telecom OpenGL ARB Khronos Working Group.
companies.
hh Session(s): 2127 - OpenGL (Pre-Conference
hh Session(s): 2270 - Appro’s GPU Computing Tutorial) (Monday, Sept 20, 16:00)
Solutions (Tuesday, Sept 21, 15:00)
Lin, Chun-Yuan
Lefebvre, Matthieu Assistant Professor (Department of CSIE, Chang Gung
PhD Student (ONERA) University)
Matthieu Lefebvre is a PhD student at ONERA, the Chun-Yuan Lin joined the Department of Computer
french aerospace lab, and at Université Paris 13. He is Science and Information Engineering at Chang Gung
working on accelerating CFD simulations on GPU. University as an assistant professor. His research
hh Session(s): 2045 - Roe-Pike Scheme for 2D interests are in the areas of parallel and distributed
Euler Equations (Wednesday, Sept 22, 14:00) computing, parallel algorithms, algorithm analysis,
information retrieve, proteomics, and bioinformatics.
Lever, Ben hh Session(s): 2105 - CUDA-FRESCO: An
Senior Research Engineer (NICTA) Efficient Algorithm for Mapping Short
Ben Lever is a senior research engineer at NICTA Reads (Thursday, Sept 23, 15:00)
currently developing new methodologies and
frameworks for describing computer vision algorithms Linderman, Michael
that can target heterogeneous, highly-parallel platforms. Engineering Research Associate (Stanford University)
Prior to NICTA, Ben was a hardware design engineer at Michael Linderman is an Engineering Research
Canon Research before joining Synopsys as a software Associate in the Computer Systems Laboratory
engineer for developing real-time simulation models of at Stanford University. His research focuses on
embedded processors. using graphics processing units (GPUs) and other
hh Session(s): 2173 - Enabling Large-Scale CCTV heterogeneous computer systems to accelerate
Face Recognition (Thursday, Sept 23, 11:00) computational systems biology and other data- and
compute-intensive applications. Michael earned a
Lewin, Dan’l Ph.D. and MS in Electrical Engineering from Stanford
Corporate Vice President, Strategic and Emerging University in 2009 and 2006 respectively, and B.S. from
Business Development (Microsoft) Harvey Mudd College in 2003.
Dan’l Lewin is responsible for leading Microsoft’s global hh Session(s): 2030 - High-Throughput
engagement with startups and venture capitalists. Cell Signaling Network Learning with
In addition, Lewin has executive, site, and citizenship GPUs (Thursday, Sept 23, 09:00)
responsibility for the company’s operations in the
Silicon Valley, based in Mountain View, California, Loddoch, Alex
which currently employ 2,500 people and supports Research Scientist (Chevron)
SPEAKERS AND

business relationships with industry partners in Silicon Alex received an MSc in Physics in 2001 and a PhD
Valley. Lewin’s business development teams focus in Geophysics in 2007 from University in Muenster,
PANELISTS

on supporting the software startup and entrepreneur Germany, working on topics including geophysical fluid
ecosystem developing on the Microsoft platform while dynamics, parallel computing and data compression.
helping foster and grow local software economies In 2007 he joined Chevron as a Research Scientist,
worldwide. Through the Microsoft BizSpark Program, working in the area of High Performance Computing and
and the Microsoft Innovation Center Program, the particularly GPU computing.
groups help accelerate startup success in more than 100
countries. hh Session(s): 2174 - Reverse Time Migration
on GPUs (Wednesday, Sept 22, 15:00)
hh Session(s): 4001 - Emerging Companies:
CEO on Stage featuring Elemental Löhner, Rainald
Technologies, Inc., Geomerics, and Professor (George Mason University)
Milabra (Wednesday, Sept 22, 11:00)
Rainald Lohner is the head of the CFD center at the
hh 4002 - Emerging Companies: CEO on Stage department of computational and data sciences of
featuring Allegorithmic SAS, Bunkspeed, George Mason University in Fairfax, VA, in the outskirts
and miGenius (Wednesday, Sept 22, 14:00) of Washington, D.C. He received a MSc in Mechanical
Engineering from the Technische Universitaet
Li, Hongjian Braunschweig, Germany, as well as a PhD and DSc
Graduate Student (The Chinese University of Hong Kong) in Civil Engineering from the University College of
Hongjian Li is currently working on a M.Phil. degree Swansea, Wales, where he studied under Profs. Ken
under the supervisions of Prof. Kwong Sak Leung and Morgan and Olgierd Zienkiewicz. His areas of interest
Prof. Man Hon Wong. His major research interests include numerical methods, solvers, grid generation,
are GPU applications in bioinformatics, particularly parallel computing, visualization, pre-processing,
computer-aided drug design by means of high fluid-structure interaction as well as shape and process
throughput in silico virtual screening via structure-based optimization. His codes and methods have been applied
ligand-protein docking. He is developing new software in many fields, including aerodynamics or airplanes,
that utilizes the computational horsepower of graphics cars and trains, hydrodynamics of ships, submarines
processors with a purpose of accelerating the pipeline of and UAVs, shock-structure interaction, dispersion
drug discovery. analysis in urban areas and haemodynamics of vascular
diseases. He is the author of more than 600 articles Bob Lucas is the Computational Sciences Division
covering the fields enumerated above, as well as a Director at the University of Southern California’s
textbook on Applied CFD Techniques. Information Sciences Institute. He has been developing
hh Session(s): 2005 - Porting Large-Scale Legacy parallel linear solvers since 1985.
Fortran Codes (Wednesday, Sept 22, 17:00) hh Session(s): 2240 - Accelerating LS-DYNA with MPI,
OpenMP, and CUDA (Thursday, Sept 23, 14:30)
Loop, Charles
Senior Researcher (Microsoft Research) Lumsdaine, Andrew
Charles Loop is a Senior Researcher in the Microsoft Professor (Indiana University)
Research Graphics Group. He has worked extensively in Andrew Lumsdaine is a professor in the School of
the areas of curve and surface modeling and rendering, Informatics & Computing at Indiana University, and an
including work on n-sided patches, smooth patch Associate Director of the Digital Science Center and
complexes, as well as GPU algorithms for rendering Director of the Open Systems Lab at the Pervasive
vector art. Charles is best known for the triangle mesh Technology Institute. Lumsdaine received his Ph.D.
subdivision algorithm that bears his name. He is from MIT in 1992, and from 1992 through 2001, he
currently working on data parallel algorithms for REYES was a faculty member in the Department of Computer
style rendering and accelerating raytracing of surface Science and Engineering at the University of Notre
primatives. Dame. His research interests include computational
hh Session(s): 2129 - Hardware Subdivision science and engineering, parallel and distributed
and Tessellation of Catmull-Clark computing, software engineering, generic programming,
Surfaces (Tuesday, Sept 21, 16:00) mathematical software, and numerical analysis.
Lumsdaine is a member of ACM, IEEE, and SIAM, as well
Lorach, Tristan as the MPI Forum, the BLAS technical forum and the
Computergraphics Engineer (NVIDIA) ISO C++ standards committee. In 1995, he received the
Career Development Award from the National Science
Tristan Lorach has worked on many realtime interactive Foundation.
events all over the world. Tristan is now working at
NVIDIA in the developer technical relations department hh Session(s): 2093 - Computational
(a.k.a Devtech), participating on a variety of projects in Photography: Real-Time Plenoptic
relation with NVIDIA partners. At the same time, he is Rendering (Wednesday, Sept 22, 16:00)
also contributing to R&D writing demos for new GPU
Chips. Lunn, Philip
CEO (Bunkspeed)
hh Session(s): 2056 - Next-Generation Rendering
with CgFX (Tuesday, Sept 21, 16:00) Philip Lunn is the visionary founder of Bunkspeed.
He has brings his passion for computer graphics to
Ltaief, Hatem democratize the creation of photographic quality 3D
Sr. Research Associate (University of Tennessee) imagery and animation to every Bunkspeed product.
Simplicity without compromise enables explosive growth
Hatem Ltaief received the MSc degree from the school of new users and new untapped markets. Mr. Lunn
of engineering at the University of Claude Bernard has over 20 years of technology and entrepreneurial
Lyon I, France, the MSc in applied mathematics at the experience and holds a Bachelor of Science degree in
University of Houston and the PhD degree in computer Mechanical Engineering from the University of Arizona.
science from the University of Houston. He is a research
associate in the Innovative Computing Laboratory in the hh Session(s): 4002 - Emerging Companies: CEO on
Department of Electrical Engineering and Computer Stage featuring Allegorithmic SAS, Bunkspeed,
and miGenius (Wednesday, Sept 22, 14:00)
SPEAKERS AND

Science at the University of Tennessee, Knoxville.

His research interests include parallel algorithms,
Mallick, Dr. Sudeep
PANELISTS

specifically in the area of numerical linear algebra, and

also parallel programming models and performance Principle Research Scientist (Infosys)
optimization for parallel architectures spanning hh Session(s): 2120 - High Performance Complex
distributed and shared memory systems, as well as next Event Processing on GPGPU (Wednesday, Sept 22,
generation multi-core and many-core processors. 14:00)
hh Session(s): 2138 - Faster, Cheaper,
Better – Hybridization of Linear Algebra Malcolm, James
for GPUs (Thursday, Sept 23, 09:00) VP of Engineering (AccelerEyes)
James Malcolm is VP of Engineering at AccelerEyes and
Lu, Peter a co-founder. He holds degrees in Mathematics (BS),
Post-Doctoral Research Fellow (Harvard University) Computer Science (BS, MS), and Electrical Engineering
Peter J. Lu is a post-doctoral research fellow in physics (MS) from Georgia Institute of Technology.
at Harvard University; he focuses on the physics of hh Session(s): 2271 - Compose CUDA
attractive colloids, integrating high-performance Masterpieces! Write better, Leverage
imaging analysis. He conducts experiments aboard the More (Thursday, Sept 23, 16:00)
International Space Station, and has also published
his discoveries of modern quasicrystal geometry in Marbach, Jonathan
medieval Islamic architectural tilings; the first precision Director of Software Architecture and Engineering
compound machines, from ancient China; the first use (TerraSpark Geosciences, LLC)
of diamond, in prehistoric China; and the first natural
quasicrystalline mineral. For more infomation, see Jon is currently working as Director of Software
http://www.peterlu.org Architecture and Engineering at TerraSpark Geosciences,
makers of the breakthrough 3D Seismic Interpretation
hh Session(s): 2242 - Swarming Bacteria and package Insight Earth. He specializes in OpenGL and
Diffusing Particles: High-Throughput Analysis of Virtual Reality, and received his PhD in Computer Science
Microscopic 3D Motion (Wednesday, Sept 22, 17:00) from the University of Colorado at Boulder in 2009. His
PhD dissertaion investigates techniques for supporting
Lucas, Bob stereoscopic 3D for simulataneous participants in a
Computational Sciences Division Director (University of virtual reality environment.
Southern California)
hh Session(s): 2107 - Accelerating Stereographic passion of storytelling and filmmaking on nights and
and Multi-View Images Using Layered weekends, finding the time to write, direct and produce
Rendering (Thursday, Sept 23, 15:00) numerous independent films (one feature film, two
shorts and a music video). It was during these parallel
Masaie, Issei journeys that his “a-ha” moment occurred and Lance’s
Chief GPU Engineer (Prometech Software, Inc.) entrepreneurial spirit ignited a new path for him to
He received his masters degree in Quantum follow in the field of image processing.
Engineering and System Science from the University of hh Session(s): 4011 - Emerging Companies: CEO
Tokyo. He joined Prometech Software, Inc. in 2007. He on Stage featuring Cinnafilm, Perceptive Pixel,
currently heads the development of the GPU accelerated and Total Immersion (Thursday, Sept 23, 16:00)
version of Particleworks.
hh Session(s): 2106 - Particleworks: McAllister, Dave
Particle-based CAE Software on Multi- Development Lead, OptiX Development (NVIDIA)
GPU (Thursday, Sept 23, 11:30) David is the lead developer for the next release of OptiX.
Prior to joining the OptiX group two years ago David was
Mason, Chris a GPU architect at Nvidia for eight years, working on
Product Manager (Acceleware) the design of all GPU families from GeForce 3 through
Fermi. David holds a Ph.D. in Computer Science from
hh Session(s): 2208 - Acceleration of the University of North Carolina at Chapel Hill.
SIMULIA’s Abaqus Solver on NVIDIA
GPUs (Thursday, Sept 23, 15:30) hh Session(s): 2261 - Introduction to GPU Ray
Tracing with NVIDIA OptiX (Pre-Conference
Mastrobuono Battisti, Alessandra Tutorial) (Monday, Sept 20, 14:30)
PhD Student (Sapienza- University of Rome)
McMains, Sara
Alessandra Mastrobuono Battisti is a second year PhD Associate Professor (University of California Berkeley)
student in Astronomy at the University of Rome La
Sapienza. She received her Bachelor Degree in Physics Sara McMains is an Associate Professor of Mechanical
and Astrophysics from the University La Sapienza in Engineering at UC Berkeley. Her research interests
2006. She also has a Masters Degree in Astronomy include geometric design for manufacturing feedback,
Astrophysics from the same University. solid modeling, CAD/CAM, GPU algorithms, computer
Her PhD program concerns the study of the dynamical aided process planning, layered manufacturing,
evolution of N-Body systems. computer graphics, visualization, virtual prototyping, and
At this scope, she realized a high performing code, virtual reality. She received her A.B. from Harvard and
NBSymple that runs on composite architecture. her M.S. and Ph.D. from UC Berkeley, all in Computer
Science. She is the recipient of Best Paper Awards from
hh Session(s): 2000 - Gravitational N-body Usenix (1995) and ASME DETC (2000), a Best Poster and
Simulations: How Massive Black Holes Interact a Best Paper Award from the ACM Solid and Physical
with Stellar Systems (Wednesday, Sept 22, 14:00) Modeling Symposium (2007, 2008 -- 2nd place), and the
NSF CAREER Award (2005).
Matsuoka, Satoshi
Professor (Tokyo Institute of Technology) hh Session(s): 2171 - Parallel Algorithms
for Interactive Mechanical CAD
Satoshi Matsuoka received his Ph. D. from the University (Thursday, Sept 23, 11:00)
of Tokyo in 1993. He became a full Professor at the
Global Scientific Information and Computing Center Meister, Benoit
(GSIC) of Tokyo Institute of Technology (Tokyo Tech / Senior Engineer (Reservoir Labs)

SPEAKERS AND
Titech) in April 2001, leading the Research Infrastructure
Division Solving Environment Group of the Titech Benoit Meister received his BSc in Physics and his

PANELISTS
campus. He has won several awards including the Sakai PhD in computer science from Strasbourg University,
award for research excellence from the Information in automatic program parallelization and optimization
Processing Society of Japan in 1999, and recently using a polyhedral model of computation loops. After
received the JSPS Prize from the Japan Society for a post-doc in Verimag Grenoble, Benoit has joined
Promotion of Science in 2006 from his Royal Highness Reservoir Labs where he contributed to the development
Prince Akishinomiya. and management of R-Stream, an advanced auto-
parallelizing compiler based on extensions of the
hh Session(s): 2265 - CUDA Centers of Excellence polyhedral optimization techniques to date. At the
Super-Session IV (Tuesday, Sept 21, 17:00) moment, R-Stream successfully targets 6 radically
hh 2280 - TSUBAME2.0 Experience different architectures and parallelizes a broad range of
(Wednesday, Sept 22, 10:00) applications.
hh Session(s): 2202 - A Programming Model and
Maurer, Lance Tool for Automatic High-Performance C to
CEO (Cinnafilm, Inc.) CUDA Mapping (Thursday, Sept 23, 09:00)
Lance Maurer is the founder, president and CEO of
Cinnafilm, Inc. – an American engineering company Menon, Shashi
dedicated to global leadership in image optimization R&D Manager (Schlumberger)
using innovative and affordable parallel-processing
methods. Prior to launching Cinnafilm, Maurer hh Session(s): 2141 - Moving the Frontier of Oil
FULL CONFERENCE GUIDE 2010

spent ten years, primarily with Goodrich Aerospace, and Gas Exploration and Production with GPUs
designing, analyzing and testing technology used in (Wednesday, Sept 22, 10:00)
the world’s most advanced spacecraft and launch
vehicles; clients include: NASA, Boeing, Ball, Kodak Meredith, Jeremy
and Lockheed-Martin. He would eventually become Computer Scientist (Oak Ridge National Laboratory)
an expert in thermal, structural and materials design hh Session(s): 2089 - Analyzing CUDA Accelerated
for extreme environmental applications, honing Application Performance at 20 PFLOP/s
regimented engineering philosophy in the true “failure- (Wednesday, Sept 22, 17:00)
is-not-an-option” defense industry. During his tenure
as an aerospace engineer he also pursued his life
117
Merrill, Duane Morrison, Michael
Ph.D Candidate (University of Virginia) (NVIDIA)
Duane Merrill is a Ph. D. candidate at the University of hh Session(s): 2308 - Building Cutting-Edge Realtime
Virginia, Department of Computer Science. His advisor 3D Applications with NVIDIA SceniX (Wednesday,
is Professor Andrew Grimshaw. Before graduate school, Sept 22, 10:00)
he was a software developer for Avaki Corporation,
specializing in grid computing middleware. His Mooney, Al (Adobe)
current research interests lay in parallel and high-
performance computing, specifically in regard to hh Session(s): 2224 - GPU Acceleration in Adobe
programming models and algorithmic primitives for Creative Tools (Tuesday, Sept 21, 15:00)
GPGPU, stream, and many-core architectures. Much
of his prior academic work has involved concurrent Morton, Scott
systems in one form or another, including grid and Geophysical Advisor (Hess Corporation)
distributed computing; virtual machines and hypervisor Scott Allyn Morton received his B.A. in physics &
technologies; operating systems and meta-systems; and math from Gustavus Adolphus College and his
security architecture and protocols. Ph.D. in astrophysics from University of Illinois at
hh Session(s): 2296 - CUDA Optimization for Urbana-Champaign. He has 25 years of experience
Ninjas: A Case Study of High-Performance in computational and theoretical physics distributed
Sorting (Wednesday, Sept 22, 15:00) between academia, the computer industry and the
petroleum industry. Scott has worked at NCSA (National
Micikevicius, Paulius Center for Supercomputing Applications), Shell, Thinking
Developer Technology Engineer (NVIDIA) Machines, Cray Research and SGI (Silicon Graphics Inc).
Scott manages the geophysical technology development
Paulius Micikevicius is a Developer Technology group for Hess Corporation and is responsible for
Engineer at NVIDIA with a focus on parallel computing monitoring, testing, adopting and developing new
and performance. Prior to joining NVIDIA, he was an geophysical and computational technologies.
assistant professor of Computer Science at Armstrong
Atlantic State University as well as a research associate hh Session(s): 2059 - Industrial Seismic Imaging
at the Media Convergence Laboratory at UCF. Paulius on GPUs (Wednesday, Sept 22, 11:00)
holds a PhD in Computer Science from the University
of Central Florida and a B.S. in Computer Science from Moulik, Supratik
Midwestern State University. (University of Pennsylvania)
hh Session(s): 2012 - Analysis-Driven Performance Dr. Moulik is a cardiovascular imaging fellow at the
Optimization (Thursday, Sept 23, 15:00) University of Pennsylvania. Combining an engineering
degree from Carnegie Mellon University with 10 year of
hh 2011 - Fundamental Performance Optimizations graduate medical education, Dr. Moulik is a unique blend
for GPUs (Wednesday, Sept 22, 17:00) of physician and programmer. The breadth of training
has allowed him to develop GPU computing algorithms
Miller, Phillip for the medical imaging community which are both
Director, Workstation Software Product Management intuitive and robust.
(NVIDIA)
hh Session(s): 2036 - Algorithms for Automated
Phillip Miller is an accomplished software product Segmentation of Medical Imaging Studies
manager with 16 years experience guiding industry Utilizing CUDA (Tuesday, Sept 21, 16:00)
leading solutions from companies such as Autodesk and
Adobe. He is also a registered architect, bringing real- Mroue, Abdul
SPEAKERS AND

world experience in using tools and directing projects Post-Doc Fellow (CITA, Univ. Of Toronto)
to software creation. At NVIDIA, Mr. Miller directs much
Abdul Mroue is a PostDoc Fellow at the Canadian
PANELISTS

of the professional middleware produced to enable

software developers to quickly leverage the potential of Institute for Theoretical Astrophysics, University of
the GPU within the applications they produce. Toronto. He received his PhD degree in Physics from
Cornell University in 2009. His research work focuses on
hh Session(s): 2024 - NVIDIA Acceleration solving general relativity numerically.
Engines Overview (Pre-Conference
Tutorial) (Monday, Sept 20, 13:00) hh Session(s): 2108 - Binary Black Holes Simulations
using CUDA (Wednesday, Sept 22, 16:00)
hh 2261 - Introduction to GPU Ray Tracing
with NVIDIA OptiX (Pre-Conference Mrsic-Flogel, Janko
Tutorial) (Monday, Sept 20, 14:30) CTO (Mirriad)
Mitchell, Kenny Janko has over 15 years experience in Internet and
Research Lead (Black Rock Studio) Mobile Internet and was a pioneer of interactive
television. He has achieved high-yielding returns to
Kenny is the research lead at Black Rock, Disney shareholders in technology company sales and is the
Interactive Studios. His Ph.D. introduced the use of founder of several successful IT and Telecoms ventures.
real-time 3D for information visualization on consumer Prior to joining MirriAd he was Managing Director of
hardware, including a novel recursive perspective Dynamical Systems Research, Technical Director at
projection technique. Over the past ten years, he has Digital Mobility, and Director of Applied Technology
shipped games using high-end graphics technologies at BBC Vecta, the BBC’s Venturing arm. Janko is an
including voxels, PN patches, displacement mapping, and acknowledged speaker, lecturer, author of scientific
clipmaps. In between shipping games for the Harry Potter publications and inventor of patent applications in areas
franchise and now racing games with Split/Second, he is of Computing, Neural Networks, Telecommunications,
also involved in developing new intellectual properties. Interactive Television and Electrical Engineering. Janko
hh Session(s): 2047 - Bridging Ray and Raster holds a PhD in Electrical Engineering from Imperial
Processing on GPUs (Tuesday, Sept 21, 11:00) College.
hh Session(s): 4003 - Emerging Companies
Summit Panel: GPU for Computer
Vision (Wednesday, Sept 22, 15:00)
Mueller, Frank data analysis (VDA) and innovative design for large-scale
Associate Professor (North Carolina State University) VDA systems. His recent work includes algorithms for
ultrascale distributed-memory ray tracing, work that
hh Session(s): 2272 - GStream: A General- enables photo-realistic rendering of the largest datasets
Purpose Data Streaming Framework produced on supercomputers today, such as cosmologic
on GPUs (Thursday, Sept 23, 09:00) simulations of the Universe and computational fluid
hh 2026 - MatCloud: Accelerating Matrix Math GPU dynamics (CFD) simulations at unprecedented levels
Operations with SaaS (Tuesday, Sept 21, 17:00) of detail. Paul also helps manage TACC’s visualization
systems, including Longhorn, the largest supercomputer
Murphy, K.C. in the world dedicated to VDA, and Stallion, the highest-
Vice President of Marketing (NextIO) resolution tiled display in the world.
K.C. is a semiconductor veteran with more than 25 hh Session(s): 2144 - Large-Scale Visualization
years of experience, and time spent with some of the Using A GPU Cluster (Wednesday, Sept 22, 16:00)
industry’s best known companies: AMD, Cadence and
Broadcom. Prior to NextIO, K.C. was an Executive Officer Negrut, Dan
of Broadcom and the VP/GM of Broadcom’s RF and Assistant Professor (University of Wisconsin)
Advanced Mixed Signal Group. K.C. joined Broadcom Dan received his Mechanical Engineering Ph.D. from
through Broadcom’s acquisition of Pivotal Technologies, the University of Iowa in 1998. He spent six years in
where he was President and CEO. Before Pivotal, K.C. the software industry. In 2004, he served as an Adjunct
was Executive Vice President, Strategic Business Group Professor in the Department of Mathematics at the
and Corporate Strategy, at Cadence Design Systems. University of Michigan. He spent 2005 as a Mathematics
K.C. joined Cadence after many years at AMD where his and Computer Science Visiting Scientist at Argonne
last position was Vice President, Corporate Strategy and National Laboratory. Dan joined the University of
PC Systems. Wisconsin in 2006. His interests are in Computational
hh Session(s): 2247 - Reconfiguring a Science and he leads the Simulation-Based Engineering
Pool of GPUs on The Fly (Sponsored by Lab.
NextIO) (Tuesday, Sept 21, 16:00) hh Session(s): 2231 - Driving on Mars, Redux:
System Level Simulation of Dynamic
Narayanan, Babu Systems (Wednesday, Sept 22, 10:00)
Lab Manager (GE Global Research)
Lab Manager, Medical Image Analysis Lab, GE Research, Nessim, Maurice
JFWTC, Bangalore (Schlumberger)
hh Session(s): 2094 - Nearly Instantaneous hh Session(s): 2141 - Moving the Frontier of Oil
Reconstruction for MRIs (Tuesday, Sept 21, 14:00) and Gas Exploration and Production with
GPUs (Wednesday, Sept 22, 10:00)
Nash, Steve
Applied Engineer (NVIDIA) Nichols, David
Steve is an Applied Engineer in the Professional Research Director (Schlumberger)
Solutions Group at Nvidia, focussing on scalable Dave Nichols has worked in the seismic imaging
visualization solutions. industry for 25 years. Much of his work has focused
hh Session(s): 2010 - Implementing Stereoscopic on the intersection of new HPC technologies and new
3D in Your Applications (Pre-Conference imaging algorithms. After working in seismic data
Tutorial) (Monday, Sept 20, 16:00) processing and software development for 5 years, he

SPEAKERS AND
hh 2071 - Large Scale Visualization Soup returned to academia to study for a PhD at Stanford
(Wednesday, Sept 22, 11:00) University. He has since worked for Schlumberger in

PANELISTS
a number of roles ranging from managing seismic
Naumov, Maxim research at the long term research labs to managing the
Software Engineer (NVIDIA) process for introducing new technologies.
Maxim Naumov’s expertise is in the area of parallel hh Session(s): 2142 - Complex Geophysical
numerical linear algebra. In particular, he has worked Imaging Algorithms Enabled by GPU
on parallel iterative linear systems and eigenvalue technology (Wednesday, Sept 22, 14:00)
solvers. He received his Ph.D. in Computer Science
(with specialization in Computational Science and Nicoletti, Bruno
Engineering) in 2009 and his B.Sc. in Computer Science CTO (The Foundry)
and Mathematics in 2003, all from Purdue University Bruno Nicoletti has worked in visual effects since
– West Lafayette. He currently works in NVIDIA CUDA graduating with a degree in Computer Science and
Platform team developing parallel numerical algorithms Mathematics from Sydney University in 1987. He
for Graphical Processing Units (GPUs). He has previously has worked in TV and Film production at companies
worked in the Intel Corporation Microprocessor such as Rushes, The Computer Film Company (now
Technology Lab and Computational Software Lab, and Framestore) and Animal Logic, as well as developing
received a 2008-09 Intel Foundation Ph.D. Fellowship. software commercially at Animal Logic, Softimage and
hh Session(s): 2070 - CUSPARSE Library: A Discreet Logic (now both part of Autodesk). He has
Set of Basic Linear Algebra Subroutines for developed 2D image processing software as well as 3D
animation, rendering and modelling tools. In 1996 he
FULL CONFERENCE GUIDE 2010

Sparse Matrices (Thursday, Sept 23, 11:00)

founded The Foundry to develop visual effects plug-ins
Navratil, Paul and oversaw it’s initial growth. The Foundy has since
Visualization Scientist (Texas Advanced Computing grown and develops a range of applications for visual
Center) effects. Now the company’s CTO, he acts as senior
engineer for the company and is overseeing the effort to
Paul is a Visualization Scientist in the Data and move The Foundry’s software to a new image processing
Information Analysis division of the Texas Advanced framework that can exploit CPUs and GPUs to yield
Computing Center (TACC) at the University of Texas dramatic speed improvements.
at Austin. His research interests include efficient
algorithms for large-scale parallel visualization and
119
hh Session(s): 2125 - Developing GPU hh Session(s): 2075 - GPU-Accelerated Video
Enabled Visual Effects For Film And Encoding (Thursday, Sept 23, 11:00)
Video (Wednesday, Sept 22, 14:00)
Oliker, Aaron
Nienhaus, Marc Managing Partner/Director 3D Technology (BioDigital)
Sen. Graphics Software Engineer (mental images GmbH) Aaron is the 3D Technology Director and Managing
Marc Nienhaus is the leading engineer of mental images Partner at BioDigital Systems. His efforts have led the
subsurface data visualization solution. He received development new simulation products that are currently
a master in mathematics with a minor in computer implemented at virtual surgery centers for companies
science from the University of Muenster and a PhD in and institutions around the country. Aaron‘s past work
computer science from the Hasso Plattner Institute at with the Smile Train organization led to revolutionary
the University of Potsdam. In 2006, Marc researched new ways to empower physicians in developing nations
as a post-doc at the Northwestern University and through 3D virtual training. Aaron is a research assistant
led a research project at the University of Potsdam professor of Educational Informatics at New York
before joining mental images R&D department in University School of Medicine.
2007. His research interest include scalable rendering hh Session(s): 2146 - Virtual Surgery
and visualization techniques, GPU-based rendering (Wednesday, Sept 22, 11:00)
and computing, and photorealistic graphics. He has
published various papers on GPU-based real-time Ordureau, Sylvain
rendering, non-photorealistic rendering, and depiction CEO (UsefulProgress)
techniques for symbolizing dynamics.
The way of Sylvain Ordureau both academic and
hh Session(s): 2014 - Scalable Subsurface professionnal has enabled him to acquire skills in 4 keys
Data Visualization Framework : scientific research, computing, communications and
(Wednesday, Sept 22, 17:00) finance. His business consist in the creation of models
that test scientific hypotheses for living or inert matter.
Obenschain, Keith With his experience as a partner in a communication
Computer Scientist (Naval Research Lab Code 6440) group, he has acquired control of the management and
Mr. Keith Obenschain graduated with a BS in Computer financing of projects, primarily in the service of outside
Science from the University of Illinois at Urbana- companies and since Feb2003 in UsefulProgress.
Champaignand and has been employed since graduation hh Session(s): 4010 - Emerging Companies: CEO
at the Laboratory for Computational Physics and Fluid on Stage featuring NaturalMotion Ltd, OptiTex,
Dynamics at NRL. Since 2002, Mr. Obenschain has been and Useful Progress (Thursday, Sept 23, 15:00)
the lead software engineer for NRL’s CT-Analyst project,
responsible for the overall software engineering effort Palamadai, Ekanathan
on CT-Analyst, including the architecture, performance Research & Development Engineer (ANSYS)
enhancements, and the actual implementation.
Working on R&D in Nexxim Circuit Simulator
hh Session(s): 2234 - Unstructured Finite
Volume Code on a Cluster with Multiple GPUs hh Session(s): 2066 - Accelerating System
per Node (Wednesday, Sept 22, 15:00) Level Signal Integrity Simulation
(Thursday, Sept 23, 16:30)
O’Brien, Kevan
(NVIDIA) Pande, Vijay
Director, Folding@home Distributed Computing Project
Digital filmmaking is on the cusp of another revelation, (Stanford University)
revolutions where Kevan has either helped be a catalyst
SPEAKERS AND

or a torch bearer burning the books old rules. In his Prof. Pande is currently an Associate Professor of
23 year history in filmmaking, computer technology Chemistry and (by courtesy) of Structural Biology and of
PANELISTS

has evolved at an alarming rate enabling consumers Computer Science at Stanford University. Prof. Pande
and professionals to harness the power of desktop received a BA in Physics from Princeton University
workstations. With the advent of CUDA’s mark in the in 1992 and PhD in physics from MIT in 1995. Prof.
HD and film making, the next logical step is Stereo 3D Pande’s current research centers on the development
on the desktop and Kevan’s unique vision and insight and application of novel grid computing simulation
will bring relevance to any audience as he is still a techniques to address problems in chemical biology. In
commissioned filmmaker in his own right. Kevan particular, he has pioneered novel distributed computing
currently works in the Quadro group at NVIDIA hoping methodology to break fundamental barriers in the
the IRS won’t catch up with him too quickly as he’s spent simulation of kinetics and thermodynamics of proteins
most of his salary on large cups of Starbucks coffee. and nucleic acids.
hh Session(s): 2279 - Working Man’s Guide to 3D hh Session(s): 2007 - Folding@home:
Video Editing (Thursday, Sept 23, 16:00) Petaflops on the Cheap Today; Exaflops
Soon? (Thursday, Sept 23, 11:00)
hh 2222 - Working Man’s Guide to 3D Video
Editing (Tuesday, Sept 21, 14:00)
Pappas, Jack
Co-founder, CEO (TidePowerd)
Obukhov, Anton
Developer Technology Engineer (NVIDIA) Mathematician, entrepreneur, “yankee”. Put another
way: I grew up in Princeton, NJ, then attended
Anton Obukhov is a Developer Technology Engineer at
the University of Alabama to study Mathematics.
NVIDIA Corporation since 2008. His field of interests
Coincidentally, this is also where I met Nick,
include computer vision, video encoding, and multimedia
TidePowerd’s co-founder, CFO, and resident
processing. He graduated from Moscow State University
Frenchman. Born from my love of math and hatred
in 2008 with a Masters Degree in Computer Science
of C programming, TidePowerd was accepted as
from the Computational Mathematics and Cybernetics
the first-ever participant in Red Gate Software’s
Department in Russia. Before joining NVIDIA, he
“Springboard” startup incubator. Since then, we’ve
conducted research and development in the Graphics
built our first tool, GPU.NET, which makes GPU
and Multimedia Lab at Moscow State University while
computing easier than ever!
also working at YUVsoft Corporation.
hh Session(s): 2294 - GPU.NET with has worked on both a creative R&D team as well as
TidePowerd (Wednesday, Sept 22, 17:00) shipped several game titles at EA. Before that he worked
as a consultant while pursuing his M.Sc. in real-time
Patel, Sandeep GPU volume rendering. Eric enjoys developing real-
Assistant Professor (University of Delaware) time algorithms in traditional rendering pipelines and
volume renderers. Some of his latest work includes
hh Session(s): 2035 - Simulations of Large two chapters in the upcoming book GPU Pro 2, on skin
Membrane Regions (Wednesday, Sept 22, 11:30) shading and pixel shader amoritization.
Patney, Anjul hh Session(s): 2235 - Advanced Medical
Graduate Student (University of California, Davis) Volume Rendering and Segmentation on
the GPU (Tuesday, Sept 21, 15:00)
Anjul is a third year PhD student in the Department
of Electrical and Computer Engineering at University Peterfreund, Natan
of California, Davis. He works under the guidance of CTO (Playcast Media Systems)
Prof. John Owens in the area of graphics and computer
architecture. Anjul is interested in pursuing hardware Dr. Natan Peterfreund is a world-renowned expert with
and software challenges in the design of programmable more than 20 years of research experience in video
rendering architectures. Before UC Davis, he received and image processing technologies. Prior to founding
his B.Tech. in Electrical Engineering from Indian Playcast, Dr. Peterfreund was Chief Scientist, Video
Institute of Technology, Delhi in 2007. Technologies in the DSP Group [NASDAQ: DSPG].
While serving as a principal scientist in Harmonic
hh Session(s): 2162 - Real-time Reyes: [NASDAQ: HLIT], he was one of the authors of the
Programmable Rendering on Graphics H.264 compression standard. Dr. Peterfreund holds a
Processors (Wednesday, Sept 22, 17:00) D.Sc degree in EE from the Technion Israel Institute of
Technology.
Peddie, Jon
President (Jon Peddie Research) hh Session(s): 4004 - Emerging Companies:
CEO on Stage featuring Cooliris, empulse
Dr. Jon Peddie is one of the pioneers of the graphics GmbH, and Playcast Media Systems
industry. After the successful launch of several (Wednesday, Sept 22, 16:00)
graphics manufacturing companies, Peddie began JPA
in 1984 to provide comprehensive data, information Peters, Alan
and management expertise to the computer graphics CTO (Universal Robotics, Inc.)
industry. In 2001 Peddie left JPA and formed Jon Peddie
Research (JPR) to provide customer intimate consulting Richard Alan Peters II is an Associate Professor of
and market forecasting services. Peddie lectures at Electrical Engineering at Vanderbilt University and the
numerous conferences on topics pertaining to graphics Chief Technology Officer of Universal Robotics, Inc (UR).
technology and the emerging trends in digital media During the years 2000-2008 he was a visiting researcher
technology, is the author of several books on computer on the NASA-JSC Robonaut Project. For Robonaut, he
graphics, and was named one of the most influential developed a multimodal short-term memory system
analysts, he is frequently quoted in trade and business and a sensory-motor coordination (SMC) control system
publications, and contributes articles to numerous that learns tasks from teleoperation. This technology
publications. led to the development of UR’s adaptive control
systems that learn SMC to enable industrial robots to
hh Session(s): 4001 - Emerging Companies: work in uncertain, dynamic environments. He is a Phi
CEO on Stage featuring Elemental Beta Kappa graduate of Oberlin College (Ohio) where
Technologies, Inc., Geomerics, and he received an A. B. in Mathematics (May 1979). He

SPEAKERS AND
Milabra (Wednesday, Sept 22, 11:00) received an M.S. (1985) and a Ph.D. (1988) in Electrical
hh 4002 - Emerging Companies: CEO on Stage Engineering from the University of Arizona, where he

PANELISTS
featuring Allegorithmic SAS, Bunkspeed, was a fellow of the American Electronics Association. He
and miGenius (Wednesday, Sept 22, 14:00) is the author of more than 70 scientific papers and holds
hh 4003 - Emerging Companies Summit Panel: GPUs four US patents. His research interests include sensory-
for Computer Vision (Wednesday, Sept 22, 15:00) guided robotics, computer vision, image processing,
embedded systems, and mathematics.
Pedersen, Chris hh Session(s): 2091 - The GPU in the Reactive Control
Market Development Manager (NVIDIA) of Industrial Robots (Thursday, Sept 23, 16:00)
Chris studied electrical engineering at BYU, but has
spent most of his career as an intrapranuer — helping Peters, Amanda
start new businesses in large companies. He began PhD Candidate (Harvard University)
his career at Hewlett Packard where he helped Amanda received a M.S. degree in Computer Science
start businesses in communications testing, video from Harvard University and a bachelor’s degree from
servers, consumer PC’s, handheld devices and digital Duke University in both physics and computer science.
entertainment products and services. He’s also created She spent three years working at IBM on the Blue Gene
startup businesses and worked as an industry analyst supercomputers where her job responsibilities included
/ consultant. Today he works with ISVs to develop applications porting and optimizing as well as system
compelling mobile application that run on NVIDIA Tegra- performance analysis. She is currently pursuing a PhD
powered devices. in Applied Physics at Harvard University with a primary
FULL CONFERENCE GUIDE 2010

hh Session(s): 2019 - GPU-Accelerated Internet research focus on computational fluid dynamics.

Technologies & Trends (Tuesday, Sept 21, 14:00) hh Session(s): 2163 - Leveraging
GPUs for Evolutionary Game Theory
Penner, Eric (Wednesday, Sept 22, 10:00)
Research Associate (Hotchkiss Brain Institute, University
of Calgary, Canada) Peters, David
Eric Penner is a rendering engineer at Electronic Arts Founder and CEO (Universal Robotics)
and research associate at the Imaging Informatics lab David Peters has overseen operations at Universal
at the University of Calgary. For the last few years Eric Robotics since its inception, raising capital, supervising
121
engineering and leading marketing initiatives. Under Pien, Homer
David’s leadership, Universal has developed innovative Director of the Laboratory for Medical Imaging and
products that surpass current technologies while Computations (MGH / HMS)
building a business ecosystem, which includes channel Homer Pien is the Director of the Laboratory for
partner, Yaskawa, the leading producer of industrial Medical Imaging and Computations, in the Department
robots. In the entertainment industry as a producer of Radiology at Massachusetts General Hospital and
for 17 years, David facilitated the development and Harvard Medical School. Dr. Pien’s research focuses on
production of feature films, managing complex financing the use of computations to enable new clinical imaging
and intellectual property affairs. David has a history of applications, for cardiovascular disease, trauma, and
completing complex projects within budget, resulting cancer.
in profits to investors, and garnering status as a
Completion Bond approved executive. hh Session(s): 2282 - GPU-Enabled Biomedical
Imaging (Wednesday, Sept 22, 17:00)
hh Session(s): 4008 - Emerging Companies:
CEO on Stage featuring ICD and Universal Pierce, Dan’l
Robotics (Thursday, Sept 23, 11:00) Partner (Access Analytics Int’l, LLC)
Pfister, Hanspeter Dan’l Pierce is a principal partner of Access Analytics
Professor (Harvard University) Int’l, LLC. Dan’l has also served as Director of Business
and Market Development for Cray, WW Lead HPC
Hanspeter Pfister is Gordon McKay Professor of the Scientist for Intel, Senior Vice President at Washington
Practice in the School of Engineering and Applied Mutual, and in technical and management roles at The
Sciences at Harvard University. His research lies at the Boeing Company. Dan’l received his PhD in Applied
intersection of visualization, computer graphics, and Mathematics from North Carolina State University, and
computer vision. Dr. Pfister has a Ph.D. in Computer his MS and BS degrees in Mathematics and Computer
Science from the State University of New York at Stony Science from Northern Illinois University.
Brook and an M.S. in Electrical Engineering from
the Swiss Federal Institute of Technology, Zurich, hh Session(s): 2213 - BCSLIB-GPU:
Switzerland. Significant Performance Gains for
CAE (Thursday, Sept 23, 15:00)
hh Session(s): 2262 - CUDA Centers of Excellence
Super-Session I (Tuesday, Sept 21, 14:00) Pimentel, Ken
hh 2281 - Domain-Specific Languages Director, Media & Entertainment (Autodesk)
(Wednesday, Sept 22, 15:00) Ken Pimentel is responsible for product strategy for
3ds Max, Showcase and other solutions in Autodesk’s
Phillips, Everett Media and Entertainment Division. Prior to Autodesk,
Applied Engineer - GPU Computing (NVIDIA) Ken worked at Engineering Animation and helped major
is an Engineer in the Tesla Performance Group at manufacturers incorporate advanced visualization
NVIDIA where he works in the area of GPU Computing, techniques to reduce the need for physical prototypes.
HPC, and Clusters. He holds an M.S. in Mechanical Before EAI, Ken helped start Sense8, one of the first VR
and Aeronuatical Engineering from the University of companies. Ken holds a BS EE degree from UC Davis
California, Davis. and is coauthor of “Virtual Reality: Through the New
hh Session(s): 2057 - CUDA-Accelerated LINPACK Looking Glass”.
on Clusters (Tuesday, Sept 21, 14:00) hh Session(s): 2165 - Rendering Revolution
(Tuesday, Sept 21, 11:00)
Phillips, James

SPEAKERS AND
Senior Research Programmer (University of Illinois) Pinskiy, Dmitriy
James Phillips is a Senior Research Programmer in the Sr Software Engineer (Walt Disney Animation Studios)

PANELISTS
Theoretical and Computational Biophysics Group at the Senior Software engineer at Walt Disney Animation
Beckman Institute for Advanced Science and Technology Studious; designed and developed CG animation tools
at the University of Illinois at Urbana-Champaign. He for feature animation movies Chicken Little, Meet the
has a Ph.D. in Physics from the University of Illinois. Robinsons, Bolt, Tangled. Prior to that, working at
Since 1999, James has been the lead developer of the Alias|Wavefront (now Autodesk) on Maya, an award
highly scalable parallel molecular dynamics program winning software, implemented various animation
NAMD, for which he received a Gordon Bell Award in deformers as well as modeling / sculpting tools.
2002. His research interests include improving the Published a number of computer graphics research
performance and accuracy of biomolecular simulations papers (as recent as Eurographics 2010) and presented
through parallelization, optimization, hardware SIGGRAPH talks.
acceleration, better algorithms, and new methods. hh Session(s): 2284 - GPU implementation
hh Session(s): 2054 - NAMD, CUDA, and Clusters: of Collision-Based Deformation
Taking GPU Molecular Dynamics Beyond (Wednesday, Sept 22, 17:30)
the Desktop (Thursday, Sept 23, 14:00)
Pinto, Nicolas
Phung, Huynh PhD Student (MIT)
Research Engineer (A*STAR Institute of High Nicolas Pinto is a PhD Student in Computational
Performance Computing) Neuroscience at MIT. He is currently a member of the
FULL CONFERENCE GUIDE 2010

Huynh Phung received a Ph.D. degree in Computer DiCarlo Lab at MIT, the Sinha Lab for Vision Research at
Science from School of Computing, National University MIT, and the Cox Visual Neuroscience Group at Harvard/
of Singapore. Currently, Huynh is working as a research Rowland. His research interests lie at the intersection of
engineer at A*STAR Institute of High Performance Brain and Computer Sciences.
Computing, Singapore. hh Session(s): 2204 - Bridging GPU Computing
hh Session(s): 2109 - Migration of a Complete and Neuroscience to Build Large-
3D Poisson Solver from Legacy Fortran Scale Face Recognition on Facebook.
to CUDA (Wednesday, Sept 22, 10:30) (Wednesday, Sept 22, 14:00)
123
hh 2176 - Easy GPU Meta-programming: A Case product manager and software engineer. He holds a
Study in Biologically-Inspired Computer BA in Computer Science from Willamette University
Vision (Thursday, Sept 23, 10:00) and completed the Japan Studies Program at Tokyo
International University. Outside of work, Will learns
Pissoort, Davy something new every day, usually from his two kids. He
Professor (KHBO-FMEC) enjoys hiking, camping, swimming, spending time with
Davy Pissoort was born in 1978. He received theM.S. his wonderful wife, and playing The Game.
and Ph.D. degrees in electrical engineering from hh Session(s): 2004 - Languages, APIs and
Ghent University, Ghent, Belgium, in 2001 and 2005, Development Tools for GPU Computing (Pre-
respectively. From October 2005 to October 2006, he was Conference Tutorial) (Monday, Sept 20, 13:00)
a Postdoctoral Researcher with the Fund for Scientific
Research-Flanders (Belgium) (FWO-Vlaanderen) at Rasmusson, Allan
the Department of Information Technology, Ghent PhD Candidate (University of Aarhus (NVIDIA intern))
University. In November 2006, he joined the Eesof-EDA Works within the field of Quantitative Tissue Analysis,
Department, Agilent Technolgies, Ghent, Belgium, as primarily with optimizing the process of counting and
a Research Engineer. Since August 2009, he is with the measuring cells in microscopic images of histological
KHBO, Belgium where he is also head of the Flanders’ tissue sections using 3D visualization and medical
Mechatronics Engineering Centre. His current research imaging computations.
interests include the development of fast and efficiënt
electromagnetic simulation methods, electromagnetic hh Session(s): 2021 - Efficient Volume Segmentation
compatibility, as well as the analysis and testing of on the GPU (Wednesday, Sept 22, 17:00)
the mechanical and thermal reliability of electronic
modules. Raue, Kristian
CEO and Founder (Jedox Business Intelligence)
hh Session(s): 2080 - Tackling Multi-Gigabit
Design Challenges with a Practical Virtual 2002 - today: CEO & Founder, Jedox Business
EMI/ESD Lab (Wednesday, Sept 22, 15:00) Intelligence
1991 - 2000: CEO & Founder, IntelliCube AG
Prevrhal, Sven 1989 - 1991: Management Consultant
Staff Research Scientist (Philips) 1982 - 1989: Technical University of Darmstadt (double
major in engineering and business administration)
PhD in Physics from Technical University Vienna, Austria
1997: work on medical imaging of Osteoporosis hh Session(s): 4005 - Emerging Companies:
Faculty Member University of California, San Francisco, CEO on Stage featuring Jedox Business
until 2009: work on Computed Tomography image Intelligence, Rocketick, and Softkinetic
reconstruction. Now with Philips Medical Imaging, San (Wednesday, Sept 22, 17:00)
Jose: work on Positron Emission Tomography image
reconstruction. Reid, Ian
Chief Commercial Officer (NAG)
hh Session(s): 2211 - Modern Architecture for
Massively Parallel Medical Tomographic As Chief Commercial Officer for the NAG Group, Ian
Image Reconstruction on a GPU Cluster has responsibility for driving all aspects of commercial
(Wednesday, Sept 22, 15:00) strategy. Ian has been with NAG for over 20 years and
has held various technical and commercial positions
Price, Daniel within the company during this time. Most recently he
Engineer (EM Photonics, Inc.) was Vice President for Business Development and he
continues to lead a worldwide team with responsibility
Dan Price is a member of the Accelerated Computing
SPEAKERS AND

for developing strategic partnerships with software and

Solutions group at EM Photonics. After receiving an hardware organisations.
MSEE from the University of Delaware, he joined EM
PANELISTS

Photonics to work on accelerating computationally hh Session(s): 2063 - Banking on Monte Carlo…

intense problems using commodity hardware platforms. and Beyond (Thursday, Sept 23, 15:30)
His research has included the implementation
of computational electromagnetic algorithms on Reil, Torsten
GPUs, image processing algorithms for atmospheric CEO (NaturalMotion Ltd)
compensation, and most recently dense linear algebra Torsten Reil is co-founder and CEO of NaturalMotion.
solvers using CUDA. He holds a BA in Biology from Oxford University and an
hh Session(s): 2154 - The Impact of Data Movement MSc in Evolutionary and Adaptive Systems from Sussex
on GPU Performance (Wednesday, Sept 22, 16:00) University. Prior to founding NaturalMotion, Torsten was
researching for a PhD in Complex Systems at Oxford
Pryor, Gallagher University’s Zoology department, from which he spun off
VP of Research (AccelerEyes) NaturalMotion. Torsten has been named amongst MIT’s
TR100 global top innovators, Next-Gen’s 25 People in
Gallagher Pryor is VP of Research at AccelerEyes and the Games Industry, and Develop magazine’s 25 Game
a cofounder. He is the inventor of GPU computing in Changers.
MATLAB. He holds degrees in Computer Science from
Georgia Institute of Technology. hh Session(s): 4010 - Emerging Companies: CEO on
Stage featuring NaturalMotion Ltd, OptiTex, and
hh Session(s): 2268 - Think Data-Parallel! Building Useful Progress (Thursday, Sept 23, 15:00)
Data-Parallel Code with M (Tuesday, Sept 21, 15:30)
Reyna, Nabor
Ramey, Will Graduate Student(Rice University)
Sr. Product Manager, GPU Computing (NVIDIA)
Nabor Reyna Jr. is a third year graduate student in the
As NVIDIA’s Senior Product Manager for GPU Computational and Applied Mathematics Department
Computing, Will helps define and promote platforms, at Rice University. Reyna is currently finishing up
libraries and developer tools for the CUDA architecture. his Masters work in Compressive Sensing (CS). His
Prior to joining NVIDIA in 2003, he managed an interests consists of high performance computing,
independent game studio and developed advanced image processing as well as face recognition. Reyna is
technology for the entertainment industry as a also a member in organizations such as ACM, SIAM and
SACNAS.
hh Session(s): 2300 - High-Performance Compressive Rollin, Philippe
Sensing using Jacket (Wednesday, Sept 22, 10:30) Applied Engineer (NVIDIA)
Philippe Rollin is currently a Graphics Application
Richardson, Alan Engineer in the Professional Solution Group at NVIDIA,
Graduate Student (M.I.T) where he is investigating new tools and technologies to
BA in Theoretical Physics from Trinity College Dublin, bring the best out of their products. In the past, Philippe
MSc in High Performance Computing from EPCC at the was the Technical Lead in the Developer Tools group
University of Edinburgh. Currently pursuing a PhD in at NVIDIA, working on a full featured realtime shader
Geophysics at MIT. authoring and debugging environment FX Composer
2.x. Philippe graduated with an MS in Information
hh Session(s): 2167 - Designing a Geoscience Technology from EPITECH, Paris.
Accelerator Library Accessible from High Level
Languages (Wednesday, Sept 22, 17:00) hh Session(s): 2227 - OpenGL 4.0 Tessellation for
Professional Applications (Tuesday, Sept 21, 15:00)
Rios, Joseph
Research Aerospace Engineer (NASA) Rossbach, Christopher
Researcher (Microsoft Research)
Joseph is a researcher at NASA Ames Research Center.
His work focuses on computational issues associated Chris Rossbach is a researcher at Microsoft Research
with Traffic Flow Management in the National Airspace Silicon Valley. He received his Ph.D. in Computer
System. He completed his B.A. in Mathematics from Science from the University of Texas at Austin, and his
UC Santa Cruz before teaching high school in the Peace B.S. in Computer Systems Engineering from Stanford
Corps and in California. Upon returning to school, he University. His research focuses on operating systems
received his M.S. in computer science from Cal State and architecture, and emphasizes the development
Hayward and his Ph.D. in Computer Engineering from of better tools for managing and taking advantage of
UC Santa Cruz. concurrency.
hh Session(s): 2214 - Faster Simulations of the hh Session(s): 2124 - Operating System Abstractions
National Airspace System (Tuesday, Sept 21, 11:00) for GPU Programming (Thursday, Sept 23, 10:00)

Roberts, Mike Rubin, Eri

Research Assistant (Hotchkiss Brain Institute, University Senior CUDA R&D developer (OptiTex)
of Calgary, Canada) Eri Rubin is a Senior CUDA R&D Developer. From
Mike Roberts is currently pursuing his MSc in Computer 2006 to present, he is the senior R&D Developer and
Science at the University of Calgary in Canada. Mike is lead developer on 3 products: OptiTex 2D/3D, CAD/
also researching interactive level set segmentation of CAM, Fashion & Textile. He is also the head of GPGPU
large medical data sets. Recently, he published a paper projects, porting an implicit cloth solver engine to CUDA;
on his research at the ACM SIGGRAPH/Eurographics 3D engine architecture work; developing CG Tools for
Conference on High Performance Graphics 2010.He also Clothes Design within a 3D interface; and developing
did an internship at NVIDIA during the summer of 2009 Maya, MentalRay projects and plugins.
and helped to develop the world’s first IDE-integrated hh Session(s): 2246 - The challenges of integrating
native GPU debugger entitled NVIDIA Parallel Nsight. CUDA engines into an existing package, yet not
hh Session(s): 2235 - Advanced Medical sinking the boat (Wednesday, Sept 22, 14:00)
Volume Rendering and Segmentation on
the GPU (Tuesday, Sept 21, 15:00) Ruge, Thomas

SPEAKERS AND
Software Manager (NVIDIA)
hh 2236 - A Work-Efficient GPU Algorithm for Level
Set Segmentation (Thursday, Sept 23, 09:00)) hh Session(s): 2024 - NVIDIA Acceleration

PANELISTS
Engines Overview (Pre-Conference
Robison, Austin Tutorial) (Monday, Sept 20, 13:00_
Lead Developer, OptiX integration (NVIDIA)
Austin Robison is a Research Scientist at NVIDIA Sakharnykh, Nikolai
working as part of the OptiX team on GPU ray tracing. Developer Technology Engineer (NVIDIA)
His research interests include high performance Nikolai Sakharnykh is a developer technology engineer
ray tracing, physically-based rendering and hybrid at NVIDIA. He has worked with game developers
rendering. Austin holds a B.S. in Computer Science providing support for graphics technology content.
from the University of Chicago and an M.S. in Computer Recently he focused on GPU compute and CUDA.
Science from the University of Utah. Currently he is working on CFD-related projects and
hh Session(s): 2250 - GPU Ray Tracing Exposed: supporting CUDA customers. His interests include
Under the Hood of the NVIDIA OptiX Ray computational fluid dynamics, sparse matrix solvers
Tracing Engine (Tuesday, Sept 21, 17:00) and visualization techniques. Nikolai graduated with
honours from Moscow State University, the department
Rogan, Aaron of Computational Mathematics and Cybernetics as a
Research Scientist and System Adminstrator (Neva Ridge specialist in applied mathematics and informatics.
Technologies) hh Session(s): 2015 - Efficient Tridiagonal
I currently work for Neva Ridge Technologies where Solvers for ADI methods and Fluid
FULL CONFERENCE GUIDE 2010

I specialize in radar image processing. Over the past Simulation (Tuesday, Sept 21, 14:00)
year and half I have transitioned legacy code to run on a
GPU which has resulted in performance improvements Salian, Satish
anywhere from 20x to 400x depending ont he algorithm. Manager CUDA Debugger Tools (NVIDIA)
My interests are in GPGPU programming, image Satish Salian is the Manager for CUDA debugger tools
processing, algorithm development and numerical at NVIDIA, where he is responsible for the strategy,
simulations. direction and development of CUDA tools and support for
hh Session(s): 2003 - Using CUDA to Accelerate Radar CUDA developers. He joined NVIDIA in 2001 and before
Image Processing (Thursday, Sept 23, 15:00) moving to CUDA was responsible for the development of
125
NVIDIA’s Graphics tools and NVAPI SDK. Satish received University in 1974. He is Swanlund Professor of
his Bachelors degree in Computer Engineering from Physics and is also affiliated with the Department of
University of Pune, India. Chemistry as well as with the Center for Biophysics and
hh Session(s): 2002 - CUDA Debugging on Linux and Computational Biology. Professor Schulten is a full-time
MacOS with cuda-gdb (Thursday, Sept 23, 10:00) faculty member in the Beckman Institute and directs
the Theoretical and Computational Biophysics Group at
Sanders, Jason the University of Illinois Urbana-Champaign, IL. Honors
Senior Software Engineer (NVIDIA) and awards: Award in Computational Biology 2008;
Humboldt Award of the German Humboldt Foundation
Jason Sanders is a senior software engineer in the (2004); University of Illinois Scholar (1996); Fellow of the
CUDA Platform Group at NVIDIA. While at NVIDIA, he American Physical Society (1993); Nernst Prize of the
helped develop early releases of CUDA system software Physical Chemistry Society of Germany (1981).
and co-wrote the book CUDA by Example. Jason received
his M.S. in computer science from the University of hh Session(s): 1002 – Day 2 Keynote
California Berkeley where he published research in with Dr. Klaus Schulten, University of
GPU computing, and he holds a B.S.E. in electrical Illinois at Urbana-Champaign
engineering from Princeton University. Prior to joining
NVIDIA, he previously held positions at ATI Technologies, Sheehan, Andrew T.
Apple, and Novell. When he’s not coding or writing Managing Director (Sutter Hill Ventures)
books, Jason is typically in the gym, playing soccer, or Andy Sheehan focuses his investments on internet
shooting photos. software, services and digital media companies.
hh Session(s): 2131 - Introduction to CUDA C (Pre- Andy currently is a director of Buzz Media, Inc., Global
Conference Tutorial) (Monday, Sept 20, 14:30) Liquid Markets, LLC, Grain Communications Group, Inc.,
and Yext, Inc. His prior directorships have included, @
Sarzo, Rudy Road (acquired by Trimble), AllBusiness.com, BakBone
Principal (SMI) Software, Datran Media, Intermix Media and Myspace
(acquired by News Corp.) and ReachLocal. Andy joined
Rudy has been a professional recording and performing the firm in 2007 from VantagePoint Venture Partners,
artist worldwide for over 20 years. As a member of Ozzy where he was a managing director. Previously, he worked
Osbourne’s band, from March 1981 to September 1982, at Alex. Brown & Sons and ABS Capital Partners. Andy
Rudy toured the world in support of the “Blizzard of Ozz” received his BA from Dartmouth College with a degree
and “Diary Of a Madman” records. His bass playing can in English. He earned his MBA in 1985 from the Wharton
be heard on Ozzy’s multimillion selling CD “Tribute” and School.
“Speak of the Devil” CD and DVD.
hh Session(s): 4009 - Emerging Companies
hh Session(s): 2279 - Working Man’s Guide to 3D Summit Panel: The “New Normal” For Building
Video Editing (Thursday, Sept 23, 16:00) Emerging Companies Based On Disruptive
Technologies (Thursday, Sept 23, 14:00)
Scherl, Holger
Computer Scientist (Siemens AG) Shoemaker, Austin
Holger is a computer scientist in the field of medical CTO and Co-Founder (Cooliris)
image processing. He pursues his PhD studies at Austin is CTO and co-founder of Cooliris. Austin was
the University of Erlangen-Nuremberg, Germany, a master’s degree student in Computer Science at
specializing in hardware-accelerated cone-beam CT Stanford University specializing in artificial intelligence,
reconstruction. Since 2007 he is a system architect and stopped out to lead technology and product
in R&D at Siemens Healthcare and focuses on the development for the Cooliris platform. Austin is fluent in
SPEAKERS AND

development and implementation of cone-beam CT Spanish and Mandarin Chinese. Prior to his involvement
reconstruction algorithms on graphics hardware. with Cooliris, Austin worked at Apple Computer for
PANELISTS

hh Session(s): 2096 - High-Speed CT Reconstruction seven years, contributing to product development efforts
in Medical Diagnosis & Industrial NDT in several divisions.
Applications (Tuesday, Sept 21, 11:00) hh Session(s): 4004 - Emerging Companies:
CEO on Stage featuring Cooliris, empulse
Schneider, Jens GmbH, and Playcast Media Systems
Postdoctoral Fellow (King Abdullah University of Science (Wednesday, Sept 22, 16:00)
and Technology)
Dr. Jens Schneider received his MA from RWTH Aachen, Silva, Claudio
Germany in 2003 and his doctorate from Technische Professor (University of Utah)
Universitaet Muenchen in 2009. He is currently a Claudio T. Silva is a professor of computer science
postdoctorate fellow at the King Abdullah University of and a faculty member of the Scientific Computing
Science and Technology (KAUST) where he works in the and Imaging (SCI) Institute at the University of Utah.
Geometric Modeling and Scientific Visualization Center. He coauthored more than 150 technical papers and
His research interests include GPU-based algorithms, eight U.S. patents, primarily in visualization, geometric
Computer Graphics, Scientific Visualization, and GPU- computing, and related areas. He received IBM Faculty
friendly Data Compression. Awards in 2005, 2006, and 2007, and best paper awards
hh Session(s): 2139 - Interactive Histology at IEEE Visualization 2007 and IEEE Shape Modeling
of Large-Scale Biomedical Image Stacks International 2008. He is a member of the ACM,
(Wednesday, Sept 22, 14:00) Eurographics, and IEEE.
hh Session(s): 2248 - Parallel Processing on GPUs at
Schuh, Andrew the University of Utah (Wednesday, Sept 22, 14:00)
Project Manager (University of Illinois)
hh Session(s): 2249 - New Programming Tools Sinclair, Matt
GPU Computing (Wednesday, Sept 22, 10:00) Research Assistant (UW-Madison)
Matt Sinclair is a Ph.D. student at the University of
Schulten, Klaus Wisconsin-Madison in the Electrical and Computer
Professor (University of Illinois, Urbana-Champaign) Engineering Department. He received his B.S. degrees
Klaus Schulten received his Ph.D. from Harvard
in Computer Engineering and Computer Science with HPC machines.
Honors from the University of Wisconsin-Madison in hh Session(s): 2017 - Lessons Learned
2009. His interests lie in processor microarchitecture Deploying the World’s First GPU-Based
and high performance computing. Petaflop System (Tuesday, Sept 21, 15:00)
hh Session(s): 2044 - GRASSY: Leveraging
GPU Texture Units for Asteroseismic Data Spatz, Pierre
Analysis (Wednesday, Sept 22, 15:00) Head of Quantitative Research (Murex SAS)
Pierre joined Murex in 1989 and has a master degree
Snepvangers, Jeroen in computer science and applied mathematics from
President and CEO (RTT USA) ENSIMAG. After various leading positions in the Murex
Jeroen Snepvangers is President and CEO of RTT software development team, Pierre launched the Murex
USA, Inc., a subsidiary of RTT AG, a public company Analytics initiative in 2002.
trading at the Frankfurt Stock Exchange. RTT is the hh Session(s): 2032 - Practical Methods Beyond
largest real-time 3D computer graphics software and Monte Carlo in Finance (Thursday, Sept 23, 10:00)
CGI animation services provider to the Automotive and
Aircraft industries. The company serves its customers in Srinivasan, Savitha
industrial design and digital (3D CGI) marketing. Prior Corporate Venture Partner (IBM)
to joining RTT, Jeroen was a Management Consultant
at Urban Science Applications Inc., a technology- Savitha Srinivasan is a Corporate Venture Partner in
based management consultancy for the automotive IBM’s Venture Capital Group in Corporate Strategy
and financial industries, focusing on optimizing retail where she develops strategic relationships with venture
networks. He obtained his MBA at IMD, Switzerland capitalists and their portfolio companies to leverage
in 2002. He has a Bachelors Degree in Applied external innovation for mutual strategic advantage.
Mathematics from Warwick University, UK. He has an She has nearly 20 years of experience at IBM in
IB, International Baccalaureate from St. Clare’s College, leadership roles addressing the strategic priorities of
Oxford, UK IBM’s Services businesses. She leads the strategic
development of IBM’s Services venture ecosystem,
hh Session(s): 4007 - Emerging Companies: CEO with each of the Global Technology Services business
on Stage featuring Aqumin, RTT, and Scalable units – Strategic Outsourcing, Integrated Technology
Display Technologies (Thursday, Sept 23, 10:00) Services and Managed Business Process Services
by early identification of companies, fostering pilots,
Solano, Lizandro partnerships and contributing to M&A pipeline.
(Iowa State University)
hh Session(s): 4007 - Emerging Companies: CEO
Lizandro Solano-Quinde received his MSc. in Electrical on Stage featuring Aqumin, RTT, and Scalable
Engineering in 2006 at Iowa State University; currently Display Technologies (Thursday, Sept 23, 10:00)
he is a Ph.D. candidate in Computer Engineering at
Iowa State University under the advisement of Dr. hh 4008 - Emerging Companies: CEO on
Brett Bode and Dr. Arun Somani. He is affiliated to the Stage featuring ICD and Universal
Scalable Computing Laboratory at the Ames Laboratory, Robotics (Thursday, Sept 23, 11:00)
Department of Energy. His research interests are in
the fields of High Performance Computing, Computer Stam, Joe
Architecture and Fault Tolerance. Sr. Applications Engineer (NVIDIA)
hh Session(s): 2292 - Implementation of JOSEPH STAM is a Senior Applications Engineer for
High-Order Adaptive CFD Methods on NVIDIA Corporation. He has a focus on computer vision,
GPUs (Thursday, Sept 23, 10:30) video, and image processing applications of Graphics

SPEAKERS AND
Processors for both professional and embedded
Somani, Arun products. Prior to joining NVIDIA in 2007, he worked

PANELISTS
Anson Marston Professor (Iowa State University) in the automotive industry for 12 years on research
and development of imaging hardware and computer
Arun K. Somani is currently Anson Marston vision algorithms for vehicle based vision products.
Distinguished Professor and Jerry R. Junkins Endowed Joe received a B.S. degree in Engineering Physics
Chair Professor of Electrical and Computer Engineering & Computer Science from Hope College in Holland,
at Iowa State University where he first served as David Michigan and an M.S. degree in Electrical Engineering
C. Nicholas Professor during 1997-2002. He earned his from Michigan State University. He is an inventor on
MSEE and PhD degrees in electrical engineering from 82 U.S. patents and several foreign patents, many of
the McGill University, Montreal, Canada, in 1983 and which relate to computer vision software and imaging
1985, respectively. He worked as Scientific Officer for hardware technologies. Joe resides in Holland Michigan
Govt. of India, New Delhi from 1974 to 1982 and as a with his wife and four children.
faculty member at the University of Washington, Seattle,
WA from 1985 to 1997 in electrical engineering and hh Session(s): 2215 - Extending OpenCV with
computer science and engineering departments where GPU Acceleration (Thursday, Sept 23, 10:00)
he was promoted to the level of Professor in September hh 4003 - Emerging Companies Summit Panel: GPUs
1995. for Computer Vision (Wednesday, Sept 22, 15:00)
hh Session(s): 2292 - Implementation of
High-Order Adaptive CFD Methods on Stephan, Philippe
GPUs (Thursday, Sept 23, 10:30) CTO (RMS)
FULL CONFERENCE GUIDE 2010

Philippe Stephan is the Chief Technology Officer of

Southard, Dale RMS. Prior to RMS, he directed product development
Senior Solution Architect (NVIDIA) as the CTO of San Francisco based Moody’s KMV, the
As a Senior Solution Architect with NVIDIA, Dale award winning credit risk analytics vendor. Philippe
assists in the integration of GPUs in HPC and Cloud has also held senior management positions at Internet
computing environments. He was previously with startups and built derivatives risk systems for SocGen
Lawrence Livermore National Lab as the architect of and CA Lazard Financial Product Bank in London.
their visualization hardware solutions and part of the Philippe started his career as a key contributor to the
systems team that deployed and managed their large
127
development of the Eiffel programming language in the high performance computing. Previously, Robert was a
early 1990s. visiting assistant professor at the Stanford University
hh Session(s): 2077 - Catastrophic Risk and until 2005, a postdoc at the caesar research
Management: Fast and Flexible with GPU center in Bonn. He received his doctorate in numerical
Analytics (Wednesday, Sept 22, 17:00) mathematics from the University of Duisburg in 2004.
hh Session(s): 2038 - The Best of Both Worlds:
Stich, Timo Flexible Data Structures for Heterogeneous
Developer Technology Engineer (NVIDIA) Computing (Wednesday, Sept 22, 14:00)
Timo Stich is a Developer Technology Engineer for
NVIDIA Corporation. His focus is on image processing Surati, Rajeev
and compute applications of Graphics Processors. Prior President (Scalable Display Technologies)
to joining NVIDIA, he was a member of the research Cofounder and President Scalable Display Technologies,
staff in Computer Graphics and Image Processing at founder and CTO Flash Communications sold to
the Max-Planck-Institute for Computer Science and Microsoft as a basis for their msn messenger services,
the Computer Graphics Lab of Brunswick University, founder and president photo.net worlds largest
Germany. He received a diploma amateur photographer community, sold to NameMedia.
degree in Computer Science from Mannheim University, Education: SB, SM, Ph.D. MIT in Electrical Engineering
Germany and a Ph.D. degree from the Brunswick and Computer Science. DOE Computational Fellow 1995.
University, Germany. hh Session(s): 2134 - Ultra High Resolution
hh Session(s): 2087 - Fast High-Quality Panorama Displays and Interactive Eyepoint Using
Stitching (Thursday, Sept 23, 14:00) CUDA (Wednesday, Sept 22, 10:00)

Stier, Jochen Tai, Bill

Founder (Geist Software Labs) General Partner (Charles River Ventures)
Jochen Stier holds a Ph.D. in Computer Science from Bill Tai joined CRV in 2002 to lead the firm’s West
the University of Victoria. His research interest are real coast practice. He has funded companies as a venture
time 3D visualization and high performance computing. capitalist since 1991. 16 startups he has backed became
Prior to his Ph.D. studies, Jochen spent several years publicly listed companies, among them 8x8 Inc., Award
as a software engineer and architect in the industrial Software, Network Peripherals, Microtune, Terayon, and
automation and telecommunication industry. In 2007, Transmeta. His focus areas include disruptive enabling
Jochen founded Geist Software Labs to market OpenCL technology, digital media and web based services. His
Studio and provide solutions for high performance current portfolio includes Glyde - a next generation
computing and 3D visualization. commerce platform company, Maxthon - a web browser
hh Session(s): 2148 - Rapid Prototyping that has been downloaded over 430 million times,
and Visualization with OpenCL Studio Nantero – a carbon nanotube memory company, and
(Tuesday, Sept 21, 15:00) Scribd – a social publishing company with over 50M
monthly users, among others. He is a ‘hand on’ early
Stone, Christopher stage VC, having served as Chairman of IPInfusion
Research Scientist (Intelligent Light) (acquired by Access Corp), founded as CEO iAsiaWorks
(IPO in 2000) and founded as President TRADEBEAM,
Dr. Stone received his PhD from the Georgia Institute of a leader in SAAS based Global Trade Management
Technology’s School of Aerospace Engineering in 2003. (acquired by CDC Software). Prior to his VC career, he
He research areas include computational fluid dynamics positioned public offerings of Adaptec, Atmel, Cirrus
and high performance parallel computing. Logic, Dallas Semiconductor, Exar, and Zilog while

SPEAKERS AND
hh Session(s): 2110 - Acceleration of a establishing the semiconductor research practice at
Novel Rotorcraft Wake Simulation Alex. Brown and Sons.

PANELISTS
(Thursday, Sept 23, 10:00) hh Session(s): 4010 - Emerging Companies: CEO
on Stage featuring NaturalMotion Ltd, OptiTex,
Stone, John and Useful Progress (Thursday, Sept 23, 15:00)
Senior Research Programmer (University of Illinois at
Urbana-Champaign) hh 4011 - Emerging Companies: CEO on Stage
featuring Cinnafilm, Inc., Perceptive Pixel, and
John Stone is a Senior Research Programmer in the Total Immersion (Thursday, Sept 23, 16:00)
Theoretical and Computational Biophysics Group
at the Beckman Institute for Advanced Science and Tal, Uri
Technology, and Associate Director of the NVIDIA CUDA CEO (Rocketick)
Center of Excellence at the University of Illinois. Mr.
Stone is the lead developer of VMD, a high performance Uri Tal has 12 years of experience in management,
molecular visualization tool used by researchers all over design and implementation of hardware acceleration
the world. His research interests include molecular technologies. Before founding Rocketick, Uri Managed
visualization, GPU computing, parallel processing, ray a large R&D team that developed FPGA-based
tracing, haptics, and virtual environments. acceleration solutions in the intelligence corps of the
Israeli Defense Forces. And was a system architect in
hh Session(s): 2073 - High Performance Molecular Siliquent / Broadcom. B.Sc. (Summa Cum Laude) and
Simulation, Visualization, and Analysis M.Sc. in Electrical Engineering ,Technion.
on GPUs (Wednesday, Sept 22, 16:00)
hh Session(s): 4005 - Emerging Companies:
FULL CONFERENCE GUIDE 2010

Strzodka, Robert CEO on Stage featuringJedox Business

Senior Researcher (Max Planck Institut Informatik) Intelligence, Rocketick, and Softkinetic
(Wednesday, Sept 22, 17:00)
Robert Strzodka has been the head of the research
group, Integrative Scientific Computing, at the Max Tarui, Kento
Planck Institut Informatik since 2007. His research Researcher (AquaCast Corporation)
focuses on efficient interactions of mathematic,
algorithmic and architectural aspects in heterogeneous Kento Tarui received his Masters of Engineering from
Tokyo Institute of Technology in 2005. He then earned his
Ph.D.in Engineering from Tokyo Institute of Technology
in 2008. Currently, he is a researcher at the Aquacast
129
Corporation. Challenge, a US-Government sponsored autonomous
hh Session(s): 2114 - Cascaded HOG on robot race that took place in 2005. Thrun also pioneered
GPU (Thursday, Sept 23, 16:00) the scientific field of probabilistic robotics, and he
co-invented Google Street View. In recognition of his
Taufer, Michela contributions, Thrun has been elected into the US
Assistant Professor (University of Delaware) National Academy of Engineering and the German
Academy of Sciences. He is an elected fellow of the
My research interests include new algorithms and AAAI, ECCAI, and WTN. Popular Science included
architectures for resource- and time-expensive Thrun in their “Brilliant Ten,” Forbes Magazine in their
applications in computational chemistry, physics, and “E-Gang” members, Scientific American in their list of
biology; effective migration of large scale simulations to 50 world technology and policy leaders, and Fortune
global computing systems based on public resources; selected him as one of the 50 smartest people in tech.
and performance analysis, -modeling and -optimization Wired Magazine awarded Thrun’s robot Stanley the top
of large scale simulations on heterogeneous, distributed spot in the most influential robots of all times. The robot
systems. is now part of a permanent exhibition in the Smithsonian
hh Session(s): 2034 - Reformulating Algorithms Museum of American History. Thrun has authored 11
for the GPU (Wednesday, Sept 22, 11:00) books and over 300 scientific articles.
hh 2035 - Simulations of Large Membrane hh Session(s): 1003 – Closing Keynote
Regions (Wednesday, Sept 22, 11:30) with Dr. Sebastian Thrun, Stanford

Taylor, John Toelke, Jonas

Science and Business Leader (CSIRO) Chief Computational Software Development (Ingrain)
Dr John Taylor is the Science and Business leader for Dr. Tölke’s work is focused on the design,
Computational and Simulation Sciences at CSIRO, implementation and verification of simulation engines
Australia. Dr Taylor has written more than 140 articles for multi-phase flow on Supercomputers and other
and books on computational and simulation science and high performance computing hardware. He holds a PhD
its application to such areas as climate change, global in engineering from Technische Universität München
biogeochemical cycles, air quality, water resources and and a “Venia Legendi” (Habilitation) in computational
environmental policy. His research has addressed local engineering in fluid mechanics from Technische
and global scientific and environmental policy issues. He Universität Braunschweig. Before joining Ingrain,
has extensive experience developing applications on high he provided consulting services in the field of multi-
performance computers. At CSIRO, Dr Taylor has led the phase simulations and worked as director for research
development and deployment of Australia’s and one of software engineering at Exa Corp.
the world’s first CPU+GPU clusters. hh Session(s): 2170 - Lattice Boltzmann Multi-
hh Session(s): 2301 - GPU Cluster Phase Simulations in Porous Media using
Computing: Accelerating Scientific GPUs (Wednesday, Sept 22, 15:00)
Discovery (Thursday, Sept 23, 09:00)
Tombroff, Michel
Thamm, Tom-Michael CEO (Softkinetic)
VP Products (mental images GmbH) Prior to his CEO role at Softkinetic Michel spent 17
Mr. Thamm is the Vice President for Products at mental years in the software industry in the start-up, pre-IPO
images and is responsible for the product management and public stages. He spent 8 years at TIBCO Software,
of all products, in particular of mental ray, mental mill, where his last position was VP Sales, and 7 years at
and RealityServer. In addition, he is managing and Chorus Systems (acquired by Sun Microsystems), where
SPEAKERS AND

coordinating the customer support in cooperation with his last role was Head of Engineering. He received a B.S.
Engineering. Mr. Thamm joined mental images in 1989. EE from University of Brussels and a MSc Computer
PANELISTS

He has led several key projects of mental images, such Science from University of California, Santa Barbara.
as the definition of the extended OBJ format and the hh Session(s): 4005 - Emerging Companies:
integration of mental ray into many of the major CAD CEO on Stage featuringJedox Business
systems. He has studied Mathematics and subsequently Intelligence, Rocketick, and Softkinetic
developed free-form surface algorithms and various 3D (Wednesday, Sept 22, 17:00)
file formats.
hh Session(s): 2014 - Scalable Subsurface Tomov, Stan
Data Visualization Framework (University of Tennessee)
(Wednesday, Sept 22, 17:00) Stanimire (Stan) Tomov is Research Scientist at ICL
and Adjunct Assistant Professor in EECS at UTK. He
Thomason, Lee received Ph.D. in Mathematics from TAMU in 2002
Principal Scientist (Adobe Systems) and held positions at LLNL and BNL. Stan’s research
Lee Thomason is a principal scientist and Flash Player interests are in parallel algorithms, numerical analysis,
architect for Adobe Systems. He prototypes GPU and high-performance scientific computing. He co-leads
technology and leads the GPU development for the Flash the UTK’s CCOE on the development of MAGMA, a new
Player. generation of linear algebra libraries, extending the
hh Session(s): 2060 - GPUs in a Flash: Mapping sequential LAPACK-style algorithms, for highly parallel,
the Flash Animated Software Vector Rendering GPU and multicore heterogeneous architectures.
Model to the GPU (Tuesday, Sept 21, 17:00) hh Session(s): 2138 - Faster, Cheaper,
Better – Hybridization of Linear Algebra
Thrun, Sebastian for GPUs (Thursday, Sept 23, 09:00)
Professor / Distinguished Engineer (Stanford University hh 2263 - CUDA Centers of Excellence Super-
/ Google) Session II (Tuesday, Sept 21, 15:00)
Sebastian Thrun is a professor of computer science
and electrical engineering at Stanford, where he Townsend, Richard
directs the Stanford AI Lab. He is also a distinguished Assistant Professor (University of Wisconsin-Madison)
engineer at Google. Thrun’s team won the DARPA Grand Richard Townsend is a computational astrophysicist
at the University of Wisconsin-Madison, interested in Valich, Theo
the rotation, oscillations, magnetic fields and outflows President (Bright Side Network Inc)
of massive, luminous stars. His recent research has Theo Valich founded Bright Side Network and its
focused on investigating how GPU computing can be subsidiaries, developing products such as next-
brought to bear on the steep data analysis challenges generation automotive user interface, CPU/GPGPU
faced in Asteroseismology. Initial projects showing computational benchmark, proprietary web engine and
particular promise include GPU-accelerated period providing high-end 4K resolution videos in fully digital
searching in non-uniformly sampled data, and fast video production studio, utilizing latest GPU technology
spectrum interpolation by leveraging the untapped developments. Prior to founding Bright Side Network,
capabilites of GPU texturing units. Valich served as GPGPU technology analyst at JPR, CTO
hh Session(s): 2082 - CU-LSP: GPU-based at Provox, Senior Editor at TG Daily and Tom’s Hardware,
Spectral Analysis of Unevenly Sampled as well as Contributing Editor on The Inquirer.
Data (Wednesday, Sept 22, 10:00) hh Session(s): 2303 - Using Tegra to
Solve The Electric Car Power Dilemma
True, Thomas (Tuesday, Sept 21, 14:00)
Applied Engineer (NVIDIA)
hh 2304 - Harnessing the GPU to Accelerate
Tom is an Applied Engineer in NVIDIA’s Professional Automotive Development (Tuesday, Sept 21, 17:00)
Solutions Group where he focuses on the use of GPUs
in broadcast, video and film applications ranging from Vandermersch, Philippe
pre-visualization to post production and live to air. Prior Senior Software Engineer (NVIDIA)
to joining NVIDIA, Tom was an Applications Engineer at
SGI. Thomas has a M.S. degree in Computer Science Philippe Vandermersch joined the CUDA Platform
from the Graphic Lab at Brown University and a B.S. group in 2009, leading the development of the CUBLAS
Degree from the Rochester Institute of Technology. and CUSPARSE Libraries. Before that, Philippe was a
Senior Video architect, working on the NVIDIA Multi-
hh Session(s): 2159 - Programming the NVIDIA Standard Decoder solution (MSDEC). Prior to joining
Digital Video Pipeline with Direct3D (Pre- NVIDIA, Philippe has worked as an Embedded engineer
Conference Tutorial) (Monday, Sept 20, 14:30) at Equator Technologies and as a DSP engineer at
hh 2158 - Programming the NVIDIA Digital Siemens ICN. He is the inventor of two patents, and
Video Pipeline with OpenGL (Pre-Conference holds a Master degree from the Institut National des
Tutorial) (Monday, Sept 20, 13:00) Telecommunications in France.
hh 2161 - NVIDIA Quadro Digital Video Pipeline hh Session(s): 2216 - CUDA Libraries Open
Overview (Tuesday, Sept 21, 16:00) House (Wednesday, Sept 22, 11:00)

Tsingos, Nicolas Varah, Sean

Senior Staff Engineer (Dolby Laboratories) CEO (MotionDSP Inc.)
Nicolas Tsingos leads interactive audio research at Dolby Dr. Sean Varah is the CEO of MotionDSP, having founded
Laboratories. He is actively involved in the development the company in 2005. Previous to MotionDSP, he co-
of next-generation features for Axon, Dolby’s in-game founded Q Media Partners, and served as Director of
voice solution as well as spatial audio solutions for Consumer Technology Investments at Sony Music’s 550
consumer and cinema applications. Digital Media Ventures. He also sourced and led the
hh Session(s): 2042 - Interactive 3D Audio Series A investment in Keyhole Inc., which was acquired
Rendering Systems (Thursday, Sept 23, 11:00) by Google in 2004 and is now Google Earth. Dr. Varah
received a bachelor’s degree from Stanford University

SPEAKERS AND
Tzeng, Stanley and a doctorate from Columbia University.
Graduate Student (University of California, Davis) hh Session(s): 2027 - GPU-Based Image Processing

PANELISTS
Stanley is a 3rd year PhD student at the University of in Military Applications (Thursday, Sept 23, 09:00)
California, Davis working with Professor John Owens.
Over the summer he is working at Microsoft Research Varhol, Peter
in Redmond with the Extreme Computing Group. His HPC Editor (Desktop Engineering Magazine)
research involves GPGPU algorithms, task parallel Peter Varhol is an industry veteran with over twenty
scheduling and alternative rendering pipelines on the years experience as a technology journalist, software
GPU. In his free time he loves to go explore different developer, product manager, and university processor.
places to eat and go on puzzle hunts. Currently he is HPC editor for Desktop Engineering
hh Session(s): 2162 - Real-time Reyes: magazine, and also principal at industry consulting
Programmable Rendering on Graphics firm Technology Strategy Research. He has graduate
Processors (Wednesday, Sept 22, 17:00) degrees in computer science, applied mathematics, and
psychology.
Uzzan, Bruno hh Session(s): 2130 - GPU Computing and a Revolution
CEO & Founder (Total Immersion) in Design Engineering (Tuesday, Sept 21, 11:00)
Bruno oversees operations and business development
for Total Immersion. He is principally responsible Varshney, Amitabh
for building the company’s client roster, including Professor (University of Maryland)
Renault, Peugeot,BMW, Disney, EADS, CBS, Thomson Amitabh Varshney is a Professor of Computer Science
FULL CONFERENCE GUIDE 2010

and SGI Japan. Before establishing Total Immersion, at the University of Maryland at College Park where
Uzzan served as a consultant for Pierre Henri Scacchi he directs the NVIDIA CUDA Center of Excellence. His
and Associates (Price Waterhouse Group).He holds a interests include GPU-based heterogeneous parallel
masters degree in management from the University of computing for computational biology, nano assembly,
Paris Dauphine. plasma physics, climate modeling and several other
hh Session(s): 4011 - Emerging Companies: CEO on applications. Varshney received a NSF CAREER Award in
Stage featuring Cinnafilm, Inc., Perceptive Pixel, 1995 and the IEEE Visualization Technical Achievement
and Total Immersion (Thursday, Sept 23, 16:00) Award in 2004. He is a Fellow of IEEE.
hh Session(s): 2263 - CUDA Centers of Excellence
Super-Session II (Tuesday, Sept 21, 15:00)
131
Vo, Huy
Venkataraman, Shalini Research Assistant (University of Utah)
Applied Engineer (NVIDIA) Huy T. Vo is a PhD student at the SCI Institute, University
Shalini Venkataraman is an applied engineer of Utah, under the supervision of Professor Claudio
withNVIDIA’s professional solutions group where T. Silva. His main research interests are in High
she focuses on using GPU’s to solve graphics and Performance Computing and Visualization Systems. Huy
visualization problems in the medical and oil & gas is currently working on HyperFlow, a parallel streaming
communities. Prior to joining NVIDIA, she was a framework for large-scale visualization where data-
research staff in scientific visualization at several flows can be executed efficiently on clusters of machines
institutions including the Center for Computation and with multiple CPUs and GPUs.
Technology at LSU and in Singapore, at the Institute hh Session(s): 2248 - Parallel Processing on GPUs at
of High-Performance Computing and the Center for the University of Utah (Wednesday, Sept 22, 14:00)
Information-Enhanced Medicine. Her interests include
scalable graphics and display environments, large Volkov, Vasily
volume visualization and higher bit depth rendering . PhD Student (UC Berkeley)
She earned her Master’s degree from the Electronic
Visualization Lab at the University of Illinois-Chicago and Vasily Volkov has contributed to substantial performance
B.Sc from the National University of Singapore. improvements in CUBLAS and CUFFT and has received
NVIDIA Graduate Fellowship in 2008. He is currently a
hh Session(s): 2009 - 4D Visualization and Ph.D. candidate at UC Berkeley.
Analysis of Flow (Tuesday, Sept 21, 17:00)
hh Session(s): 2238 - Better Performance at Lower
Vermes, Domokos Occupancy (Wednesday, Sept 22, 15:00)
Associate Professor (Worcester Polytechnic Insitute)
Vuik, Kees
Domokos Vermes received his MS in Electrical Professor (Delft University Of Technology)
Engineering at the Technische Universität Dresden.
He then earned his Ph.D. in Mathematics at the hh Session(s): 2049 - Deflated Preconditioned
University of Szeged and Doctorate in Mathematics at Conjugate Gradient on the GPU
the Hungarian Academy of Sciences. Currently, he is the (Wednesday, Sept 22, 14:30)
Associate Professor of Mathematics and the Founding
Director of Financial Mathematics Graduate Program
and Laboratory at the Worcester Polytechnic Institute. In Vukicevic, Vladimir
the past, he attended the University of Washington and Principal Engineer (Mozilla Corporation)
Brown University. He specializes in: optimization under Vladimir is a principal engineer at Mozilla, where he
uncertainty, optimal control of Stochastic Proc.,computer works on core browser technology. He is involved in
assisted medical decision making, quantitative finance, adding new capabilities to the web platform for use by
portfolio optimization and risk management, and high- both web content and the Firefox browser, and focuses
performance data analysis. on improving the rich media and graphics capabilities of
hh Session(s): 2111 - Using R for High-Performance the web. His early experiments with 3D in HTML canvas
Data Analysis (Tuesday, Sept 21, 16:30) led to the WebGL standard.
hh Session(s): 2113 - WebGL: Bringing 3D
Vetter, Jeffrey to the Web (Tuesday, Sept 21, 15:00)
Professor / Distinguished R&D Staff Member (Georgia
Tech / Oak Ridge National Lab) Wade, Will
Jeffrey Vetter, Ph.D., has a joint appointment between Business Alliance Manager (HP)
SPEAKERS AND

Oak Ridge National Laboratory (ORNL) and the Georgia Will Wade is the Business Alliance Manager for Hewlett-
Institute of Technology (GT). At ORNL, Vetter is a Packard Company’s Workstation Global Business Unit.
PANELISTS

Distinguished R&D Staff Member, and the founding His responsibilities include working with Intel, AMD, and
group leader of the Future Technologies Group in NVIDIA to develop programs and solutions that benefit
the Computer Science and Mathematics Division. At workstation users. Will joined Hewlett-Packard in 1997
GT, Vetter is a Joint Professor in the Computational as an engineer in Test & Measurement. In 1999 he
Science and Engineering School, where he serves as the transitioned to the Environmental Test Center for the
Principal Investigator of the NSF Track 2D Experimental workstation business and later moved in to a technical
Computing Facility, named Keeneland, for large scale marketing role working with software partners for
heterogeneous computing using graphics processors, workstation applications. He became a Workstation
and of the NVIDIA CUDA Center of Excellence. Product Manager in 2003 where he led the workstation
hh Session(s): 2262 - CUDA Centers of Excellence graphics strategy, and later the entry workstation
Super-Session I (Tuesday, Sept 21, 14:00) business.
hh Session(s): 2233 - Solving Your GPU Computing
Vidal, Antonio M. Needs (Sponsored by HP) (Tuesday, Sept 21, 14:00)
Professor (Universidad Politecnica de Valencia)
Antonio M. Vidal receives his Ph.D. degree in Computer Walker, Ross
Science in 1990, from the Universidad Politecnica de Research Professor (San Diego Supercomputer Center)
Valencia, Spain, where he is currently a full professor. He Ross Walker is a Research Professor at the San Diego
coordinates the project “High Performance Computing Supercomputer Center, an Adjunct Professor in the
on Current Architectures for Problems of Multiple Signal Chemistry and Biochemistry at UCSA and an NVIDIA
Processing”, financed by the Generalitat Valenciana in CUDA Fellow. He runs the Walker Molecular Dynamics
the frame of PROMETEO Program for research groups (MD) Lab leading a team that develops advanced
of excellence. His main areas of interest include parallel techniques for MD Simulations supporting simulations
computing with applications in numerical linear algebra improving drug and biocatalyst design. His work includes
and signal processing. improved Quantum Mechanical, Molecular Mechanical
hh Session(s): 2116 - Real-time Multichannel models and the development of a GPU accelerated
Audio Convolution (Thursday, Sept 23, 10:00) version of the AMBER Molecular Dynamics engine
PMEMD. for electromagnetics simulations. He co-authored the
hh Session(s): 2269 - Bringing GPUs first major text on discontinuous Galerkin methods,
to Mainstream Molecular Dynamics published by Springer in 2008. He is currently an
Packages (Thursday, Sept 23, 10:00) associate professor in the department of Computational
and Applied Mathematics at Rice University.
Wang, Long hh Session(s): 2078 - Shockingly fast and accurate
Associate Professor (Super Computing Center, Institute of CFD simulations (Wednesday, Sept 22, 11:00)
Computer Network Information of CAS)
Dr. Long Wang, he got his Ph.D in Computational Warren, Stephen
Mathematics from AMSS, CAS(Chinese Academy of Snr Linux Software Engineer (NVIDIA)
Sciences), in 2004. Then, he went to department of Stephen Warren is a Software Engineer at NVIDIA,
scientific & engineering computing of Peking University working on VDPAU (Video Decode and Presentation
for postdoc from 2004 to 2006. Now, he is associate API for Unix) and related portions of the Linux graphics
professor in Super Computing Center, Computer driver.
Network Information Center of CAS. His research hh Session(s): 2016 - VDPAU: PureVideo
interests include AMR algorithm, large scale GPU on Unix (Thursday, Sept 23, 15:00)
computing and high performance computing software.
He implemented parallel galatic wind code using 8192 Washbrook, Andy
cores on DeepComp 7000 supercomputing machine in Postdoctoral Research Assistant (University of Edinburgh)
2008. This summer, he held the first international GPU
workshop in Harbin, China (See: gpu-smp.sccas.cn). Dr. Andrew Washbrook is a physicist programmer based
at the University of Edinburgh working for the GridPP
hh Session(s): 2286 - Towards Peta-Scale collaboration. His previous physics research investigated
Green Computation - Applications of the GPU evidence for supersymmetric particle production
Supercomputers in the Chinese Academy of at CERN and he has also been a technical account
Sciences (CAS) (Wednesday, Sept 22, 11:00) manager for a leading enterprise open source software
company. His current research interests include
Wang, Peng investigating emerging computing methods that can be
Developer Technology Engineer (NVIDIA) used to improve the future software framework of the
Peng Wang is a member of the Developer Technology ATLAS experiment.
group at NVIDIA, where he develops algorithms for GPU hh Session(s): 2135 - Processing Petabytes
computing. Dr. Wang received his Ph.D. in Computational per Second with the ATLAS experiment
Physics from Stanford University, where his primary at the Large Hadron Collider at CERN
research was the development of multi-physics codes (Wednesday, Sept 22, 16:00)
for computational fluid dynamical simulations of
astrophysical turbulence. Dr. Wang’s education also Weber, Jason
includes a M.S. in Physics and a B.S. in Scientific Internet Explorer Performance Lead (Microsoft)
Computing from Nankai University, Tianjin, P.R. China.
Jason Weber is an engineering lead on the Microsoft
hh Session(s): 2008 - OpenCL Optimization Internet Explorer team. Jason is focused on ensuring
(Thursday, Sept 23, 14:00) that Internet Explorer 9 is ready for the performance
hh 2006 - Short-Range Molecular Dynamics demands of HTML5 applications, including hardware
on GPU (Wednesday, Sept 22, 14:00) accelerated graphics and compiled javascript. Jason has
been with Microsoft for thirteen years. Before joining
Wang, Xiaowei the Internet Explorer team in 2008, Jason worked on

SPEAKERS AND
(Institute of Process Engineering) projects ranging from Microsoft Office to Visual Studio,
and was a member of Chairman Bill Gates technical
hh Session(s): 2286 - Towards Peta-Scale

PANELISTS
staff.
Green Computation - Applications of the GPU
Supercomputers in the Chinese Academy of hh Session(s): 2274 - Harnessing the Power of the
Sciences (CAS) (Wednesday, Sept 22, 11:00) GPU in Internet Explorer 9 (Tuesday, Sept 21, 16:00)

Wang, Z.J. Weiskopf, Paul

Professor (Iowa State University) Sr. Vice President, Corporate Development (Adobe)
Z.J. Wang, Professor of Aerospace Engineering and As senior vice president of corporate development, Paul
Director of Computational Fluid Dynamics (CFD) Weiskopf leads activities related to Adobe’s strategic
Center at the Iowa State University (ISU), received his planning, alliances, mergers and acquisitions, venture
Ph.D. in Aerospace Engineering from the University of investments and new business initiatives. Weiskopf has
Glasgow in 1990. His research areas include adaptive held this role since May 2008, when he was promoted
high-order methods for the Navier-Stokes equations, from his former role as Adobe’s vice president of
algorithm and flow solver development for structured corporate development. Since joining Adobe in 2005,
and unstructured, overset and adaptive Cartesian grids, Weiskopf has managed Adobe’s activities in the areas
computational aeroacoustics and electromagnetics, of corporate strategy, mergers and acquisitions, and
parallel computing on CPUs and GPUs, geometry venture investing, helping Adobe’s product businesses
modeling and grid generation. identify market expansion opportunities and evaluate
options for building, buying and partnering to deliver
hh Session(s): 2292 - Implementation of the most complete solutions to customers. He played
FULL CONFERENCE GUIDE 2010

High-Order Adaptive CFD Methods on an instrumental role in the $1.8 billion acquisition
GPUs (Thursday, Sept 23, 10:30) of Omniture, Inc., the $3.4 billion acquisition of
Macromedia, Inc., and a number of smaller acquisitions
Warburton, Timothy and venture investments.
Associate Professor (Rice University)
hh Session(s): 4010 - Emerging Companies: CEO
Tim Warburton specializes in devising new algorithms on Stage featuring NaturalMotion Ltd, OptiTex,
for solving partial differential equations. He is a leader
in the development of discontinuous Galerkin methods
133
and Useful Progress (Thursday, Sept 23, 15:00) Winarsky, Norman
hh 4011 - Emerging Companies: CEO on Stage VP Ventures, Licensing, and Strategic Programs (SRI
featuring Cinnafilm, Inc., Perceptive Pixel, and International)
Total Immersion (Thursday, Sept 23, 16:00) Norman Winarsky is SRI’s Vice President of Ventures,
Licensing, and Strategic Programs. As such he is
Whitehead, Nathan responsible for creating SRI’s highest value venture and
CUDA Software Engineer (NVIDIA) license opportunities. He is the creator and founder of
Nathan Whitehead works on the CUDA Platform team. SRI’s venture process, including venture and license
He holds a PhD in Computer Science from the University incubation, seed funding, the EIR program, and Venture
of California, Santa Cruz. Capital engagement. He chairs SRI’s Commercialization
Board and the nVention Board, a partnership with the
hh Session(s): 2216 - CUDA Libraries Open venture capital community that develops early-stage
House (Wednesday, Sept 22, 11:00) investment opportunities.
Williams, David M. hh Session(s): 4007 - Emerging Companies: CEO
PhD Candidate (Stanford University) on Stage featuring Aqumin, RTT, and Scalable
Display Technologies (Thursday, Sept 23, 10:00)
Mr. Williams is a Ph. D. candidate in the Aero/Astro
department at Stanford University under the advisement hh 4008 - Emerging Companies: CEO on
of Professor Antony Jameson. In particular, Mr. Williams Stage featuring ICD and Universal
is interested in developing efficient higher-order solvers Robotics (Thursday, Sept 23, 11:00)
capable of handling real world applications. He focuses
on characterizing fluid flow over complex geometries Witchel, Emmett
under viscous, unsteady, and compressible conditions. Professor (University of Texas at Austin)
hh Session(s): 2079 - A Fast, Scalable High- hh Session(s): 2124 - Operating System Abstractions
Order Unstructured Compressible Flow for GPU Programming (Thursday, Sept 23, 10:00)
Solver (Tuesday, Sept 21, 11:00)
Woolley, Cliff
Williams, Ian CUDA Developer Technology Engineer (NVIDIA)
Director PSG Applied Engineering (NVIDIA) Cliff Woolley is a developer technology engineer
Ian Williams is currently the Director of Applied at NVIDIA focused on enabling high-performance
Engineering within NVIDIA’s Professional Solutions computing on GPUs. He completed his Master of
Group, where he has worked since 2001. He holds Computer Science degree at the University of Virginia in
a BSc in Engineering Science and Technology from 2003, where his research group was among the earliest
Loughborough University (UK) as well as an MBA in academia to investigate the use of GPUs for general
from Pepperdine University (USA). He is a Chartered purpose computing.
Mechanical Engineer with the Institute of Mechanical hh Session(s): 2018 - OpenCL on the GPU (Pre-
Engineers (UK) and has been awarded 7 patents. He is Conference Tutorial) (Monday, Sept 20, 16:00)
also Chairman SPEC/GPC committee which is part of
the Standard Performance Evaluation Corporation. Wu, Ren
hh Session(s): 2279 - Working Man’s Guide to 3D Senior Scientist (HP Labs)
Video Editing (Thursday, Sept 23, 16:00) Dr. Ren Wu is a Senior Research Scientist at HP Labs,
hh 2222 - Working Man’s Guide to 3D Video Palo Alto. His research interests include data-intensive
Editing (Tuesday, Sept 21, 14:00) high-performance computing, massively parallel
algorithms and computational intelligence. In recent
SPEAKERS AND

Wilson, Adam years he has been focusing on GPU acceleration of large

Postdoctoral Fellow (University of Cincinnati) scale analytics, and is well known for his work on GPU-
PANELISTS

Adam Wilson received his Ph.D. in Biomedical accelerated clustering algorithms.

Engineering at the University of Wisconsin in the field hh Session(s): 2069 - GPU-Accelerated Business
of Neuroprosthetics. His research has focused on Intelligence Analytics (Wednesday, Sept 22, 16:00)
developing brain-computer interfaces using implantable
electrodes in humans, and using GPU processing to Wu, Xing
process high-bandwidth, high-channel-count data Research Assistant (North Carolina State University)
real-time. His work has been featured internationally, Yongpeng Zhang is a graduate student in Computer
including NPR, CNN, and Time magazine (for being Science at the North Carolina State University. His
named one of the top 10 inventions of 2009), and was research interests are data-intensive programming
named to Popular Science’s Brilliant 10 Class of 2009. models, cloud computing and GPGPU. He received his
hh Session(s): 2122 - Using GPUs for Bachelor’s and Master’s degree in Beihang University
Real-Time Brain-Computer Interfaces and Drexel University, both in Electrical Engineering.
(Wednesday, Sept 22, 15:00) hh Session(s): 2272 - GStream: A General-
Purpose Data Streaming Framework
Wilton, Richard on GPUs (Thursday, Sept 23, 09:00)
Research Scientist (The Johns Hopkins University)
hh 2026 - MatCloud: Accelerating Matrix Math GPU
Richard Wilton obtained his BS and MD degrees from Operations with SaaS (Tuesday, Sept 21, 17:00)
UCLA. He is interested in astronomical and genomics
research computation at the multi-terabyte and petabyte Yaacovi, Yoram
scale. CTO and General Manager, Technologies (Microsoft Israel,
hh Session(s): 2115 - Modified Smith- R&D Center)
Waterman-Gotoh Algorithm for CUDA Yoram is the CTO and General Manager of Technologies
Implementation (Thursday, Sept 23, 14:00) of the Microsoft Israel Development Center (ILDC). His
hh 2092 - Integrating CUDA into a Large- responsibilities include the ILDC Innovation labs – a
Scale Commercial Database Management greenhouse for breakthrough research and development
System (Wednesday, Sept 22, 11:00) as well as new technologies and business groups,
academia and industry outreach and the technology CST tools.
connection with the Microsoft corporate headquarters. hh Session(s): 2133 - 3D Full Wave EM
Yoram started his career in 1983 at Elbit Computers Simulations Accelerated by GPU
where he programmed an innovative real time data Computing (Thursday, Sept 23, 16:00)
entry and display system for the Air Force. In 1984 he
joined Intel where he worked for 9 years in engineering, Zaspel, Peter
consulting and training roles including work with leading (University of Bonn)
companies in Israel and leading a Unix and X-Windows
development team at Intel in Santa Clara. Peter Zaspelis a research assistant at the Institute
for Numerical Simulation of the University of Bonn,
hh Session(s): 4003 - Emerging Companies Summit Germany. He studied Computer Science and is
Panel: GPU for Computer Vision (Wednesday, Sept now working on his PhD. His research topics are
22, 15:00) computational fluid dynamics, general-purpose
computations on graphics hardware, and visualization.
Yalamanchili, Sudhakar
Professor (Georgia Institute of Technology) hh Session(s): 2083 - GPU Accelerated Solver
for the 3D Two-phase Incompressible Navier-
Sudhakar Yalamanchili earned his PhD degree in Stokes Equations (Wednesday, Sept 22, 16:00)
Electrical and Computer Engineering in 1984 from
the University of Texas at Austin. After several years Zeitlin, Michael
in industry, he joined the faculty of the School of CEO (Aqumin)
Electrical and Computer Engineering at the Georgia
Institute of Technology where he is a Joseph M. Pettit Michael began his career at Texaco as a Senior Scientist
Professor of Computer Engineering. His research in 1980 and became a Texaco Fellow in 1997. He received
interests are in productivity tools for heterogeneous the Carnegie Mellon and American Management
architectures/systems and modeling and simulation of Institute Award for Innovation in Information Technology
high performance many core architectures focusing on in 1998. In 1999 his work was honored with a permanent
performance, energy, and thermal characterization. position in the Archives of the Smithsonian Institution.
He founded Magic Earth, LLC. in 2000 and grew
hh Session(s): 2210 - GPU-Ocelot: An Open operations to profitability in three months. Magic Earth
Source Debugging and Compilation Framework was acquired by Halliburton that same year for $100
for CUDA (Thursday, Sept 23, 14:00) million.
Yang, Yi hh Session(s): 4007 - Emerging Companies: CEO
PhD Student (North Carolina State University) on Stage featuring Aqumin, RTT, and Scalable
Display Technologies (Thursday, Sept 23, 10:00)
Yi Yang is the third year Ph.D student in Department of
Electrical and Computer Engineering at North Carolina Zhang, Yao
State University. He received master degree from Graduate Student (University of California, Davis)
Chinese Academy of Sciences in 2005 and Bachelor
degree from University of Science and Technology of Yao Zhang is a PhD student in the Department of
China in 2002. His research interests include High- Electrical and Computer Engineering at University of
performance computing, General Purpose Computation California, Davis. Zhang received his BS in electrical
on Graphics Processors, code generation and engineering from the Beijing Institute of Technology. His
optimization. research interests are in the area of GPU computing,
especially in parallel algorithms for numerical linear
hh Session(s): 2067 - Experiences with Code algebra, and the GPU architecture/software
Optimizations for High Performance GPGPU optimization.
Programs (Tuesday, Sept 21, 16:00)

SPEAKERS AND
hh Session(s): 2085 - Tridiagonal Solvers: Auto-
Young, Eric Tuning and Optimizations (Tuesday, Sept 21, 15:00)

PANELISTS
Manager of Developer Technology Profesional and
Consumer Applications (NVIDIA) Zhang, Yubo
PhD Student (UC Davis)
Eric Young manages the developer technology group
responsible for professional and consumer developers. Yubo Zhang is a PhD student supervised by Prof.
He has graduated from Cornell University in 1995 with Kwan-Liu Ma at the department of Computer Science,
a Master of Computer Engineering and University UC Davis. His research interests include numerical
of Michigan in 1994 with a Bachelors in Electrical methods, computer graphics and visualization.
Engineering. hh Session(s): 2145 - Photo Editing on the GPU
hh Session(s): 2260 - DirectCompute (Pre- with MuseMage (Thursday, Sept 23, 09:00)
Conference Tutorial) (Monday, Sept 20, 14:30)
Zhang, Yunquan
Young, Paul Professor (Institute of Software, CAS)
(Adobe) Prof. Yun-quan Zhang is the Associate Director of the
Parallel Computing Laboratory, Institute of Software,
hh Session(s): 2224 - GPU Acceleration in Adobe Chinese Academy of Sciences in Beijing, China. He
Creative Tools (Tuesday, Sept 21, 15:00) received his PhD degree in computer software and
theory from the same institute in 2000 and has worked
Zanella, Fabrizio at the Institute as a research scientist since then.
FULL CONFERENCE GUIDE 2010

Systems Manager (CST of America) His major research interests are in the areas of high
Fabrizio Zanella has over 15 years of experience working performance parallel computing, with particular
on Signal Integrity characterization of high speed digital emphasis on the design of large scale parallel
systems. Mr. Zanella has worked for several companies, computation modes and numerical libraries, and large
including Teradyne and EMC Corporation. Currently, he system performance modeling and evaluation. He has
is the Systems Manager at CST of America, a worldwide published over 90 papers and trained over 20 master and
provider of full wave electromagnetic software. In this Ph.D. students.
role, he leads the high performance computing effort,
advising customers on improving overall peformance of
135
hh Session(s): 2286 - Towards Peta-Scale
Green Computation - Applications of the GPU
Supercomputers in the Chinese Academy of
Sciences (CAS) (Wednesday, Sept 22, 11:00)

Zhao, Kaiyong
Graduate Student (HKBU)
I received my B.Eng. degree in the Aircraft Design and
Technology from Beijing Institute of Technology (BIT),
Beijing, P. R. China, in 2005. Then worked in CCUR
two years. I am currently an MPhil student in the
Department of Computer Science, Hong Kong Baptist
University.
hh Session(s): 2145 - Photo Editing on the GPU
with MuseMage (Thursday, Sept 23, 09:00)

Zhou, Huiyang
Associate Professor (North Carolina State University)
Huiyang Zhou is currently an associate professor in the
Department of Electrical and Computer Engineering at
North Carolina State University. His research focuses on
high performance microarchitecture, low-power design,
architecture support for system dependability, and GPU
Computing. He is a recipient of NSF CAREER award and
a senior member of the IEEE.
hh Session(s): 2067 - Experiences with Code
Optimizations for High Performance GPGPU
Programs (Tuesday, Sept 21, 16:00)

Ziegler, Gernot
Developer Technology (Compute) (NVIDIA)
Gernot Ziegler (MSc/civ.ing.) is an Austrian engineer with
an MSc degree in Computer Science and Engineering
from Linköping University, Sweden. He pursued his PhD
studies at the Max-Planck-Institute for Informatics
in Saarbrücken, Germany, where he specialized in
GPU algorithms for computer vision and data-parallel
algorithms for spatial data structures. As a member of
NVIDIA’s DevTech-Compute team, Gernot now consults
in high performance computing on graphics hardware.
hh Session(s): 2020 - GPU-Accelerated
Data Expansion for the Marching Cubes
Algorithm (Wednesday, Sept 22, 16:00)
SPEAKERS AND

hh 2021 - Efficient Volume Segmentation on

the GPU (Wednesday, Sept 22, 17:00)
PANELISTS

Zigon, Robert
Sr Staff Development Engineer (Beckman Coulter)
Bob is the Software Technical Lead for Flow Cytometry
analysis products within Beckman Coulter. His interests
include high performance computing, numerical
analysis and information retrieval theory.
hh Session(s): 2055 - Application of Fermi
GPU to Flow Cytometry and Cancer
Detection (Thursday, Sept 23, 10:00)
A Powerful Platform for Amazing Performance

Performance. To get it right, you need a foundry with an Open Innovation Platform™ and process technologies that
provides the flexibility to expertly choreograph your success. To get it right, you need TSMC.

Whether your designs are built on mainstream or highly advanced processes, TSMC ensures your products achieve
maximum value and performance.

Product Differentiation. Increased functionality and better system performance drive product value. So you need
a foundry partner who keeps your products at their innovative best. TSMC’s robust platform provides the options you
need to increase functionality, maximize system performance and ultimately differentiate your products.

Faster Time-to-Market. Early market entry means more product revenue. TSMC’s DFM-driven design initiatives,
libraries and IP programs, together with leading EDA suppliers and manufacturing data-driven PDKs, shorten your yield
ramp. That gets you to market in a fraction of the time it takes your competition.

Investment Optimization. Every design is an investment. Function integration and die size reduction help drive your
margins. It’s simple, but not easy. We continuously improve our process technologies so you get your designs produced
right the first time. Because that’s what it takes to choreograph a technical and business success.

Find out how TSMC can drive your most important innovations with a powerful platform to create amazing performance.
Visit www.tsmc.com

30
104 106 108 110 118 120 122 124 126
PNY 31
105 107 109 111 119 121 123 125 127
12 32

11 33

10 37

9 38
74 76 78 80 82 84 86 88 90 92 94 96 98 100
8 39
75 77 79 81 83 85 87 89 91 93 95 97 99 101
7 40

6 41

5 72 73
42
70 71
43
68 69
44

4 66 67
45

3 64 65
46

2 62 63
47

1 60 61 48

59 58 57 56 55 54 53 52 51

HALL 1

Concourse and Research Posters

EXHIBITORS BOOTH # EXHIBITORS BOOTH #
3DreamTeam 8 Mathworks 66
3DTV Solutions 1 Mazda Technologies 41
AccelerEyes 56 Mellanox Technologies 49
Acceleware 52 mental images 74
Ace Computers 76 Micoy Concourse
Allegorithmic 6 Microsoft 87
AMAX Information Technologies 89/91 Microway Inc 105/107
Appro International, Inc. 75 miGenius Limited 32
Aspen Systems 68 Mingleverse 82
Binatix Inc 42 Morgan Kaufmann 44
BioDigital 98 NaturalMotion LTD 94
BOXX Technologies 110 NextComputing 119
Bright Computing 10 NextIO 61/63
Bunkspeed 104 Nvidia Backwall
CIARA Technologies 55 Nvidia Foundation 126
Cinnafilm Inc 109 OptiTex USA Inc 120
Cirrascale Corporation 99 PEER 1 Hosting 100
CodeSourcery 38 Penguin Computing 43
Colfax International 51 Perceptive Pixel 92
Cooliris 37 PhaseSpace Inc 69
Creative Consultants LLC 29/30 Phototour 45
Cubix Corporation 125 Platform Computing 72
CyberLink Corporation 70 Playcast 122
Dassault Systemes 93 PNY 110-119
Dell 106/108 Polywell 67
Discretix 123 Prometech Software, Inc. 4
EM Photonics 54 PSSC Labs 59
embodee 77 RealityFrontier 3
Empulse 39 Reservoir Labs 86
Exxact Corporation 95 RTT USA, Inc 9
Filter Foundry 83 Scalable Display Technologies 47
GE Fanuc Intelligent Platforms 101 ScaleForm Corporation 35
Geomerics 33 SGI 85
Giada 65 Stonetrip 7
GraphStream Incorporated 118 Softkinetic 48
HP 60/62/64 Supermicro 71/73
HPC Project / Wild Systems 5 Synnex Corp 97
Hynix 79/81 Tech-X Corporation 34
IBM 127 The Portland Group 36
Israel Economic Mission 121 Tide Powerd Ltd. 78
James River Technical 57 T-Platform 46
Jedox Business Intelligence 40 Trinity Racing Concepts, LLC 11/12
JMR Elctronics, Inc 111 Universal Robotics 88/90
Koi Computers 84 Useful Progress 58
Los Alamos 2 VisiSonic Corporation 80
Wolfram Research 31
PLATINUM SPONSORS
Adobe Adobe revolutionizes how the world engages with ideas and information.
Our award-winning software and technologies have set the standard for
communication and collaboration for more than 25 years, bringing vital and
engaging experiences to people across media and to every screen in their
lives, at work and at play. The impact of Adobe® software is evident almost
everywhere you look. Whether people are collaborating at work, transacting
online, or socializing with friends, businesses use Adobe software and
technologies to turn digital interactions into richer, high-value experiences
that reach across computing platforms and devices to engage people
anywhere, anytime. With a reputation for excellence and a portfolio of many
of the most respected and recognizable software brands, Adobe is one of the
world’s largest and most diversified software companies.

HP HP is a technology company that operates in more than 170 countries

around the world. We explore how technology and services can help people
and companies address their problems and challenges, and realize their
possibilities, aspirations and dreams. We apply new thinking and ideas to
create more simple, valuable and trusted experiences with technology,
continuously improving the way our customers live and work.

Microsoft Microsoft Corporation is passionate about software innovation and is

making a huge investment into developing products and applications for
the Technical Computing world. Modeling and simulation as well as raw
computational performance is critical for scientists, engineers and analysts
seeking solutions in an ever larger sea of data. Professionals seek to model
an ever increasing scale of scenarios using the most advanced mathematical
functions in the quickest manner possible. However, the path from math
to model to results often takes too long. Microsoft has formed a Technical
Computing organization to enable solutions to this challenge using the full
range of compute platforms; client, cluster and cloud.

Dell Dell’s research and development (R&D) efforts span the globe, driven by
some of the industry’s foremost product designers and engineers. At the
core of Dell’s research and development (R&D) efforts span the globe, driven
by some of the industry’s foremost product designers and engineers. At
the core of Dell’s innovation approach, however, remains an unwavering
commitment to deliver new and better solutions that directly address
customer needs. Many innovations begin in-house, led by a global team of
top engineers, product designers and technical experts. Others begin as
a team effort with Dell’s strategic partners like Nvidia. The mission is to
SPONSORS AND

deliver innovative and cost-effective solutions that meet today’s real-life

customer challenges and work seamlessly in existing environments and with
EXHIBITORS

other products. For more information on Dell’s high performance solutions

please visit www.dell.com/precision or www.dell.com/solutions.

Supermicro Supermicro, the leader in server technology innovation and green

computing, provides customers around the world with application-
optimized server, workstation, blade, storage and GPU systems. Based
on its advanced Server Building Block Solutions, Supermicro offers the
most optimized selection for IT, datacenter and HPC deployments. The
company’s system architecture innovations include Twin server, double-
sided storage and SuperBlade® product families. Offering the most
comprehensive product lines in the industry, Supermicro delivers energy-
efficient solutions with unmatched performance and value. Founded in 1993,
Supermicro is headquartered in Silicon Valley with worldwide operations and
manufacturing centers in Europe and Asia. For more information, visit
www.supermicro.com.
PNY Established in 1985, PNY Technologies®, Inc. is the authorized NVIDIA®
Quadro® channel partner for North America and Europe. PNY provides
unsurpassed service and commitment to its professional graphics
customers offering: 3 year warranty, pre and post sales support, dedicated
Quadro Field Application engineers and direct tech support hot lines. The
company also offers a full line of consumer graphics cards, computer
memory upgrade modules, solid state drives, flash memory cards, USB flash
drives, and HDMI cables. Headquartered in Parsippany, NJ, PNY maintains
facilities in North America, Europe, Asia, and Latin America. For more
information, please visit http://www.pny.com.

Cooley Cooley LLP is a national law firm for the converging worlds of high
technology, high finance and high-stakes litigation. We are counselors,
strategists and advocates for the foremost private and public companies and
investors in all major technology fields. Our Emerging Companies practice
has a long tradition of representing emerging and high-growth companies
worldwide. The GPU space is an exciting growth area in the technology arena,
and Cooley has been at the forefront, advising both established and start-up
companies on the issues facing businesses in this industry. Our attorneys’
extensive experience in intellectual property protection and business
counseling along with the Firm’s deep roots in the technology sector give
us a unique perspective on the issues facing our clients. Cooley’s team
consists of experienced counselors and litigators that are equally skilled at
representing and advising clients on the protection and commercialization
of their intellectual property in a wide range of areas, including copyright,
trademark, patent, technology licensing, privacy, electronic security and
electronic commerce. We are dedicated to offering comprehensive and
creative legal support, utilizing the full resources of the Firm.

Synnex Corp SYNNEX Corporation, a Fortune 500 corporation, is a leading business

process services company, servicing resellers and original equipment
manufacturers in multiple regions around the world. The Company provides
services in IT distribution, supply chain management, contract assembly
and global business services. Founded in 1980, SYNNEX employs over 7,000
associates worldwide and operates in the United States, Canada, China,
Japan, Mexico, the Philippines and the United Kingdom. Our value-added
service model streamlines business processes to help customers across the
globe lower their costs and create greater efficiencies. We provide a variety
of professional and marketing services, including: demand generation,
education and training, pre-and post-sale technical support, end-user
enablement, server assessment, design and integration, recycling and
trade-in, contract design and assembly, and IT resource planning.

SPONSORS AND
Acer Established in 1976, globally Acer ranks No. 2 for total PCs and notebooks.
A profitable and sustainable Channel Business Model is instrumental to the

EXHIBITORS
company’s continuing growth, while its multi-brand approach effectively
integrates Acer, Gateway, Packard Bell, and eMachines brands in worldwide
markets. Overcoming the barriers between people and technology: this
is Acer’s long-term mission, to allow anyone to use and benefit from
technology. Acer is renowned for the development and manufacture of
sophisticatedly, environmentally friendly and intuitively designed, easy to use
products. For further information, please visit the website acer-group.com.

Gateway The Californian company Gateway is a historical brand in the IT market, and
has been a leading company in the field of computers and notebooks for
FULL CONFERENCE GUIDE 2010

nearly 25 years. Now, thanks to the creation of a new structure exclusively

focused on businesses, Gateway enters Europe as a player that is fully
committed to the professional market and that can develop complete and
reliable business solutions for partners dedicated to medium-sized and large
companies. Gateway products are available in Europe only through indirect
distribution channels and, in particular, through its network of selected
Partners. For further information, please visit the website gateway.com
141
TSMC TSMC is the world’s largest dedicated semiconductor foundry, providing
the industry’s leading process technology and the foundry’s largest
portfolio of process-proven libraries, IP, design tools and reference flows.
The Company’s total managed capacity in 2008 exceeded 9 million 8-inch
equivalent wafers, including capacity from two advanced 12-inch - GigaFabs
™
, four eight-inch fabs, one six-inch fab, as well as TSMC’s wholly owned
subsidiaries, WaferTech and TSMC (China), and its joint venture fab, SSMC.
TSMC is the first foundry to provide 40nm production capabilities. Its
corporate headquarters are in Hsinchu, Taiwan. For more information about
TSMC please visit http://www.tsmc.com.

Samsung Samsung Electronics Co., Ltd. is a global leader in semiconductor,

telecommunication, and digital convergence technologies. The company
consists of seven independently operated business units: Visual Display,
Mobile Communications, Telecommunication Systems, Digital Appliances,
IT Solutions, Semiconductor and LCD. The Samsung Semiconductor
Business provides the most extensive range of Flash, DRAM, SRAM and
specialty memories in the world and paces the industry in advancing
memory technology for home, mobile and office environments, for consumer
electronics and networked applications. For more information, please visit
www.samsung.com/GreenMemory.

IBM IBM delivers powerful, innovative solutions to customers’ most challenging

and complex problems, enabling businesses and researchers to innovate,
achieve breakthrough results and establish sustainable competitive
advantage. IBM takes a holistic approach to high performance computing
that involves designing and delivering complete, robust technical solutions
that are easy to acquire and access, environmentally responsible,
competitively priced, and backed by IBM’s world-class support.

GOLD SPONSORS
Next IO NextIO, Inc. is the leader in next-generation network consolidation
solutions for today’s dynamic data center in a variety of industries including
enterprise, oil and gas, High Performance Computing, digital media
and financial services. Leveraging PCI Express, NextIO offers true I/O
consolidation for any end-point technology

Citi Citi is today’s pre-eminent financial services company, with some 200 million
customer accounts in more than 100 countries. Our history dates back to the
founding of Citibank in 1812, Bank Handlowy in 1870, Smith Barney in 1873,
Banamex in 1884, and Salomon Brothers in 1910.
SPONSORS AND
EXHIBITORS

SILVER SPONSORS
GE Intelligent Platforms GE Intelligent Platforms is a leading manufacturer of rugged COTS
computer boards and systems for military programs. As a partner to NVIDIA
for Embedded Applications, GE-IP brings GPGPU technology into a wide
range of defense related programs and can now be used in ground tanks,
fighter aircraft, military helicopters, and UAV’s for Radar, ISR, DSP, Sensor
Processing, Imaging and many other military applications.

AMAX Information Technologies With 31 years as a leading systems manufacturer, AMAX delivers the
most cutting edge GPU and GPGPU computing solutions to solve data-
intensive computing challenges in today’s leading industries. Using an
open-architecture approach optimizing best-of-breed components for
solutions designed to fit precise needs with maximize functionality, superior
performance and power efficiency—this is the AMAX advantage.
SGI SGI is focused on helping customers solve their most demanding technology
challenges by delivering: high performance computing, servers, storage, data
center and cloud computing solutions and professional services. We develop,
market and sell a broad line of low-cost, mid-range and high-end scale-out and
scale-up servers and data storage solutions as well as differentiating software.

Appro Appro is a leading developer of supercomputing solutions for the high

performance computing markets. Appro accelerates technical applications
and business results through balanced architecture, open standards and
engineering expertise. Headquarters is in Milpitas, CA, sales and services
in Houston, TX and global sales and service presence with strategic
international partners.

Sutter Hill Ventures Sutter Hill Ventures has financed technology-based start-ups and
assisted entrepreneurs in building market-leading companies since 1964.
Through our decades of experience, we have developed strong industry
networks, considerable operating and venture capital experience, and
an understanding of the challenges that early-stage and high-growth
companies face.

Hynix Hynix is a leading producer of DRAM, NAND Flash memory and Image
Sensors. Besides computing memory solutions, Hynix is a leader in Graphics
memory with a portfolio of high performance products.The newly introduced,
eco-friendly 1.35V, 44nm 2Gb GDDR5 offers 7Gbps speed targeting high end
graphics and high performance computing.

Dassault Systèmes Dassault Systèmes is a world leader in 3D and Product Lifecycle

Management (PLM) solutions, bringing value to over 100,000 customers in
80 countries. Dassault Systèmes develops and markets PLM software and
services that support industrial processes and provide a 3D vision of the
entire product lifecycle from conception to maintenance.”

Silicon Valley Bank Silicon Valley Bank is the premier commercial bank for companies in the
technology, life science, venture capital, private equity and premium wine
industries. SVB provides a comprehensive suite of financing solutions,
treasury management, corporate investment and international banking
services to its clients worldwide. Through its focus on specialized markets
and extensive knowledge of the people and business issues driving
them, Silicon Valley Bank provides a level of service and partnership that
measurably impacts its clients’ success. Founded in 1983 and headquartered

SPONSORS AND
in Santa Clara, Calif., the company serves clients around the world through
26 U.S. offices and international operations in China, India, Israel and the

EXHIBITORS
United Kingdom. Silicon Valley Bank is a member of global financial services
firm SVB Financial Group (Nasdaq: SIVB), with SVB Analytics, SVB Capital,
SVB Global and SVB Private Client Services. More information on the
company can be found at www.svb.com. Silicon Valley Bank is the California
bank subsidiary and the commercial banking operation of SVB Financial
Group. Banking services are provided by Silicon Valley Bank, a member of
the FDIC and the Federal Reserve System. SVB Private Client Services is a
division of Silicon Valley Bank. SVB Financial Group is also a member of the
Federal Reserve System.
FULL CONFERENCE GUIDE 2010

Deloitte Deloitte LLP’s Technology, Media and Telecommunications (TMT) industry

group helps TMT companies evaluate complex issues, develop fresh
approaches to problems, and implement practical solutions. Clients include
some of the world’s leading and fastest growing semiconductor, software
and computer manufacturing companies, wireless operators, satellite
broadcasters, advertising agencies—as well as leaders in digital media,
gaming, clean tech, publishing and telecommunications.
143
Barco With more than 25 years of experience in 3D, Barco designs and develops
display visualization solutions for a variety of markets, including simulation,
virtual reality, medical, defense, media and entertainment, and traffic,
surveillance and monitoring. A global technology company, Barco (NYSE,
Euronext Brussels: BAR) is active in more than 90 countries and employs
3,300 staff worldwide.

Mandel Communications Inc. Mandel is a provider of communication skills coaching and training to
leading companies worldwide. Since 1993, Mandel has been dedicated
to helping executives, sales, and technical professionals achieve
improved business results through effective business conversations and
presentations, particularly when the stakes are high. Mandel’s proprietary
approach to strengthening presentation, conversation, and facilitation skills
turns spoken communications into a competitive advantage. With a global
presence, Mandel delivers its solutions both face-to-face and virtually in
more than 14 languages.

Emerging Companies
Exhibitors
3DreamTeam 3DreamTeam LLC have created a unique library of photo realistic
3D worlds and images as well as the Vizerra software platform and
ecosystem which allows their in-house team of thirty designers,
software engineers, technical artists or any independent studio
to assemble photo realistically rendered, 3D environments and
marketing content. The Vizerra Platform and Ecosystem will drive
3DreamTeam the content boom creating demand for enterprise and consumer
hardware. The unique benefits of the Vizerra Platform and
Ecosystem is speed to create, cost of production and size of file Vs
quality of rendering.

3DTV Solutions 3DTVSolutions™ conceives and develops cameras and software

suites to create and manipulate real and virtual images in
FullDepth™3D to be viewed without glasses in real time. Our
unique expertise in the complete chain of 3D imaging: Production-
Transmission and Display opens up huge market potential
for industry, healthcare, audio-visual production, and gaming.
3DTVSolutions™ will demonstrate its capabilities in terms of
capture, editing and display of real and virtual 3D images to be
viewed on auto-stereoscopic screens without glasses. These
images come from our unique camera system that films directly in
3D and/or computer-generated images to be mixed and edited to
SPONSORS AND

create films or augmented reality in FullDepth™ 3D.

EXHIBITORS

AccelerEyes AccelerEyes delivers simple software for powerful visual

computing. AccelerEyes’ product, Jacket, is the GPU engine for
MATLAB, enabling MATLAB code to run faster. Jacket is used by
®
customers across all major technical industries. Jacket’s focus is
to make GPU programming easy. Within minutes of downloading
Jacket, the scientist, engineer, or financial analyst is able to
accelerate their MATLAB code.

Acceleware Acceleware provides High Performance Computing software

solutions to the Oil & Gas and Computer Engineering
markets for multi-core CPUs and compute GPUs. Additionally,
our HPC consulting services are utilized by enterprises
needing to harness the power of parallel computing. At
Acceleware the goal is always the same – Go Faster!
Allegorithmic Allegorithmic is the first company to propose a professional toolset
for compression, authoring and on-the-fly rendering of smart
textures. Allegorithmic’s new product range Substance is poised to
redefine the development and distribution of rich content for the next
generation of online and mobile games: Substances are dynamic
and are typically 500-1000 times smaller than regular textures.

Binatix Binatix develops pattern recognition software based on machine

learning algorithms that mimic the human brain.

Biodigital BioDigital is dedicated to using cutting edge biomedical

visualization systems to improve training, communication and
the interpretation of medical information. From 3D animation, to
virtual training environments, to systems that intuitively store and
visualize scientific data, BioDigital’s products and services promise
to revolutionize the way we understand medical subjects.

Bunkspeed Bunkspeed is a leading global provider of 3D rendering and

animation software for the design and creative industry.
Bunkspeed is a private company founded in November of 2002
with the philosophy that 3D rendering software should be easy to
learn, simple to use and produce stunning photographic results.
Headquartered in Carlsbad, California, Bunkspeed’s products
include Bunkspeed Shot™, a “virtual digital camera” with interactive
ray-tracing empowering anyone to quickly create photographic-
quality images, Bunkspeed Move™, a “virtual movie camera”
bringing products to life by quickly creating various types of
animations, and Bunkspeed Drive™, a fully featured visualization
application tuned for the automotive industry. Bunkspeed’s
customers include Frog Design, Pininfarina, Unilever, Rubbermaid,
Nike, Ford Motor Company, Honda, and Tiffany’s. For more
information on Bunkspeed’s products and services, visit
www.bunkspeed.com.

Cinnafilm, Inc. Cinnafilm™ was founded to address the absence of quality,

software-based tools to visually optimize, convert and repurpose
video images in the post production market. Recognizing the power
of modern Graphics Processing Units (GPUs) and the accelerating
migration to file-based workflows, Cinnafilm disregarded prevailing
industry methods in 2005 and started from scratch to create what
has become the world’s fastest, most accurate GPU-based image
processing engine, Pixel Strings™. Cinnafilm is a growing company
that has successfully partnered with some of the strongest names
in the motion picture and television industry: ARRI, Quantel, Rhozet,
and NVIDIA. Our partners have recognized the power of Pixel

SPONSORS AND
Strings and the superlative image quality which can result when

EXHIBITORS
this power is properly harnessed. Cinnafilm is a privately held
company, headquartered in Albuquerque, NM, amongst powerful
resources such as the nation’s defense laboratories and New
Mexico’s highly competitive film tax incentives.

Code Sourcery CodeSourcery builds software tools that enable its customers to
get the most out of hardware platforms ranging from embedded
devices to supercomputers: Sourcery G++, tools for professional
embedded C and C++ developers, and Sourcery VSIPL++, a C++
library for developing high performance signal- and image-
processing applications.
FULL CONFERENCE GUIDE 2010

Cooliris, Inc Cooliris was founded in January 2006 with a simple mantra:
“Think beyond the browser.” We focus on creating products that
make discovering and enjoying the Web more exciting, efficient,
and personal. Each of us is passionate about serving our users
without compromise and seeing that our products deliver the best
experience possible. Headquartered in Palo Alto, CA, Cooliris is
backed by Kleiner Perkins Caufield & Byers, DAG Ventures, the
Westly Group, and T-Venture. For more information, please visit
145

www.cooliris.com/company.
Cyberlink CyberLink Corp. is the leader in enabling digital multimedia on
PCs and CEs. Our software solutions include 3D stereoscopic
applications, Blu-ray Disc playback and creation, digital home
entertainment, and touch-enabled media solutions. CyberLink’s
partners with worldwide business leaders in the PC industry include
top-5 desktop and notebook brands, drive manufacturers, and
graphics card markers. CyberLink offers the most extensive array
of 3D software for the consumer market, from Blu-ray™ 3D to 3D
conversion of 2D video and even 3D slideshow applications. Hardware
support from NVIDIA® 3D Vision™ and optimization for NVIDIA®
CUDA™ decoding and encoding technology results in a superb
viewing experience with reduced loading on PC system resources.

Discretix Security and content protection lie at the heart of the mobile
and home entertainment markets. Discretix’ suite of embedded
security solutions includes security co-processors, security sub-
systems, cryptographic cores and content protection applications.
The Discretix Downloadable Secure Player allows content service
providers (CSP) to target the large amount of devices already
in the market, overcoming the dependency on pre-installed or
embedded device applications and accelerating the deployment
of new services. This unique secure player supports both industry
standard and proprietary content protection schemes and is
compliant with the requirements of the content owners. Suitable to
a wide variety of connected devices including smartphones, e-book
readers, tablet computers, netbooks and internet-enabled TVs, the
secure player enables a broad range of premium content services
and applications. The secure player is available on various open
operating systems including iPhone OS, Android, Windows Mobile
/ Windows Phone, Linux and Symbian. Discretix’ content protection
solutions are field proven and trusted by some of the world’s best-
known semiconductor and device manufacturers including Intel,
HTC, SonyEricsson, Acer and Motorola. For more information with
visit www.discretix.com.

embodee embodeeTM gives you an Online Try-onSM service with digital

versions of real-world jeans, enabling you to quickly find a great
looking pair. Garment sizes are automatically matched with your
body, making it easy to select a size. Embodee’s GPU equipped
servers run highly accurate simulations of look and fit - of any jeans
style, in any size, on your body – and we deliver them on-demand to
any device. www.embodee.com

EM Photonics EM Photonics is a recognized leader in implementing computationally

intense algorithms on commodity hardware platforms. We develop
SPONSORS AND

custom solutions using GPUs, FPGAs, and embedded systems,

EXHIBITORS

for clients seeking to optimize their scientific computing, image

processing, and numerical analysis applications. Our most popular
commercially available software product is CULA, a GPU-accelerated
linear algebra library built in collaboration with NVIDIA.

Empulse GmbH Empulse is an IT service company focussing on the development

of CUDA applications for High Performance Computing. Based in
Cologne/Germany, a team of highly professional developers and
architects designs and implements high quality solutions. The main
product of empulse is Parstream, the first database engine running
on GPUs while using optimized indexing algorithms. Parstream
is the optimal solution for filtering and analyzing very large
structured datasets with billions of records, like web-logs, financial
transaction histories, call data records etc. It needs a fraction of
time compared to conventional solutions.
Filter Foundry Filter Foundry is the first social network marketplace for the
worldwide audience of designers and visual artists, enabling
“members” to make money and enhance career, daily. Individuals
and creative companies build and manage a profile and creative
work, market talents, get work, hire and manage projects, search
and answer vital questions, teach and learn, and buy digital products.

Geomerics Geomerics is an innovation-led company specialising in graphics

software. We are built on a combination of advanced in-house IP, a
world-class research team, and strong management experience.
Geomerics’ first product is Enlighten. Enlighten redefines the way
lighting is handled in computer games. Instead of pre-baking the
effects of global illumination into the scene they are computed
at run time, allowing for fully dynamic lighting that dramatically
enhances quality. Enlighten gives artists total control over
lighting, driving a new generation of games that rival film for their
manipulation of mood and atmosphere. Licensees of Enlighten
include AAA titles in production at EA DICE, CCP and FunCom.

HPC Projects/ Wild Systems HPC Project is a high-tech company whose mission is to supply
combined hardware and software solutions to users requesting
high performance computing for their applications. Under the
brand name Wild Systems, HPC Project provides dedicated
appliances. Using the open source software Par4All for translating
standard C code in a CUDA-capable code, the appliances take full
advantage of the NVidia GPU technology. One of Wild Systems
turnkey appliances is the WildLab for the simulation community.
With the WildLab, users creating models in the Scilab script
language seamlessly produce a GPU accelerated autonomous
executable code.

Israel Economic Mission (IEM) “The Government of Israel Economic Mission (IEM) is responsible
for enhancing bi-national trade relations between the West Coast
and Israeli business communities. By leveraging its networking
capacity and industry knowledge in Israel and the Western United
States, IEM is able to seamlessly engage prospective business
partners half a world apart. Such actions manifest an array of
high-level connections ranging from brokering introductions to
organizing international trade missions. Our operations span a
variety of industry sectors, with a focus on high-tech, security, new
media, cleantech, and biotechnology.”

Jedox Business Intelligence Jedox develops a centralized, in-memory, GPU-based calculation

engine that controls and stores the spreadsheet-based business
Intelligence data contained in every Excel, Open Office and Google

SPONSORS AND
spreadsheet in an organization. This technology stops “Spreadsheet

EXHIBITORS
Spreadmart Chaos” (hundreds of spreadsheets with uncoupled,
non-verifiable data “running amok” in an organization).

Mersive Technologies Inc Mersive Technologies’ display management software and solutions
bring unprecedented simplicity and affordability to large-scale,
beyond-HD displays allowing visual collaboration to go mainstream.
Mersive’s patented SOL software automatically aligns multiple
projectors into one seamless image of extraordinary quality and
resolution without the expense of specialized hardware, building
infrastructure or services.
FULL CONFERENCE GUIDE 2010

miGenius Limited Based on the powerful combination of NVIDIA CUDA based GPU’s
and mental images ‘iray’ rendering technology, miGenius has
developed an easy to use toolset, EasyRS, allowing non-technical
users to create fully customized User Interfaces with a wide
range of viewing and management controls that can be quickly
uploaded onto either a dedicated GPU server or utilizing the rapidly
emerging ‘cloud computing’ networks. With these toolsets, the
vast market of both businesses and consumers alike will be able
to rapidly transform these powerful web based rendering platform
147
technologies into a customisable and truly revolutionary 3D visual
communication and collaboration medium.

Milabra Milabra uses proprietary, patent pending machine vision software to

create visual data about the web. We bridge the visual relationship
between the Ad, the Page, and the Audience for increased
advertiser performance. We analyze the visual attributes and
content of webpages in order to target online display advertising,
optimizing creative choice based on the visual environment within
which it will appear. We analyze page elements as well as the
photo and video content, providing real time data for targeting and
optimization as well as defending advertisers against negative
content associations.

Mingleverse Revolutionizing the way people meet and interact, Mingleverse

brings to the market an entirely new “Virtual Telepresence”
communications service allowing you to talk and share media live
within a browser based virtual “Mingle Room”. It is simply the new,
easy, and fun way to talk and meet online!

NaturalMotion Ltd NaturalMotion Ltd. is a leading entertainment software company

with offices in Oxford (England) and San Francisco (California).
The company produces the widely-adopted animation technologies
euphoria,morpheme and endorphin, used across the game
and movie industries by companies such as Rockstar Games,
LucasArts, Disney and Bioware as well as in the development
of Backbreaker, the company’s first in-house game.

OptiTex OptiTex is the leading developer of 2D & 3D CAD solutions for

virtually all sewn-products industries. OptiTex presents a complete
content creation solution, to create and visualize customized
garments, to simulate fitting and draping of garments on fully
parametric virtual models and to create movie clips in an
immediate and direct manner.

Perceptive Pixel Perceptive Pixel is dedicated to the research, development and

deployment of multi-touch interfaces for the knowledge worker.
The company’s hardware and software solutions enable both novice
and expert users to manipulate complex datasets through a new
class of intuitive yet powerful and visually rich interface techniques.

PhaseSpace PhaseSpace uses Nvidia GPUs with custom 4 megapixel cameras for
pose tracking, eye tracking, 3D scanning and computer vision tasks
in real time. PhaseSpace’s list of clients includes the U.S. Navy,
Air Force, Army, NASA, Boeing, Disney, Honda, Google, Stanford,
SPONSORS AND

Berkeley, MIT, Cambridge, Sandia Labs, Sony, MTV, and others.

EXHIBITORS

Phototour Phototour is the “social Google streetview”, created by crowd

sourcing geotagged photographs and panoramas. With 40% of
the revenue generated online relating to travel, exploring and
discovering new places still remains a challenge. Phototour
conquers this challenge by providing a convenient, intuitive, and
reliable way to browse geospecific photos. Users also have the
freedom to create their own virtual tour using photos or panoramas
created by our patented web based photostitcher without having to
download any software.

Playcast Media Systems Playcast Media Systems brings video games to the world’s largest
media distribution platform – Pay TV networks. The company’s
solution includes a head-end based system, which streams a
game’s audiovisual content as a standard MPEG stream, as well as
the provisioning of the content and programming itself. Playcast’s
media streaming systems, located in operators’ headends, host
the games and stream them over the existing video network to an
already distributed base of set-top boxes. Playcast is a privately
owned, venture capital backed company, based in Israel and the UK.
Prometech Prometech Software, Inc. provides a particle-based CAE
software “Particleworks” for Japanese manufacturing industries.
Prometech, as an university-launched technology venture, has
a strong technical capability of complex physics simulation,
visualization and many-core acceleration and offers physics
based simulation in the fields of manufacturing, VFX and scientific
researches.

RealityFrontier RealityFrontier, an affiliate of CoroWare Inc. and its partners, is

dedicated to delivering services and sales enablement solutions
in three converging technical fields of expertise: hardware
acceleration, connected systems and telepresence devices.
RealityFrontier’s market is with service providers and designers.
RealityFrontier long-term goal is to become an ISV in augmented
reality, defined as a virtual interaction layer between the physical
world and the user. To pursue that goal, RealityFrontier capitalizes
on its partners’ know-how in video streaming, embedded systems,
user interaction, data manipulation and high-performance
computation. RealityFrontier is actively reaching out to sponsors
and partners.

Reservoir Labs Reservoir Labs has developed compiler algorithms for

automatically performing the resource tradeoffs (parallelism vs.
locality vs. contiguity) needed to automatically translate sequential
loop nests in C into CUDA. This can multiply the productivity of
users bringing new algorithms to NVIDIA GPUs reaching high
percentage of peak performance.

RTT Realtime Technology (RTT) AG stands for creative and fascinating

3D visualization solutions, which bring products to life in realtime
and portray them in a natural and realistic environment. The
company provides its clients with assistance during each stage
of the life cycle of their products – from the initial product design
stage through to development and subsequent marketing and
sales. The 3D data model from the product development stage
serves as the basis for all the following steps in the product
lifecycle. It can be used, for example, to rapidly create computer
generated, photorealistic product illustrations for the marketing
department or to develop a 3D online product configurator on a
website. In this way, RTT doesn’t just speed up decision making
and development processes for its clients, but it also opens up new
opportunities with regard to marketing and sales. The company
was founded in 1999 and its head office is in Munich, Germany.
RTT AG has over 400 employees and is represented in 14 locations
worldwide. Many leading businesses have put their trust in RTT and

SPONSORS AND
its portfolio of clients includes names such as Adidas, Audi, BASF,

EXHIBITORS
BMW, Bosch, Daimler, EADS, Harley-Davidson, Miele, Porsche,
Samsung, Thyssen-Krupp, Toyota and Volkswagen. RTT AG is
a stock market listed company (Xetra:R1T; WKN: 701220; ISIN:
DE0007012205). For more information visit www.rtt.ag.

Scalable Display Technologies Scalable Display Technologies, Inc. is a leading provider of

software used to create large projection-based displays with
resolution far beyond HD. Scalable’s patented software is the
catalyst for an emerging class of displays. Its software simplifies
the creation of super-resolution, multiprojector displays of the
highest quality and scalable size. EasyBlend opens the door to
FULL CONFERENCE GUIDE 2010

widespread use of multi-projector edge-blended displays for a wide

range of applications including simulators based on off-the-shelf
components, as well as supporting new forms of digital signage
and data visualization tools.

ScaleForm Corporation Scaleform is the leading provider of user interface software for the
videogame and consumer electronic industries. Scaleform GFx leverages
the power of the Adobe® Flash® tool set and enables developers to
quickly create powerful and immersive user interface environments
149

while stream lining workflow and improving time to market.

Sea CO2 Seac02 provides for the software market augmented reality
software aimed at the improvement of the quality and efficiency of
the processes of engineering, marketing, sales and communication
and at the reinforcement of awareness in consumer purchasing.

Stonetrip Stonetrip a leading 3D engine company for games and 3D

applications. Headquartered in France, the company designs and
supports ShiVa, powerful easy-to-use tools for creating amazing
3D real-time applications and games. Stonetrip continues to
add additional platforms as it extends its reach to new markets,
maintaining its position as the most cross-platform-compatible
solution in the market today.

Softkinetic Softkinetic is the leader in natural interfaces that transform the

way people interact with the digital world. We provide the most
advanced software platform for building immersive, transparent
and intuitive user experiences within the fields of Interactive
Digital Entertainment, Serious Games, Interactive Marketing and
Consumer Electronics. More information: www.softkinetic.net

Tide Powerd Ltd. Founded in 2009 at the University of Alabama, TidePowerd Ltd.
was the first-ever participant accepted into Red Gate Software’s
(http://www.red-gate.com) “Springboard” incubator program. Our
core goal is to provide developers with powerful, yet easy-to-use
tools for numerical and high-performance computing (HPC) - and
to provide those tools with the best value and technical support
possible. To this end, we’ve created GPU.NET, a system that
allows developers to write their GPU code in any .NET-supported
language (e.g., C#, F#, IronPython). GPU.NET opens up the exciting
world of GPU computing to millions of new developers worldwide,
and we hope it will help to make GPU computing more popular
than ever before.

Trinity Racing Trinity Racing Concepts produces high quality motorsports

simulation products and services. Trinity is the leader in the
integration of Stereoscopic 3D into hyper-realistic control
systems to develop the most accurate and immersive motorsports
simulation experience available. In addition to creating custom and
retail simulators, Trinity also develops and manages race-themed
promotional tours, trade shows, company parties, team-building
activities, and other events.

Universal Robotics Universal Robotics creates software that enables machines to learn
from their experiences, react and adapt to their surroundings, and
perform tasks that are costly, dangerous or difficult for humans to
SPONSORS AND

undertake. The company’s signature technology, Neocortex, which

EXHIBITORS

was developed over seven years at NASA and Vanderbilt University,

will increase efficiency and worker safety across industries in
applications including warehousing, mining, handling hazardous
waste and automating vehicles such as forklifts.

Useful Progress The development in computer graphics allows huge progress

in the knowledge of Life and Matter. In Medical science, CT
scanners allow to investigate the whole body with transparency.
A very important step in data analysis consist to convert signals
(X, MR, US) in digital data that could be treated by computers.
UsefulProgress develops new software strategies based on
computer graphics for highperformance visualisation.

acoustic signal processing algorithms with computer vision on
CUDA platforms to provide enabling technologies addressing real
world solutions. VisiSonics RealSpace offerings demonstrate
advanced telepresence combining vision and sound in a unified
solution space. Applications include industrial acoustic analytics,
virtual reality, surveillance and HRTF.
GTC Exhibitors

Ace Computers Ace computers is a 27 year old system integrator with a focus on
high performance computing, workstations, servers and storage.
We hold WSCA contract B27157, GSA schedule GS-35F-0400T and
other BPAs including major universities and federal agencies.

Aspen Systems Aspen Systems, founded in 1982, is an established, privately-held,

two time Inc. 500 corporation that designs, manufactures, and
services computing products including high-performance compute
clusters, systems software, storage/file systems, and visualization.
Aspen Systems places its highest priority on first class technical
support and the creation of fully customized products that always
incorporate the latest technologies. This allows our customers to
enjoy the highest performing solutions at very competitive prices.

Boxx Technologies BOXX Technologies offers high-performance solutions that

empower your animation, VFX, architectural or industrial design
software and renders your creations faster than any other. From
our XXtreme, 4 GPU workstations to our state-of-the-art rendering
systems, BOXX solutions are built by the one company that
understands and supports your creative business.

Bright Computing Bright Computing is a specialist in cluster management software

and services for high-performance computing (HPC) in general
and GPU clusters in particular. Its flag-ship product — Bright
Cluster Manager — makes clusters of any size easy to install, use
and manage. Bright Cluster Manager has earned a reputation for
being the best cluster management software on the market for
GPU clusters. Bright Computing works closely with NVIDIA and has
customers in industry, government and academia across the world.

CIARA Technologies CIARA designs, develops, markets, services, and supports a variety
of High Performance systems including TITAN Systems based
on NVIDIA® Tesla™ 20-series, NEXXUS-4000® and CX1 Personal
Cluster, the acclaimed VXRACK® high density blade server, VXPRO®
rack-mount/tower servers and GRAPHIXX® high-end workstations.

Cirrascale Corporation Cirrascale Corporation is a premier developer of blade-based

cloud servers, GPGPU systems and scale-out storage solutions.
Cirrascale’s patented Vertical Cooling Technology allows it to provide
the industry’s densest, most energy-efficient, reliable standards-
based solutions. To learn more about Cirrascale and its unique
blade-based solutions, please visit http://www.cirrascale.com or

SPONSORS AND
call (888) 942-3800.

EXHIBITORS
Colfax International Buy it from a trusted expert. Colfax provides the most comprehensive
range of innovative, cutting-edge and highly customized GPU
solutions. Colfax has been first-to-market with a 4GPU PSC and the
revolutionary CXT8000 - the world’s first 8GPU server unveiled at
GTC 09. Leading universities, labs and companies are accelerating
their research and business outcomes with optimally configured,
ready-to-go Colfax GPU solutions. Join us for an in-person
conversation at Booth #xxxx. Or visit www.colfax-intl.com

Creative Consultants Creative Consultants LLC is a New Mexico based business

FULL CONFERENCE GUIDE 2010

producing efficient high performance computing for Scientists, and

Engineers. We design, and support computing systems advancing
leading edge, but sensible technologies. We specialize in the
integration of outside source and in-house developed software
with our custom GPU accelerated hardware to produce complete
solutions. Creative Consultants has serviced America’s National
Laboratories for more than twenty years. Our company offers an
ecosystem of products enabling the creation of GPU clustered
151
appliances. This includes the Stelletto dual node workstation, our
eStella cloud service, and Stella, a high throughput conSTELLAtion
cluster for local GPU supercomputing.

Cubix Corporation Cubix Visual & GPU Compute Solutions is a new division of Cubix
Corporation, a vertically-integrated manufacturer with 35 years
of manufacturing experience, focused on delivering deskside and
rackmount modular, scalable, GPU Compute hardware solutions
for demanding applications such as physically-based rendering,
animation, visualization, and cloud computing applications.

Exxact Founded in 1992, Exxact develops and manufactures high

performance computing workstations, servers, and clusters that
serve compute-intensive applications such as space exploration,
medical imaging, financial modeling, broadcasting, and industrial
design. Exxact is also a value-added distributor of professional
workstation graphics by ATI FirePro™ and Nvidia® Quadro® by PNY.

GIADA Giada is a leading innovator of small form-factor PCs. Its state-of-

the-art technology and stylish design is exemplified in its flagship
product, the $299 Slim-N20. At only 1.2lbs, it is the world’s smallest
PC that uses NVIDIA’s next generation ION processor, and is the
perfect digital home entertainment solution, providing a robust
platform for Internet entertainment. Giada, already a leading brand
in the entertainment category for mini-PCs in China, is establishing
a presence in the U.S. to promote its innovative products, including
all-in-one PCs, OEM motherboards, graphics cards and Netbooks.

GraphStream GraphStream is a leading provider of custom-integrated scalable

computing+storage systems with GPU acceleration. GraphStream
partnered with NVIDIA, Mellanox, and Supermicro in 2003 to deliver
the world’s first commercially integrated scalable GPU-computing
systems with InfiniBand cluster interconnect. Since then,
GraphStream has delivered advanced GPU-computing systems
to more than 200 customers worldwide, including four Top10
supercomputing sites.

James River Technology Inc. JRTI, a leading provider of (HPC) solutions to the marketplace and
Velocity Micro, the premier high-performance personal computer
provider in North America, are pleased to introduce VelocityHPC,
our latest initiative focused on NVIDIA Tesla GPU Accelerated
Computing solutions.
SPONSORS AND
EXHIBITORS

JMR Electronics Inc JMR is a leading value provider of scalable storage systems for
performance and capacity driven applications in the government,
DCC, VOD, video surveillance and Web 2.0 markets. Headquartered
in Chatsworth, California, JMR has been developing reliable,
high performance RAID storage technologies since 1982. JMR’s
complete line of BlueStor PeSAN™ DAS, NAS and SAN solutions,
manufactured entirely in the U.S.A., are ideal for nearly every IT
and video production need. For further information please visit,
www.jmr.com.

Koi Computers Koi Computers, Inc. has over fifteen years of experience in the IT
hardware and systems integration industry. We offer a wide range
of custom configured computer systems and an extensive catalog
of technology products. Our core competencies include computer
high performance clusters; server, storage, and blade solutions;
desktops, laptops, and workstations; mounting and cabling
solutions; and strategic sourcing for technology products.
Los Alamos National Labs Los Alamos National Laboratory is a premier national security
research institution, delivering scientific and engineering solutions
for the nation’s most crucial and complex problems. Our primary
responsibility is ensuring the safety, security, and reliability of the
nation’s nuclear deterrent.

Mathworks Over 1,000,000 engineers and scientists in more than 100 countries,
on all seven continents, use MATLAB® and Simulink®. These
products have become fundamental tools for work at the world’s
most innovative technology companies, government research labs,
financial institutions, and at more than 5000 universities. For more
information, visit www.mathworks.com

Mazda Technologies Mazda Technologies creates significant business advantages for

its customers by providing differentiated, value-added hardware
and software IT solutions and professional services based on GNU/
Linux and other open source and current OS solutions. At Mazda
Technologies our main goal and mission is to always ensure that our
team is knowledgeable, competent and customer focused in dealing
with cutting edge technology thus providing our customers the
absolute best value for their IT dollars and the necessary peace of
mind to run their IT Operations efficiently. We strive to deliver custom
high performance solutions that will fit the needs of the customer.

Mellanox Technologies Mellanox Technologies is a leading supplier of end-to-end

connectivity solutions for servers and storage that optimize data
center performance.

Micoy Micoy is the only true provider of full omni-directional 3D Stereo

Immersive systems. This revolutionary 3D format is a breakthrough
patented method in 360 degree panospheric & cylindrical 3D optics.
With a secure patent portfolio for the technology Micoy is looking
for development partners for applications of its technology and co-
development opportunities in real time solutions.

Microway Inc. Since 1982, Microway has earned an international reputation

for building robustly designed and cooled HPC clusters and
WhisperStations. Since 2007, these included Tesla GPUs. Utilizing
multi-core hosts with high-efficiency power, excellent cooling, and
QDR InfiniBand interconnects, Microway’s GPU clusters deliver
more TFLOPs with fewer watts. Microway is known for the quality
of its design, integration, and support teams and high level of
customer satisfaction.

SPONSORS AND
Morgan Kauffman Morgan Kaufmann has been bringing the knowledge of experts to

EXHIBITORS
the computing community since 1984. Our goal is to provide timely
yet timeless content to research and development professionals,
business leaders and IT managers, everyday practitioners, and
academia. We publish textbooks and references in Artificial
Intelligence, Computer Networking, Computer Architecture,
Computer Graphics & Game Development, Data Management &
Business Intelligence, Software Engineering, and User Experience
& Human Computer Interaction. For more information, visit mkp.com.

Next Computing NextComputing manufactures high-performance portable and

small form-factor workstations and servers, designed to run high-
FULL CONFERENCE GUIDE 2010

end applications across a range of industries. Leveraging NVIDIA’s

‘Fermi’ GPU architecture, NextComputing offers unique portable
and small footprint platforms for professional power-users who
require more performance and flexibility than ordinary notebook
and desktop computers can provide.
153
PEER 1 hosting One of the world’s leading IT hosting providers built on two
obsessions: Ping & People. A 10Gb SuperNetwork™ connects 17
datacenters and, paired with 24x7 FirstCall Support, helps power
over 10,000 customers worldwide. PEER 1 recently launched the
first ever large-scale GPU Cloud hosted in North America and UK.

Penguin Computing Penguin Computing is a global leader in high-performance

computing (HPC), delivering complete, integrated HPC solutions,
from the workstation to the cloud. With a focus on cutting-edge
technology, ease-of-use and exceptional customer service,
Penguin cost-effectively meets the needs of the world’s most
demanding HPC users, including Caterpillar, Lockheed Martin,
the U.S. Air Force, and the U.S. Navy. Today, Penguin delivers
a range of solutions, from massive Linux clusters to “Penguin
on Demand” (POD), a new service that provides a complete
HPC solution in the cloud. Penguin has been an innovator in
HPC solutions for over a decade, and the company’s founder
Donald Becker is recognized as the “Father of Linux Clustering.”
For more information about Penguin Computing and Penguin
products please go to www.penguincomputing.com.

Platform Computing Platform Computing is the leader in cluster, grid and cloud
management software – serving more than 2,000 of the world’s
most demanding organizations. For 18 years, our workload and
resource management solutions have delivered IT responsiveness
and lower costs for enterprise and HPC applications. Platform has
strategic relationships with Cray, Dell, HP, IBM, Intel, Microsoft,
Red Hat, and SAS. GPU-accelerated clusters are rapidly growing
in popularity as powerful, cost-effective High Performance
Computing (HPC) solutions. NVIDIA’s GPU hardware, along with
the CUDA computing environment, is delivering impressive results
for commercial HPC workloads. Platform HPC suite, with CUDA
kit, enables analysts, engineers, and scientists to unlock the
power of NVIDIA GPU Clusters, making them easier to deploy,
run and manage.

Polywell Polywell is a manufacturer of high quality computer products

ranging from PCs to high-performance workstations, storage
solutions and high-end servers, and has been serving the needs
of consumers, businesses and government entities since 1987. Its
direct sales channel and Just-in-time manufacturing processes
allow competitive pricing. There are system specialists to support
the various vertical markets, including data centers, IPTV,
entertainment, gaming, biotech, oil/gas, CAD/CAM, animation,
and content creation. Polywell also provides OEM service for PC/
SPONSORS AND

Linux appliances in digital signage, set-top box, POS, kiosks,

EXHIBITORS

medical equipment and network appliances. Its full-range service

includes product design, prototyping, production, fulfillment, and
warranty repair.

Portland Group The Portland Group® (PGI®) offers high performance parallel
compilers and tools for workstations, servers and clusters based
on 64-bit x86 processors with NVIDIA CUDA-enabled GPUs running
under Linux, MacOS or Windows operating systems. PGI GPU
accelerator products include directive-based PGI Accelerator™
Fortran and C compilers and CUDA Fortran.

PSSC Labs PSSC Labs is everything you expect from your technology provider,
and more. With 20 years in business, PSSC Labs possesses the
knowledge, expertise and procedures to deliver high performance
computing solutions to the world’s most demanding organizations.
PSSC Labs computing solutions empower next generation science.
Tech- X Corporation Tech-X offers products and services for high-performance
computing. GPULib enables users of MATLAB and IDL to take
advantage of GPUs from within these high-productivity languages.
We offer consulting, training, and custom software development
to migrate customers’ scientific computing problems onto
hardware accelerated architectures, using technologies like
CUDA, OpenCL, or MPI.

T-Platform T-Platforms is the major Russian supercomputing group.

T-Platforms installations occupy 38% of the current Top50 list of
Russia’s most powerful computer systems. T-Platforms designed
and manufactured 5 supercomputers featured in the global Top500
list, 36th being the highest rank. T-Platforms has accumulated
substantial expertise with over 200 HPC installations.

Wolfram Research Wolfram Research, Inc. is the technical innovation powerhouse

behind the world’s most powerful global computation system
Mathematica, and the world’s first-ever computational knowledge
engine, Wolfram|Alpha. Wolfram Research continues its strong
commitment to technology and education with resources like
MathWorld, load-on-demand curated data, and the Wolfram
Demonstrations Project. www.wolfram.com

Marketing Partners

SPONSORS AND
EXHIBITORS
FULL CONFERENCE GUIDE 2010
155
VISUALIZE A GREEN EVENT What We’re Doing
Place compostables and recyclables in proper bins >> 100% of convention center’s greenhouse gas is offset
Use public transportation during the show >> Extensive composting and recycling
In hotel, decline new sheets and towels >> Producers and vendors agree to green guidelines
Also, unplug phone and laptop chargers >> Minimizing printed materials
Offset your travel at www.cool-it.us >> Using recycled and biodegradable paper/non-toxic inks
Take only collateral/giveaways you will use >> Monitoring lighting and A/C usage
>> Local and organic food options
>> Non-toxic cleaning materials

Cert no. SCS-COC-001334

EXHIBIT LEVEL

A5 J3

E J2
A3
C
D EXHIBIT HALL KEYNOTE HALL SHOW MANAGEMENT
& SALES OFFICE J1 J4
A2 A7
EMERGING COMPANIES SPEAKER GREEN
F2 H
SUMMIT ROOM
B F1 G
A1 A8

STREET LEVEL
REGISTRATION
PARALLEL
PRESS NSIGHT LOUNGE
LOUNGE BY MICROSOFT

SILICON VALLEY
BOARD ROOM THINK TANK
MARRIOTT
GUADALUPE
HILTON
SAN JOSE
BALLROOM PARKING
3
WILLOW GLEN 2
1
ELEVEVATOR TO
BLOSSOM HILL ROOMS
(3RD FLOOR)

K L M N

COAT &
BAG CHECK

HILTON
MAIN ENTRANCE
MARRIOTT

Cert no. SCS-COC-001334

The Full Indian Method
No ratings yet
The Full Indian Method
38 pages
Guide To SIE Exam Ebook
No ratings yet
Guide To SIE Exam Ebook
4 pages
Banafa A Quantum Computing and Other Transformative Technologies 2023
No ratings yet
Banafa A Quantum Computing and Other Transformative Technologies 2023
260 pages
Fifty Nifty Variations of Two-Transistor Circuits A Tribute To The Versatility of MOSFETs
No ratings yet
Fifty Nifty Variations of Two-Transistor Circuits A Tribute To The Versatility of MOSFETs
9 pages
SC 26 - 52 Screw Compressor
100% (1)
SC 26 - 52 Screw Compressor
78 pages
OpenDaylight Cookbook
From Everand
OpenDaylight Cookbook
Jamie Goodyear
No ratings yet
Linux Graphics Drivers
No ratings yet
Linux Graphics Drivers
69 pages
Gpu Cuda
No ratings yet
Gpu Cuda
204 pages
Custom PC 05.2021
No ratings yet
Custom PC 05.2021
116 pages
Matthew Waters - GStreamer WebRTC
No ratings yet
Matthew Waters - GStreamer WebRTC
31 pages
Mipi-Tutorial PDF Compressed
No ratings yet
Mipi-Tutorial PDF Compressed
13 pages
Analog Solutions For Xilinx FPGAs
No ratings yet
Analog Solutions For Xilinx FPGAs
36 pages
Maps Forge
No ratings yet
Maps Forge
20 pages
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)
Exemple Senzori MEMS
No ratings yet
Exemple Senzori MEMS
132 pages
S
No ratings yet
S
20 pages
Proceedings of The International Conference On Signal Networks, Computing and Systems
No ratings yet
Proceedings of The International Conference On Signal Networks, Computing and Systems
336 pages
BL460c Gen8 Datasheet
No ratings yet
BL460c Gen8 Datasheet
4 pages
OpenFlow Cookbook
From Everand
OpenFlow Cookbook
Kingston Smiler. S
5/5 (1)
[Ebooks PDF] download Analog Design and Simulation Using OrCAD Capture and PSpice 2nd Edition Dennis Fitzpatrick full chapters
100% (1)
[Ebooks PDF] download Analog Design and Simulation Using OrCAD Capture and PSpice 2nd Edition Dennis Fitzpatrick full chapters
71 pages
Hspice Mosfet
No ratings yet
Hspice Mosfet
630 pages
Thin Film Transitors
No ratings yet
Thin Film Transitors
517 pages
Open CV
No ratings yet
Open CV
11 pages
Electric Field Mapping in High Voltage Substation Using The Finite Elements Method
No ratings yet
Electric Field Mapping in High Voltage Substation Using The Finite Elements Method
5 pages
Advanced Dynamic-System Simulation: Model Replication and Monte Carlo Studies
From Everand
Advanced Dynamic-System Simulation: Model Replication and Monte Carlo Studies
Granino A. Korn
No ratings yet
Anubhav
No ratings yet
Anubhav
43 pages
Introduction To OpenCL Programming (201005)
No ratings yet
Introduction To OpenCL Programming (201005)
132 pages
Skin Cancer Classification Using Convolutional Neural Networks
No ratings yet
Skin Cancer Classification Using Convolutional Neural Networks
8 pages
Techniques For Building Timing-Predictable Embedded Systems 2016
No ratings yet
Techniques For Building Timing-Predictable Embedded Systems 2016
242 pages
Neuromorphic Engineering Systems and Applications
No ratings yet
Neuromorphic Engineering Systems and Applications
183 pages
100 uCOS III ST STM32 002
No ratings yet
100 uCOS III ST STM32 002
848 pages
[Ebooks PDF] download Robotics DISCOVER THE SCIENCE AND TECHNOLOGY OF THE FUTURE with 20 PROJECTS Kathy Ceceri full chapters
100% (2)
[Ebooks PDF] download Robotics DISCOVER THE SCIENCE AND TECHNOLOGY OF THE FUTURE with 20 PROJECTS Kathy Ceceri full chapters
77 pages
Zotac NM10 ITX User Manual
100% (1)
Zotac NM10 ITX User Manual
46 pages
Computer Organization and Architecture Designing For Performance 8th Edition
100% (1)
Computer Organization and Architecture Designing For Performance 8th Edition
881 pages
Low Cost' Three Phase To Single Phase Matrix Converter
No ratings yet
Low Cost' Three Phase To Single Phase Matrix Converter
6 pages
QNX Introduce
No ratings yet
QNX Introduce
26 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
Getting Started With Pico
No ratings yet
Getting Started With Pico
70 pages
07 Multiprocessors MF PDF
No ratings yet
07 Multiprocessors MF PDF
99 pages
Blackmagic Live Video Processing With OpenCV
No ratings yet
Blackmagic Live Video Processing With OpenCV
19 pages
Book ROS
No ratings yet
Book ROS
166 pages
Let's Go Tiny With TinyGo
No ratings yet
Let's Go Tiny With TinyGo
12 pages
Klayout-0 21 16
No ratings yet
Klayout-0 21 16
511 pages
White Paper Interconnect Solutions Debugging Issues Advanced ARM CoreLink
No ratings yet
White Paper Interconnect Solutions Debugging Issues Advanced ARM CoreLink
8 pages
Baker Clamp Application For PNP Transistor: Ask Question
No ratings yet
Baker Clamp Application For PNP Transistor: Ask Question
6 pages
Immediate download High Performance Computing: Modern Systems and Practices Thomas Sterling ebooks 2024
100% (1)
Immediate download High Performance Computing: Modern Systems and Practices Thomas Sterling ebooks 2024
51 pages
OpenCV 3.0 Computer Vision with Java
From Everand
OpenCV 3.0 Computer Vision with Java
Daniel Lélis Baggio
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Finite Difference Method
100% (1)
Finite Difference Method
16 pages
Netdata Debian Linux
No ratings yet
Netdata Debian Linux
52 pages
Chapter 7 Passive Devices
No ratings yet
Chapter 7 Passive Devices
72 pages
Picoblaze User Guide
No ratings yet
Picoblaze User Guide
124 pages
Implementing Open Flow Switch Using FPGA Based Platform
No ratings yet
Implementing Open Flow Switch Using FPGA Based Platform
140 pages
Instant Download Invitation to Computer Science 8th Edition G. Michael Schneider PDF All Chapters
100% (3)
Instant Download Invitation to Computer Science 8th Edition G. Michael Schneider PDF All Chapters
55 pages
Fundamentals of Operating Systems-April-2024
No ratings yet
Fundamentals of Operating Systems-April-2024
450 pages
System Overview: Total CPU Utilization (System - Cpu)
No ratings yet
System Overview: Total CPU Utilization (System - Cpu)
82 pages
The Definitive Guide to ARM Cortex M3 and Cortex M4 Processors Third Edition Joseph Yiu instant download
100% (2)
The Definitive Guide to ARM Cortex M3 and Cortex M4 Processors Third Edition Joseph Yiu instant download
62 pages
Iot Based Car Detection and Theft Control
No ratings yet
Iot Based Car Detection and Theft Control
69 pages
Nature Photonics Technology Focus Optical Fiber Sensor
100% (1)
Nature Photonics Technology Focus Optical Fiber Sensor
20 pages
PCML Overview
No ratings yet
PCML Overview
1 page
PCSpim Tutorial
No ratings yet
PCSpim Tutorial
5 pages
OpenCV Essentials
From Everand
OpenCV Essentials
Oscar Deniz Suarez
No ratings yet
Mastering ChatGPT and Google Colab for Machine Learning: Automate AI Workflows and Fast-Track Your Machine Learning Tasks with the Power of ChatGPT, Google Colab, and Python (English Edition)
From Everand
Mastering ChatGPT and Google Colab for Machine Learning: Automate AI Workflows and Fast-Track Your Machine Learning Tasks with the Power of ChatGPT, Google Colab, and Python (English Edition)
Rosario Moscato
No ratings yet
GPIO Terminal Block
No ratings yet
GPIO Terminal Block
28 pages
RTG CPR CoreBookFAQv1.0b
No ratings yet
RTG CPR CoreBookFAQv1.0b
4 pages
Circular 20240417 Mobile Crane Load Test
No ratings yet
Circular 20240417 Mobile Crane Load Test
2 pages
FD-XS20_Datasheet
No ratings yet
FD-XS20_Datasheet
12 pages
File 1677649675
No ratings yet
File 1677649675
6 pages
Steps to Mail Merge in LibreOffice Writer
No ratings yet
Steps to Mail Merge in LibreOffice Writer
7 pages
Maths Half Yearly Question Papers Class 8
No ratings yet
Maths Half Yearly Question Papers Class 8
10 pages
Tutorial Letter 101/0/2024: Management in Foundation Phase
No ratings yet
Tutorial Letter 101/0/2024: Management in Foundation Phase
17 pages
Data Sheet: HCPL-0370, HCPL-3700, HCPL-3760
No ratings yet
Data Sheet: HCPL-0370, HCPL-3700, HCPL-3760
14 pages
1-1 - AI Chat GPT and The University
No ratings yet
1-1 - AI Chat GPT and The University
6 pages
Instant ebooks textbook Machines and mechanisms : applied kinematic analysis 4th ed Edition David H Myszka download all chapters
100% (6)
Instant ebooks textbook Machines and mechanisms : applied kinematic analysis 4th ed Edition David H Myszka download all chapters
41 pages
The Future of Reading Electronic Books Ebooks Reading Comprehension Exercises 5696
No ratings yet
The Future of Reading Electronic Books Ebooks Reading Comprehension Exercises 5696
1 page
Introductory Econometrics for Finance 2nd Edition Chris Brooks pdf download
100% (6)
Introductory Econometrics for Finance 2nd Edition Chris Brooks pdf download
64 pages
Fuel Pump Installation
No ratings yet
Fuel Pump Installation
11 pages
Shiitake - CSD & Shiitake - Ext: Default Settings (200 Bars/candles To Calculate and All Alarms True)
No ratings yet
Shiitake - CSD & Shiitake - Ext: Default Settings (200 Bars/candles To Calculate and All Alarms True)
3 pages
Deep Q-Learning Based Sparse Code Multiple Access For Ultra Reliable Low Latency Communication in Industrial Wireless Networks
No ratings yet
Deep Q-Learning Based Sparse Code Multiple Access For Ultra Reliable Low Latency Communication in Industrial Wireless Networks
13 pages
ISO27k Audit Exercise
No ratings yet
ISO27k Audit Exercise
6 pages
Mechanic (D&D 5e Conversion)
No ratings yet
Mechanic (D&D 5e Conversion)
7 pages
IC - CCT15495 Temporozador Crepuscular
No ratings yet
IC - CCT15495 Temporozador Crepuscular
3 pages
October 2009: in Case of Doubt, The German-Language Original Shall Be Considered Authoritative
100% (1)
October 2009: in Case of Doubt, The German-Language Original Shall Be Considered Authoritative
15 pages
Data Link Layer
No ratings yet
Data Link Layer
53 pages
What Is A Blog
No ratings yet
What Is A Blog
2 pages
Acr-Pcr Faa
No ratings yet
Acr-Pcr Faa
15 pages
PH20 Value PH Tester Kit User Manual: Apera Instruments, LLC
No ratings yet
PH20 Value PH Tester Kit User Manual: Apera Instruments, LLC
8 pages
Detailed Programme - 19 Global MSME Business Summit 2022
No ratings yet
Detailed Programme - 19 Global MSME Business Summit 2022
5 pages
Overview of The C# Language
No ratings yet
Overview of The C# Language
4 pages
CHAPTER 4 INPUT AND OUTPUT DEVICES AND INTERRUPTS
No ratings yet
CHAPTER 4 INPUT AND OUTPUT DEVICES AND INTERRUPTS
3 pages