Program Guide: Sept 20-23, 2010 San Jose Convention Center
Program Guide: Sept 20-23, 2010 San Jose Convention Center
PROGRAM
GUIDE
PRESENTED BY SPONSORED BY
Microsoft and Parallel Nsight – a powerful combination. Use Parallel Nsight to integrate into Microsoft
Visual Studio, the world’s most popular development environment. Run GPGPU accelerated applications
on your desktop, on Windows Server or use Windows HPC Server 2008 for cluster applications.
Learn more at table 87 or the Parallel Nsight Lounge by Microsoft in the concourse area.
The world,
these days,
isn’t flat.
INTRODUCTION
It’s parallel.
The parallel-processing power unleashed by the year, NVIDIA has shipped more than 70 million CUDA-
graphics processing unit, or GPU, is changing the enabled GPUs; 10 textbooks about CUDA have been
face of computing. And, as it changes, our ability to published in Chinese, English, Japanese and Russian;
address some of the world’s most vexing challenges and two more universities each week, on average, are
is improving. adopting CUDA into their curriculums.
Performing safer heart surgery. Making cars safer And in the graphics space, NVIDIA’s new Quadro
to drive. Drilling oil wells more accurately. Solutions professional graphics are ushering in a new era of
to these and other complex computational problems computational visualization, bringing significant
have rapidly moved within reach, yielding results that change to broadcast and film production, medical
have the potential to change lives and, ultimately, imaging and seismology, among other fields.
society as a whole. Perhaps the most immediately obvious sign of the
Facilitating this work is the GPU, one of the most importance of parallel computing, though, is the
sophisticated processors ever manufactured. With GPU Technology Conference itself. Last year’s event
up to three billion transistors in an area the size of far outstripped expectations in terms of attendance.
a postage stamp, it can accelerate applications by And interest this year suggests that a revolution is at
several hundred times, shortening time to discovery hand. Consider the following:
from days to minutes. > The response to the call for talks at this year’s
At the same time, GPUs are far more power efficient conference rose more than fourfold from last
than clusters designed exclusively with CPUs. And year, while the number of sessions has more than
they are significantly less costly. In June, China’s doubled to some 300 hours.
National Supercomputing Center, in Shenzhen, > Representatives from more than 100 universities
unveiled the world’s second-most powerful super- are registered.
computer, powered by Tesla GPUs, which was
developed in a matter of months. Other enormously > Attendees have arrived from some 50 countries.
powerful GPU-powered supercomputers are on For all these metrics, the ultimate success of
the way. the GPU Technology Conference, though, will be
Another indicator of the triumph of parallel measured by the level of engagement it inspires,
computing is the growing relevance of CUDA, the by the side conversations that it generates and the
architecture developed by NVIDIA that enables collaboration that it leads to.
GPUs to understand industry standard computing Brace yourself for immersion in this brave
languages, as well as graphics APIs. Over the past new world!
The wait is over.
Experience breakthrough performance with the all-new
Mercury Playback Engine in Adobe® Premiere® Pro CS5.
www.adobe.com/go/productionpremium
IMPORTANT INFORMATION
If there is anything else we can do to make your conference experience better,
please stop by the info desk and let us know!
ENROLL INTO YOUR SESSIONS Go to www.Nvidia.Com/gtc, click on “view schedule,” and log in to start adding
sessions into your personal schedule. Priority access into each session will be
given to those who enroll. Enrolling into sessions also help us place the most
popular sessions into the largest rooms.
WIRELESS INTERNET ACCESS Free wireless internet access is available in most session rooms, the keynote
hall, and also in the concourse outside of Ballroom A and the Exhibit Hall, under
“GTC2010.”
FIND OUT THE LATEST Log on to www.nvidia.com/gtc to get the latest coverage on the event, along
HAPPENINGS WITH THE with any updates on room changes, access to session feedback survey, etc.
CONFERENCE
BUSINESS CENTER / SHIPPING The Marriott Hotel and the Hilton Hotel both have business centers located
on the first floor, near their front lobby. You can work out shipments with their
respective front desks or bell desks. Alternatively, there is a FedEx Office Print
& Ship Center at 93 E. San Carlos St, near 3rd St (3 blocks from the Convention
Center, call 408-295-4336 for hours).
GO GREEN! Take part in the shared goal of minimizing our collective impact on the
environment. Please take only the conference materials you need and recycle
your badges at the conclusion of the event. Also, we have provided GTC coffee
mugs for those who opted in, so please use those to fill your hot and cold
beverages to avoid contributing to more waste to the environment.
BAG AND COAT CHECK Bag check is available at the bell desk of the Marriott and Hilton hotels,
connected to the Convention Center. It is also available for a small fee on the
ground floor of the San Jose Convention Center.
LOST AND FOUND Please check the info desk should you lose or find an article.
FIRST AID / EMERGENCY Should there be a medical emergency, please dial 911 and alert the nearest
conference personnel.
Unparalleled PowerEdge
flexibility with the M610x
The Dell™ PowerEdge™ M610x blade server allows you to
creatively incorporate a vast array of expansion solutions,
including the NVIDIA® Tesla™ GPGPU card.
Now, a single M610x, equipped with a NVIDIA Tesla GPGPU card and
installed in a PowerEdge M1000e blade enclosure, can perform over 400
Gigaflops of double-precision computations for demanding, floating-point-
intensive workloads. The x16 Gen2 PCIe slots in the PowerEdge M610x
bring a new dimension of flexibility and performance to your data center.
PowerEdge M1000e
PowerEdge M610x
NVIDIA Tesla
3 Important Information
7-9 Conference Highlights – Don’t Miss These Events!
12-13 Recommended for Academics
14-31 Emerging Companies
33-34 Sessions Listing – Monday
36-46 Sessions Listing – Tuesday
48-64 Sessions Listing – Wednesday
66-78 Sessions Listing – Thursday
80-93 Research Posters Listing
95-136 Speaker and Panel Listing
138-139 Exhibit Hall Map & Directory
140-155 Sponsors & Exhibitors
Quadro Professional Graphics
NVIDIA® Quadro® by PNY®
Professional Solutions. Professional Support.
Visit the PNY booth and see how NVIDIA Quadro by PNY and Quadro Plex multi-GPU
solutions are used today to enhance real-world professional graphics and HPC applications.
Meet with our experienced product managers and partners to discuss your development and
application needs. Learn how NVIDIA Quadro by PNY professional solutions enable new
technical and business possibilities.
Features and specifications subject to change without notice. The PNY logo are registered trademarks of
PNY Technologies, Inc. All other trademarks are the property of their respective owners. © 2009 PNY Technologies, Inc. All rights reserved.
CONFERENCE
HIGHLIGHTS
DON’T MISS THESE
EVENTS!
ALL WEEK LONG
Parallel Nsight Lounge, by Microsoft
While attending GTC, come learn from the experts at the Parallel Nsight™ Lounge
by Microsoft, a casual environment for hands-on learning and instruction on
Parallel Nsight, the industry’s first development environment for GPU-accelerated
applications. Experts from NVDIA and Microsoft will be available from 10am to
8pm each day to answer questions and provide instruction on Parallel Nsight,
Visual Studio 2010, Windows HPC Server 2008 and CUDA C/C++ development.
TUESDAY
09:00 - 10:30 Opening Keynote with Jen-Hsun Huang, NVIDIA CEO and Co-Founder
>Keynote Hall
12:00 - 14:00 Exhibits Open / Networking Lunch >Exhibit Hall
WEDNESDAY
09:00 - 09:50 Day 2 Keynote with Dr. Klaus Schulten, University of Illinois at Urbana-
Champaign >Keynote Hall
10:00 - 10:50 Emerging Companies Summit Opening Address and Highlights >Keynote Hall
12:00 - 14:00 Exhibits Open / Networking Lunch >Exhibit Hall
THURSDAY
09:00 - 09:50 Emerging Companies Summit “Fireside Chat” featuring Quentin Hardy (Forbes
Magazine) and Jen-Hsun Huang (NVIDIA) >Keynote Hall
12:00 - 14:00 Exhibits Open / Networking Lunch >Exhibit Hall
17:00 - 18:30 Closing Keynote with Sebastian Thrun, Professor / Distinguished Engineer,
Stanford University and Google >Keynote Hall
Fe
San
E.
What happens when you throw together live music, raffle prizes, your GTC
S.
4 th
t
St
S
colleagues and a charity? A great party for a great cause.
S.
st
Po
3rd
St
S.
t
oS
2n
d San Jose
You can feel even better about letting loose as every dollar raised will be
an
dS
rn Museum of Art
Fe
t
n
Sa
Fairmont
S.
W. Hotel
1s
tS
Ma
s
t
rlo
rke
n Ca
Sa
tS
For $10, you get a free drink ticket plus an entry into a raffle to win some stellar E.
tre
Hotel
et
Montgomery
prizes. Learn more and buy your tickets at the NVIDIA Foundation table on the rk
Av
e
TECH MUSEUM
Pa
concourse, the Gear Store booth, or buy at the door.
OF INNOVATION
r
do
lva
Alm
Ma
n Sa
rk
ad
CIVIC MARRIOTT Sa
E.
et
WHEN: Thursday, September 23 at 8:00 PM
en
AUDITORIUM HOTEL
re
St
e
B
t
lvd
WHERE: Voodoo Lounge, 14 S. Second Street, near Santa Clara Street HILTON
SAN JOSE McENERY
CONVENTION CENTER
HOTEL
e
YOU ARE HERE Av
a
ol
Vi
Tuesday / September 21
TIME ID / SESSION TITLE
09:00 - 10:30 1001 – Opening Keynote with Jen-Hsun Huang
11:00 - 12:00 2223 – Academic Welcome Social and Poster Review
11:00 - 11:50 2112 – The Heisenberg Spin Glass Model on GPU: Myth versus Fact
14:00 - 14:50 2262 – CUDA Centers of Excellence Super-Session I
15:00 - 15:50 2263 – CUDA Centers of Excellence Super-Session II
16:00 - 16:50 2264 – CUDA Centers of Excellence Super-Session III
17:00 - 17:50 2265 – CUDA Centers of Excellence Super-Session IV
18:00 - 18:50 1005 – Research Poster Showcase / Exhibits Open / Networking Reception
Wednesday / September 22
TIME ID / SESSION TITLE
09:00 - 09:50 1002 – Keynote with Dr. Klaus Schulten,
University of Illinois at Urbana-Champaign
10:00 - 10:50 2280 – TSUBAME2.0 Experience
10:00 - 10:50 2082 – CU-LSP: GPU-based Spectral Analysis of Unevenly Sampled Data
10:00 - 10:20 2163 – Leveraging GPUs for Evolutionary Game Theory
10:00 - 10:50 2249 – New Programming Tools GPU Computing
10:00 - 10:50 2166 – The Triad of Extreme Computing-Fast Algorithms, Open Software and
Heterogeneous Systems
10:00 - 10:50 2058 – A Practical Introduction to Computational Fluid Dynamics on GPUs
11:00 - 11:50 2078 – Shockingly fast and accurate CFD simulations
11:00 - 11:50 2286 – Towards Peta-Scale Green Computation- Applications of the GPU
Supercomputers in the Chinese Academy of Sciences (CAS)
11:00 - 11:50 2177 – Simplifying Parallel Programming with Domain Specific Languages
14:00 - 14:50 2248 – Parallel Processing on GPUs at the University of Utah
14:00 - 14:50 2137 – CUDA for Real-Time Multigrid Finite Element Simulation of Soft Tissue
Deformations
14:00 - 14:50 2000 – Gravitational N-body Simulations: How Massive Black Holes Interact with
Stellar Systems
14:00 - 14:50 2164 – Analytical Performance Models to Improve the Efficiency of
GPU Computing
14:00 - 14:50 2068 – Parallelizing FPGA Technology Mapping using GPUs
14:00 - 14:50 2204 – Bridging GPU Computing and Neuroscience to Build Large-Scale Face
Recognition on Facebook.
15:00 - 15:50 2050 – Copperhead: Data-Parallel Python for the GPU
15:00 - 15:50 2029 – Computer Vision Algorithms for Automating HD Post-Production
15:00 - 15:50 2044 – GRASSY: Leveraging GPU Texture Units for Asteroseismic Data Analysis
15:00 - 15:50 2122 – Using GPUs for Real-Time Brain-Computer Interfaces
15:00 - 15:50 2281 – Domain-Specific Languages
16:00 - 16:50 2108 – Binary Black Holes Simulations using CUDA
16:00 - 16:50 2118 – Large-scale Gas Turbine Simulations on GPU Clusters
16:00 - 16:50 2135 – Processing Petabytes per Second with the ATLAS experiment at the Large Hadron
Collider at CERN
16:00 - 16:50 2226 – Reverse Time Migration with GMAC
17:00 - 17:50 2005 – Porting Large-Scale Legacy Fortran Codes
17:00 - 17:50 2242 – Swarming Bacteria and Diffusing Particles: High-Throughput Analysis of
Microscopic 3D Motion
17:00 - 17:50 2167 – Designing a Geoscience Accelerator Library Accessible from High Level Languages
Thursday / September 23
TIME ID / SESSION TITLE
09:00 - 09:50 2030 – High-Throughput Cell Signaling Network Learning with GPUs
09:00 - 09:50 2236 – A Work-Efficient GPU Algorithm for Level Set Segmentation
10:00 - 10:50 2001 – Acceleration of the Freesurfer Suite for Neuroimaging Analysis
10:00 - 10:50 2269 – Bringing GPUs to Mainstream Molecular Dynamics Packages
10:00 - 10:50 2176 – Easy GPU Meta-programming: A Case Study in Biologically-Inspired
Computer Vision
10:30 - 10:50 2292 – Implementation of High-Order Adaptive CFD Methods on GPUs
11:00 - 11:50 2007 – Folding@home: Petaflops on the Cheap Today; Exaflops Soon?
14:00 - 14:50 2054 – NAMD, CUDA, and Clusters: Taking GPU Molecular Dynamics Beyond
the Desktop
14:00 - 14:50 2210 – GPU-Ocelot: An Open Source Debugging and Compilation Framework
for CUDA
15:00 - 15:50 2062 – HOOMD-blue: Fast and Flexible Many-Particle Dynamics
17:00 - 18:30 1003 – Closing Keynote with Dr. Sebastian Thrun, Stanford University
20:00 - ??? 1007 – Closing Night Party for Charity
emerging companies
summit
Welcome back to NVIDIA’s annual Emerging the massive power of GPUs to drive amazing
Companies summit. I’m delighted to report that 2010 performance for their applications.
marks the third consecutive and successful year for In the spirit of innovation, this year we decided to
ECS, and our momentum continues to build! introduce a new and exciting format for the startup
The Emerging Companies Summit (an integral presentations at ECS. While you will still be able
part of the GPU Technology Conference) is now the to find booths in the exhibit hall for almost all
premier event for startups to share new applications, 60 of the emerging companies, with the help of
based on GPUs (graphics processing units) that are an advisory committee we have chosen a select
revolutionizing the computing industry. At the same group of 24 to participate in action-packed “CEO on
time, these startups will have an opportunity to meet Stage” sessions. The CEOs from these 24 emerging
with hundreds of technologists, investors, analysts companies will have the special opportunity to both
and executives who add additional “fuel” to the GPU present and discuss their business strategies with
computing ecosystem. panels comprised of some of the world’s leading
Familiar sectors such as media and entertainment and most impressive venture capitalists, technology
have already been fundamentally altered by the executives and industry analysts. I believe this will
GPU. Movie director James Cameron often says be one of the major highlights of this year’s GPU
that Avatar could not have been created 10 years Technology Conference, and I urge you to participate
ago – it required powerful graphics processors to in as many of these sessions as possible.
bring his vision to life. In addition to entertainment, In closing, I am extremely excited and honored to
new industries and applications are also being once again be part of the Emerging Companies
unleashed and significantly enhanced by GPU-based Summit. The GPU computing ecosystem has now
technologies. gathered a full head of steam, which will be clearly
At this year’s ECS, I hope you will take full advantage evident over the next few days. I would also like to
of the opportunity to see and hear from 60 of the add a special note of thanks to our sponsors which
most promising companies in these fields. The include Cooley Godward Kronish, Citi, Sutter Hill
companies, representing several countries from Ventures, Silicon Valley Bank, Deloitte, Mandel
around the world, will showcase new technology Communications, Churchill Club, and VentureBeat.
in the fields of computer vision, robotics, video Thank you for attending, and welcome to the GPU
processing, cloud computing and mobile computing. computing revolution!
What they all share in common is that they harness
Jeff Herbst
Vice President of Business Development
NVIDIA
Experienced Guides
Company Summit.
PALo ALto | New York | SAN Diego | SAN FrANCiSCo | reStoN, VA | BroomFieLD, Co | wAShiNgtoN, DC | BoStoN | SeAttLe
© 2010 Cooley LLP, 101 California Street, 5th Floor, San Francisco, CA 94111. 415/693-2000.
RECOMMENDED SESSIONS FEATURING EMERGING COMPANIES
Emerging Companies Summit Agenda
Tuesday / September 21
TIME ID / SESSION TITLE
9:00 – 10:30 1001 – Opening Keynote with Jen-Hsun Huang
12:00 – 14:00 1004 – Exhibits Open / Networking Lunch
18:00 – 20:00 1005 – Exhibits Open / Networking Happy Hour/ Research Posters Showcase
Wednesday / September 22
TIME ID / SESSION TITLE
9:00 – 9:50 1002 – Day 2 Keynote with Dr. Klaus Schulten, University of Illinois at
Urbana-Champaign
10:00 – 10:50 4000 – Emerging Companies Summit Opening Address featuring
Jeff Herbst (NVIDIA)
11:00 – 11:50 4001 – Emerging Companies Summit “CEO on Stage” featuring Sam Blackman
(Elemental Technologies, Inc.), Sam Cox (Milabra), Chris Doran
(Geomerics) and panelists Drew Lanza (Partner, Morgenthaler), Dan’l
Lewin (Corporate VP and Strategic & Emerging Business Development,
Microsoft), Jon Peddie (President, JPR), Jeff Herbst (Vice President of
Business Development, NVIDIA)
12:00 – 14:00 1004 – Exhibits Open / Networking Lunch
14:00 – 14:50 4002 – Emerging Companies Summit “CEO on Stage” featuring Christopher
Blewitt (miGenius), Sebastien Deguy (Allegorithmic), Philip Lunn
(Bunkspeed) and panelists Drew Lanza (Partner, Morgenthaler), Dan’l
Lewin (Corporate VP and Strategic & Emerging Business Development,
Microsoft), Jon Peddie (President, JPR), Jeff Herbst (Vice President of
Business Development, NVIDIA)
15:00 – 15:50 4003 – Emerging Companies Summit “GPUs for Computer Vision” moderated by
Jon Peddie (Jon Peddie Research), featuring panelists Sam Cox (CEO,
Milabra), Tom Dean (Research Scientist, Google) Janko Mrsic-Flogel (CTO,
MirriAd), Joe Stam (Sr. Applications Engineer, NVIDIA), Yoram Yaacovi
(CTO & General Manager, Technologies at Microsoft)
16:00 – 16:50 4004 - Emerging Companies Summit “CEO on Stage” featuring Michael
Hummel (empulse GmbH), Natan Peterfreund (Playcast Media Systems),
Austin Shoemaker (Cooliris) and panelists Nathan Brookwood (Research
Fellow, Insight64), Charles Carmel (VP of Corporate Business
Development, Cisco), Flip Gianos (General Partner, InterwestInterWest
Partners), Jeff Herbst (Vice President of Business Development, NVIDIA)
17:00 – 17:50 4005 - Emerging Companies Summit “CEO on Stage” featuring Michel Tombroff
(Softkinetic), Uri Tal (Rocketick), Kristian Raue (Jedox Business
Intelligence) and panelists Nathan Brookwood (Research Fellow,
Insight64), Charles Carmel (VP of Corporate Business Development,
Cisco), Flip Gianos (General Partner, InterwestInterWest Partners),
Jeff Herbst (Vice President of Business Development, NVIDIA)
18:00 – 20:00 1005 - Exhibits Open / Networking Happy Hour
emerging companies
THURsday / September 23
summit
TIME ID / SESSION TITLE
9:00 – 9:50 4006 – Emerging Companies Summit “Fireside Chat” featuring Quentin Hardy
(Forbes Magazine) and Jen-Hsun Huang (Co-founder & CEO, NVIDIA)
10:00 – 10:50 4007 – Emerging Companies Summit “CEO on Stage” featuring Andrew Jamison
(Scalable Display Technologies), Jeroen Snepvangers (RTT), Michael
Zeitlin (Aqumin) and panelists Rob Enderle (Analyst, Enderle Group),
Jeff Herbst (Vice President of Business Development, NVIDIA), Savitha
Srinivasan (Corporate Venture Partner, IBM), Norman Winarsky (VP of
Ventures, Licensing and Strategic Programs, SRI)
11:00 – 11:50 4008 – Emerging Companies Summit “CEO on Stage” featuring David Peters
(Universal Robotics), David Hayes (ICD) and panelists Rob Enderle
(Analyst, Enderle Group), Jeff Herbst (Vice President of Business
Development, NVIDIA), Savitha Srinivasan (Corporate Venture Partner,
IBM), Norman Winarsky (VP of Ventures, Licensing and Strategic
Programs, SRI)
12:00 – 14:00 1004 – Exhibits Open / Networking Lunch
14:00 – 14:50 4009 – Emerging Companies Summit “The `New Normal’ For Building Emerging
Companies Based On Disruptive Technologies” moderated by Jeff Herbst
(NVIDIA), featuring panelists Gerald Brady (Silicon Valley Bank), Bill
Frauenhofer (Managing Director, Citigroup Global Markets), Garrett
Herbert (Partner, M&A Transaction Services, Deloitte & Touche LLP), Eric
Jensen (Partner, Business Department Chair, Cooley LLP), Andrew T.
Sheehan (Managing Director, Sutter Hill Ventures)
15:00 – 15:50 4010 - Emerging Companies Summit “CEO on Stage” featuring Yoram Burg
(OptiTex), Sylvain Ordureau (Useful Progress), Torsten Reil (NaturalMotion)
and panelists Tim Bajarin (Creative Strategies), Bill Tai (Charles River
Ventures), Paul Weiskopf (Adobe)
16:00 – 16:50 4011 – Emerging Companies Summit “CEO on Stage” featuring Jeff Han
(Perceptive Pixel), Lance Maurer (Cinnafilm, Inc.), Bruno Uzzan (Total
Immersion) and panelists Tim Bajarin (President, Creative Strategies),
Jeff Herbst (Vice President of Business Development, NVIDIA), Bill
Tai (General Partner, CRV), Paul Weiskopf (Sr. VP of Corporate
Development, Adobe)
17:00 – 18:30 1003 – Closing Ceremony / “Ones to Watch” Award Presentation and Closing
Keynote with Dr. Sebastian Thrun, Stanford University
20:00 – ? 1007 – Closing Party for Charity
Allegorithmic
Aqumin
summit
Carlsbad, California. Our philosophy has been to create easy to use use software
for creative people with no prior 3D modeling or rendering experience, thus
expanding the marketing beyond traditional bounderies established by complex
rendering software. Founded in 2003, Bunkspeed software has become the
standard in the industrial design community and is spreading rapidly to the
engineering design and marketing communities. Recently Bunkspeed has
introduced it’s new generation of 3D rendering software based on mental
images iray accelerated by the NVIDIA CUDA GPU’s.
Cinnafilm
Investors Undisclosed
Capital Raised Undisclosed
19
Financial capital. Intellectual capital.
In more than 100 countries around the world, Citi is helping companies,
governments and institutions overcome business challenges, raise capital,
mitigate risk and extend their reach.
When you partner with Citi, you gain access to an unparalleled global
platform, capital markets, insightful advice and award-winning solutions —
so you can realize your goals today and in the future.
Success requires both financial and intellectual capital, and that’s why
Citi never sleeps.
© 2009 Citigroup Inc. All rights reserved. Citi and Arc Design is a trademark and service mark of Citigroup Inc., used
and registered throughout the world. Citi Never Sleeps is a service mark of Citigroup Inc.
emerging companies
Cooliris
Cooliris was founded with a simple mantra: “Think beyond the browser”. The
summit
company creates products that make discovering and enjoying the Web more
exciting, efficient,and personal. Core products include Cooliris (formerly
PicLens),which transforms your browser into an interactive, full-screen
“cinematic” experience for web media, and CoolPreviews, which lets you preview
links instantly. Cooliris has reached over 12 million installs of the product, with
thousands more downloads everyday
Elemental Technologies
Geomerics
summit
design principles using the latest hardware and software technologies available.
Our products are optimized for usability, aesthetics, technology and purpose. We
specialize in NVIDIA chipset technologies running Google and Windows OS. We
have basic goals: Simplify. Innovate. Impress.
vast potential in technology and life science companies. From jump starting start ups to global cash
management to providing debt and asset management for industry leaders, SVB understands what you
do and has the resources to provide what you need. Beyond commercial banking, we offer venture
capital services and funds, valuations and analytics, private banking and more. Silicon Valley Bank.
©2010 SVB Financial Group.SM All rights reserved. Member Federal Reserve. Silicon Valley Bank.® All rights reserved. Member of FDIC and Federal Reserve. Rev. 08-24-10.
emerging companies
miGenius
summit
renderer and NVIDIA’s CUDA based GPU hardware systems; both businesses
and consumers alike can now rapidly and simply upload any 3D content to
individually customised websites that can be immediately shared and explored
with friends and colleagues, in both accurate photorealistic detail and real-
time. miGenius is developing a toolset to enable predominantly ‘non-technical’
users to easily create customised User Interfaces with a wide range of viewing
and management controls and upload their detailed 3D scenes onto either
a dedicated GPU server or onto the rapidly emerging ‘GPU cloud computing’
networks.
Milabra
OptiTex
OptiTex is the premiere 2D and 3D CAD software for virtually all sewn-products
industries. OptiTex technologies allows designers to create, correct and
adjust compelling designs before the first piece of fabric is cut, giving a new
dimension to the motto, “Virtual is Real”. OptiTex system consists of three main
components: cloth content creation system with our PDS software, 3D Runway
Designer, a virtual try-onsystem, which includes both cloth simulation and
accurate 3D parametric mannequins; motion animation engine which enables
the generation of motion sequences with interactive cloth. OptiTex brings a
wealth of virtual textile experience to the gaming, feature animation and digital
effects industries. OptiTex’s products are second only to real life in depicting
fabric movement and dynamics.
summit
multi-touch interfaces for the knowledge worker. The company’s hardware and
software solutions enable both novice and expert users to manipulate complex
datasets through a new class of intuitive yet powerful and visually rich interface
techniques.
Playcast Media Systems brings video games to the world’s largest media
distribution platform – Pay TV networks. The company’s solution includes
a head-end based system, which streams a game’s audiovisual content as
a standard MPEG stream, as well as the provisioning of the content and
programming itself. Playcast’s media streaming systems, located in operators’
headends, host the games and stream them over the existing video network
to an already distributed base of set-top boxes. Playcast is a privately owned,
venture capital backed company, based in Israel and the UK.
www.churchillclub.org
Churchill Club is a member-supported non-profit organization.
Top Ten Tech Trends 2010 For information about membership and upcoming programs,
please visit www.churchillclub.org
or contact us at 408.265.0130.
Join us Sept. 28th
for Churchill Club’s
Annual Dinner 2010
with: Juniper Network’s
CEO, Kevin Johnson Ignite your own conversations
emerging companies
RTT
summit
with assistance during each stage of the life cycle of their products – from the
initial product design stage through to development and subsequent marketing
and sales. The 3D data model from the product development stage serves as
the basis for all the following steps in the product lifecycle. It can be used,
for example, to rapidly create computer generated, photorealistic product
illustrations for the marketing department or to develop a 3D online product
configurator on a website. In this way, RTT doesn’t just speed up decision
making and development processes for its clients, but it also opens up new
opportunities with regard to marketing and sales. The company was founded in
1999 and its head office is in Munich, Germany. RTT AG has over 400 employees
and is represented in 14 locations worldwide. Many leading businesses have
put their trust in RTT and its portfolio of clients includes names such as Adidas,
Audi, BASF, BMW, Bosch, Daimler, EADS, Harley-Davidson, Miele, Porsche,
Samsung, Thyssen-Krupp, Toyota and Volkswagen. RTT AG is a stock market
listed company (Xetra:R1T; WKN: 701220; ISIN: DE0007012205). For more
information visitwww.rtt.ag.
Speaker Jeroen Snepvangers, President and CEO
Speaker Session
4007 - Emerging Companies: CEO on Stage featuring Aqumin, RTT, and Scalable
Display Technologies (Thursday, Sept 23, 10:00)
CEOs Ludwig A. Fuchs and Jeroen Snepvangers
Investors Balderton Capital, Siemens VC, Heliad
Capital Raised 15 Million Euros
Rocketick
Total Immersion
Total Immersion is the global leader in augmented reality. Through its patented
D’Fusion™ technology, Total Immersion blurs the line between the virtual
world and the real world by integrating real time interactive 3D graphics into
a live video stream. Total Immersion offers consumers a compelling way to
interact with brands in their own environment. With augmented reality, the
brand temporarily “resides” in the viewer’s space.Imagine a favorite animated
character sitting in the next chair, or a static product suddenly “come to life”–
that’s Total Immersion’s augmented reality.
Universal Robotics creates software that enables machines to learn from their
summit
experiences, react and adapt to their surroundings, and perform tasks that are
costly, dangerous or difficult for humans to undertake. The company’s signature
technology, Neocortex, which was developed over seven years at NASA and
Vanderbilt University, will increase efficiency and worker safety across industries
in applications including warehousing, mining, handling hazardous waste and
automating vehicles such as forklifts.
Useful Progress
GTC SYNNEX
NETWORK
Please visit these Tesla Preferred Partner
exhibits and be entered in a drawing to win a free
NVIDIA Tesla C2050!
SGI
Monday, Sept 20, 13:00 (80 minutes) programming with CUDA through a number of hands-on code
Marriott San Jose Ballroom examples. Examine more deeply the various APIs available to
CUDA applications and learn the best (and worst) ways in which
2004 Languages, APIs and Development Tools for GPU
to employ them in applications. Master the first half of the book
Computing (Pre-Conference Tutorial)
“CUDA by Example” as taught by the author, pointing you on a
Get a head start on the conference with this first-day introduction trajectory to complete the second half on your own after course
to key technologies for GPU Computing. This 90-minute tutorial completion.
session will cover the key features and differences between the
major programming languages, APIs and development tools Speaker(s): Jason Sanders (Senior Software Engineer, NVIDIA)
available today. Attendees will also learn several high level design Topic(s): Programming Languages & Techniques,
patterns for consumer, professional and HPC applications, with
practical programming considerations for each. Monday, Sept 20, 14:30 (80 minutes)
Room C
Speaker(s): Phillip Miller (Director, Workstation Software Product
MONDAY
Management, NVIDIA), Holger Kunz (Director, 2159 Programming the NVIDIA Digital Video Pipeline with
Workstation Software Development, NVIDIA), Brian Direct3D (Pre-Conference Tutorial)
Harrison (NVIDIA), Thomas Ruge (Software Learn how to program the NVIDIA Quadro Digital Video pipeline
Manager, NVIDIA) using Direct3D. This session will provide an overview of the SDK,
Topic(s): Programming Languages & Techniques, discuss device control, data transfers, performance measuring
Tools & Libraries and tuning, ancillary data and application design considerations.
Speaker(s): Thomas True (Applied Engineer, NVIDIA)
Monday, Sept 20, 13:00 (80 minutes)
Topic(s): Programming Languages & Techniques,
Room B
Video Processing, Computer Graphics
2024 NVIDIA Acceleration Engines Overview
(Pre-Conference Tutorial) Monday, Sept 20, 14:30 (80 minutes)
Come learn of the software engines NVIDIA freely provides to Room A5
application developers to rapidly leverage new GPU capabilities 2260 DirectCompute (Pre-Conference Tutorial)
and dramatically reduce the time it takes to bring compelling
Learn how to to use the DirectCompute API to solve GPU
features to end users.
computing problems. This tutorial will introduce the
Speaker(s): Phillip Miller (Director, Workstation Software Product DirectCompute API, cover the recommended best practices for
Management, NVIDIA), Holger Kunz (Director, GPU programming, and go over examples of how to use this API
Workstation Software Development, NVIDIA), Brian efficiently and effectively to solve compute-intensive problems.
Harrison (NVIDIA)
Speaker(s): Eric Young (Manager of Developer Technology
Topic(s): Programming Languages & Techniques, Computer
Professional and Consumer Applications, NVIDIA)
Vision, Ray Tracing
Topic(s): Programming Languages & Techniques
Video Processing, Computer Graphics Let’s dive into the 3rd dimension. This talk presents a
comprehensive technical overview of NVIDIA’s stereo technology
Monday, Sept 20, 14:30 (80 minutes) and tools. After a complete introduction to NVIDIA’s stereo
Marriott San Jose Ballroom technology, we will then explore in more detail production
techniques for the new artistic space of effects and creativity
2131 Introduction to CUDA C (Pre-Conference Tutorial)
offered by 3D stereo. The take away of this session will be a solid
Starting with a background in C or C++, learn everything you need understanding of NVIDIA’s stereo technology and how to take best
to know in order to start programming in CUDA C. Beginning advantage of it.
with a “Hello, World” CUDA C program, explore parallel
33
Speaker(s): Samuel Gateau (Developer Technology Engineer,
NVIDIA), Steve Nash (Applied Engineer, NVIDIA)
Topic(s): Programming Languages & Techniques,
Stereoscopic 3D
IBM, the IBM logo, ibm.com, Smarter Planet and the planet icon are trademarks of International Business Machines Corp., registered in
many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at www.ibm.com/legal/copytrade.shtml. © International Business Machines Corporation 2010.
Tuesday, Sept 21, 09:00 (90 minutes) Tuesday, Sept 21, 11:00 (50 minutes)
Keynote Hall Room A2
1001 Opening Keynote with Jen-Hsun Huang, NVIDIA 2096 High-Speed CT Reconstruction in Medical Diagnosis &
Do not miss this opening keynote, featuring Jen-Hsun Huang, Industrial NDT Applications
CEO and Co-Founder of NVIDIA and special guests. Hear about We present the software platform CERA developed by Siemens,
what’s next in computing and graphics, and preview disruptive which utilizes (multiple) graphics processing units (GPUs) in
technologies and exciting demonstrations from across industries. order to deliver high-speed CT reconstructions, and describe
its implementation challenges using CUDA and OpenCL. We
Jen-Hsun Huang co-founded NVIDIA in 1993 and has served since
further show how GPU acceleration enables the utilization
its inception as president, chief executive officer and a member of
of reconstruction approaches which provide highly improved
the board of directors.
reconstruction quality in NDT applications.
Speaker(s): Jen-Hsun Huang (CEO & Co-Founder, NVIDIA)
Speaker(s): Holger Scherl (Computer Scientist, Siemens AG)
Topic(s): General Interest
Topic(s): Medical Imaging & Visualization, Imaging
TUESDAY
Room B
2165 Rendering Revolution
Learn how GPU technologies are transforming the making of 2149 Overview of Parallel Nsight for Visual Studio
pixels. This talk will cover GPU-centric rendering techniques NVIDIA Parallel Nsight provides access to the power of the GPU
that leverage both the raw computational capabilities of NVIDIA’s from within the familiar environment of Microsoft Visual Studio.
GPUs and advanced pixel-shading techniques for interactive This session is an entry level overview of the GPU computing and
visualization and rendering. graphics development features of Parallel Nsight as well as a
glimpse into the future of this powerful tool.
Speaker(s): Ken Pimentel (Director, Media & Entertainment,
Autodesk) Speaker(s): Kumar Iyer (Product Manager, NVIDIA)
Topic(s): Computer Graphics, Film Topic(s): Tools & Libraries
Tuesday, Sept 21, 11:00 (50 minutes) Tuesday, Sept 21, 12:00 (120 minutes)
Room D Exhibit Hall
2172 Unveiling Cellular & Molecular Events of Cardiac 1004 Exhibits Open / Networking Lunch
Arrhythmias Join your colleagues in the exhibit hall to preview emerging
George Mason University is using CUDA technology to get a technologies and see some of the most innovative solutions
20x speed-up in simulations of intracellular calcium dynamics, available today. Lunch will be served.
thought to play a major role in the generation of cardiac
Topic(s): General Interest
arrhythmias. We will discuss the novel algorithms we have
developed for Markov Chain Monte Carlo Simulation and their
Tuesday, Sept 21, 14:00 (50 minutes)
use in investigating elementary events of calcium release in the
Room A8
cardiac myocyte. The resulting extremely fast simulation time
has generated new insights into how defects in the control of 2013 iray - GPUs and the Photorealistic Rendering
intracellular calcium may lead to cardiac arrhythmia. Revolution
Speaker(s): Tuan Hoang-Trong (PhD student, George Hear about the ongoing revolution in the production of
Mason University) photorealistic imagery being powered by GPUs. We will explore
Topic(s): Life Sciences, Algorithms & Numerical Techniques,
the algorithms and concepts behind iray – a CUDA accelerated
Physics Simulation
software library from mental images/NVIDIA that provides
an interactive, push-button, fast synthetic digital camera in
software to a variety of OEM applications and platforms. We
Tuesday, Sept 21, 11:00 (50 minutes)
will demonstrate iray embedded in commercial CAD and Digital
Room L
Content Creation applications as well as in 3D cloud computing
2214 Faster Simulations of the National Airspace System platforms.
Learn about twenty-four hour, fast-time simulations of traffic Speaker(s): Michael Kaplan (Vice President of Strategic
in the National Airspace System, which use GPU technology to Development, mental images/NVIDIA)
help perform key steps in the trajectory prediction of flights.
Topic(s): Digital Content Creation (DCC), Cloud Computing,
GPUs enabled us to improve the runtime by up to two orders of
Ray Tracing
magnitude versus the previously required tens of minutes per
FULL CONFERENCE GUIDE 2010
TUESDAY
Topic(s): Digital Content Creation (DCC), Stereoscopic 3D approach and architecture of the solution. Benchmark results
will show where success has been found. Plans for future products
Tuesday, Sept 21, 14:00 (50 minutes) will also be covered.
Room M Speaker(s): Chris Gottbrath (Principal Product Manager,
2233 Solving Your GPU Computing Needs (Sponsored by HP) TotalView Technologies, Inc., a Rogue Wave
Software company)
In this session we will go into detail and you will learn about HP’s
GPU enabled systems, from Workstations to our GPU enabled Topic(s): Tools & Libraries
servers and clusters. You will get the latest information on
configurations, options, GPU management and use cases. Tuesday, Sept 21, 14:00 (50 minutes)
Room D
Speaker(s): Dave Korf (Marketing, HP), Will Wade (Business
Alliance Manager, HP) 2303 Using Tegra to Solve The Electric Car Power Dilemma
Topic(s): High Performance Computing Explore how advanced SoC technologies are transforming the
world of automotive industry. Learn on how using nVidia Tegra
Tuesday, Sept 21, 14:00 (50 minutes) increased the available range while pushing the envelope on next-
Marriott San Jose Ballroom gen driver experience. Sharing the lessons learned in the world
of electric cars and challenges in constructing a mass production
2262 CUDA Centers of Excellence Super-Session I electric vehicle.
Come hear about the groundbreaking research taking place at
Speaker(s): Theo Valich (President, Bright Side Network Inc.)
the CUDA Centers of Excellence, an elite group of world-renown
research universities that are pushing the frontier of massively Topic(s): Embedded & Automotive, Computer Vision, Video
parallel computing using CUDA. Researchers from these top Processing, Computer Graphics
institutions will survey cutting-edge research that is advancing
the state of the art in GPU computing and dozens of application Tuesday, Sept 21, 15:00 (50 minutes)
fields across science and engineering. Room A3
In this session we will hear from Professor Hanspeter Pfister of 2017 Lessons Learned Deploying the World’s First GPU-
Harvard University and Professor Jeff Vetter of Georgia Tech and Based Petaflop System
Oak Ridge National Laboratory. Learn what to expect when deploying PetaFLOP or larger systems.
The June 2010 list of the Top 500 computer systems featured the
Speaker(s): Hanspeter Pfister (Professor, Harvard University),
first GPU based cluster to exceed 1 PetaFLOP of foating point
Jeffrey Vetter (Professor, Georgia Tech / Oak Ridge
power -- a system that was built in a fraction of the time and the
National Laboratory)
cost a CPU-only system of that performance would have required.
Topic(s): General Interest An overview of how system builders and administrators should
prepare for large-scale HPC deployments.
Tuesday, Sept 21, 14:00 (50 minutes)
Room L Speaker(s): Dale Southard (Senior Solution Architect, NVIDIA)
Topic(s): High Performance Computing
2276 Using GPUs to Run Next-Generation Weather Models
We are using GPUs to run a new weather model being developed Tuesday, Sept 21, 15:00 (50 minutes)
at NOAA’s Earth System Research Laboratory (ESRL) called
FULL CONFERENCE GUIDE 2010
Room A2
the Non-hydrostatic Icosahedral Model (NIM). NIM is slated to
run at high resolution (4km global scale) within two years. This 2074 Driving a Product from Rasterization to Ray Tracing:
presentation will highlight work required to parallelize and run The Developer Experience
the NIM. We will describe progress running on multiple GPUs, Learn from the challenges encountered while using DirectX to
report on our evaluation of two FORTRAN GPU compilers, and update the Bunkspeed Move rasterization engine to work with
give performance updates of NIM using Fermi. We will also Mental Images’ iRay. This work was part of the creation of
discuss special challenges developing and running operational Bunkspeed Shot, which allows the user to leverage both the high
weather models on GPUs. quality image generation of iRay and a highly interactive, good
39
CUDA The VGK platform is a collection of software
tools that enable Flash content to run
Open GL
on a variety of systems, from low-power
ActionScript Accelerator electronics and mobile devices to high-end
3D desktop and game systems. Flash can be
Multi Parallel Animations
integrated into your 2D and 3D applications
3D Apps with ease.
TUESDAY
WebSockets integrate with WebGL. Experienced OpenGL
optimal switching point between algorithms on a per-machine
developers will learn how to transition their knowledge to WebGL
basis. Next, we present a technique to handle large systems,
development.
where shared memory constraints prohibit previous work to solve
these systems directly. Finally, we will discuss optimizations on Speaker(s): Vladimir Vukicevic (Principal Engineer, Mozilla
a cyclic reduction technique that avoid bank conflicts on current Corporation)
hardware. Topic(s): GPU Accelerated Internet, Tools & Libraries,
Computer Graphics
Speaker(s): Andrew Davidson (Graduate Student, University
of California, Davis), Yao Zhang (Graduate Student,
University of California, Davis) Tuesday, Sept 21, 15:00 (50 minutes)
Topic(s): Algorithms & Numerical Techniques, Computational
Room B
Fluid Dynamics 2147 GPGPU Development for Windows HPC Server
Attend this demo-driven session to see how to schedule jobs to
Tuesday, Sept 21, 15:00 (50 minutes) a Windows compute cluster that includes GPUs. We will also
Room N demonstrate GPU-enhanced versions of some commonly used
2090 Developing Highly Scalable Particle-Mesh Codes for HPC open-source codes, and show how NVIDIA Parallel Nsight™
GPUs: A Generic Approach can be used to debug GPU applications on a cluster. Provides
a brief introduction to performance profiling tools that allow
Dive deep into a multi-parallel Particle in Cell code that utilizes
developers to analyze system, CPU and GPU events.
MPI, pthreads, and CUDA. Around this specific application a
general C++ framework for transparent data transfers between Speaker(s): Calvin Clark (Senior Consultant, Microsoft)
GPUs has been developed and will be presented. Further Topic(s): High Performance Computing, Tools & Libraries
techniques employed include interleaving of communication
and computation, particle tiling and a study of how well CUDA Tuesday, Sept 21, 15:00 (20 minutes)
performance can be transferred to OpenCL. Room K
Speaker(s): Guido Juckeland (Senior System Engineer (HPC), 2148 Rapid Prototyping and Visualization with OpenCL
Leader Hardware Accelerator Group, TU Dresden Studio
- ZIH), Michael Bussmann (Junior Group Leader Learn about OpenCL Studio, an integrated OpenCL and OpenGL
Computational Radiation Physics, development environment for parallel programming and
Forschungszentrum Dresden-Rossendorf) visualization. We will discuss building end user applications and
Topic(s): Physics Simulation, Astronomy & Astrophysics, High using its integrated visualization capabilities to better understand
Performance Computing the output and internal structure of parallel algorithms. We
will also demonstrate its capabilities using several sample
Tuesday, Sept 21, 15:00 (50 minutes) applications including particle systems, volumetric rendering,
Room L and image processing.
2103 Development of an Efficient GPU-Accelerated Model Speaker(s): Jochen Stier (Founder, Geist Software Labs)
for Fully Nonlinear Water Waves Topic(s): Tools & Libraries
This work is concerned with the development of an efficient high-
FULL CONFERENCE GUIDE 2010
throughput scalable model for simulation of fully nonlinear water Tuesday, Sept 21, 15:00 (50 minutes)
waves (OceanWave3D) applicable to solve and analyze large-scale Room A7
problems in coastal engineering. The goal can be achieved through
algorithm redesign and parallelization of an optimized sequential
2224 GPU Acceleration in Adobe Creative Tools
single-CPU algorithm based on a flexible-order Finite Difference Hear experts explain how Adobe Creative Suite 5 harnesses
Method. High performance is pursued by utilizing many-core the power of CUDA technology in several of its core software
processing in the model focusing on GPUs for acceleration of code applications. We will focus on the complete redesign of the
execution. This involves combining analytical methods with an core video playback and rendering engine in Adobe Premiere
algorithm redesign of the current numerical model. Pro CS5 and how it uses the power of GPUs to deliver superior
41
performance and change the game for Adobe in professional Tuesday, Sept 21, 15:00 (50 minutes)
video production. Room M
Speaker(s): Paul Young (Adobe), Steve Hoeg (Adobe), 2270 Appro’s GPU Computing Solutions
Al Mooney (Adobe) Learn how GPU’s are changing the High Performance Computing
Topic(s): Video Processing, Imaging landscape to deliver price/performance levels that were previously
considered unachievable. Join Appro (http://www.appro.com),
Tuesday, Sept 21, 15:00 (50 minutes) a leading provider of supercomputing solutions; to discuss the
Room A1 introduction of the Appro Tetra server, the most powerful GPU
2227 OpenGL 4.0 Tessellation for Professional Applications server available today in a 1U form factor and the availability
of a new modular GPU expansion blade, both based on NVIDIA
The new generation of accelerated graphics is elevating
Tesla 20-series GPUs. The availability of these two products is
visual computing to new heights. Tessellation, one of its most
a confirmation of Appro’s commitment in providing the most
anticipated features, is already used in many scenarios to bring
innovative and powerful computing platforms at very attractive
3D graphics to an unprecedented level of realism.
prices to the High Performance Computing markets.
This talk will introduce tessellation using OpenGL 4.0. We will also Speaker(s): John Lee (Vice President, Appro International, Inc.)
describe how an existing application can be adapted to efficiently
Topic(s): High Performance Computing
take advantage of this new feature and also how to overcome
TUESDAY
TUESDAY
Speaker(s): Xiaohui Cui (Research Scientist, Oak Ridge
the CPU, GPU, and operating system, and explore the depths of
National Laboratory)
Parallel Nsight profilers, including GPU performance counters
Topic(s): High Performance Computing
and how to use them.
Tuesday, Sept 21, 16:00 (50 minutes) Speaker(s): Sebastien Domine (Sr. Dir. Developer Tools, NVIDIA)
Room C Topic(s): Tools & Libraries, Programming Languages &
Techniques
2056 Next-Generation Rendering with CgFX
Dive into the details of using CgFX – Cg’s effect framework – to
Tuesday, Sept 21, 16:00 (50 minutes)
combine ray-tracing with real-time rendering and enable the next
Room A7
generation of complex high-quality rendering. You will learn how
to use CgFX to create complex rendering effects in a concise and 2161 NVIDIA Quadro Digital Video Pipeline Overview
elegant fashion by: This session will provide an overview of the Quadro Digital Video
Pipeline. It will cover a description of the DVP components,
• Blending material-level and scene-level effects in a consistent way,
application architectures software architectures, and
• Seamlessly integrating CUDA-based data processing within the programming resources available.
CgFX rendering pipeline,
Speaker(s): Thomas True (Applied Engineer, NVIDIA)
• Mixing OptiX-based rendering with CgFX and OpenGL. Topic(s): Computer Graphics, Video Processing, Programming
Languages & Techniques
Speaker(s): Tristan Lorach (Computer Graphics Engineer,
NVIDIA)
Tuesday, Sept 21, 16:00 (20 minutes)
Topic(s): Computer Graphics
Room K
Tuesday, Sept 21, 16:00 (50 minutes) 2179 GPU - An R Library for Native GPU Objects
Room A5 Come learn about the GPU R package. R is the widely popular
open source statistical programming language. The GPU package
2067 Experiences with Code Optimizations for High
extends R by providing GPU-based types, classes and methods
Performance GPGPU Programs
implementing GPU versions of R vectors, matrices, lists and
Attend this session to learn and share code optimizations to
data frames. Subsequent operations with these are executed
achieve high performance GPU computing. We will cover code
on the GPU. Users are not required to create special bindings or
transformations for memory coalesing, workload management at
implement special syntax, nor do they need copy objects between
both thread and thread-block levels, and different ways to handle
CPU and GPU. The GPU packages allows programmers access
memory partition conflicts. We will also discuss Integration of
to the computational power of GPUs with little modification to
code optimizations into a compiler.
existing code.
Speaker(s) Huiyang Zhou (Associate Professor, North Carolina
Speaker(s): Christopher Brown (Partner, Open Data)
State University), Yi Yang (Ph.D. student, North
Topic(s): Tools & Libraries, Algorithms & Numerical
Carolina State University)
Techniques, High Performance Computing
Topic(s): Programming Languages & Techniques
TUESDAY
maintaining high performance.
Tuesday, Sept 21, 17:00 (50 minutes)
Speaker(s): Matthew Curry , Sandia National Laboratories and
Room A8
the University of Alabama at Birmingham
2060 GPUs in a Flash: Mapping the Flash Animated Topic(s): High Performance Computing
Software Vector Rendering Model to the GPU
Explore the Flash rendering architecture including the challenges Tuesday, Sept 21, 17:00 (50 minutes)
of mapping from an animated software vector rendering model Room B
to a GPU. We will also discuss how the landscape of mobile,
2212 Parallel Nsight for Accelerated DirectX 11
desktop, devices, drivers, and APIs impacts the design and
Development [Advanced]
deployment of a GPU based Flash Player.
Parallel Nsight is NVIDIA’s new development environment for
Speaker(s): Lee Thomason (Principal Scientist, Adobe Systems) graphics and GPU computing. In this advanced session, you
Topic(s): GPU Accelerated Internet will learn how Parallel Nsight can accelerate debugging and
profiling of Direct3D 11 applications. Attendees will learn how
Tuesday, Sept 21, 17:00 (50 minutes) to debug Direct3D frames and HLSL shaders using Parallel
Room A5 Nsight’s powerful Graphics Inspector and Debugger which
allows developers to inspect Direct3D resources and state, set
2084 State of the Art in GPU Data-Parallel Algorithm
breakpoints in HLSL shaders, examine shader variables, and
Primitives
see which graphics primitives are live on the GPU. Attendees
Learn about the importance of optimized data-parallel algorithm
will also learn how to use the Frame Profiler to capture and
primitives as building blocks for efficient real-world applications.
mine performance information, and easily pinpoint bottlenecked
Fundamental parallel algorithms like sorting, parallel reduction,
GPU units.
and parallel scan are key components in a wide range of
applications from video games to serious science. This session Speaker(s): Simon Barrett (Senior Software Engineer, NVIDIA)
will cover the state of the art in data-parallel primitive algorithms Topic(s): Programming Languages & Techniques,
for GPUs. Starting with an explanation of the purpose and Computer Graphics
applications of the algorithms, we will discuss key algorithm
design principles, demonstrate current open source algorithm Tuesday, Sept 21, 17:00 (50 minutes)
libraries for GPUs (CUDPP and Thrust), describe optimizations Room A3
using new features in the Fermi architecture, and explore future
2225 Tools for Managing Clusters of NVIDIA GPUs
directions.
Learn about the suite of tools NVIDIA provides to manage
Speaker(s): Mark Harris (Senior Developer Technology large installations of GPUs from the NVIDIA Tesla Series. The
Engineer, NVIDIA) presentation will cover cluster management – tool and library – ,
Topic(s): Algorithms & Numerical Techniques, High as well as the GPUDirect technology that enables GPUs to
Performance Computing, Tools & Libraries communicate faster across the network.
Speaker(s): Peter Buckingham (Tesla Software Manager,
Tuesday, Sept 21, 17:00 (50 minutes)
NVIDIA), Andrew Iles (Software Engineer, NVIDIA)
Room N
Topic(s): Tools & Libraries
FULL CONFERENCE GUIDE 2010
1 Rugged Cluster.
290 Processors.
We’ve redefined the size, weight and power
equation for rugged applications.
WEDNESDAY
fast algorithms, such that our science can most benefit, an
ideal environment is created by the open software model, where 2249 New Programming Tools GPU Computing
efforts can be shared. We will describe one area of application This session will focus on new parallel programming tools
--electrostatics of biomolecules in solution-- where we see at for GPU computing. The type of tools that fit into the session
work the triad of extreme computing: fast algorithms, open include (1) Planning tools for porting legacy applications to use
software, and heterogeneous computing. GPU computing, (2) High-level programming and scripting tools
for GPU computing, (3) Automation of common performance
Speaker(s): Lorena Barba (Assistant Professor, Boston
optimizations for GPU computing, (4) Performance analysis
University)
and diagnosis tools for GPU computing, (5) Tools that simplify
Topic(s): Algorithms & Numerical Techniques, Physics
heterogeneous parallel computing.
Simulation
Speaker(s): Wen-mei Hwu (Professor, University of Illinois,
Wednesday, Sept 22, 10:00 (50 minutes) Urbana-Champaign), Andrew Schuh (Project
Room B Manager, University of Illinois)
Topic(s): Tools & Libraries
2168 Interactive Molecular Dynamics for Nanomechanical
and Nanochemical Experiments
Wednesday, Sept 22, 10:00 (50 minutes)
Hear how the combination of GPU accelerated molecular Marriott San Jose Ballroom
dynamics simulation software, 3D TV displays, affordable haptic
game controllers, and high performance molecular visualization 2280 TSUBAME2.0 Experience
is leading to new ways to study materials and objects on the Tsubame2.0 is the next-generation multi-petaflops
nanoscale. We will present the concept of an appliance for supercomputer that been designed and built at Tokyo Tech,
integrated virtual nanoscale experiments and challenges related with more than 4000 NVIDIA Fermi GPUs. as a successor to the
to software and hardware. highly successful Tsubame1. Deep design considerations were
made based on experiences on Tsubame1 retrofitted with the
Speaker(s): Axel Kohlmeyer (Associate Director, Institute for
previous generation Tesla to maximize the versatility and the
Computational Molecular Science, Temple University)
competitiveness of the system across considerable number of
Topic(s): Molecular Dynamics
application domains, as well as accommodating as much strong
scaling as possible. This resulted in a totally new custom system
Wednesday, Sept 22, 10:00 (50 minutes) design in collaboration with HP and NEC, rather than a machine
Room A7 with a retrofitted GPUs. The resulting supercomputer hopefully
2169 Real-time Volumetric Medical Ultrasound will become a design template of future large-scale GPU systems
Applications for GPU Computing to come.
Real-time volumetric medical ultrasound requires computationaly Speaker(s): Satoshi Matsuoka (Professor, Tokyo Institute
intensive rapid processing of data for visualization of aquired of Technology)
acoustic data. Clinical applications of GPU-based technologies in Topic(s): High Performance Computing
obstetrics and cardiology will be discussed.
FULL CONFERENCE GUIDE 2010
Speaker(s): Roee Lazebnik (Director of Product Development, Wednesday, Sept 22, 10:00 (50 minutes)
Siemens Healthcare) Room A8
Topic(s): Medical Imaging & Visualization, Imaging, 2305 PantaRay: Accelerating Out-Of-Core Ray Tracing of
Stereoscopic 3D, Computer Graphics Sparsely Sampled Occlusion
Modern VFX rendering pipelines are faced with major complexity
challenges: a film like Avatar requires rendering hundreds of
thousands of frames, each containing hundreds of millions
49
or billions of polygons. Furthermore, the process of lighting Wednesday, Sept 22, 10:30 (20 minutes)
requires many rendering iterations across all shots. In this Room D
talk, we present the architecture of an efficient out-of-core ray
2109 Migration of a Complete 3D Poisson Solver from
tracing system designed to make rendering precomputations
Legacy Fortran to CUDA
of gigantic assets practical on GPUs. The system we describe,
dubbed PantaRay, leverages the development of modern ray We describe our journey of migrating a legacy direct solver
tracing algorithms for massively parallel GPU architectures and library for Poisson equations written in Fortran77 to CUDA in
combines them with new out-of-core streaming and level of detail order to harness the computational power provided by the Tesla
rendering techniques. device (“Fermi”). This legacy library is still widely used today as
it is the most complete library that can deal with three different
Speaker(s): Luca Fascione (Senior Research and Development boundary conditions (Dirchlet, Neumann and Cyclic) and two
Engineer, Weta Digital) grid configurations (staggered and centered) independently in
Topic(s): Digital Content Creation (DCC) any of the three dimensions (x, y, z); giving a total of over 200
configurations.
Wednesday, Sept 22, 10:00 (20 minutes)
Speaker(s): Huynh Phung (Research Engineer, A*STAR Institute
Room K
of High Performance Computing)
2306 Gate-Level Simulation with GP-GPUs Topic(s): Tools & Libraries, Computational Fluid Dynamics
Logic simulation is a critical component of the digital design
tool flow. It is used from high-level descriptions down to gate- Wednesday, Sept 22, 10:30 (20 minutes)
level to validate several aspects of the design, particularly Room K
functional correctness. Despite development houses investing
2300 High-Performance Compressive Sensing using Jacket
WEDNESDAY
Wednesday, Sept 22, 11:00 (50 minutes) Wednesday, Sept 22, 11:00 (50 minutes)
Room A2 Room L
2059 Industrial Seismic Imaging on GPUs 2092 Integrating CUDA into a Large-Scale Commercial
At Hess Corporation, we have moved the most computationally Database Management System
intensive parts of our seismic imaging codes from CPUs to GPUs In a large-scale database installation where data tables are
over the past few years. In this talk I will give an overview of distributed across multiple servers, computational throughput
seismic imaging, highlighting the physical and computational can be optimized by using GPUs on each server and integrating
algorithms of these codes. I will discuss our software approach database management with GPU resources.
and the programming effort to port them to GPUs, concluding
In the Department of Physics and Astronomy at The Johns
with a summary of our progress in adopting GPUs in production.
Hopkins University, we are experimenting with a set of software
Speaker(s): Scott Morton (Geophysical Advisor, tools that closely couple SQL statements with GPU functionality.
Hess Corporation) While still under development, the new framework is now
Topic(s): Energy Exploration, High Performance Computing routinely used in our research projects, e.g., to study the spatial
clustering of galaxies as well as genomics.
Wednesday, Sept 22, 11:00 (50 minutes) Speaker(s): Richard Wilton (Research Scientist, The Johns
Room E Hopkins University), Tamas Budavari (Research
2065 Massively Accelerating Iterative Gauss-Newton Scientist, Johns Hopkins University)
WEDNESDAY
Fitting Topic(s): Databases & Data Mining, Astronomy &
To measure three-dimensional shape data of objects, we build Astrophysics, High Performance Computing,
up a measurement system that assigns three-dimensional Tools & Libraries
coordinates to the position of projected measurement labels in
a camera image. To achieve high measurement accuracy across Wednesday, Sept 22, 11:00 (50 minutes)
high amounts of measurement points, we need a very quick Marriott Guadalupe Room
routine to localize measurement labels with high precision. To 2099 Cosmology Powered by GPUs Redux
speed up the computation, we evaluate the fits using the CUDA
Cosmological simulations aim at reproducing the physical
architecture. The final implementation speeds up the fitting of 104
processes which occur on the largest scales of the Universe
two-dimensional Gauss functions by a factor of 90.
since the Big-Bang by means of numerical calculations on
Speaker(s): Daniel Härter (University of Freiburg, IMTEK, supercomputers. Using CUDA, I have implemented standard
Laboratory for Process Technology) cosmological techniques on GPU architecture (PM N-Body solver,
Topic(s): Computer Vision, Stereoscopic 3D Hydrodynamics & moment-based radiative transfer) and designed
them to run on supercomputing facilities by means of MPI+CUDA
Wednesday, Sept 22, 11:00 (50 minutes) mixed programming. These applications are able to run on 100 or
Room A1 more graphics devices with typical scalar x50 accelerations and
with a communication overhead limited to 15%. It allow to explore
2071 Large Scale Visualization Soup physical regimes which were out of reach of current simulations.
The unprecedented realism that is possible today allows for
visualization at an ever larger scale. This talk will walk through Speaker(s): Dominique Aubert (Lecturer, Strasbourg University)
several case studies from high resolution single displays to Topic(s): Astronomy & Astrophysics
completely immersive environments. Details will be shared
on how to architect and implement these installations, with Wednesday, Sept 22, 11:00 (50 minutes)
attention to the typical issues encountered. It will cover how to Room N
implement stereo 3D in OpenGL, Direct3D, as well as how that 2104 Rapid Prototyping Using Thrust: Saving Lives with
relates to the different display technologies (projectors, multi- High Performance Dosimetry
display, CAVEs, etc.).
Radiation poisoning is an everpresent danger for intervention
Speaker(s): Steve Nash (Applied Engineer, NVIDIA) teams that must visit nuclear sites. Virtual reality can help teams
Topic(s): Computer Graphics, Stereoscopic 3D prepare for intervention, but efficient computation of radiation
dosage is critical to study complex scenarios. Radiation protection
Wednesday, Sept 22, 11:00 (50 minutes) research often uses codes based on the straight line attenuation
Marriott San Jose Ballroom method. As with other approaches, geometrical computations
(finding all the interactions radiation rays/objects intersection)
2078 Shockingly fast and accurate CFD simulations remain the simulation bottleneck. This talk will describe how we
In the last three years we have demonstrated how GPU
FULL CONFERENCE GUIDE 2010
have used the Thrust high-level library for CUDA C/C++ to quickly
accelerated discontinuous Galerkin methods have enabled prototype innovative algorithms and achieve a significant speed
simulation of time-dependent, electromagnetic scattering from up.
airplanes and helicopters.
Speaker(s): Guillaume Saupin (CEA)
In this talk we will discuss how we have extended these Topic(s): High Performance Computing, Algorithms &
techniques to enable GPU accelerated simulation of supersonic Numerical Techniques, Physics Simulation,
airflow as well. Ray Tracing
Speaker(s): Timothy Warburton (Associate Professor,
Rice University)
51
Wednesday, Sept 22, 11:00 (50 minutes) Elif Albuz (NVIDIA), Nathan Whitehead (CUDA
Room A7 Software Engineer, NVIDIA), Frank Jargstorff
(Software Engineer, NVIDIA)
2146 Virtual Surgery
Topic(s): Tools & Libraries
Come see how 3D Vision technology is used in Virtual Surgery
Training for Medical Education. BioDigital Systems in conjuncture
Wednesday, Sept 22, 11:00 (50 minutes)
with University of California San Francisco (UCSF), has developed
Room A5
a dental injection simulator to teach students of dentistry the
mechanics of nerve block injection. 3D Vision Technology has 2275 The Evolution of GPUs for General Purpose Computing
added a new dimension of realism by providing users with a Learn how the GPU evolved from its humble beginning as a “VGA
unique immersive experience. Accelerator” to become a massively parallel general purpose
Speaker(s): Aaron Oliker (Managing Partner/Director 3D
accelerator for heterogeneous computing systems. This talk will
Technology, BioDigital)
focus on significant milestones in GPU hardware architecture and
software programming models, covering several key concepts
Topic(s): Medical Imaging & Visualization, Stereoscopic 3D
that demonstrate why advances in GPU parallel processing
performance and power efficiency will continue to outpace CPUs.
Wednesday, Sept 22, 11:00 (50 minutes)
Room C Speaker(s): Ian Buck (Software Director of GPU Computing,
NVIDIA)
2177 Simplifying Parallel Programming with Domain
Topic(s): General Interest
Specific Languages
Explore a new approach in parallel programming which leverages
Wednesday, Sept 22, 11:00 (50 minutes)
WEDNESDAY
In a lively and fast-paced exchange, the “Emerging Companies Topic(s): General Interest
Summit - CEO on Stage” sessions will feature CEOs from three
startups who will each have 15 minutes to introduce their Wednesday, Sept 22, 14:00 (50 minutes)
companies and interact with a panel of leading venture capitalists, Marriott Guadalupe Room
technology executives, and industry analysts.
2000 Gravitational N-body Simulations: How Massive Black
Panelist(s): Drew Lanza (Partner, Morgenthaler), Dan’l Lewin Holes Interact with Stellar Systems
(Corporate VP and Strategic & Emerging Business Astrophysics is a field where super computing is a must to obtain
Development, Microsoft), Jon Peddie (President, new scientific results. in particular, the study of the interaction
JPR), Jeff Herbst (Vice President of Business among massive black holes and surrounding stars is a hot topic,
Development, NVIDIA) which requires heavy computations to have good representation
Speaker(s): Sam Cox (CEO, Milabra), Sam Blackman (CEO of what happens in the inner regions of galaxies. We present
and Co-Founder, Elemental Technologies, Inc.), the results obtained with our high precisioned N-body code,
Chris Doran (Founder and Chief Operating NBSymple, which exploits the joint power of a multi core CPU
Officer, Geomerics) system together with the high performance NVIDIA Tesla C1060
WEDNESDAY
Topic(s): General Interest, Video Processing, GPUs.
Computer Graphics
The code is available at the website:
astrowww.phys.uniroma1.it/dolcetta/nbsymple.html
Wednesday, Sept 22, 11:30 (20 minutes)
Room B Speaker(s): Roberto Capuzzo-Dolcetta (Professor, Sapienza Univ.
of Roma), Alessandra Mastrobuono Battisti (PhD
2035 Simulations of Large Membrane Regions
Student, Sapienza- University of Rome)
Learn how to study membrane-bound protein receptors by Topic(s): Astronomy & Astrophysics, Algorithms & Numerical
moving beyond the current state-of-the-art simulations that only Techniques
consider small patches of physiological membranes. Towards this
end, this session presents how to apply large-scale GPU-enabled
Wednesday, Sept 22, 14:00 (50 minutes)
computations of extended phospholipid bilayer membranes using
Room D
a GPU code based on the CHARMM force field for MD simulations.
Our code enables fast simulations of large membrane regions in 2038 The Best of Both Worlds: Flexible Data Structures for
NVT and NVE ensembles and includes different methods for the Heterogeneous Computing
representation of the electrostatic interactions, i.e., reaction force Learn how to switch between array of structs (AoS) and struct of
field and Ewald summation (PME) methods. Performance and arrays (SoA) storage without having to change the data access
scientific results for dimyristoylphosphatidylcholine (PC) based syntax. A few changes to the struct and container definitions will
lipid bilayers are presented. enable you to evaluate the performance of AoS vs. SoA on your
Speaker(s): Michela Taufer (Assistant Professor, University
existing AoS code. We present a simple abstraction that retains
of Delaware), Narayan Ganesan (Research Scientist,
the more intuitive AoS syntax array[index]component, yet allows
University of Delaware), Sandeep Patel (Assistant
you to switch between AoS and SoA storage with a single template
Professor, University of Delaware)
parameter at class definition.
Topic(s): Molecular Dynamics, High Performance Computing, Speaker(s): Robert Strzodka (Senior Researcher, Max Planck
Physics Simulation Institut Informatik)
Topic(s): Algorithms & Numerical Techniques,
Wednesday, Sept 22, 11:30 (20 minutes) Tools & Libraries
Room K
2117 Migration of C and Fortran Apps to GPGPU using HMPP Wednesday, Sept 22, 14:00 (50 minutes)
Room A3
GPGPU is a tremendous opportunity to many application fields.
Migrating legacy software to GPGPU is a complex process that 2041 PyCUDA: Even Simpler GPU Programming with
requiresmastering the technological risks (e.g. loss of code Python
portabilit, extensive code restructuration, debugging complexity) Explore PyCUDA, a robust, open-source toolkit that lets you
as well as costs. In this talk, we present a methodology based on control your GPU from the comfort of Python, a Matlab-like
FULL CONFERENCE GUIDE 2010
HMPP (Heterogeneous Multicore Parallel Programming), allowing scripting language. Learn about Fermi tuning with PyCUDA,
incremental processes that reduce the cost and risks of porting the new interfaces for CUBLAS and CUFFT, the ecosystem of
codes to GPGPU. third-party libraries built on PyCUDA, and examples illustrating
Speaker(s): Francois Bodin (CTO, CAPS Entreprise)
PyCUDA’s benefits to large-scale applications.
Topic(s): High Performance Computing, Tools & Libraries Speaker(s): Andreas Kloeckner (Courant Instructor, Courant
Institute, NYU)
Topic(s): Tools & Libraries, Computational Fluid Dynamics,
Physics Simulation
53
Wednesday, Sept 22, 14:00 (20 minutes) Speaker(s): Bruno Nicoletti (CTO, The Foundry)
Room K Topic(s): Film, Tools & Libraries, Video Processing
WEDNESDAY
for GPU implementation. The algorithm uses data in irregular Technische Universität München)
ways since it is a graph-based algorithm. It also makes heavy Topic(s): Physics Simulation, Algorithms & Numerical
use of constructs like recursion which is not supported by Techniques, High Performance Computing
GPU hardware. In this paper, we take a state-of-the-art FPGA
technology mapping algorithm within Berkeley’s ABC package
Wednesday, Sept 22, 14:00 (50 minutes)
and attempt to parallelize it on a GPU. We show that runtime
Room A7
gains of 3.1x are achievable while maintaining identical quality as
demonstrated by running these netlists through Altera’s Quartus 2139 Interactive Histology of Large-Scale Biomedical
II place-and-route tool. Image Stacks
Speaker(s): Doris Chen (Student, University of Toronto)
Get the latest information on leveraging GPU computing to
process and visualize large-scale biomedical image stacks. We
Topic(s): Algorithms & Numerical Techniques,
will discuss both display-aware processing and GPU-accelerated
texture compression for histology applications on the GPU.
Wednesday, Sept 22, 14:00 (50 minutes)
Room L Speaker(s): Won-Ki Jeong (Research Scientist, Harvard
University), Jens Schneider (Postdoctoral Fellow,
2120 High Performance Complex Event Processing on
King Abdullah University of Science and Technology)
GPGPU
Topic(s): Medical Imaging & Visualization, Imaging,
Complex Event processing (CEP),a crucial component in Life Sciences
enterprise-scale applications, is the key element in that it allows
applications to process the incoming event streams and apply
Wednesday, Sept 22, 14:00 (50 minutes)
relevant techniques in real-time for quicker decisions, making it
Room A5
easy to identify complex patterns in the events. Much of the time,
this system is consumed by the event matching algorithms. Our 2140 Superfast Nearest Neighbor Searches Using a
work utilizes the highly parallel GPU for event matching algorithm Minimal kd-tree
wherein every incoming event is worked upon by this algorithm Learn how to adapt a kd-tree spatial data structure for efficient
and results in high throughput. nearest neighbor (NN) searches on a GPU. Although the kd-tree
Speaker(s): Murali Krishna (Junior Research Associate, Infosys
is not a natural fit for GPU implementation, it can still be effective
Technologies Limited), Sudeep Mallick (Principle
with the right engineering decisions. By bounding the maximum
Research Scientist, Infosys)
height of the kd-tree, minimizing the memory footprint of data
structures, and optimizing the GPU kernel code, multi-core GPU
Topic(s): Databases & Data Mining, Finance
NN searches with tens of thousands to tens of millions of points
run 10-40 times faster than the equivalent single-core CPU NN
Wednesday, Sept 22, 14:00 (50 minutes)
searches.
Room A1
Speaker(s): Shawn Brown (Graduate Student, UNC, Chapel Hill)
2125 Developing GPU Enabled Visual Effects For Film
Topic(s): Algorithms & Numerical Techniques, Databases &
And Video
FULL CONFERENCE GUIDE 2010
WEDNESDAY
Production Amolak Badesha (Senior Application Expert &
Discover how post-production tasks can be accelerated by taking Strategist, Agilent Technologies), Hany Fahmy
advantage of GPU-based algorithms. In this talk we present (Director, SI/EMC Engineering, NVIDIA)
computer vision algorithms for corner detection, feature point Topic(s): Physics Simulation, Tools & Libraries
tracking, image warping and image inpainting, and their efficient
implementation on GPUs using CUDA. We also show how to Wednesday, Sept 22, 15:00 (50 minutes)
use these algorithms to do real-time stabilization and temporal Room C
re-sampling (re-timing) of high definition video sequences, both 2122 Using GPUs for Real-Time Brain-Computer Interfaces
common tasks in post-production. Benchmarking of the GPU
Learn how GPU processing can provide researchers with
implementations against optimized CPU algorithms demonstrates
an inexpensive and versatile alternative to dedicated signal
a speedup of approximately an order of magnitude.
processing hardware for real-time neural prosthetics. Topics
Speaker(s): Hannes Fassold (Scientist, JOANNEUM RESEARCH) will include an overview of algorithms, current state-of-the-art
Topic(s): Computer Vision, Video Processing hardware, GPU processing in a real-time environment, multi-
platform processing, and future directions in BCIs using GPU
Wednesday, Sept 22, 15:00 (50 minutes) processing.
Marriott Guadalupe Room Speaker(s): Adam Wilson (Postdoctoral Fellow, University
2044 GRASSY: Leveraging GPU Texture Units for of Cincinnati)
Asteroseismic Data Analysis Topic(s): Neuroscience, Algorithms & Numerical Techniques,
Learn how to use the hidden computation capability of GPU Signal processing
texture units for general purpose computation. We describe
GRASSY, a system for stellar spectral synthesis where the Wednesday, Sept 22, 15:00 (50 minutes)
core problem is interpolation between pre-computed intensity Room E
value. We map these pre-computed tables to the GPU’s texture 2170 Lattice Boltzmann Multi-Phase Simulations in Porous
memory. Interpolation then becomes a texture lookup where Media using GPUs
the hardware automatically performs the interpolation, albeit at
Learn how a very efficient implementation of multiphase lattice
very low precision. Our mathematical framework reasons about
Boltzmann methods (LBM) based on CUDA delivers significant
the impact of this precision and our performance results show
benefits for predictions of properties in rocks. This simulator on
500X speedups. This work generalizes the GPU texture units
NVIDIA hardware enables us to perform pore scale multi-phase
as computation engines and opens up new problems for GPU
(oil-water-matrix) simulations in natural porous media and to
acceleration.
predict important rock properties like absolute permeability,
Speaker(s): Matt Sinclair (Research Assistant, UW-Madison) relative permeabilites, and capillary pressure. We will show
Topic(s): Astronomy & Astrophysics, High videos of these simulations in complex real world porous media
Performance Computing and rocks.
FULL CONFERENCE GUIDE 2010
Wednesday, Sept 22, 15:00 (50 minutes) Speaker(s): Tobias Lauer (Researcher, University of Freiburg),
Room A7 Christoffer Anselm (Software Developer, Jedox
Business Intelligence)
2211 Modern Architecture for Massively Parallel Medical
Topic(s): Databases & Data Mining
Tomographic Image Reconstruction on a GPU Cluster
Learn how to combine GPU and Cluster Programming with a Wednesday, Sept 22, 15:00 (50 minutes)
real-world example. Many aspects of medical tomographic image Room A5
reconstruction are embarrassingly parallel, but require massive
compute power. We distribute the load onto a cluster of multi-GPU 2238 Better Performance at Lower Occupancy
WEDNESDAY
equipped nodes using Message Passing Interface (MPI) and CUDA. It is usually advised to optimize CUDA kernels for higher
The Thrust library allows for a modern object-oriented approach. occupancy to hide memory and arithmetic latencies better. In
this presentation, I show that increasing occupancy is not the
Speaker(s): Sven Prevrhal (Staff Research Scientist, Philips),
only way and not always the best way to hide latency on GPU.
Jingyu Cui (Graduate Student, Stanford University)
Instead, it may be advantageous to rely on the parallelism within
Topic(s): Medical Imaging & Visualization, Algorithms &
threads-instruction-level parallelism. This insight yields a simple
Numerical Techniques, High Performance
optimization technique that is used in later versions of CUBLAS
Computing, Tools & Libraries
and CUFFT. I discuss the rationale behind the technique and
illustrate it by speeding up matrix multiplication, starting with the
Wednesday, Sept 22, 15:00 (50 minutes) basic implementation found in the NVIDIA GPU Computing SDK.
Room B
Speaker(s): Vasily Volkov (Student, UC Berkeley)
2218 Redesigning Molecular Dynamics for GPUs and GPU
Topic(s): High Performance Computing
Clusters
Generalized Born and Particle Mesh Ewald (PME) molecular Wednesday, Sept 22, 15:00 (20 minutes)
dynamics are two computationally intensive algorithms for Room K
simulating biological molecules. While several adaptations of
Generalized Born have attained excellent speedup on GPUs, high 2251 TotalView Debugger for CUDA
performance Particle Mesh Ewald has been more elusive. Here Hear how the TotalView debugger is being extended to support
we describe in detail a recent port of PME implemented within GPU computation with CUDA. In addition to the basic challenges
AMBER 11 that has achieved performance on par with up to 128 associated with debugging parallel programming, CUDA
nodes of a top ten supercomputer. programming introduces a number of new concepts for which
developers need visibility in debugging: a hierarchical memory,
Speaker(s): Scott Le Grand (Principal Engineer, NVIDIA)
near-SIMD warps, streams, and kernels, among others. How do
Topic(s): Molecular Dynamics, Algorithms & Numerical
we create a tool that handles it all? We’ll be discussing the status
Techniques, High Performance Computing,
of our work and the challenges encountered in bringing this all
Life Sciences
together into a single package, TotalView for CUDA.
Wednesday, Sept 22, 15:00 (50 minutes) Speaker(s): Chris Gottbrath (Principal Product Manager,
Room A3 TotalView Technologies, Inc., a Rogue Wave
Software company)
2234 Unstructured Finite Volume Code on a Cluster with
Topic(s): Tools & Libraries
Multiple GPUs per Node
Explore how a code written to run in parallel using OpenMP and Wednesday, Sept 22, 15:00 (50 minutes)
on a single GPU was modified to run across multiple GPUs and Room A8
nodes on a multi-CPU, multi-GPU cluster installed at the Naval
Research Laboratory. We will discuss the performance of this 2273 GPUs In the Front Line of our Defenses (Sponsored
code running in parallel using MPI/OpenMP and MPI/CUDA. by GE)
Find out how GPUs are accelerating defense and aerospace
Speaker(s): Keith Obenschain (Computer Scientist, Naval
applications and providing superior information processing
Research Lab), Andrew Corrigan (Naval
to drive the next generation of capabilities to protect both
Research Laboratory & George Mason University)
homelands and soldiers. Learn how rugged VPX hardware and
Topic(s): Computational Fluid Dynamics,
software architectures are able to scale from small power- &
High Performance Computing
weight-constrained vehicles through to large complex processing
arrays, on platforms as diverse as unmanned aerial vehicles
(UAV), through tracked ground vehicles, and to ship borne radar.
Speaker(s): Simon Collins (Product Manager, Panelists: Sam Cox (CEO, Milabra), Tom Dean (Research
GE Intelligent Platforms) Scientist, Google) Janko Mrsic-Flogel (CTO, MirriAd),
Topic(s): High Performance Computing Joe Stam (Sr. Applications Engineer, NVIDIA), Yoram
Yaacovi (CTO & General Manager, Technologies
Wednesday, Sept 22, 15:00 (50 minutes) at Microsoft)
Marriott San Jose Ballroom Topic(s): Computer Vision
WEDNESDAY
Topic(s): Programming Languages & Techniques,
General Interest Speaker(s): Brent Leback (Engineering Manager,
The Portland Group)
Wednesday, Sept 22, 15:00 (50 minutes) Topic(s): Tools & Libraries, High Performance Computing,
Room D Programming Languages & Techniques
The GPU (graphics processing unit) runs advanced applications 2072 GPUs at the Computer Animation Studio
which are transforming existing industries and creating new ones. Learn five simple ways in which GPUs have been adopted in the
Join our panel of leading industry experts as they discuss the production pipeline at Blue Sky Studios. Covers how we use
latest technology advances in the usage of GPUs for Computer GPUs to improve animation tools, add real-time anaglyph support,
Vision, they will cover facial, gesture, human motion, and and accelerate noise functions including code samples from
biometrics recognition, augmented reality, robotic computing production tools.
and more.
59
Speaker(s): Hugo Ayala (Sr. Research Associate, Wednesday, Sept 22, 16:00 (50 minutes)
Blue Sky Studios) Marriott Guadalupe Room
Topic(s): Film, Stereoscopic 3D, Tools & Libraries
2108 Binary Black Holes Simulations using CUDA
Get the latest information on how to evolve binary black holes
Wednesday, Sept 22, 16:00 (50 minutes)
simulations on GPUs.
Room B
Speaker(s): Abdul Mroue (Post-Doc Fellow, CITA, University
2073 High Performance Molecular Simulation,
of Toronto)
Visualization, and Analysis on GPUs
Topic(s): Astronomy & Astrophysics, Algorithms & Numerical
This talk will present recent successes in the use of GPUs to
Techniques, Physics Simulation
accelerate interactive visualization and analysis tasks on desktop
computers, and batch-mode simulation and analysis jobs on
Wednesday, Sept 22, 16:00 (50 minutes)
GPU-accelerated HPC clusters. We’ll present Fermi-specific
Room A8
algorithms and optimizations and compare with those for other
devices. We’ll also present performance and performance/ 2118 Large-scale Gas Turbine Simulations on GPU Clusters
watt results for NAMD molecular dynamics simulations and This talk describes a strategy for implementing structured grid
VMD analysis calculations on GPU clusters, and conclude with PDE solvers on GPUs. Techniques covered include the use of
a discussion of ongoing work and future opportunities for GPU source-to-source compilation and the use of sparse matrix
acceleration, particularly as applied to the analysis of petascale vector multiplications for complicated boundary conditions. A
simulations of large biomolecular complexes and long simulation new production-quality solver for flows in turbomachines called
timescales. Turbostream that uses these techniques is presented. The impact
WEDNESDAY
Speaker(s): John Stone (Senior Research Programmer, of the use of GPUs on the turbomachinery design process is
University of Illinois at Urbana-Champaign) demonstrated by two 64-GPU simulations that have recently been
Topic(s): Molecular Dynamics, Algorithms & Numerical performed on the University of Cambridge’s GPU cluster.
Techniques, High Performance Computing, Speaker(s): Tobias Brandvik (PhD Student, University
Life Sciences of Cambridge)
Topic(s): Computational Fluid Dynamics,
Wednesday, Sept 22, 16:00 (50 minutes)
Room E Wednesday, Sept 22, 16:00 (50 minutes)
2083 GPU Accelerated Solver for the 3D Two-phase Marriott San Jose Ballroom
Incompressible Navier-Stokes Equations 2135 Processing Petabytes per Second with the ATLAS
This demonstrates the potential of GPUs for solving complex free Experiment at the Large Hadron Collider at CERN
surface flow problems using level set methods. These methods Learn how GPUs could be adopted by the ATLAS detector at the
are capable of producing complex surface deformations, and Large Hadron Collider (LHC) at CERN. The detector, located at
therefore are used widely in computer graphics, as well as one of the collision points, must trigger on unprecedented data
engineering applications. This work demonstrates that GPUs acquisition rates (PB/s), to decide whether to record the event,
can be used to accelerate the most computationally expensive or lose it forever. In the beginning, we introduce the ATLAS
part of free surface flow calculations, and therefore allows much experiment and the computational challenges it faces. The
larger problems to be solved on workstation machines than was second part will focus on how GPUs can be used for algorithm
previously possible. These techniques will be exemplified by our acceleration - using two critical algorithms as exemplars. Finally,
current project to port our in-house fluid solver NaSt3DGPF to we will outline how GPGPU acceleration could be exploited and
the GPU. incorporated into the future ATLA computing framework.
Speaker(s): Peter Zaspel (Research Assistant University of Bonn) Speaker(s): Philip Clark (Reader (Associate Professor) in Particle
Topic(s): Computational Fluid Dynamics, Algorithms & Physics, University of Edinburgh), Andy Washbrook
Numerical Techniques, High Performance (Postdoctoral Research Assistant, University
Computing, Physics Simulation of Edinburgh)
Topic(s): High Performance Computing, Algorithms &
Wednesday, Sept 22, 16:00 (50 minutes) Numerical Techniques, Physics Simulation
Room C
2093 Computational Photography: Real-Time Plenoptic Wednesday, Sept 22, 16:00 (50 minutes)
Rendering Room A7
Get the latest information on GPU-based plenoptic rendering 2144 Large-Scale Visualization Using A GPU Cluster
including a demonstration of refocusing, novel view generation, Learn how to visualize extremely large-scale scientific data using
polarization, high dynamic range, and stereo 3D. Learn how GPU GPGPU techniques on a GPU-accelerated visualization cluster.
hardware enables plenoptic rendering tasks with high-resolution Recent advances in general-purpose GPU (GPGPU) computing
imagery to be performed interactively, opening up entirely new provide a promising solution to compute-intensive scientific
possibilities for modern photography. visualization. However, the largest scientific simulations produce
Speaker(s): Andrew Lumsdaine (Professor, Indiana University), datasets that are orders of magnitude larger than the memory
Georgi Chunev (Research Assistant, Indiana available on current GPUs. Many distributed GPUs must be used
University), Todor Georgiev (Senior Research in parallel. We present Longhorn, currently the world’s largest
Scientist II, Adobe Systems) GPU-enhanced cluster dedicated for visualization and data
Topic(s): Imaging, Computer Vision, Stereoscopic 3D analysis, and describe the distributed memory architecture and
GPGPU techniques to interactively visualize massive datasets
using distributed GPUs on Longhorn.
Speaker(s): Byungil Jeong (Visualization Scientist, TACC / and discuss handling boundary conditions and using separate
UT-Austin), Paul Navratil (Visualization Scientist, kernels to improve efficiency.
Texas Advanced Computing Center)
Speaker(s): Javier Cabezas (Researcher, Barcelona
Topic(s): Medical Imaging & Visualization, High Performance
Supercomputing Center), Mauricio Araya (Senior
Computing
Researcher, Barcelona Supercomputing Center)
Topic(s): Energy Exploration, Algorithms & Numerical
Wednesday, Sept 22, 16:00 (50 minutes) Techniques, High Performance Computing
Room A5
2154 The Impact of Data Movement on GPU Performance Wednesday, Sept 22, 16:00 (50 minutes)
GPU computing has taken the scientific computing landscape Room L
by storm, fueled by the massively parallel arithmetic hardware. 2252 Simulating Housefly Vision Elements Using OpenCL
When coding, researchers rely on best practices that have
An OpenCL GPU based computer simulation of a biologically
been developed in the short timespan of GPGPU. This session
motivated model, based on the anatomy of housefly’s first optic
challenges a widely held belief that transfers to/from the GPU
ganglion, the lamina ganglionaris (the lamina layer) is presented.
device must be minimized to achieve the best performance by
Specific to GPU technology, the computer model demonstrates:
presenting a case study on CULA, our library for dense linear
the implementation of a 2nd Order Runga-Kutta method to
algebra. The topics to be discussed include the relationship
approximate coupled differential equations using GPU hardware;
between computation and transfer time for synchronous/
the mapping of a non-Cartesian coordinate system onto the
asynchronous transfers, and impact that data allocations have on
Cartesian layout of the threads. Testing examined usage and
memory performance and overall solution time.
access across device memory spaces to determine the optimal
WEDNESDAY
Speaker(s): John Humphrey (Senior Engineer, EM Photonics, usage/access method for the ANN. This result was generalized
Inc), Daniel Price (Engineer, EM Photonics, Inc.) for OpenCL GPU devices, using the capabilities of OpenCL.
Topic(s): High Performance Computing, Algorithms &
Speaker(s): Karen Haines (Professor, WASP/The University of
Numerical Techniques, Tools & Libraries
Western Australia)
Topic(s): Neuroscience, Algorithms & Numerical Techniques,
Wednesday, Sept 22, 16:00 (50 minutes) Signal processing
Room A3
2201 A Case Study of Accelerating Matlab Based WEDNESDAY, SEPT 22, 16:00 (50 MINUTES)
Applications using GPUs ROOM M
Learn how to accelerate Matlab based applications using GPUs. 2302 Microsoft Technologies for HPC
We cover a popular neuro-imaging software called SPM and
NVIDIA Parallel Nsight provides access to the power of the GPU
show how to use CUDA and Jacket to speedup computationally
from within the familiar environment of Microsoft Visual Studio. In
intensive Matlab applications.
this session, we will expand on the computational power of Visual
Speaker(s): Aniruddha Dasgupta (Graduate Student, Georgia Studio 2010, Windows HPC Server and the Technical Computing
Institute of Technology) Libraries and show how to increase your performance.
Topic(s): Medical Imaging & Visualization
Speaker(s): Calvin Clark (Senior Consultant, Microsoft)
Topic(s): High Performance Computing
Wednesday, Sept 22, 16:00 (50 minutes)
Room N
Wednesday, Sept 22, 16:00 (50 minutes)
2217 GPU-Based Conjugate Gradient Solvers for Keynote Hall
Lattice QCD
4004 Emerging Companies: CEO on Stage featuring
Learn how to perform state-of-the-art quantum chromodynamics Cooliris, empulse GmbH, and Playcast Media Systems
(QCD) computation using NVIDIA GPUs at 1% of the cost of a
See the hottest new technologies from startups that could
conventional supercomputer and 10% of its power consumption.
transform computing.
We will discuss how physicists around the world are using GPU
clusters to solve QCD. We will focus upon how TWQCD have been In a lively and fast-paced exchange, the “Emerging Companies
using a large GPU cluster (200 GPUs) to simulate QCD, attaining Summit - CEO on Stage” sessions will feature CEOs from three
36 Teraflops (sustained). startups who will each have 15 minutes to introduce their
companies and interact with a panel of leading venture capitalists,
Speaker(s): Ting-Wai Chiu (Professor, National Taiwan
technology executives, and industry analysts.
University)
Topic(s): High Performance Computing, Physics Simulation Panelist(s): Nathan Brookwood (Research Fellow, Insight64),
Charles Carmel (VP of Corporate Business
Wednesday, Sept 22, 16:00 (50 minutes) Development, Cisco), Flip Gianos (General Partner,
FULL CONFERENCE GUIDE 2010
CUDA C. We will cover building the neighbor list and calculating Vision, Imaging, Medical Imaging & Visualization
the forces on the GPU. To handle the case where a few particles
have significantly more neighbors than most other particles,
Wednesday, Sept 22, 17:00 (50 minutes)
we propose a hybrid data structure for the neighbor list that
Room A8
can achieve a good balance between performance and storage
efficiency. A CUDA C implementation of the technique for 2077 Catastrophic Risk Management: Fast and Flexible
Leonard-Jones forces can be found in the LAMMPS molecular with GPU Analytics
dynamics open source code. RMS will describe our experience leveraging GPUs and simple
Speaker(s): Peng Wang (Developer Technology Engineer, NVIDIA)
software architectural principles to deliver both spectacular
performance gains and enhanced flexibility in next generation
Topic(s): Molecular Dynamics
portfolio risk management applications.
Wednesday, Sept 22, 17:00 (60 minutes) Speaker(s): Philippe Stephan (CTO, RMS)
Marriott San Jose Ballroom Topic(s): Finance
WEDNESDAY
pipeline, and algorithmic and implementation-level details Speaker(s): Peter Lu (Post-Doctoral Research Fellow, Harvard
for key rendering stages. We cover several issues concerning University)
GPU efficiency, including those involving work scheduling, Topic(s): Computer Vision, Imaging, Life Sciences
parallelization of traditional stages, and balancing of rendering
workloads. We expect the audience to gain an in-depth exposure Wednesday, Sept 22, 17:00 (50 minutes)
of the state of research in programmable graphics, and an insight Room A7
into efficient pipeline design for irregular workloads.
2243 Microsoft RemoteFX - GPU Virtualization for Desktop
Speaker(s): Anjul Patney (Graduate Student, University of Centralization
California, Davis), Stanley Tzeng (Graduate Student, Learn about Microsoft’s upcoming GPU Virtualization feature,
University of California, Davis) RemoteFX, which will ship in Windows Server 2008 R2 SP1.
Topic(s): Computer Graphics, Film Microsoft RemoteFX enables GPUs to be hosted in the datacenter
as a service that can be shared by multiple users for streaming
Wednesday, Sept 22, 17:00 (50 minutes) the real-time and complete Windows 7 desktop experience
Room D to ultra-lightweight client devices anywhere on the corporate
2167 Designing a Geoscience Accelerator Library network. With Microsoft RemoteFX, users will be able to work
Accessible from High Level Languages remotely in a Windows Aero desktop environment, watch
full-motion video, enjoy Silverlight animations, and run 3D
Explore a library for geoscience applications on CUDA and
applications – all with the fidelity of local-like performance.
OpenCL platforms. Target applications span atmosphere, ocean,
geomorphology and porous media flows. These areas are linked Speaker(s): Tad Brockway (Product Unit Manager, Microsoft)
by common numerical techniques encapsulated in our library. Topic(s): Cloud Computing, Computer Graphics
We will review the scope of the library, its meta-programming
approaches, and its key design attributes. We will also Wednesday, Sept 22, 17:00 (50 minutes)
demonstrate its support for multi-GPU parallelism within and Room K
across address spaces and provide examples of is use from high
level languages including C, Fortran, and Python.
2282 GPU-Enabled Biomedical Imaging
The purpose of this presentation is to describe several novel
Speaker(s): Chris Hill (Principle Research Scientist, M.I.T), Alan biomedical imaging applications which make extensive use
Richardson (Graduate Student, M.I.T) of GPUs. In CT iterative reconstructions, for example, high
Topic(s): Programming Languages & Techniques, Algorithms performance computing is allowing us to see details and
& Numerical Techniques, Computational Fluid structures we previously were not able to discern.
Dynamics, Tools & Libraries
Speaker(s): Homer Pien (Director of the Laboratory for Medical
Imaging and Computations, Massachusetts General
Wednesday, Sept 22, 17:00 (50 minutes)
Hospital / Harvard Medical School)
Marriott Guadalupe Room
Topic(s): Medical Imaging & Visualization, High Performance
2178 Using GPUs to Track Changes in the Sun
FULL CONFERENCE GUIDE 2010
Thursday, Sept 23, 09:00 (50 minutes) Thursday, Sept 23, 09:00 (50 minutes)
Room L Room A3
2030 High-Throughput Cell Signaling Network Learning 2138 Faster, Cheaper, Better – Hybridization of Linear
with GPUs Algebra for GPUs
Explore how GPUs are being used to enable high-throughput cell Learn how to develop faster, cheaper and better linear algebra
signaling network discovery and data-intensive computational software for GPUs through a hybridization methodology that is
systems biology more generally. Systems biology is transitioning built on (1) Representing linear algebra algorithms as directed
from a largely reductive discipline to one focused on building acyclic graphs where nodes correspond to tasks and edges to
predictive models of large-scale biological systems. New dependencies among them, and (2) Scheduling the execution
instrumentation will provide the necessary raw data for such an of the tasks over hybrid architectures of GPUs and multicore.
THURSDAY
approach, the key challenge now is building the hardware and Examples will be given using MAGMA, a new generation of
software tools to efficiently and interactively build these models. linear algebra libraries that extends the sequential LAPACK-
This session will describe how GPUs can and will play a key role style algorithms to the highly parallel GPU and multicore
in these efforts. heterogeneous architectures.
Speaker(s): Michael Linderman (Engineering Research Speaker(s): Hatem Ltaief (Sr. Research Associate, University of
Associate, Stanford University) Tennessee), Stan Tomov (Research Scientist,
Topic(s): Life Sciences, Algorithms & Numerical Techniques, University of Tennessee)
Machine Learning & Artificial Intelligence Topic(s): High Performance Computing, Algorithms &
Numerical Techniques, Tools & Libraries
Thursday, Sept 23, 09:00 (50 minutes)
Room A7 Thursday, Sept 23, 09:00 (50 minutes)
Room K
2033 Accelerating Pricing Models with virtual GPUs
Join Citadel to explore our three year undertaking on the 2145 Photo Editing on the GPU with MuseMage
feasibility of GPGPU computing for option pricing. We will discuss See how MuseMage greatly accelerates image processing and
our 140X performance boost and the hurdles we had to overcome editing while providing real-time feedback by harnessing the
to integrate GPUs into our existing infrastructure. Please note power of GPUs. We will discuss the majority of MuseMage tools
that our talk will not get into the details of the model (that’s which are fully implemented on GPUs.
proprietary information), but we will share our innovative solution
Speaker(s): Kaiyong Zhao (Graduate Student, HKBU), Yubo
to drive a grid of virtual GPUs.
Zhang (PhD student, UC Davis)
Speaker(s): Scott Donovan (System Architect, Citadel Topic(s): Imaging
Investment Group)
Topic(s): Finance, High Performance Computing Thursday, Sept 23, 09:00 (50 minutes)
Marriott San Jose Ballroom
Thursday, Sept 23, 09:00 (50 minutes)
2156 GMAC: Global Memory For Accelerators
Room C
Learn how to use GMAC, a novel run-time for CUDA GPUs.
2048 H.264/AVC Video Encoding with CUDA and OpenCL GMAC unifies the host and device memories into a unified virtual
Join experts from MainConcept, a leading provider of video codecs address space, enabling the host code to directly access the
to the professional market, as they demonstrate the latest version device memory, and removing the need for data transfers between
of their CUDA-based H.264/AVC Encoder. host and device memories. Moreover, GMAC also allows pointers
to be used by both, the host and device code indistinctly.
Speaker(s): Thomas Kramer (VP Product Management,
MainConcept) This session will present the GMAC run-time and show how to use
Topic(s): Video Processing, Tools & Libraries it in current applications. This session will cover from the basics
of GMAC to multi-threaded applications using POSIX threads,
OpenMP and MPI.
Speaker(s): Isaac Gelado (Lecturer and Researcher, Universitat Speaker(s): Frank Mueller (Associate Professor, North Carolina
Politecnica de Catalunya) State University), Xing Wu (Research Assistant,
Topic(s): Tools & Libraries North Carolina State University)
Topic(s): Tools & Libraries, High Performance Computing
Thursday, Sept 23, 09:00 (50 minutes)
Room A1 Thursday, Sept 23, 09:00 (20 minutes)
Room M
2202 A Programming Model and Tool for Automatic High-
Performance C to CUDA Mapping 2278 Strategies for Code Encapsulation in GPU
Discover our automatic C-to-CUDA mapper prototype, and how Implementations
it optimizes execution and data movement for a broad class of Code encapsulation is a common technique used to reduce code
loop codes. Coupled with our powerful mapper, C as an input complexity that a given programmer has to understand. It allows
language does not only offer portability but also performance and the use of increasingly complex systems of hardware, software,
performance portability. Learn about our optimizations and some and algorithms to tackle increasingly difficult scientific problems.
of the performance obtained through different uses of the mapper. Unfortunately, code encapsulation is not easily attainable
in current GPU environments. We will share our OpenCL
Speaker(s): Benoit Meister (Senior Engineer, Reservoir Labs)
development experiences for achieving partial encapsulation in
Topic(s): Tools & Libraries
GPU implementations, and discuss best practices in this area.
Thursday, Sept 23, 09:00 (20 minutes) Speaker(s): Brian Cole (Developer, OpenEye Scientific Software)
Room A8 Topic(s): Programming Languages & Techniques, High
Performance Computing, Life Sciences
2206 Accelerated Computational Fluid Dynamics
Employing GPUs
Thursday, Sept 23, 09:00 (50 minutes)
Speaker(s): Daniel Gaudlitz (Project Manager, FluiDyna)
Room A2
Topic(s): Computational Fluid Dynamics,
High Performance Computing 2301 GPU Cluster Computing: Accelerating Scientific
Discovery
Thursday, Sept 23, 09:00 (50 minutes) We propose holding a research roundtable focussed on using
THURSDAY
Room B GPU clusters to support scientific research. The roundtable
will bring together researchers that have recently deployed or
2236 A Work-Efficient GPU Algorithm for Level Set
are interested in deploying GPU clusters to enable scientific
Segmentation
research. At the research roundtable they will be able to share
Explore a novel GPU level set segmentation algorithm that their experiences in deploying this new technology and discuss
is both work-efficient and step-efficient. Our algorithm has the future of this technology in supporting research to tackle the
O(logn) step-complexity, in contrast to previous GPU algorithms world’s most challenging scientific problems.
which have O(n) step-complexity. We apply our algorithm to 3D
medical images and we show that in typical clinical scenarios, our To open discussion we will provide a brief presentation about
algorithm reduces the total number of processed level set field deployment of the CSIRO’s latest supercomputer cluster, which
elements by 16x and is 14x faster than previous GPU algorithms is among the world’s first to combine traditional CPUs with
with no reduction in segmentation accuracy. more powerful NVIDIA GPUs, that is providing a world class
computational and simulation science facility to advance priority
Speaker(s): Mike Roberts (Research Assistant, Hotchkiss Brain CSIRO science.
Institute, University of Calgary, Canada)
Topic(s): Medical Imaging & Visualization, Algorithms & Speaker(s): John Taylor (Science and Business Leader, CSIRO),
Numerical Techniques, Computer Vision, Dragan Dimitrovici (XENON Systems Pty Ltd)
Computer Graphics Topic(s): High Performance Computing
Thursday, Sept 23, 09:00 (50 minutes) Thursday, Sept 23, 09:00 (50 minutes)
Room D Keynote Hall
2272 GStream: A General-Purpose Data Streaming 4006 Fireside Chat with Jen-Hsun Huang - Co-founder &
Framework on GPUs CEO, NVIDIA
We present GStream, a general-purpose, scalable and C++ Jen-Hsun Huang will take part in a fireside chat by Quentin Hardy,
template run-time framework amenable to both the streaming National Editor at Forbes Magazine. They will discuss the rise of
problem and GPU architectures. GStream offers transparent GPUs, current trends in visual and parallel computing, and the
streaming data transmissions and automatic memory transformational changes ahead for the industry.
synchronization over a rich collection of computing resources Speaker(s): Quentin Hardy (National Editor, Forbes Magazine),
that are transparently allocated and reused. Various problems Jen-Hsun Huang (CEO & President, NVIDIA)
FULL CONFERENCE GUIDE 2010
other than streaming application, such as scientific computing, Topic(s): General Interest
numerical codes and text processing, can be easily expressed
using GStream and subsequently integrated with our GStream
Thursday, Sept 23, 09:30 (20 minutes)
library. GStream’s ease of use combined with efficient exploitation
Room A8
of GPU resources have the potential to lead to higher coding
productivity and application performance through our data- 2037 Numtech & GPGPU, a SME Point of View
centric specification paradigm. Hear why and how Numtech, a french SME working in the field of
atmospheric dispersion and expertise of meteorological events, is
67
benchmarking GPGPU for its futures applications. A compressible Discusses how the results of this work may lead to better
and an incompressible interactive flow solvers are described. diagnostics for detecting leukemia in blood cells.
Speaker(s): Emmanuel Buisson (CEO, Numtech) Speaker(s): Robert Zigon (Sr Staff Development Engineer,
Topic(s): Computational Fluid Dynamics, Physics Simulation Beckman Coulter)
Topic(s): Life Sciences
Thursday, Sept 23, 10:00 (50 minutes)
Room L Thursday, Sept 23, 10:00 (20 minutes)
Room A8
2001 Acceleration of the Freesurfer Suite for
Neuroimaging Analysis 2110 Acceleration of a Novel Rotorcraft Wake Simulation
See how GPU technology has dramatically accelerated the Dive deep as we present the details of a new CUDA-based
Freesurfer suite of tools used by thousands of researchers for the algorithm for accurate rotorcraft wake simulations. We use
analysis of neuroimaging data. a vortex particle method, accelerated with a multipole tree
algorithm, combined with a traditional grid-based CFD code. This
Speaker(s): Richard Edgar (Assistant in Neuroscience,
CUDA algorithm can evaluate the velocity and velocity-gradient
Massachusetts General Hospital, Harvard University)
with an effective throughput approaching 300 billion interactions
Topic(s): Medical Imaging & Visualization, Imaging,
per second on a C1060. This gives 10x speed-up and 2.5x better
Tools & Libraries
accuracy compared to the parallel CPU version.
Thursday, Sept 23, 10:00 (50 minutes) Speaker(s): Christopher Stone (Research Scientist,
Room A3 Intelligent Light)
Topic(s): Computational Fluid Dynamics, Algorithms &
2002 CUDA Debugging on Linux and MacOS with cuda-gdb
Numerical Techniques
Boost your development speed by mastering the CUDA debugging
tools NVIDIA provides. In this session you will learn the basics of Thursday, Sept 23, 10:00 (50 minutes)
cuda-gdb and cuda-memcheck, as well as their more advanced Room N
features with live demonstrations on Linux and MacOS.
2116 Real-time Multichannel Audio Convolution
Speaker(s): Satish Salian (Manager CUDA Debugger Tools,
THURSDAY
Speaker(s): Bob Archer (Senior Computer Scientist, Adobe Speaker(s): Christopher Rossbach (Researcher, Microsoft
Systems Inc) Research), Emmett Witchel (Professor, University
of Texas at Austin)
Topic(s): Tools & Libraries
Topic(s): Programming Languages & Techniques,
Tools & Libraries
Thursday, Sept 23, 10:00 (50 minutes)
Room A2
2055 Application of Fermi GPU to Flow Cytometry and
Cancer Detection
Learn how a Tesla C2050 enabled scientists to explore cancer
data sets 400 times faster than a PC-only implementation.
Thursday, Sept 23, 10:00 (20 minutes) package provides results that are indistinguishable from the CPU
Room B code is extremely tricky and often the desire to take shortcuts to
boost performance can affect accuracy with unpredictable results.
2149 Overview of Parallel Nsight for Visual Studio
We have developed a comprehensive validation suite that can be
NVIDIA Parallel Nsight provides access to the power of the GPU used to perform the detailed testing that is required to ensure the
from within the familiar environment of Microsoft Visual Studio. approximations necessary for GPU performance do not impact the
This session is an entry level overview of the GPU computing and scientific results. Additionally we will discuss how we have made
graphics development features of Parallel Nsight as well as a careful use of mixed single and double precision arithmetic in
glimpse into the future of this powerful tool. the AMBER implementation to achieve equivalence in the results
Speaker(s): Kumar Iyer (Product Manager, NVIDIA) without excessively compromising performance. Finally we
Topic(s): Tools & Libraries provide examples of recent breakthrough simulations conducted
using GPU enabled AMBER 11.
Thursday, Sept 23, 10:00 (50 minutes) Speaker(s): Ross Walker (Research Professor, San Diego
Room A1 Supercomputer Center)
2176 Easy GPU Meta-programming: A Case Study in Topic(s): Molecular Dynamics
Biologically-Inspired Computer Vision
Learn how to let the computer optimize your CUDA and OpenCL Thursday, Sept 23, 10:00 (50 minutes)
code for you with easy GPU Meta-programming and Scripting (e.g. Keynote Hall
PyCUDA). We will present a case study in which we consider the 4007 Emerging Companies: CEO on Stage featuring
step-wise optimization of a 3D filter bank convolution, using a Aqumin, RTT, and Scalable Display Technologies
suite of open-source tools. See the hottest new technologies from startups that are
Speaker(s): Nicolas Pinto (PhD Student, MIT) transforming computing.
Topic(s): Tools & Libraries, Computer Vision, High In a lively and fast-paced exchange, the “Emerging Companies
Performance Computing, Neuroscience Summit - CEO on Stage” sessions will feature CEOs from three
startups who will each have 15 minutes to introduce their
Thursday, Sept 23, 10:00 (50 minutes) companies and interact with a panel of leading venture capitalists,
THURSDAY
Room C technology executives, and industry analysts.
2215 Extending OpenCV with GPU Acceleration Panelist(s): Norman Winarsky (VP of Ventures, Licensing and
OpenCV is a widely popular computer vision library, with millions Strategic Programs, SRI), Savitha Srinivasan
of downloads and hundreds of thousands of users. Applications (Corporate Venture Partner, IBM), and Rob Enderle
span many industries including robotics, industrial machine (Analyst, Enderle Group), Jeff Herbst (Vice President
vision, automotive, film & broadcast, medical, and consumer of Business Development, NVIDIA
applications. NVIDIA and the OpenCV development team are Speaker(s): Andrew Jamison (CEO, Scalable Display
collaborating to provide CUDA implementations of the most Technologies), Jeroen Snepvangers (CEO, RTT),
demanding algorithms, thus enabling a new level of real-time Michael Zeitlin (CEO, Aqumin)
capability and higher quality results. Topic(s): General Interest, Finance, Imaging,
This talk with introduce OpenCV, and summarize the new CUDA Computer Graphics
enabled capabilities, and provide an overview of future plans.
Thursday, Sept 23, 10:30 (20 minutes)
Speaker(s): Joe Stam (Sr. Applications Engineer, NVIDIA) Room A8
Topic(s): Computer Vision, Imaging, Stereoscopic 3D,
Video Processing 2061 Accelerating Explicit FEM Shock & Blast Simulations
Explicit finite element codes are widely used to simulate the
Thursday, Sept 23, 10:00 (50 minutes) response of structures and mechanical equipment subjected to
Marriott San Jose Ballroom shock, blast and wave propagation phenomena. High resolution
models require run times ranging from a few seconds to a few
2269 Bringing GPUs to Mainstream Molecular months are common and hence the payoff from GPU acceleration
Dynamics Packages is tremendous. We describe the acceleration of our commercial
Recent work in close collaboration with NVIDIA has produced a finite element code NLFLEX using CUDA. We developed GPU
GPU accelerated version of the AMBER Molecular Dynamics Code kernels in CUDA based on our production code NLFLEX, for linear
PMEMD that runs between 20 and 130 times the speed of a single elasticity, explosives, elasto-plasticity and large deformation
2.8GHz Intel Nehalem Processor, with even higher performance elasticity. We attained order of magnitude (10X) acceleration in
on multiple GPUs, but which does not make sacrifices in the single precision and approximately (5X) in double precision mode.
accuracy or validity of such calculations to achieve this. The GPU
Speaker(s): Nachiket Gokhale (Senior Research Engineer,
accelerated version supports both explicit solvent particle mesh
Weidlinger Associates Inc)
FULL CONFERENCE GUIDE 2010
Thursday, Sept 23, 11:00 (50 minutes) Thursday, Sept 23, 11:00 (50 minutes)
Room N Room A1
2042 Interactive 3D Audio Rendering Systems 2075 GPU-Accelerated Video Encoding
Learn how to leverage GPUs for interactive audio rendering. This Learn how to accelerate video encoding using the GPU. We
session will give a short overview of the architecture of current will give an overview of the typical video encoding pipeline and
GPUs, emphasizing some key differences between GPU and CPUs discuss how different parts of the pipeline can be ported to
programming models for audio processing. We will illustrate the GPU using various approaches. We will focus on block-based
benefits of GPU-accelerated audio rendering with results from Motion Estimation, in particular, as it is the corner stone of video
3D audio processing and sound scattering simulations. Finally, encoding algorithms. The efficiency of its implementation on the
we will discuss best practices for GPU implementations as well GPU is crucial to the speed and quality of the encoder.
as future opportunities for audio rendering on massively parallel
Speaker(s): Anton Obukhov (Developer Technology Engineer,
architectures.
NVIDIA)
Speaker(s): Nicolas Tsingos (Senior Staff Engineer, Topic(s): Video Processing
Dolby Laboratories)
Topic(s): Audio Processing, Ray Tracing, Signal processing
Thursday, Sept 23, 11:00 (50 minutes) Speaker(s): Daniel Ayres (PhD Candidate, University of Maryland)
Room A7 Topic(s): Life Sciences
Thursday, Sept 23, 11:00 (20 minutes) Speaker(s): Nathan Bell (Research Scientist, NVIDIA Research)
Room A8 Topic(s): Tools & Libraries
THURSDAY
separation distance computation, moment computation, etc.) See the hottest new technologies from startups that are
that are one to two orders of magnitude faster, and often more transforming computing.
accurate, than current commercial CPU implementations. We will
touch on strategies we have employed to meet GPU programming In a lively and fast-paced exchange, the “Emerging Companies
challenges, such as the separation of CPU/GPU operations, Summit - CEO on Stage” sessions will feature CEOs from three
imposing artificial structure on computations, and transforming startups who will each have 15 minutes to introduce their
problem definitions to suit GPU-computation models. companies and interact with a panel of leading venture capitalists,
technology executives, and industry analysts.
Speaker(s): Sara McMains (Associate Professor, University of
Panelist(s): Rob Enderle (Analyst, Enderle Group), Jeff Herbst
California Berkeley), Adarsh Krishnamurthy
(Vice President of Business Development, NVIDIA),
(Student, University of California Berkeley)
Savitha Srinivasan (Corporate Venture Partner, IBM),
Topic(s): Algorithms & Numerical Techniques, Tools &
Norman Winarsky (VP of Ventures, Licensing and
Libraries, Computer Graphics
Strategic Programs, SRI)
Speaker(s): David Peters (Founder and CEO, Universal
Thursday, Sept 23, 11:00 (50 minutes)
Robotics), David Hayes (CEO, ICD)
Room C
Topic(s): General Interest, Machine Learning & Artificial
2173 Enabling Large-Scale CCTV Face Recognition Intelligence, Mobile Devices
Learn how to use CUDA and GPGPU to perform large scale face
search for both forensics as well as CCTV face recognition. Thursday, Sept 23, 11:30 (20 minutes)
Speaker(s): Ben Lever (Senior Research Engineer, NICTA),
Room A8
Abbas Bigdeli (Senior Researcher and 2106 Particleworks: Particle-based CAE Software on
Technology Manager, NICTA) Multi-GPU
Topic(s): Computer Vision, Video Processing Prometech Software, Inc. is an university launched technology
venture in Japan and has been working in the field of particle-
Thursday, Sept 23, 11:00 (50 minutes) based computational fluid dynamics for several years. Through
Room A2 collaboratinos with major automotive and material companies in
2203 Modeling Evolution Computing the Tree of Life Japan, Prometech has implemented our Particle technology on
Multi-GPU and delivered as a CAE software, “Particleworks”. In
Learn how GPUs are being used to accelerate our understanding
FULL CONFERENCE GUIDE 2010
THURSDAY
Normal” For Building Emerging Companies Based On
every generation. This talk will discuss how to exploit the Disruptive Technologies
new features introduced by the Fermi architecture (such as
Moderated by Jeff Herbst – Vice President of Business
concurrent kernel execution, writes to texture) to accelerate
Development, NVIDIA
computer vision algorithms.
Speaker(s): James Fung (Developer Technology, NVIDIA) Start-ups are facing unique challenges as aresult of the current
economic and business environment. Not only is the venture
Topic(s): Computer Vision, Tools & Libraries
funding environment very difficult, but small companies are
finding it increasingly difficult to “break out” of the pack through
Thursday, Sept 23, 14:00 (50 minutes)
IPO’s and attractive M&A exits. This panel of experts (which
Room A3
includes VC and corporate investors) will attempt to assess the
2210 GPU-Ocelot: An Open Source Debugging and current state of both the public and private markets, and will
Compilation Framework for CUDA explore various strategies and options for building successful
Learn how to debug and profile CUDA applications using GPU- companies in this “new” environment. Topics will include
Ocelot. Ocelot is a compilation and emulation framework for traditional forms of equity and debt, angel financing, as well as
CUDA that includes debugging and profiling tools as well as other creative/strategic financing options (eg. NRE arrangements,
backend compilers for NVIDIA GPUs and x86 CPUs. We will strategic partnerships etc.). The discussing promises to be both
present examples of applications developed on x86 CPUs lively and provocative.
and deployed on NVIDIA GPUs. We will also discuss memory Panelist(s): Gerald Brady (Managing Director, Silicon Valley
checking, race detection, and deadlock detection tools available Bank), Bill Frauenhofer (Managing Director,
within Ocelot. Citigroup Global Markets), Garrett Herbert (Partner,
Speaker(s): Gregory Diamos (PhD Student, Georgia Institute M&A Transaction Services, Deloitte & Touche
of Technology), Andrew Kerr (PhD Student, Georgia LLP), Eric Jensen (Partner, Business Department
Institute of Technology), Sudhakar Yalamanchili Chair, Cooley LLP), Andrew T. Sheehan (Managing
(Professor, Georgia Institute of Technology) Director, Sutter Hill Ventures)
Topic(s): Tools & Libraries Topic(s): Finance, General Interest
Thursday, Sept 23, 14:00 (50 minutes) Thursday, Sept 23, 14:30 (20 minutes)
Room B Room A8
FULL CONFERENCE GUIDE 2010
2220 Thrust by Example: Advanced Features and 2240 Accelerating LS-DYNA with MPI, OpenMP, and CUDA
Techniques When solving implicit problems, the computational bottleneck
Thrust is a parallel template library for developing CUDA in LS-DYNA is the multifrontal linear solver. These operations
applications which is modeled after the C++ Standard Template are performed with double precision arithmetic, hence until the
Library (STL). In this session we’ll show how to implement arrival of the Tesla 2050, experiments with GPU acceleration
decompose problems into the algorithms provided by Thrust. were only a curiosity. This is no longer the case, and in this
We’ll also discuss the performance implications of “kernel talk we will describe how LS-DYNA’s hybrid (MPI and OpenMP)
fusion” and “array of structs” vs. “structure of arrays” memory solver is further accelerated using GPUs to factor large dense
frontal matrices.
73
Speaker(s): Bob Lucas (Computational Sciences Division 10x faster performance compared to sequentially processing ASR
Director, University of Southern California) on a CPU. The state-of-art algorithm for ASR performs a graph
Topic(s): High Performance Computing, Algorithms & traversal on a large, irregular graph with millions of states and
Numerical Techniques arcs, guided by speech input only known at runtime. We present
four generalizable techniques including: dynamic data-gather
Thursday, Sept 23, 15:00 (50 minutes) buffer, find-unique, lock-free data structures using atomics,
Room K and hybrid global/local task queues. When used together, these
techniques can effectively resolve ASR implementation challenges
2003 Using CUDA to Accelerate Radar Image Processing on a GPU.
Come see how current GPU technology provides the means for
the first portable real-time radar image processing algorithm. Speaker(s): Jike Chong (Principal Software Architect,
This session will outline how the GPU has afforded nearly three Parasians, LLC)
orders of magnitude improvement in performance for Synthetic Topic(s): Machine Learning & Artificial Intelligence,
Aperture Radar’s (SAR) hallmark image processing algorithm. Algorithms & Numerical Techniques,
We will present algorithm details and further improvements. Audio Processing
Thursday, Sept 23, 15:00 (50 minutes) Thursday, Sept 23, 15:00 (20 minutes)
Room A2 Room A8
2105 CUDA-FRESCO: An Efficient Algorithm for Mapping 2213 BCSLIB-GPU: Significant Performance Gains for CAE
Short Reads Hear product architects and developers describe the algorithmic
Learn about CUDA-FRESCO and how it addresses issues with depths and high level breath of the use of GPUs that have been
MUMmerGPU. We will detail how CUDA-FRESCO overcomes employed to create BCSLIB-GPU, the GPU enablement of the
MUMmerGPU’s problems processing reads with errors or industry standard sparse matrix software suite, BCSLIB-EXT.
mismatches and delivers additional performance beyond We provide a range of comparison data with Tesla and Fermi
MUMmerGPU’s 5-12x speedup with less than 100bp query length. compared with multi-core CPU only systems and for a wide range
of realisitic demanding real world test problems.
Speaker(s): Chun-Yuan Lin (Assistant Professor, Department of
CSIE, Chang Gung University) Speaker(s): Danl Pierce (Partner, Access Analytics Int’l, LLC)
Topic(s): Life Sciences, Algorithms & Numerical Techniques, Topic(s): Tools & Libraries, Algorithms & Numerical
Tools & Libraries Techniques, High Performance Computing,
Embedded & Automotive
Thursday, Sept 23, 15:00 (50 minutes)
Room L Thursday, Sept 23, 15:00 (50 minutes)
Keynote Hall
2107 Accelerating Stereographic and Multi-View Images
Using Layered Rendering 4010 Emerging Companies: CEO on Stage featuring
Explore applications of geometry shaders in improving the NaturalMotion Ltd, OptiTex, and Useful Progress
performance of stereo pair or multi-viewer image generation. See the hottest new technologies from startups that are
This session will cover the basic approach of single-pass stereo- transforming computing.
pair creation and provides guidelines for when layered rendering
THURSDAY
In a lively and fast-paced exchange, the “Emerging Companies
can be used to increase performance. A particular emphasis
Summit - CEO on Stage” sessions will feature CEOs from three
will be placed on virtual reality and scientific visualization, but
startups who will each have 15 minutes to introduce their
the techniques discussed apply to a wide range of rendering
companies and interact with a panel of leading venture capitalists,
environments. Results will be shown for three GPU architectures,
technology executives, and industry analysts.
including the new GF100 GPU.
Panelist(s): Tim Bajarin (President, Creative Strategies),
Speaker(s): Jonathan Marbach (Director of Software
Jeff Herbst (Vice President of Business
Architecture and Engineering, TerraSpark
Development, NVIDIA), Bill Tai (General Partner,
Geosciences, LLC)
CRV), Paul Weiskopf (Sr. VP of Corporate
Topic(s): Stereoscopic 3D
Development, Adobe)
Speaker(s): Yoram Burg (President, OptiTex.), Sylvain Ordureau
Thursday, Sept 23, 15:00 (50 minutes)
(CEO, Useful Progress), Torsten Reil (CEO,
Room C
NaturalMotion Ltd)
2123 Enabling Augmented Reality with GPU Computing Topic(s): General Interest, Medical Imaging & Visualization,
This talk will take a detailed look at Sportvision’s “First and 10” Physics Simulation, Computer Graphics
system, perhaps the most widely experienced example of AR ever,
with 106 million viewers during the 2010 Superbowl alone. We’ll Thursday, Sept 23, 15:30 (20 minutes)
examine the current implementation and the GPU features that Room A7
enable low latency, video-rate performance.
2063 Banking on Monte Carlo… and Beyond
Speaker(s): Ryan Ismert (Director of Engineering, Sportvision, Inc.) Last year NAG presented spectacular results for Monte Carlo
Topic(s): Computer Vision techniques on GPUs using NAG’s GPU library. This year we will
talk about new projects in the areas of Monte Carlo and PDE
Thursday, Sept 23, 15:00 (50 minutes) techniques, delivering additional benefits to the finance industry
Room A3 for real-world problems, including credit modeling.
2153 CULA - A Hybrid GPU Linear Algebra Package Speaker(s): Ian Reid (Chief Commercial Officer, NAG)
Get the latest information on CULA, an implementation of Topic(s): Finance
hybrid GPU/CPU linear algebra solvers for NVIDIA GPUs. CULA
launched at GTC2009 and has since received large speedups and Thursday, Sept 23, 15:30 (20 minutes)
FULL CONFERENCE GUIDE 2010
many new features. We will cover all the features, old and new, Room A8
along with performance, inner workings, and how users can
2208 Acceleration of SIMULIA’s Abaqus Solver on
integrate CULA into their applications. Learn how your existing
NVIDIA GPUs
linear algebra applications can benefit from a high quality library.
Much more information is available at www.culatools.com and at Learn about Acceleware’s and Dassault Systemes’ integrated
our presentation and booth. solution that performs an LDL^T factorization on GPUs within
the Abaqus software package. We will discuss efficient GPU
75
parallelization of the factorization algorithm and enabling the how GPU-based computation enables visual servoing and box
CPU and GPU to overlap their computations and data transfers. moving. We also discuss the potential of the GPU to solve more
Includes an end user simulation case study and GPU performance difficult sensory problems such as multi-robot cooperation,
measurements including 300 GFlops in single precision and 145 multimodal sensor binding, attention, sensitization, and
GFlops in double precision on NVIDIA Tesla C2050. habituation.
Speaker(s): Chris Mason (Product Manager, Acceleware) Speaker(s): Dr. Alan Peters (CTO, Universal Robotics, Inc.)
Topic(s): High Performance Computing Topic(s): Machine Learning & Artificial Intelligence
Thursday, Sept 23, 14:00 (50 minutes) Thursday, Sept 23, 16:00 (50 minutes)
Room A5 Room A1
2008 OpenCL Optimization 2095 Building High Density Real-Time Video
Learn how to optimize your OpenCL application to achieve Processing Systems
maximum performance on NVIDIA GPUs. We will first briefly Learn how GPU Direct can be used to effectively build real time,
discuss how the OpenCL programming model maps onto NVIDIA high performance, cost effective video processing products. We
GPU’s architecture. We will then talk about memory, instruction, will focus especially on how to optimize bus throughput while
and NDRange optimization techniques, illustrating each with keeping CPU load and latency minimal.
small code samples.
Speaker(s): Ronny Dewaele (Director Technology Center, Barco)
Speaker(s): Peng Wang (Developer Technology Engineer, NVIDIA) Topic(s): Video Processing, Imaging
Topic(s): Tools & Libraries, High Performance Computing
Thursday, Sept 23, 16:00 (50 minutes)
Thursday, Sept 23, 16:00 (50 minutes) Room A3
Room B
2100 Hybrid GPU/Multicore Solutions for Large Linear
2086 GPGPU DL_POLY Algebra Problems
Discover DL_POLY. Large linear algebra problems may be solved using recursive
DL_POLY: an MD code ICHEC has ported to CUDA. The block decomposition in which GPUs efficiently compute the sub-
THURSDAY
presentation especially focuses on the auto-tuning of the work blocks and multicore CPUs put the sub-blocks back together
distribution between CPU and GPU within a large shared memory space. This talk will present
Speaker(s): Gilles Civario (Head of Capability Computing and benchmark results for such a hybrid approach, implemented in
Novel Architecture Group, ICHEC) Matlab® and using Jacket® to access the GPU compute power.
Topic(s): Molecular Dynamics, High Performance Computing Speaker(s): Nolan Davis (Research Scientist, SAIC)
Topic(s): High Performance Computing, Algorithms &
Thursday, Sept 23, 16:00 (50 minutes) Numerical Techniques, Signal processing
Room A2
2088 Nucleotide String Matching Using Thursday, Sept 23, 16:00 (50 minutes)
CUDA-Accelerated Agrep Room C
Dive deep into the intelligent utilization of various CUDA 2114 Cascaded HOG on GPU
memory spaces to remarkably speedup approximate DNA/ We propose a real time HOG based object detector implemented
RNA nucleotide sequence matching algorithm in bioinformatics on GPU. To accelerate the detection process, the proposed
by an amazing factor of 67 compared to multi-threaded quad method uses two serially-cascaded HOG detectors. The first low
core CPU counterpart. Our talk provides a very good example dimensional HOG detector discards detection windows obviously
to demonstrate how to use indexable array to save frequently not showing target objects. It reduces the computational cost of
updated variables directly into GPU registers, how to organize the second high dimensional HOG detector. This method tested on
shared memory into a 2D array to avoid bank conflict, and how to 640x480 color image and the same size movie. The computation
shuffle the data structure to satisfy the requirement for coalesced time decreases to 70ms per image. That is 4 times faster
global memory access. Our CUDA implementation employs online than a case of single detector. This method provides real time
approach and can be applied in real time. performance even on middle end GPUs such as GeForce GTS 250.
Speaker(s): Hongjian Li (Graduate Student, The Chinese Speaker(s): Kento Tarui (Researcher, AquaCast Corporation)
University of Hong Kong) Topic(s): Computer Vision, Machine Learning &
Topic(s): Life Sciences, Algorithms & Numerical Techniques Artificial Intelligence
Thursday, Sept 23, 16:00 (50 minutes) Thursday, Sept 23, 16:00 (50 minutes)
Room N Room K
2091 The GPU in the Reactive Control of Industrial Robots 2126 Accelerating Signal Processing: Introduction to
Universal Robotics is using GPUs for real-time visual sensing GPU VSIPL
in the reactive control of industrial robots. For a robot to work Learn how to use the Vector Signal Image Processing Library
in a complex dynamic environment to achieve a more loosely to accelerate signal processing applications without needing to
specified goal, such as moving arbitrary boxes from a pallet to a understand platform-specific programming and optimization
conveyor, requires reactivity. Reactive control requires intensive, techniques. We will discuss how GPU VSIPL implements
concurrent, low-latency computation for motion planning, the VSIPL API and uses CUDA-capable GPUs to maximize
exception handling, and sensing. We describe and demonstrate performance of several example applications.
Speaker(s): Dan Campbell (Research Engineer, Georgia Tech intended to provide a pragmatic guide to creating prosumer 3D
Research Institute) video content and how the GPU greatly assists and speeds up this
Topic(s): Signal processing, Tools & Libraries process. The intended audience is anyone interested in how to
create compelling 3D movies at a prosumer level.
Thursday, Sept 23, 16:00 (20 minutes) Speaker(s): Rudy Sarzo (Principal, SMI), Ian Williams (Director
Room A8 PSG Applied Engineering, NVIDIA), Kevan O’Brien
2133 3D Full Wave EM Simulations Accelerated by GPU (NVIDIA)
Computing Topic(s): Digital Content Creation (DCC)
3D Full Wave Electromagnetic simulations of RF components,
antennas, printed circuit boards, can be quite time consuming. Thursday, Sept 23, 16:00 (50 minutes)
Computer Simulation Technology (CST) toolsuite includes the Room L
capability to activate GPU Computing. Examples will be shown of 2283 500 Teraflops Heterogeneous Cluster
using Tesla C1060 and S1070 configurations to provide significant
HPC Affiliated Resource Center (ARC) will be host of a very large
performance improvement of complex simulations.
interactive HPC. The large cluster (CONDOR) will integrate cell
Speaker(s): Fabrizio Zanella (Systems Manager, CST of America) broadband engine processors, GPGPUs and powerful x86 server
Topic(s): High Performance Computing, nodes, with a combined capability of 500 Teraflops. Applications
will include neuromorphic computing, video synthetic aperture
Thursday, Sept 23, 16:00 (20 minutes) radar backprojection, matrix multiplications, and others. This
Room A7 presentation will discuss progress on performance optimization
using the Heterogeneous Cluster and lessons learned from
2136 Pseudo Random Number Generators for Massively this research.
Parallel Apps
Learn how to select the best and fastest pseudo random number Speaker(s): Mark Barnell (HPC Director, Air Force Research
generator for your massively parallel Monte Carlo simulation. Lab (AFRL))
Pseudo random numbers generators (PRNG) are a fundamental Topic(s): High Performance Computing
building block of these simulations and it is thus required to
select suitable PRNGs with regard to the specific problem at Thursday, Sept 23, 16:00 (50 minutes)
THURSDAY
hand while considering the parallel hardware architecture. Keynote Hall
Recent developments in random number generations provide 4011 Emerging Companies: CEO on Stage featuring
a wide variety of choices, each with different properties and Cinnafilm Inc., Perceptive Pixel, and Total Immersion
trade-offs. We provide a comprehensive survey of the current
See the hottest new technologies from startups that are
state of the art for massively parallel PRNG and show a broad
transforming computing.
range of applications.
In a lively and fast-paced exchange, the “Emerging Companies
Speaker(s): Holger Dammertz (PhD Student, Ulm University)
Summit - CEO on Stage” sessions will feature CEOs from three
Topic(s): Algorithms & Numerical Techniques, Finance
startups who will each have 15 minutes to introduce their
companies and interact with a panel of leading venture capitalists,
Thursday, Sept 23, 16:00 (50 minutes) technology executives, and industry analysts.
Room A5
Panelist(s): Tim Bajarin (President, Creative Strategies),
2271 Compose CUDA Masterpieces! Write better, Jeff Herbst (Vice President of Business
Leverage More Development, NVIDIA), Bill Tai (General Partner,
Not all CUDA code is created equally. Learn how to step up your CRV), Paul Weiskopf (Sr. VP of Corporate
CUDA game. Also, learn how to build large, multi-person CUDA Development, Adobe)
projects for your organization. In very clear descriptions, learn Speaker(s): Lance Maurer (CEO, Cinnafilm, Inc.), Bruno Uzzan
the difference between naïve GPU code, intermediate GPU code, (Founder and CEO, Total Immersion), Jeff Han
and advanced GPU mastery. We show how careful construction (Founder and Chief Scientist, Perceptive Pixel)
of CUDA kernels can affect application performance. We also
Topic(s): General Interest, Computer Vision, Film, Imaging
discuss how Jacket tools greatly facilitate the development of
CUDA-based projects. Finally, we will debut the Jacket runtime’s
Thursday, Sept 23, 16:30 (20 minutes)
new C/C++ library. With this library, the technical computing
Room A8
functions in Jacket’s MATLAB engine are made available in C/C++.
2066 Accelerating System Level Signal Integrity Simulation
Speaker(s): James Malcolm (VP of Engineering, AccelerEyes)
Discuss how GPU acceleration for key parts of the ANSYS Nexxim
Topic(s): Tools & Libraries
Simulator resulted in significant speedup over multi-core
processors. We will cover time consumption and data parallelism
Thursday, Sept 23, 16:00 (50 minutes)
exposure considerations, and focus on key areas where GPU
FULL CONFERENCE GUIDE 2010
Room D
acceleration was applied including convolution and Eye rendering.
2279 Working Man’s Guide to 3D Video Editing
Speaker(s): Danil Kirsanov (Scientist, ANSYS), Ekanathan
Video editing is currently at two simultaneous inflections points: Palamadai (Research & Development Engineer,
use of GPUs for video processing and the beginning of wide ANSYS)
spread adoption of 3D. At this time however, identifying and
Topic(s): Physics Simulation, Algorithms & Numerical
navigating through the necessary tools and equipment to create
Techniques, Signal processing
compelling 3D video content is challenging. This session is
77
Thursday, Sept 23, 16:30 (20 minutes)
Room A7
2101 Pricing American Options Using GPUs
This presentation focuses on the challenging problem of Pricing
High-Dimensional American Options (PHAO) and how GPUs can
be involved in this task. On the one hand, we present a method
based on Malliavin calculus which is effective for parallel
architecture. On the other hand, we compare this method
with Longstaff & Schwartz method which is more dedicated to
sequential architecture. We will conclude with some ideas about
the parallelization of the former method on a cluster of machines
and finally we will discuss this method considering it as a
reformulation of a non-linear parabolic problem using BSDEs.
Speaker(s): Lokman A. Abbas-Turki (PhD Student in Applied
Mathematics, Paris-Est University)
Topic(s): Finance, Physics Simulation
A03 - Particle-In-Cell Simulations on the GPU A07 - A Hybrid Method for Solving Tridiagonal
Particle-In-Cell simulations represent an Systems on GPU
important technique in the field of kinetic plasma Tridiagonal linear systems are of importance
simulations. 2D particle pushing and conserved to many problems in numerical analysis and
current aggregation has been implemented in computational fluid dynamics, as well as to
NVIDIA RESEARCH SUMMIT
CUDA. On a TESLA C1060 the CUDA code is 4 computer graphics applications in video games
times faster than SSE2 optimized code on a quad and computer-animated films. This poster
core INTEL XEON processor. presents our study on the performance of multiple
POSTER LISTING
POSTER LISTING
vision and other related fields. However, high We present GPU algorithms and strategies for
computation cost prevents applying this method accelerating distance queries and clearance
to real-time and interactive scenarios. This computations on models made of trimmed
work intensively used parallel design patterns NURBS surfaces. We provide a generalized
that are implemented in the thrust library, framework for using GPUs as co-processors
like compaction, reduction and scattering, to in accelerating CAD operations. The accuracy
parallelize the particle level set method in order to of our algorithm is based on the model space
attain real-time performance. precision, unlike earlier graphics algorithms that
Author: Wen Zheng (Stanford University) were based only on image space precision. Our
algorithms are at least an order of magnitude
faster and about two orders of magnitude more
A12 - Accelerating Cuda Graph Algorithms at
accurate than the commercial solid modeling
Maximum Warp
kernel ACIS.
Graphs are powerful data representations favored
Author: Adarsh Krishnamurthy (University of
Graphs are powerful data representations favored
California, Berkeley)
in many computational domains. GPUs have
showed promising results in this domain, but their
performance when the graph is highly irregular. A16 - Gate-Level Simulation with GP-GPUs
In this study, we propose three general schemes This poster describes my research work on how
to accelerate graph algorithms on a modern GPU to leverage the GP-GPU execution parallelism to
architecture: (i) deferred processing of outliers, achieve high performance in the time consuming
(ii) efficient dynamic workload balancing and problem of gate-level simulation of digital
(iii) warp-based execution exploiting threads hardware designs.
FULL CONFERENCE GUIDE 2010
accelerating performance
www.sgi.com
© 2010 SGI. SGI and Altix are registered trademarks or trademarks of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
All other trademarks are property of their respective holders.
A17 - CUDA Implemenation of Barrier Option C02 - Efficient Automatic Speech Recognition on
Valuation using Jump-Diffusion Model and the GPU
Browning Bridge Automatic speech recognition (ASR) technology
Impressive speedups up to 100x using GPUs is emerging as a critical component in data
compared to CPUs are achieved by taking analytics for a wealth of media data being
advantage data parallelism, increased bandwidth generated everyday. ASR-based applications
and the ability to hide latency. We have contain fine-grained concurrency that has great
implemented a Monte Carlo valuation of a barrier potential to be exploited on the GPU. However,
option modeled by a standard diffusion process the state-of-art ASR algorithm involves a highly
with a jump diffusion term obeying an underlying parallel graph traversal on an irregular graph
Poisson process to account for rare events. In with millions of states and arcs, making efficient
addition, a Brownian Bridge is incorporated to parallel implementations highly challenging. We
account for barrier crossings in between diffusion present four generalizable techniques including:
trajectories and to reduce bias. This option is dynamic data-gather buffer, find-unique, lock-
representative of exotic options which lack a free data structures using atomics, and hybrid
closed-form solution and are amenable to Monte global/local task queues. When used together,
Carlo type methods for valuation. these techniques can effectively resolve ASR
Author: Vincent Natoli (Stone Ridge Technology) implementation challenges on an NVIDIA GPU.
Author: Jike Chong (Parasians, LLC)
POSTER LISTING
with two or more massive bodies (black holes), with the Tesla C1060 compared to the Intel i7 CPU.
including if necessary relativistic corrections to Memory access was optimized using shared and
the classical Newtonian gravitational forces (Kupi texture memory.
et al. 2006, Berentzen et al.2009). Author: Patrice Castonguay (Stanford University)
Author: Rainer Spurzem (National Astronomical
Obersvatories, Chinese Academy of Sciences) D02 - Parallel 3D Geometric Multigrid Solver on
GPU Clusters
An investigation of the performance and scalability
of a multigrid pressure Poisson equation solver
Audio Processing running on a GPU cluster.
Author: Dana Jacobsen (Boise State University)
C01 - Exploring Recognition Network
Representations for Efficient Speech Inference
D03 - Acceleration of mesh-free CFD using CUDA
on the GPU
In this work, the acceleration of a mesh-free
We explore two contending recognition network
Computational Fluid Dynamics (CFD) code is
representations for speech inference engines:
performed using CUDA. The poster gives an
the linear lexical model (LLM) and the weighted
overview of the CUDA implementation strategy
finite state transducer (WFST) on NVIDIA GTX285
and the resulting performance increase.
and GTX480 GPUs. We demonstrate that while
Author: Ruairi Nestor (Irish Centre for
an inference engine using the simpler LLM
High-End Computing)
representation evaluates 22x more transitions per
second than the advanced WFST representation,
D04 - Airblast Modelling on Multiple Tesla units
FULL CONFERENCE GUIDE 2010
with a lower memory overhead than previous GPGPU card costs ~$500 and is ~size of mouse
approaches. Using our data structure, we have cortex. We present results demonstrating image
seen significant improvements in both volume ray classification on UAV aerial video with a visual
casting and ray tracing applications over previous cortex model running on a 240-core NVIDIA
state-of-the-art methods. GeForce GTX285, and see >x10 speed-up. As
Author: Nathan Andrysco (Purdue University) this technology continues to improve, cortical
modeling on GPGPU devices has the potential to
revolutionize computer vision.
E02 - Fragment-Parallel Composite and Filter Author: Steven Brumby (Los Alamos
In this poster, we describe our recent work in National Laboratory)
the area of programmable graphics pipelines by
presenting a fragment-parallel formulation of an
A-buffer-style composite and filter equation, and F04 - Fermi in Action: Robust Background
describe its implementation on a modern GPU. Subtraction for Real-time Video Analysis
Author: Anjul Patney (University of California, Davis) Background subtraction is one of the important
image processing steps for video surveillance and
many computer vision problems such as tracking
& recognition. However, robust background
Computer Vision subtraction that adapts well to variable
environment changes is highly computational
F01 - Architecture Aware Design for a Parallel and consumed large amount of memory. Thus,
Object Recognition System its practical application is often limited. Here,
We have developed a parallel object we aimed to expand its usage and tackle vision
recognition system using CUDA, achieving problems that requires high frame rate camera
70x-80x speedup against the original serial such as real-time sports analysis, real-time
implementation. In order to optimize our object detection and recognition. Using recent
implementation, we evaluated the performance advances in accelerator hardware – NVIDIA
of different parallelization strategies on some Fermi Architecture and taking advantage of
key computations in the object recognition heterogeneous computing , we are able to gain
system. Finally we concluded that the parallel good performance that allows to use in these
implementation performance is sensitive to input practical applications.
Author: Melvin Wong (Institute for Infocomm Research)
F05 - Bridging Neuroscience and GPU Computing clustered using hamming distances. Each of
to Build General Purpose Computer Vision these clusters is geometrically verified and
The construction of artificial vision systems connected using Geotags. Connected clusters
and the study of biological vision are naturally are bundle adjusted and the obtained registration
intertwined as they represent simultaneous is used to estimate depthmaps that are finally
efforts to forward- and reverse-engineer systems fused to obtain dense 3D models. Each of the
with similar goals. Here, we present a high- above steps, except Bundle Adjustment, is
throughput approach to more expansively explore implemented in CUDA and runs on multiple GPUs.
biologically-inspired models by leveraging GPUs. The performance of our pipeline is two order
We show that this approach can yield significant of magnitude faster on one order more images
gains in performance on object and face compared to state of the art method.
recognition (including “Labeled Faces in the Wild” Author: Jan-Michael Frahm (University of North
challenge and faces from Facebook), consistently Carolina, Chapel Hill)
outperforming the state-of-the-art. We highlight
how the application of flexible programming
F10 - Portable Central Vision Enhancement
tools, such as high-level scripting, template
System for Macular Degeneration Patients
metaprogramming/auto-tuning, can enable large
Vision enhancement systems is an alternative
performance gains, while managing complexity for
visual aid device to enhance the remaining vision
the developer.
for visual impairment subjects. Our aim is to
Author: Nicolas Pinto (Massachusetts Institute
develop a mobile central vision enhancement
of Technology)
system for macular degeneration patients. Three
different types of enhancement algorithms have
F06 - CUDA for Vision and Imaging Library been developed and their efficiency was tested on
CUVI Lib (CUDA for Vision and Imaging Library) low vision patients. These three algorithms have
is a software library that provides a set of been implemented on a portable low power devic.
GPU accelerated computer vision and image The Nvidia system-on-a-chip Tegra has been
processing functions. CUVI can both be utilized as chosen for this implementation.
an add-on library for the NVIDIA’s NPP (NVIDIA Author: Chloe Vaniet (Imperial College London)
Performance Primitives) as it compliments the
functionality present in NPP as well as it can be
F11 - Dense Stereo Vision on GPU
used as a standalone library ready to be plugged
A dense stereo vision for a material handling
POSTER LISTING
F07 - GPU-Friendly Multi-View Stereo using CUDA.
Reconstruction Using Surfel Representation and Author: Esubalew Bekele (Universal Robotics Inc.)
Graph Cuts
We present a new surfel (surface element) based
F12 - Upsampling Range Data in Dynamic
multi-view stereo algorithm which runs entirely
Environments
on GPU. We utilize flexibility of surfel-based 3D
We present a flexible, parallelized method for
shape representation and global optimization
fusing information from optical and range sensors
by graph cuts in a same framework.The
based on an accelerated high-dimensional
orientation of the constructed surfel candidates
filtering approach. Our system takes as input a
imposes an effective constraint that reduces the
sequence of monocular camera images as well
effect of the minimal surface bias. The entire
as a stream of sparse range measurements as
processing pipeline is implemented on the latest
obtained from a laser or other sensor system.
GPU to speed up the processing significantly.
Our method produces a dense, high-resolution
Experimental results show that the proposed
depth map of the scene, automatically generating
approach reconstructs the 3D shape of an object
confidence values for every interpolated depth
accurately and efficiently, which runs more than
point. We describe how to integrate priors on
100 times faster than on CPU.
object shape, motion and appearance and how to
Author: In Kyu Park (Inha University)
achieve an efficient implementation using parallel
processing hardware such as GPUs.
F08 - CUDA Accelerated Face Recognition Author: Jennifer Dolson (Stanford University)
A GPU based implementation of a face recognition
solution using PCA with Eigenfaces algorithm.
FULL CONFERENCE GUIDE 2010
consisting of functions implemented on GPU detecting speed-limit signs which is only possible
was introduced in OpenCV. It consists of several with the aid of GPU processing.
methods for calculating stereo correspondence Author: Vladimir Glavtchev (BMW)
between two images that is used to reconstruct
a 3D scene. A simple block-matching algorithm
works up to 10x faster compared to a CPU H02 - Complex Automotive Applications
implementation in OpenCV providing real-time NVIDIA GPU architecture becomes a very
processing of HD stereo pairs on Tesla cards. interesting hardware target for complex
Belief propagation-based algorithms show 20-50x automotive application. We implemented the
speedup compared to a CPU implementation. same automotive application on several different
Author: Anatoly Baksheev (ITEEZ) hardware targets and analyzed the maximum
frame rate and the effective CPU charge. This
paper shows how real-time applications like
pedestrian detection and driving assistance
Databases & Data Mining take benefits from a massively parallel
“central” architecture like GPU/CUDA. Real-
G02 - Speculative Query Processing time performance and zero-delay transfers
With an increasing amount of data and user can be achieved using a full asynchronous
demands for fast query processing, the implementation. The same approach can really
optimization of database operations continues to multiply the application performance by the
be a challenging task. A common optimization number of GPU devices present on the embedded
method is to leverage parallel hardware system, at a reasonable power consumption.
architectures. With the introduction of general- Author: Marius Vasiliu (University of Paris Sud)
purpose GPU computing, massively parallel
hardware has become available within commodity High Performance Computing
hardware. To efficiently exploit this technology,
we introduce the method of speculative query I01 - A GPU-based Architecture for Real-Time
processing. This speculative query processing Data Assessment at Synchrotron Experiments
works on index structures to efficiently support Modern X-ray imaging cameras provide millions
heavily used database operations. To show the of pixels and several thousand frames per second.
To process such an amount of information we divergence in parallel code kernels. The use of
have optimized the reconstruction software the GPU allows AB models to be visualised in real
employed at the tomography beamlines of ANKA time, which further widens the application of ABM
and ESRF synchrotrons to use the computational to real-time simulations.
power of modern graphic cards. Using GPUs as Author: Paul Richmond (University of Sheffield)
compute coprocessors we were able to reduce
the reconstruction time by a factor 30 and
I05 - The Scalable HeterOgeneous Computing
process a typical data set of 20GB in 40 seconds.
(SHOC) Benchmark Suite
The time needed for the first evaluation of the
SHOC is a benchmark suite for heterogeneous
reconstructed sample is reduced significantly and
systems. This poster describes the suite and
quasi real-time visualization is now possible.
presents recent performance measurements.
Author: Suren Chilingaryan (Karlsruhe Institute
Author: Kyle Spafford (Oak Ridge National Laboratory)
of Technology)
POSTER LISTING
I03 - CSIRO Advances in GPU Computing. What I08 - Particle Simulations using DEM on GPUs
could you do with 256 GPUs? Particle based numerical methods are an
The Commonwealth Scientific and Industrial emerging field since the GPU/CUDA technique
Research Organisation (CSIRO) is Australia’s became widely accepted in the last years. 80%
national science agency. CSIRO is currently of the whole material,used in pharmaceutical
applying GPU Computing on a scale ranging from technology are powders. Numerical simulations
single GPU workstations through to their 256 GPU of such material is possible by using the Discrete
cluster. This poster showcases some of CSIRO’s Element Method (DEM). The main restrictions
work in the areas GPU accelerated biological here is compute power together with the problem
imaging, image deconvolution, synchrotron size. Only a few ten-thousand particles lead to
science and CT reconstruction, and statistical weeks to months of compute time in order to
inference in complex environmental models. reflect processes of a few minutes in real time.
Speedups of between 8 to 230x have been seen DEM scales excelent with the massively-parallel
across these applications areas using a broard CUDA environment, enabling us to access the
range of GPU computing platforms. million particle range in acceptable job runtimes.
Author: Luke Domanski (CSIRO) Author: Charles Radeke (University Graz)
I04 - High Performance Agent-Based Simulation I09 - Mastering Multi-GPU Computing on a Torus
with FLAME for the GPU Network
The Flexibile Large-scale Agent Modelling We describe APEnet+, the new generation of
Environment for the GPU (FLAME GPU) addresses our 3D torus network which scales up to tens
the performance and architecture limitations of of thousands of cluster nodes with linear cost.
FULL CONFERENCE GUIDE 2010
previous work by presenting a flexible framework The basic component is a custom PCIe adapter
approach to ABM on the GPU. Most importantly with six high-speed links, designed around a
it addresses the issue of agent heterogeneity programmable HW component (FPGA), a nice
through the use of state machine based agent environment for studying integration techniques
representation. This representation allows between GPUs and network interfaces. The
agents to be separated into associated state lists highlevel programming model is MPI, while a low-
which are processed in batches to allow very level RDMA API is also available.
diverse population of agents whilst avoiding large Author: Davide Rossetti (National Institute of
Nuclear Physics)
87
I10 - Atmospheric Modelling, Simulation and I14 - An Atomic Tesla
Visualization using CUDA We examined the possibility of using an Atom-
The Laboratory Meteorological Dynamics (LMD) based host system to control a Tesla S1070.
by CNRS weather model is used extensively for Our simple benchmarks found that Atom-based
research and weather forecasting purposes. systems should be viable for codes with serial
Simulation of atmospheric climate is one of the portions small enough to make Amdahl’s Law
most challenging computational tasks because irrelevant. Such systems would have a much lower
of its numerical complexity and simulation time. power draw than ‘traditional’ GPU clusters.
The numerical simulations must be obviously Author: Richard Edgar (Massachusetts
achieved faster than in real time to use them in General Hospital)
decision support.
Author: Priyanka Sah (Indian Institute of
I15 - ICHEC’s GPU Research: Porting of Scientific
Technology, Delhi)
Application on NVIDIA GPU
ICHEC is the Irish National HPC centre, with
I11 - Automatic Program Generation for the a mission to provide both high performance
Fermi - DFT Transform computing resources and expertise for the Irish
The goal of SPIRAL is to push the limits research community. In addition to its core
of automation in software and hardware mission of research enablement, ICHEC started
development and optimization numerical kernels in May 2009 an exploratory activity in GPGPU
beyond what is possible with current tools. In this and CUDA programming. Quantum Espresso
research, we address the problem of an efficient is an increasingly popular molecular dynamic
high performance computing platform of libraries package, mainly developed by the DEMOCRITOS
automatically generated by a computer forNVIDIA group in Trieste (IT). PWscf is part of the Qauntum
GPU architectures. Spiral generates code that Espresso suite which performs electronic and
automatically bypasses all the architectural ionic structure calculations. Interesting part on
restrictions on GPUs, shared memory bank the porting of PWscf is an high performance [ZD]
conflicts, global memory coalescing and pushes gemm which execute in parallel between CPU
code to the limits (maximum number of threads, and GPU.
register pressure, etc.). The procedure of code Author: Ivan Girotto (Irish Centre for
generation is fast, platform dependent, easy to High-End Computing)
rewrite and problem adaptable.
NVIDIA RESEARCH SUMMIT
I12 - Fast N-body Algorithms for Dynamic of Smith-Waterman algorithm done in OpenCL.
Problems on the GPU This implementation is capable of computing
we present an extension of the earlier algorithm similarity indexes between query sequences and
by Gumerov & Duraiswami (J. Comput. Phys., a reference sequence with or without sequence
2008) which adapts the FMM to the GPU, where alignment paths. In accordance with the
the data structures are efficiently generated on requirement for the target application in cancer
the GPU as well. Details and performance on research the implementation provides processing
current architectures will be presented. of very long reference sequences (in the order of
Author: Qi Hu (University of Maryland) millions of nucleotides). Performance compares
favorably against CPU, being on the order of
14 - 610 times faster; 4.5 times faster than the
I13 - GPU Acceleration of Cube Calculus
Farrar’s implementation. It is also on par with
Operations
CUDASW++v2.0.1 performance, but with less
In our current work, we present the first massively
constraints in sequence length.
parallel, GPU accelerated implementation of the
Author: Dzmitry Razmyslovich (Institute of
Cube Calculus operations for multivalued and
Computer Engineering, University of Heidelberg)
binary logic, also called Cube Calculus Machine
(CCM). Substantial speedups upto the order of
85x are achieved using the CUDA enabled nVIDIA I17 - Computing Strongly Connected
Tesla GPU compared to the CPU implementation Components in Parallel on CUDA
on a sequential processor.CC is a very efficient The problem of decomposition of a directed
and convenient mathematical formalism for graph into its strongly connected components is
representation, processing and synthesis of a fundamental graph problem inherently present
binary and multivalued logic which has significant in many scientific and commercial applications.
applications in logic synthesis, image processing We show how existing parallel algorithms can
and machine learning. Thus, massive speedups be reformulated in order to be accelerated by
achieved using GPUs are very encouraging to build NVIDIA CUDA technology. We design a new
future parallel VLSI EDA systems CUDA-aware procedure for pivot selection and
Author: Vamsi Parasa (Portland State University) we redesign the parallel algorithms in order to
allow for CUDA accelerated computation. We
experimentally demonstrate that with a single
GTX 280 GPU card we can easily outperform processing in these situations greatly affects
optimal serial CPU algorithm. the workflow throughput. We report some early
Author: Milan Ceska (Masaryk University) results on GPU acceleration of the Neurite
Detection module in our groups’ HCA-Vision.
The most time consuming algorithm steps
I18 - A CUDA Runtime Target for the
are accelerated by up to 13.6x resulting in a
Sequoia Compiler
3.3x speedup for the entire algorithm (70% of
We describe an implementation of the Sequoia
theretical maximum).
Runtime interface in CUDA that enables the
Author: Luke Domanski (CSIRO)
Sequoia compiler to target programs written in
Sequoia for single and multiple GPU systems.
Author: Michael Bauer (Stanford University) J02 - Fast Radon Transform via Fast Non-
uniform FFTs on GPUs
Fast Radon Transform is required in X-ray Phase
I19 - GPU Computing for Real-Time Optical
Contrast Tomography performed at the Advanced
Measurement Techniques
Light Source, Lawrence Berkeley National Lab.
Measuring displacement and strains during
We describe a fast implementation based on fast
deformation of advanced materials which are
non-uniform FFTs on GPUs.
too small, big, compliant, soft or hot are typical
Author: Chao Yang (Lawrence Berkeley National
scenarios where non-contact techniques are
Laboratory)
needed. Using Digital Image Correlation and
Tracking, strain can be calculated from a series
of consecutive images with sub pixel resolution. J03 - Projected Conjugate Gradient Solvers on
However, the image processing is a computation GPU and its Applications
intensive task and can’t be performed in real In this work, the focus is specifically on how to
time using general purpose processors. We speedup the projected CG algorithm utilizing the
implemented 3 stage pipelined architecture: GPU. It is shown that the projected CG method
images are loaded, preprocessed using CPU, and can be used within the single precision accuracy
correlated on GPUs. Using two GTX295 cards we of the current GPU. One benefit gained through
were able to reach 35 times speedup compared to use of the projected CG is that it reduces the total
fastest Core i7 processor. number of matrix vector multiplications, which is
Author: Suren Chilingaryan (Karlsruhe Institute usually a bottleneck for an efficient GPU-based
of Technology) Krylov-based algorithm. A modified projection
POSTER LISTING
support the proposed algorithm.
Maxwell’s Equations
Author: Youzuo Lin (Arizona State University)
We describe an MPI/CUDA approach to solve
Maxwell’s equations in time domain by means
of an Interior Penalty Discontinuous Galerkin J04 - Real-time Direct Georeferencing of Images
Time Domain Methods and a local time stepping from Airborne Line Scan Cameras
algorithm. We show that MPI/CUDA provides 10x The Norwegian Defense Research Establishment
speed up versus MPI/CPU, in double precision. (FFI) is developing a technology demonstrator
Moreover, we present scalability results and an for airborne real-time hyperspectral target
85% parallelization efficiency up to 40 GPUs on detection. The system includes two nadir-
the Glenn cluster of Ohio Supercomputing Center. pointing line scan cameras. The line scanned
Finally, we study an electromagnetic cloaking images are georeferenced in real-time by
example for a broad band signal(8-11GHz), to intersecting rays cast from the cameras with
show the potential of our approach to solve real a 3D model of the terrain underneath. The
life examples in short simulation times. georeferenced images may then easily be
Author: Stylianos Dosopoulos (Ohio State University) ortho-rectified (e.g by using texture mapping
in OpenGL) and overlaid digital maps. This
poster presents the performance of a cuda
implementation of the georeferencing method.
Imaging Author: Trym Vegard Haavardsholm (Norwegian
Defence Research Establishment (FFI))
J01 - Neurite Detection using CUDA, GPU
Accelerated Biological Imaging for
FULL CONFERENCE GUIDE 2010
POSTER LISTING
Molecular Dynamics Physics Simulation
U01 - Mint: An OpenMP to CUDA Translator V01 - Real-Time Color Space Conversion for High
We aim to facilitate GPU programming for finite Resolution Video
difference applications. We have developed Mint, a Color space conversion or color correction
source to source compiler to generate CUDA code is a widely used technique to adapt the color
from OpenMP code. Mint transforms omp parallel characteristics of video material to the display
for loops into CUDA kernels and applies domain technology employed (e.g. CRT, LCD, projection) or
specific optimizations such as shared memory, to create a certain artistic look. As color correction
register and kernel fuse optimizations. Since often is an interactive task and colorists need a
our translator targets structured grid problems, direct response, state-of-the-art real-time color
it optimizes the code better than the general correction systems for video are so far based on
purpose compilers. In this poster, we present expensive dedicated hardware. This submission
translation and optimization steps along with our shows the feasibility to replace dedicated color
initial performance results. correction systems by General Purpose GPUs. It
Author: Didem Unat (University of California, is shown that a single Tesla C2050 GPU supports
San Diego) real-time color correction up to a resolution of
4096x2048 pixel.
U02 - Real-Time Particle Simulation in the Author: Klaus Gaedke (Technicolor)
Blender Game Engine with OpenCL
The goal of this project is to produce interactive V02 - 3D Object Detection in Digital Holographic
scientific visualizations that can be used in Microscope Images
educational games. We use the computational Digital Holographic Microscopy (DHM) is based
power of OpenCL to enable features in the on the classical holographic principle invented
Blender Game Engine that would otherwise not by Hungarian physicist Dennis Gabor. The
be possible in real-time. By adding an interactive holographic images are acquired by a CCD
particle system to the game engine, we set the camera. Depth slices can be reconstructed using
stage to demonstrate many interesting scientific Fourier transform. The numerical reconstruction
phenomena (molecular dynamics, fluid dynamics, and further image processing for object detection
statistics) with the added benefit of real-time is done using General Purpose Graphical
special effects for games in general. Processor Units (GPGPU).
POSTER LISTING
Streaming Framework on GPU Clusters
In this poster, we propose GStream, a general-
purpose, scalable data streaming framework
on GPUs. The contributions of GStream are as
follows: (1) We provide powerful, yet concise
language abstractions suitable to describe
conventional algorithms as streaming problems.
(2) We project these abstraction onto GPUs to fully
exploit their inherent massive data- parallelism.
(3) We demonstrate the viability of streaming on
accelerators. Experiments show that the proposed
framework provides flexibility, programmability
and performance gains for various benchmarks
from a variety of domains, including but not
limited to data streaming, data parallel problems,
numerical codes and text search.
Author: Yongpeng Zhang (North Carolina
State University)
SPEAKERS AND
Laboratory for Computational Nanoscience & Soft the transfer of light through the large scale structures
Matter Simulation at the University of Michigan. Dr. of the Universe or the dynamics of astrophysical
PANELISTS
Anderson holds a Ph.D. degree in Condensed Matter fluids. Lately he ported such applications on the
Physics from Iowa State University and is the the lead large scale multi-GPUs cluster provided by the Titane
developer of HOOMD-blue, a high performance particle supercomputer in collaboration with the CEA, France. In
simulation tool. His current research interests include 2010, these numerical investigations on GPUs owed him
GPU computing, polymer physics, and nanoparticle self- to be chosen as one of the “Young Astrophysicist of the
assembly. Year” by the French Astrophysical Society.
hh Session(s): 2062 - HOOMD-blue: Fast and Flexible hh Session(s): 2099 - Cosmology Powered by
Many-Particle Dynamics (Thursday, Sept 23, 15:00) GPUs Redux (Wednesday, Sept 22, 11:00)
external customers for outstanding performance and hh Session(s): 2283 - 500 Teraflops Heterogeneous
strategic vision. Cluster (Thursday, Sept 23, 16:00)
PANELISTS
SPEAKERS AND
Adjunct Professor of Computer Science, “La Sapienza” contributed numerous publications to Siggraph and
University, Rome. He is the author of more than 120 Graphics Hardware conferences.
PANELISTS
papers in international journals and proceedings of hh Session(s): 2207 - Playing Zero-Sum Games
international conferences. on the GPU (Wednesday, Sept 22, 11:00)
hh Session(s): 2112 - The Heisenberg
Spin Glass Model on GPU: Myth versus Blewitt, Chris
Fact (Tuesday, Sept 21, 11:00) CFO (miGenius)
Combining deep knowledge and experience in the
Bigdeli, Abbas development and implementation of ‘cutting edge’ 3D
Senior Researcher and Technology Manager, (NICTA) systems and physical based rendering technologies,
Abbas is currently a Senior Researcher and Technology Chris has aquired a unique perspective and inate
Manager for Advanced Surveillance Project at National understanding of how these two diverse fields can be
ICT Australia lab. He has collaborated on industrial brought together most effectively for the diverse range
projects with various companies in New Zealand, of users needing these capabilities. Chris has over 25
Australia and USA. He has more than 10 years years of experience in developing effective and efficient
experience in consultancy, scientific research and solutions for the wider needs of the non-technical user.
technology leadership in the areas of digital signal hh Session(s): 4002 - Emerging Companies: CEO on
and image processing, computer architecture, and Stage featuring Allegorithmic SAS, Bunkspeed,
information security. He has published over 60 papers and miGenius (Wednesday, Sept 22, 14:00)
in journals, book chapters and refereed international
FULL CONFERENCE GUIDE 2010
SPEAKERS AND
Computational Radiation Physics Group at the FZD. physicisists in the Consiglio Universitario Nazionale.
hh Session(s): 2090 - Developing Highly Scalable hh Session(s): 2000 - Gravitational N-body
PANELISTS
Particle-Mesh Codes for GPUs: A Generic Simulations: How Massive Black Holes Interact
Approach (Tuesday, Sept 21, 15:00) with Stellar Systems (Wednesday, Sept 22, 14:00)
Cambridge) Digital, which propelled Cisco into a leading role for the
Consumer technology market; Scientific Atlanta, which
Dr. Paul Calleja is Director of High Performance positioned Cisco as the leader in the delivery of digital
Computing at Cambridge University, where he video; and Webex and Tandberg, which established
provides research computing services across all Cisco’s leadership in the Collaboration market.
academic disciplines. Dr. Calleja obtained his Ph.D. in Transactions led by Charles have resulted in more than
computational bio-physics at Bath University. After filling $10 billion in incremental revenue to Cisco since 2002.
a post-doctoral research position at Birkbeck College, In addition to his work on acquisitions, Charles has
he moved into private industry, where he spearheaded helped accelerate Cisco’s role as a leading Corporate
early commercialization of HPC cluster solutions within Venture Capitalist with responsibility for managing
99
Cisco’s broad based $1.5 billion venture portfolio. He His research interests include computer architectures
has led investments across many segments and stages and parallel programming models.
which have created strategic and financial returns for hh Session(s): 2177 - Simplifying Parallel
Cisco and has been an active participant on the board of Programming with Domain Specific
directors for a number of portfolio companies. Prior to Languages (Wednesday, Sept 22, 11:00)
joining Cisco in 2001, Charles was an investment banker
at Goldman Sachs where he was an early member of the Chatterjee, Debapriya
Technology Investment Banking group. During his time Graduate student (University of Michigan)
at Goldman, Charles was involved with over $2.5 billion
of financings and M&A transactions across a diverse set Debapriya Chatterjee is a Ph.D. candidate working with
of industry leading technology companies. Charles holds Prof. Valeria Bertacco in the Electrical Engineering and
an MBA from Stanford’s Graduate School of Business Computer Science Department at University of Michigan,
and a Bachelors degree from Tufts University.. Ann Arbor. His research focuses on validation and
verification solutions for industry-scale digital designs.
hh Session(s): 4004 - Emerging Companies: CEO His solutions entail both novel and adaptable semi-
on Stage featuring Cooliris, empulse GmbH, formal verification methods and the use of massively
and Playcast (Wednesday, Sept 22, 16:00) parallel multi-core platforms to boost the performance
hh 4005 - Emerging Companies: CEO on Stage of key validation applications. Debapriya holds a B.
featuring Jedox Business Intelligence, Rocketick, Tech. degree from the Indian Institute of Technology,
and Softkinetic (Wednesday, Sept 22, 17:00 Kharagpur, India and a MS degree from the University
of Michigan; both degrees are in Computer Science and
Castonguay, Patrice Engineering.
PhD Candidate (Stanford University) hh Session(s): 2306 - Gate-Level Simulation
Mr. Castonguay is a Ph.D candidate in the Aeronautics with GP-GPUs (Wednesday, Sept 22, 10:00)
and Astronautics department at Stanford University
working in the Aerospace Computing Laboratory under Chen, Doris
the supervision of Professor Antony Jameson. The Student (University of Toronto)
Aerospace Computing Lab focuses on developing more Doris received her M.A.Sc and B.A.Sc degrees in
efficient and robust algorithms for modeling fluid Computer Engineering from the
dynamics. Mr. Castonguay is interested in developing University of Waterloo in 2007 and 2005 respectively. She
efficient high-order methods for the Navier-Stokes currently works at Altera’s Toronto Technology Center
equations. Most specifically, his research interests on advanced algorithms in the fields of device modeling,
include developing stable and efficient high-order CAD optimizations and logic synthesis. She has
methods for mixed grids. authored or co-authored 6 papers in FPGA CAD and
hh Session(s): 2079 - A Fast, Scalable High- device reliability.
Order Unstructured Compressible Flow hh Session(s): 2068 - Parallelizing FPGA Technology
Solver (Tuesday, Sept 21, 11:00) Mapping using GPUs (Wednesday, Sept 22, 14:00)
Catanzaro, Bryan Chen, Hanning
PhD Candidate (University of California, Berkeley) Research Associate (Northwestern University)
Bryan Catanzaro received his BS and MS degrees
from Brigham Young University, and is currently a PhD hh Session(s): 2128 - Hybrid Quantum Mechanics/
candidate at the University of California, Berkeley. His Electrodynamics (QM/ED) Modeling of Solar Cells
interests center on programming models for manycore on a CUDA Cluster (Wednesday, Sept 22, 17:00)
computers, with an applications driven emphasis.
SPEAKERS AND
Cheung, Mark
hh Session(s): 2050 - Copperhead: Data-Parallel Physicist (Lockheed Martin Solar & Astrophysics
PANELISTS
SPEAKERS AND
information, please visit: http://www.ichec.ie/about_us/
gilles_civario Simon has been Product Manager for video and graphics
products at GE Intelligent Platforms since 1998,
PANELISTS
hh Session(s): 2086 - GPGPU DL_POLY during which time he has maintained the company’s
(Thursday, Sept 23, 16:00) leading position at the forefront of leading-edge, high-
performance commercial technology applied to the
Clark, Calvin rugged defense and aerospace market. Traditional
Senior Consultant (Microsoft) graphics applications have seen the company’s
Calvin Clark has been at Microsoft for 14 years. He is products deployed into diverse applications such as
a Senior Consultant in the Application Development cockpit displays in fighter jets, mission computers in
Consulting Group. Since 2006, his focus has been on helicopters and tanks, and into embedded training
High Performance Computing solutions, delivering systems on naval weapons systems. As the trend
trainings and consulting services to numerous ISVs, towards GPGPU has grown, Simon has defined new
Solution Integrators, and OEMs in the HPC space. He products suited to the next generation of Intelligence,
lives in Menlo Park, CA with his wife and daughter. Surveillance and Reconnaissance applications, once
hh Session(s): 2147 - GPGPU Development for again taking the performance lead in the rugged
Windows HPC Server (Tuesday, Sept 21, 15:00) marketplace. Prior to taking this role, Simon worked
in a number of engineering roles in the nuclear and
Clark, Philip scientific research industries. He graduated with a B.Sc.
Reader (Associate Professor) in Particle Physics in Microelectronics, and an M.Sc. (Eng) in Advanced
(University of Edinburgh) Manufacturing Technology.
FULL CONFERENCE GUIDE 2010
Dr Philip Clark is a reader (associate professor) at the hh Session(s): 2273 - GPUs In the Front
University of Edinburgh. He is the principal investigator Line of our Defenses (Sponsored by
for the Edinburgh ATLAS and GridPP particle physics GE) (Wednesday, Sept 22, 15:00)
research groups. He is the chairman of the ScotGrid
tier-2 compute and data centre. His primary research Corrigan, Andrew
is in elementary particle physics, but is also interested Research Mathematician (Naval Research Laboratory &
evolving computer architectures, particularly the George Mason University)
advent of many-core and GPGPU devices. He has 672 Andrew Corrigan is a research mathematician at the
publications (427 in peer reviewed journals). Naval Research Laboratory, where he is working on
101
the GPU implementation of CFD codes and supersonic Curry, Matthew
jet noise reduction. He received his Ph.D. from George A Highly Reliable RAID System Based on GPUs (Sandia
Mason University in Computational Mathematics in May National Laboratories and the University of Alabama at
2009. He also performed a post-doc through early 2010 Birmingham)
under Prof. Rainald Löhner porting FEFLO to graphics Matthew Curry is a Ph.D. candidate at the University of
hardware, as well as developing specialized numbering Alabama at Birmingham. He is a member of the High
schemes for edge-based unstructured grids on GPUs. Performance Computer Laboratory in the Computer and
hh Session(s): 2005 - Porting Large-Scale Legacy Information Sciences Department under the advisement
Fortran Codes (Wednesday, Sept 22, 17:00) of Dr. Anthony Skjellum. He is interested in GPU
hh 2234 - Unstructured Finite Volume Code computing, operating systems, and high performance
on a Cluster with Multiple GPUs per storage.
Node (Wednesday, Sept 22, 15:00) hh Session(s): 2205 - A Highly Reliable RAID System
Based on GPUs (Tuesday, Sept 21, 17:00)
Cox, Sam
CEO (Milabra) Dammertz, Holger
Sam is an technology entrepreneur and investor, PhD Student (Ulm University)
focusing on technology and media ventures. He founded As a PhD student at Ulm University, Germany, my
a successful design & software firm in Canada and has main focus of research is fast Ray Tracing for Global
consulted for firms in the UK, China, the United States Illumination and related rendering techniques. I am also
and Canada. He graduated with his MBA from Cass researching quasi-Monte Carlo methods and parallel
Business School in London, UK with a degree in strategy algorithms for graphics.
and completed his undergraduate degree in Art History, hh Session(s): 2136 - Pseudo Random
Chinese Language and Economics at Queen’s University Number Generators for Massively Parallel
in Canada. Apps (Thursday, Sept 23, 16:00)
hh Session(s): 4001 - Emerging Companies:
CEO on Stage featuring Elemental Dasgupta, Aniruddha
Technologies, Geomerics, and Milabra Graduate Student (Georgia Institute of Technology)
(Wednesday, Sept 22, 11:00) Aniruddha Dasgupta is currently working towards a
hh 4003 - Emerging Companies Summit Panel: GPUs Mater’s Degree from the department of Electrical and
for Computer Vision (Wednesday, Sept 22, 15:00) Computer Engineering at Georgia Tech.
His areas of research interests are GPGPU and GPU
Crivelli, Luis architecture.
Director Solver Development (Dassualt Systems Simulia hh Session(s): 2201 - A Case Study of
Corporation) Accelerating Matlab Based Applications
PhD in Aerospace Engineering. Director Of Solver using GPUs (Wednesday, Sept 22, 16:00)
Development at Dassault Systems Simulia Corporation.
16+ years experience in High Performance Computing Davidson, Andrew
and Parallel Computing. Graduate Student (University of California, Davis)
hh Session(s): 2155 - GPGPU in the real world. The Andrew Davidson is a graduate student in the Computer
ABAQUS experience (Thursday, Sept 23, 14:00) Engineering Department at the University of California,
Davis. His research interests include data-parallel
Cui, Jingyu algorithms and primitives, numerical methods, and
Graduate Student (Stanford University) auto-tuning. He is also a developer for the CUDA
SPEAKERS AND
Jingyu Cui received his B.E (2005) and M.S. (2008) degree Parallel Primitives Library (CUDPP) .
from Tsinghua University, and M.S. (2010) degree from hh Session(s): 2085 - Tridiagonal Solvers: Auto-
PANELISTS
Stanford University. He is currently pursuing his Ph.D. Tuning and Optimizations (Tuesday, Sept 21, 15:00)
degree working on high speed dynamic 4-dimensional
medical imaging. Jingyu has published 8 peer-reviewed Davis, Nolan
papers in conference proceedings as the leading author, Research Scientist (SAIC)
2 journal articles, and a book chapter. He also holds a Nolan R. Davis is a Senior Scientist with SAIC in San
US patent. Jingyu worked with Microsoft and Google, and Diego. He holds a doctorate in physics from the
made important contributions to several products. University of Texas at Dallas, and has spent over 25
hh Session(s): 2211 - Modern Architecture for years working in physics research, signal and image
Massively Parallel Medical Tomographic processing, and high performance computing. He
Image Reconstruction on a GPU Cluster has worked with large corporations and laboratories
(Wednesday, Sept 22, 15:00) including SAIC, Lockheed-Martin, the Johns-Hopkins
Applied Physics Laboratory, the Naval Research
Cui, Xiaohui Laboratory, and Walt Disney Feature Animation.
Research Scientist (Oak Ridge National Laboratory) hh Session(s): 2100 - Hybrid GPU/Multicore
Dr. Xiaohui Cui is the scientist staff of the Computational Solutions for Large Linear Algebra
Sciences & Engineering Division, Oak Ridge National Problems (Thursday, Sept 23, 16:00)
Laboratory of Department of Energy and the adjunct
associate professor of University of Louisville in Dean, Loren
FULL CONFERENCE GUIDE 2010
Kentucky. His research interests include swarm Director of Engineering, MATLAB Products (MathWorks)
intelligence, agent based modeling and simulation, GPU Loren Dean is a Director in the MATLAB® development
computing, and information retrieval. His research has organization. He has responsibility for MathWorks
been reported by MSNBC, New Scientist etc. In 2008 and parallel computing products, the Test & Measurement
2009, he received the Department of Energy Outstanding application area and the eProducts and Services
Mentor Awards. organization. Loren has been with MathWorks since
hh Session(s): 2052 - Power Management 1995. Prior to joining MathWorks, Loren worked for
Techniques for Heterogeneous Exascale AlliedSignal Aerospace, performing systems analysis
Computing (Tuesday, Sept 21, 16:00) and integration for aircraft engines, with extensive use of
103
MATLAB and Simulink®. Loren has a B.S. and an M.S. in Deng, Yangdong
Aeronautical Engineering from Purdue University and an Associate Professor (Tsinghua University)
M.B.A. from Northeastern University. Yangdong (Steve) Deng received his Ph.D. degree in
hh Session(s): 2267 - GPU Computing with Electrical and Computer Engineering from Carnegie
MATLAB® (Tuesday, Sept 21, 11:00) Mellon University, Pittsburgh, PA, in 2006. He received
his ME and BE degrees in Electronic Department
Dean, Tom from Tsinghua University, Beijing, in 1998 and 1995,
Research Scientist (Google Inc.) respectively.
Tom Dean is a full-time research scientist at Google hh Session(s): 2081 - Morphing a GPU into a
in Mountain View, California. From 1993 to 2007 he Network Processor (Thursday, Sept 23, 15:00)
was Professor of Computer Science and Cognitive and hh 2264 - CUDA Centers of Excellence Super-
Linguistic Sciences at Brown University. He received his Session III (Tuesday, Sept 21, 16:00)
B.A. in mathematics from Virginia Polytechnic Institute
& State University in 1982 and his M.Sc. and Ph.D. in Dewaele, Ronny
computer science from Yale University in 1984 and 1986. Director Technology Center (Barco)
His research interests include automated planning and
control, computational biology, machine learning, neural Ronny Dewaele is responsible for the corporate
modeling, probabilistic inference, robotics and spatial Technology Center for Networked Visualization in Barco.
and temporal reasoning. For more information, please Ronny and his team focus on exploring new technologies
visit: http://www.cs.brown.edu/people/tld/pages/bio.html in the domain of network centric video processing.Ronny
Dewaele has a Master degree in Computer Science and
hh Session(s): 2132 - Accelerating Applied Mathematics from KU Leuver, Leuven, Belgium.
Biologically Inspired Computer Vision He lives and works in Belgium.
Models (Tuesday, Sept 21, 11:00)
hh Session(s): 2095 - Building High
hh 4003 - Emerging Companies Summit Panel: GPUs Density Real-Time Video Processing
for Computer Vision (Wednesday, Sept 22, 15:00) Systems (Thursday, Sept 23, 16:00)
hh 2297 - Developing CUDA Accelerated .NET Plugins Gregory Diamos is a PhD student at the Georgia Institute
for Microsoft Excel (Tuesday, Sept 21, 17:00) of Technology, under the direction of Professor Sudhakar
PANELISTS
SPEAKERS AND
Mr. Donovan has a masters degree in computer science Group. While there he ran the eCommerce, Security, and
and over 20 years IT experience. Throughout his career Mobile research practices.
PANELISTS
he has held positions at exchanges, investment banks, hh Session(s): 4007 - Emerging Companies: CEO
and hedge funds. He is currently a System Architect on Stage featuring Aqumin, RTT, and Scalable
at Citadel where his main area of focus is accelerating Display Technologies (Thursday, Sept 23, 10:00)
financial models with a combination of grid computing, hh 4008 - Emerging Companies: CEO
virtualization, and CUDA / OpenCL. on Stage featuring ICD and Universal
hh Session(s): 2033 - Integrating GPGPU Accelerated Robotics (Thursday, Sept 23, 11:00
Pricing Models into an Existing Financial Services
Infrastructure (Thursday, Sept 23, 09:00) Engsig-Karup, Allan Peter
Assistant Professor, Scientific Computing (Technical
Doran, Chris University of Denmark)
Founder and Chief Operating Officer (Geomerics) MSc, PhD in Applied math.Learn more: http://www.imm.
Dr. Chris Doran is Founder and Chief Operating Officer dtu.dk/~apek Involved in research related to utilization
at Geomerics. He is a leading research scientist with 20 of GPUs for Scientific Computing. Learn more: http://
years experience in applied mathematics and theoretical gpulab.imm.dtu.dk. Research interest in Computational
physics, and is the author of a major book on geometry Fluid Dynamics, High-Performance Computing, Coastal
and physics and of over 50 papers. Chris is a regular Engineering, Scientific Computing, Numerical analysis.
speaker at major international conferences, including hh Session(s): 2103 - Development of an Efficient
SIGGRAPH, Develop, Nordic Game and Montreal Games
FULL CONFERENCE GUIDE 2010
SPEAKERS AND
Isaac Gelado is an Assistant Professor at the Computer finite element software NLFLEX. He is interested
Architecture Department in Universitat Politecnica de in the development of fast GPU enabled algorithms
PANELISTS
Catalunya at Barcelona. Isaac Gelado holds a Master’s and computer codes for the high-fidelity solution of
degree on Telecommunications Engineering from challenging problems in computational mechanics with
Universidad de Valladolid, and will get a PhD degree an emphasis on transient phenomena in structural
from Universitat Politecnica de Catalunya in July, 2010. mechanics, such as shock and blast; FEM simulation
hh Session(s): 2156 - GMAC: Global Memory For of ultrasound with an emphasis on biomedical imaging
Accelerators (Thursday, Sept 23, 09:00) and therapy; and the computational design of novel
acoustic metamaterials. His project experience includes
Georgiev, Todor SBIR and STTR research for DARPA, ONR, and various
Senior Research Scientist II (Adobe Systems) protective design and structural engineering efforts
involving large finite element analyses. He earned
Todor Georgiev is a Senior Research Scientist at his Ph.D . and M.S. both in Mechanical Engineering
Adobe Systems, working closely with the Photoshop from Boston University where his work involved the
group. His contributions are often based on transfer finite element solution of linear and non-linear inverse
of mathematical methods from physics to image problems in biomechanical imaging.
processing and vision. Currently he is focusing on
developing cameras for radiance capture and interactive hh Session(s): 2061 - Accelerating Explicit FEM Shock
plenoptic / lightfield rendering. & Blast Simulations (Thursday, Sept 23, 10:30)
hh Session(s): 2093 - Computational Goldsmith, Kevin
Photography: Real-Time Plenoptic
FULL CONFERENCE GUIDE 2010
has pursued this goal in a variety of customer-focused hh Session(s): 2049 - Deflated Preconditioned
technical roles with the TotalView team over the last Conjugate Gradient on the GPU
PANELISTS
seven years. Prior to that, as a graduate student of (Wednesday, Sept 22, 14:30)
astrophysics at the University of Arizona in Tucson, he
wrote cosmological simulations (with the occasional Haines, Karen
bug) using C and MPI on a small-scale Beowulf cluster. Professor (WASP/The University of Western Australia)
Chris is a regular contributor to HPC and software Dr. Haines completed her PhD in Electrical Engineering
development industry conferences worldwide. at the University of New Mexico. She received her
hh Session(s): 2299 - Integrating CUDA BLAS Masters in Engineering at Carnegie Mellon University
with IMSL Fortran (Tuesday, Sept 21, 14:00) and her Bachelor of Arts in Mathematics at the
University of California, San Diego. Her PhD research
hh 2251 - TotalView Debugger for CUDA efforts have lead to the development of a parallel motion
(Wednesday, Sept 22, 15:00) detection algorithm, which is based on the fly’s visual
processing system. The resulting model is suitable for
Govett, Mark robotic or computer vision applications. This work relied
Chief, Advanced Computing Section (NOAA Earth System on distributed parallel programming and advanced
Research Laboratory) scientific visualization methods.
I manage NOAA Earth System Research Laboratory’s hh Session(s): 2252 - Simulating Housefly
Advanced Computing Section, a software group Vision Elements Using OpenCL
that supports weather model development, code (Wednesday, Sept 22, 16:00)
parallelization, and exploring advanced computing
technologies including GPUs. I have a background in Han, Jeff
high performance computing, code parallelization and Founder, Chief Scientist (Perceptive Pixel)
compiler development. Recently, I wrote a Fortran to
CUDA compiler to parallelize and run a next generation Jeff Han is the founder and chief scientist of Perceptive
weather model on GPUs. Pixel A TED speaker in 2006, and named to the Time
100 most influential persons list in 2008, Jeff continues
hh Session(s): 2276 - Using GPUs to Run to contribute frequently to the research communities.
Next-Generation Weather Models Jeff’s formal training was in electrical engineering and
(Tuesday, Sept 21, 14:00) computer science at Cornell University, where he worked
on the innovative CU-SeeMe videoconferencing system.
hh 4011 - Emerging Companies: CEO on Stage hh Session(s): 2152 - Using Virtual
featuring Cinnafilm, Inc., Perceptive Pixel and Texturing to Handle Massive Texture
Total Immersion (Thursday, Sept 23, 16:00) Data (Tuesday, Sept 21, 14:00)
SPEAKERS AND
GPUs, physically based simulation, real-time rendering, in advising financial and strategic buyers on due
and gastronomy. Mark earned his PhD in computer diligence, accounting structuring and financial reporting
PANELISTS
science from the University of North Carolina at Chapel aspects of transactions in technology, semiconductors,
Hill in 2003. He founded and maintains GPGPU.org, a and software transactions both domestically and
web site dedicated to general-purpose computation on internationally. In addition to his M&A experience with
GPUs. Deloitte, Garrett has M&A experience as an investment
hh Session(s): 2084 - State of the Art in professional in industry with Mentmore Holdings
GPU Data-Parallel Algorithm Primitives Corporation (a private equity group), Stellex Technologies
(Tuesday, Sept 21, 17:00) (wireless communications equipment), and Register.
com (NASDAQ: RCOM) where he was responsible for
Harrison, Brian target evaluation, due diligence, divestitures, and post-
(NVIDIA) transaction integration.
hh Session(s): 4009 - Emerging Companies
hh Session(s): 2024 - NVIDIA Acceleration Summit Panel: The “New Normal” For Building
Engines Overview (Pre-Conference Emerging Companies Based On Disruptive
Tutorial) (Monday, Sept 20, 13:00) Technologies (Thursday, Sept 23, 14:00)
hh 2308 - Building Cutting-Edge Realtime
3D Applications with NVIDIA SceniX Herbst, Jeff
(Wednesday, Sept 22, 10:00) Vice President of Business Development (NVIDIA)
Jeff is the Vice President of Business Development
FULL CONFERENCE GUIDE 2010
atmosphere and ocean processes for 20 years. He Cranfield University, University of Hertforshire and FHT-
is a lead developer of the open-source M.I.T General Esslingen. He received several awards for outstanding
Circulation Model (http://mitgcm.org) and has been achievements. In 1994 he joined Accenture and worked
exploring applications of accelerators for several years. as a process and technology consultant. As youngest
With colleagues, he is developing a GPU oriented manager of Germany he left Accenture in 1999 and
accelerator library for geosciences. continued his career as project and program manager
for complex business and technology projects. In 2007
hh Session(s): 2167 - Designing a Geoscience
he founded “empulse” togehter with a former colleague
Accelerator Library Accessible from High Level
as a professional service and software development
Languages (Wednesday, Sept 22, 17:00)
company.
Hoang-Trong, Tuan hh Session(s): 4004 - Emerging Companies:
PhD Student (George Mason University) CEO on Stage featuring Cooliris, empulse
GmbH, and Playcast Media Systems
Tuan got his B.Eng. in Computer Science and (Wednesday, Sept 22, 16:00))
Engineering from HoChiMinh City University in Vietnam
in 2005. His M.Eng at Chonnam National University
Humphrey, John
(South Korea) was in Computer Engineering; where he
Senior Engineer (EM Photonics, Inc)
conducted research in artificial neural network, protein-
spot maching in 2-dimensional gel electrophoresis John Humphrey is a member of the Accelerated
from 2006-2008. Since 2008, he’s a PhD student at Computing Solutions group at EM Photonics. He earned
George Mason University, Department of Bioinformatics his MSEE degree from the University of Delaware,
and Computational Biology. His current research studying the acceleration of electromagnetics algorithms
interests are in calcium signalling, building cardiac cell using custom hardware platforms. At EM Photonics, he
model using high-performance computing with GPU launched a GPU research effort in 2005 with an FDTD
technology. solver based on OpenGL methods. Since then, he has
worked on accelerated algorithms in a variety of fields,
hh Session(s): 2172 - Unveiling Cellular &
including linear algebra solvers and computational fluid
Molecular Events of Cardiac Arrhythmias
dynamics engines.
(Tuesday, Sept 21, 11:00)
hh Session(s): 2153 - CULA - A Hybrid GPU Linear was vice president of marketing for software maker
Algebra Package (Thursday, Sept 23, 15:00) Patron Systems. Prior to that, he was vice president
hh 2154 - The Impact of Data Movement on GPU of sales for Entelagent Software Corporation and
Performance (Wednesday, Sept 22, 16:00) ViewTech Corporation. He founded GroupNet, Inc., a
PictureTel Corporation reseller. Jamison spent ten years
Hwu, Wen-mei with PictureTel Corporation where he was the fourth
Professor (University of Illinois, Urbana-Champaign) employee of the company and also a founding member
of the European management team of PictureTel
Wen-mei W. Hwu is a Professor of ECE at the University International LTD. During his years at PictureTel,
of Illinois at Urbana-Champaign. He received the ACM revenue grew to over $200M and market value reached
Maurice Wilkes Award, the ACM Grace Murray Hopper more than $1.0B. He received his undergraduate
Award, and the ISCA Most Influential Paper Award. business education from Northeastern University and
He is a fellow of IEEE and ACM and leads the GSRC conducted post-graduate studies in finance at Fairfield
Concurrent Systems Theme. He directs the UIUC CUDA University.
Center of Excellence. Dr. Hwu also received his Ph.D.
degree in Computer Science from UC Berkeley. hh Session(s): 4007 - Emerging Companies: CEO
on Stage featuring Aqumin, RTT, and Scalable
hh Session(s): 2264 - CUDA Centers of Excellence Display Technologies (Thursday, Sept 23, 10:00)
Super-Session III (Tuesday, Sept 21, 16:00)
hh 2249 - New Programming Tools GPU Jargstorff, Frank
Computing (Wednesday, Sept 22, 10:00) Software Engineer (NVIDIA)
Frank Jargstorff is a software engineer leading NVIDIA’s
Iribe, Brendan Performance Primitives effort (NPP). Frank received his
President (Scaleform) degree in computer science in 1997 from the University
Brendan Iribe co-founded Scaleform and established of Tübingen, Germany.
the company as the #1 video game user interface (UI) hh Session(s): 2216 - CUDA Libraries Open
and video codec provider. Brendan pioneers all aspects House (Wednesday, Sept 22, 11:00)
of product research, development, and promotion at
Scaleform. Under his leadership, Scaleform GFx has Jensen, Eric
been adopted by most commercial 3D engines (UE3, Partner, Business Department Chair (Cooley LLP)
CryEngine, Gamebryo) and licensed for use in over 600
titles in less than 4 years, including hit games from 19 of Eric C. Jensen is a business partner in the Cooley Palo
the top 20 worldwide video game publishers. Alto office. Mr. Jensen is head of the Firm’s Business
department and a member of the Management
hh Session(s): 2241 - Standing Out: Implementing Committee. Mr. Jensen has been with the Firm
a Great Stereo UI (Thursday, Sept 23, 14:00)) since 1988 and a partner since 1994. Mr. Jensen
practices securities and general corporate law, with
Iles, Andrew an emphasis on the representation of emerging and
Software Director (NVIDIA) public software, semiconductor, internet, and other
hh Session(s): 2225- Tools for Managing Clusters of information technology companies. He also has
NVIDIA GPUs (Tuesday, Sept 21, 17:00) extensive experience representing venture capital
funds and underwriters. He has counseled clients in
Ismert, Ryan the areas of corporate formations, venture financings,
Director of Engineering (Sportvision, Inc.) public offerings of equity and debt, mergers and
acquisitions, joint venture, licensing and related
Ryan has been building augmented reality systems for strategic transactions, employee incentive matters and
SPEAKERS AND
broadcast TV at Sportvision for 7 years. He currently SEC reporting and compliance. Mr. Jensen has been
leads a team focused on disrupting the current state of included as one of The Best Lawyers in America in 2006
camera tracking and broadcast rendering by leveraging
PANELISTS
- 2011 and named as one of Northern California’s “Super
the power of multiple GPUs. Lawyers” in 2007 - 2010. Mr. Jensen has also been
hh Session(s): 2123 - Enabling Augmented Reality ranked as a leading lawyer in Investment Funds: Venture
with GPU Computing (Thursday, Sept 23, 15:00) Capital in Chambers USA 2010 edition.
hh Session(s): 4009 - Emerging Companies
Iyer, Kumar Summit Panel: The “New Normal” For Building
Product Manager (NVIDIA) Emerging Companies Based On Disruptive
Kumar Iyer is a Product Manager of Developer Tools Technologies (Thursday, Sept 23, 14:00)
at NVIDIA, where he works on the most advanced GPU
development tools in the world. Prior to his work at Jeong, Byungil
NVIDIA, Kumar worked on PC and console games at Visualization Scientist (TACC / UT-Austin)
Electronic Arts, and in research at the USC Institute for Byungil Jeong is a visualization scientist with the
Creative Technologies. Kumar holds a B.S. in Computer Texas Advanced Computing Center at the University
Science from UCLA, and a MBA from the UCLA of Texas at Austin and a primary Scalable Adaptive
Anderson School of Management. Graphics Environment (SAGE) architect. His research
hh Session(s): 2245 - Parallel Nsight for interests include scalable parallel graphics architecture,
Microsoft Visual Studio (Pre-Conference collaborative remote visualization, large-scale data
Tutorial) (Monday, Sept 20, 16:00) visualization, and high-resolution display systems. Jeong
FULL CONFERENCE GUIDE 2010
hh 2149 - Overview of Parallel Nsight for has a PhD in computer science from the University of
Visual Studio (Thursday, Sept 23, 10:00) Illinois at Chicago.
hh 2149 - Overview of Parallel Nsight for hh Session(s): 2144 - Large-Scale Visualization
Visual Studio (Tuesday, Sept 21, 11:30) Using A GPU Cluster (Wednesday, Sept 22, 16:00)
hh Session(s): 2013 - iray - GPUs and the Dynamics for Nanomechanical and Nanochemical
Photorealistic Rendering Revolution Experiments (Wednesday, Sept 22, 10:00)
PANELISTS
SPEAKERS AND
OmniPV, Unity Semiconductor, Autonet Mobile, SiPort, has worked in various positions over the last 26 years
ZeroG, and R2 Semiconductor. Drew spent 15 years in in HPC customer support, math library development,
PANELISTS
senior operating positions in the telecommunications applications engineering and consulting at QTC, Axian,
industry starting companies in both the components PGI and STMicroelectronics.
and the systems sectors of that industry. Drew was hh Session(s): 2143 - CUDA Fortran Programming
a founder and VP of Engineering at E/O Networks for NVIDIA GPUs (Wednesday, Sept 22, 15:30)
where he helped to design and produce a long reach
rural fiber optic telephony system. Drew started his Lecomber, David
optical telecommunications career in 1986 at Raynet, CTO (Allinea Software)
a pioneering company in the development of fiber to David Lecomber is one of the founders of Allinea and
the home technologies. Drew’s many roles at Raynet leads the research and development team behind Allinea
included VP of Marketing and VP of International DDT, the world’s most scalable parallel debugger.
Development. Drew was the founding CEO of Lightwave
Microsystems, a leader in the design and manufacture of hh Session(s): 2039 - GPU Debugging with
high volume optical integrated circuits. Drew graduated Allinea DDT (Wednesday, Sept 22, 11:00)
magna cum laude from Harvard with an MBA in 1987.
He received his BSEE & MSEE degrees from Stanford in Lee, HyoukJoong
1979. PhD Student (Stanford University)
hh Session(s): 4001 - Emerging Companies: HyoukJoong Lee is a PhD student in electrical
CEO on Stage featuring Elemental engineering at Stanford University. His research
Technologies, Inc., Geomerics, and interests include parallel systems architecture and
FULL CONFERENCE GUIDE 2010
Milabra (Wednesday, Sept 22, 11:00) programming models. He has a BS degree from Seoul
National University.
hh 4002 - Emerging Companies: CEO on Stage
featuring Allegorithmic SAS, Bunkspeed, hh Session(s): 2177 - Simplifying Parallel
and miGenius (Wednesday, Sept 22, 14:00) Programming with Domain Specific
Languages (Wednesday, Sept 22, 11:00)
Lauer, Tobias
Researcher (University of Freiburg) Lee, John
(Appro)
Tobias Lauer received his PhD in computer science
from the University of Freiburg in 2007. His current John K. Lee joined Appro in 2001, and is responsible
113
for leading Appro’s hardware product development hh Session(s): 2088 - Nucleotide String
engineering team. In addition, Mr. Lee leads the Matching Using CUDA-Accelerated
company’s Project Management team that is responsible Agrep (Thursday, Sept 23, 16:00)
for deploying Appro’s complex cluster solutions. He has
served as the Program Executive for some of Appro’s Lichtenbelt, Barthold
most important cluster projects such as 2006 Peloton Sr. OpenGL Manager (NVIDIA)
Project as well as 2007 TLCC Cluster Project. Prior to his Barthold Lichtenbelt is a Sr. Manager of the OpenGL
role at Appro, Mr. Lee served in both Sales and Service core driver team at NVIDIA. He is also the Chair of the
Management capacities at multiple storage and telecom OpenGL ARB Khronos Working Group.
companies.
hh Session(s): 2127 - OpenGL (Pre-Conference
hh Session(s): 2270 - Appro’s GPU Computing Tutorial) (Monday, Sept 20, 16:00)
Solutions (Tuesday, Sept 21, 15:00)
Lin, Chun-Yuan
Lefebvre, Matthieu Assistant Professor (Department of CSIE, Chang Gung
PhD Student (ONERA) University)
Matthieu Lefebvre is a PhD student at ONERA, the Chun-Yuan Lin joined the Department of Computer
french aerospace lab, and at Université Paris 13. He is Science and Information Engineering at Chang Gung
working on accelerating CFD simulations on GPU. University as an assistant professor. His research
hh Session(s): 2045 - Roe-Pike Scheme for 2D interests are in the areas of parallel and distributed
Euler Equations (Wednesday, Sept 22, 14:00) computing, parallel algorithms, algorithm analysis,
information retrieve, proteomics, and bioinformatics.
Lever, Ben hh Session(s): 2105 - CUDA-FRESCO: An
Senior Research Engineer (NICTA) Efficient Algorithm for Mapping Short
Ben Lever is a senior research engineer at NICTA Reads (Thursday, Sept 23, 15:00)
currently developing new methodologies and
frameworks for describing computer vision algorithms Linderman, Michael
that can target heterogeneous, highly-parallel platforms. Engineering Research Associate (Stanford University)
Prior to NICTA, Ben was a hardware design engineer at Michael Linderman is an Engineering Research
Canon Research before joining Synopsys as a software Associate in the Computer Systems Laboratory
engineer for developing real-time simulation models of at Stanford University. His research focuses on
embedded processors. using graphics processing units (GPUs) and other
hh Session(s): 2173 - Enabling Large-Scale CCTV heterogeneous computer systems to accelerate
Face Recognition (Thursday, Sept 23, 11:00) computational systems biology and other data- and
compute-intensive applications. Michael earned a
Lewin, Dan’l Ph.D. and MS in Electrical Engineering from Stanford
Corporate Vice President, Strategic and Emerging University in 2009 and 2006 respectively, and B.S. from
Business Development (Microsoft) Harvey Mudd College in 2003.
Dan’l Lewin is responsible for leading Microsoft’s global hh Session(s): 2030 - High-Throughput
engagement with startups and venture capitalists. Cell Signaling Network Learning with
In addition, Lewin has executive, site, and citizenship GPUs (Thursday, Sept 23, 09:00)
responsibility for the company’s operations in the
Silicon Valley, based in Mountain View, California, Loddoch, Alex
which currently employ 2,500 people and supports Research Scientist (Chevron)
SPEAKERS AND
business relationships with industry partners in Silicon Alex received an MSc in Physics in 2001 and a PhD
Valley. Lewin’s business development teams focus in Geophysics in 2007 from University in Muenster,
PANELISTS
on supporting the software startup and entrepreneur Germany, working on topics including geophysical fluid
ecosystem developing on the Microsoft platform while dynamics, parallel computing and data compression.
helping foster and grow local software economies In 2007 he joined Chevron as a Research Scientist,
worldwide. Through the Microsoft BizSpark Program, working in the area of High Performance Computing and
and the Microsoft Innovation Center Program, the particularly GPU computing.
groups help accelerate startup success in more than 100
countries. hh Session(s): 2174 - Reverse Time Migration
on GPUs (Wednesday, Sept 22, 15:00)
hh Session(s): 4001 - Emerging Companies:
CEO on Stage featuring Elemental Löhner, Rainald
Technologies, Inc., Geomerics, and Professor (George Mason University)
Milabra (Wednesday, Sept 22, 11:00)
Rainald Lohner is the head of the CFD center at the
hh 4002 - Emerging Companies: CEO on Stage department of computational and data sciences of
featuring Allegorithmic SAS, Bunkspeed, George Mason University in Fairfax, VA, in the outskirts
and miGenius (Wednesday, Sept 22, 14:00) of Washington, D.C. He received a MSc in Mechanical
Engineering from the Technische Universitaet
Li, Hongjian Braunschweig, Germany, as well as a PhD and DSc
Graduate Student (The Chinese University of Hong Kong) in Civil Engineering from the University College of
Hongjian Li is currently working on a M.Phil. degree Swansea, Wales, where he studied under Profs. Ken
under the supervisions of Prof. Kwong Sak Leung and Morgan and Olgierd Zienkiewicz. His areas of interest
Prof. Man Hon Wong. His major research interests include numerical methods, solvers, grid generation,
are GPU applications in bioinformatics, particularly parallel computing, visualization, pre-processing,
computer-aided drug design by means of high fluid-structure interaction as well as shape and process
throughput in silico virtual screening via structure-based optimization. His codes and methods have been applied
ligand-protein docking. He is developing new software in many fields, including aerodynamics or airplanes,
that utilizes the computational horsepower of graphics cars and trains, hydrodynamics of ships, submarines
processors with a purpose of accelerating the pipeline of and UAVs, shock-structure interaction, dispersion
drug discovery. analysis in urban areas and haemodynamics of vascular
diseases. He is the author of more than 600 articles Bob Lucas is the Computational Sciences Division
covering the fields enumerated above, as well as a Director at the University of Southern California’s
textbook on Applied CFD Techniques. Information Sciences Institute. He has been developing
hh Session(s): 2005 - Porting Large-Scale Legacy parallel linear solvers since 1985.
Fortran Codes (Wednesday, Sept 22, 17:00) hh Session(s): 2240 - Accelerating LS-DYNA with MPI,
OpenMP, and CUDA (Thursday, Sept 23, 14:30)
Loop, Charles
Senior Researcher (Microsoft Research) Lumsdaine, Andrew
Charles Loop is a Senior Researcher in the Microsoft Professor (Indiana University)
Research Graphics Group. He has worked extensively in Andrew Lumsdaine is a professor in the School of
the areas of curve and surface modeling and rendering, Informatics & Computing at Indiana University, and an
including work on n-sided patches, smooth patch Associate Director of the Digital Science Center and
complexes, as well as GPU algorithms for rendering Director of the Open Systems Lab at the Pervasive
vector art. Charles is best known for the triangle mesh Technology Institute. Lumsdaine received his Ph.D.
subdivision algorithm that bears his name. He is from MIT in 1992, and from 1992 through 2001, he
currently working on data parallel algorithms for REYES was a faculty member in the Department of Computer
style rendering and accelerating raytracing of surface Science and Engineering at the University of Notre
primatives. Dame. His research interests include computational
hh Session(s): 2129 - Hardware Subdivision science and engineering, parallel and distributed
and Tessellation of Catmull-Clark computing, software engineering, generic programming,
Surfaces (Tuesday, Sept 21, 16:00) mathematical software, and numerical analysis.
Lumsdaine is a member of ACM, IEEE, and SIAM, as well
Lorach, Tristan as the MPI Forum, the BLAS technical forum and the
Computergraphics Engineer (NVIDIA) ISO C++ standards committee. In 1995, he received the
Career Development Award from the National Science
Tristan Lorach has worked on many realtime interactive Foundation.
events all over the world. Tristan is now working at
NVIDIA in the developer technical relations department hh Session(s): 2093 - Computational
(a.k.a Devtech), participating on a variety of projects in Photography: Real-Time Plenoptic
relation with NVIDIA partners. At the same time, he is Rendering (Wednesday, Sept 22, 16:00)
also contributing to R&D writing demos for new GPU
Chips. Lunn, Philip
CEO (Bunkspeed)
hh Session(s): 2056 - Next-Generation Rendering
with CgFX (Tuesday, Sept 21, 16:00) Philip Lunn is the visionary founder of Bunkspeed.
He has brings his passion for computer graphics to
Ltaief, Hatem democratize the creation of photographic quality 3D
Sr. Research Associate (University of Tennessee) imagery and animation to every Bunkspeed product.
Simplicity without compromise enables explosive growth
Hatem Ltaief received the MSc degree from the school of new users and new untapped markets. Mr. Lunn
of engineering at the University of Claude Bernard has over 20 years of technology and entrepreneurial
Lyon I, France, the MSc in applied mathematics at the experience and holds a Bachelor of Science degree in
University of Houston and the PhD degree in computer Mechanical Engineering from the University of Arizona.
science from the University of Houston. He is a research
associate in the Innovative Computing Laboratory in the hh Session(s): 4002 - Emerging Companies: CEO on
Department of Electrical Engineering and Computer Stage featuring Allegorithmic SAS, Bunkspeed,
and miGenius (Wednesday, Sept 22, 14:00)
SPEAKERS AND
SPEAKERS AND
Titech) in April 2001, leading the Research Infrastructure
Division Solving Environment Group of the Titech Benoit Meister received his BSc in Physics and his
PANELISTS
campus. He has won several awards including the Sakai PhD in computer science from Strasbourg University,
award for research excellence from the Information in automatic program parallelization and optimization
Processing Society of Japan in 1999, and recently using a polyhedral model of computation loops. After
received the JSPS Prize from the Japan Society for a post-doc in Verimag Grenoble, Benoit has joined
Promotion of Science in 2006 from his Royal Highness Reservoir Labs where he contributed to the development
Prince Akishinomiya. and management of R-Stream, an advanced auto-
parallelizing compiler based on extensions of the
hh Session(s): 2265 - CUDA Centers of Excellence polyhedral optimization techniques to date. At the
Super-Session IV (Tuesday, Sept 21, 17:00) moment, R-Stream successfully targets 6 radically
hh 2280 - TSUBAME2.0 Experience different architectures and parallelizes a broad range of
(Wednesday, Sept 22, 10:00) applications.
hh Session(s): 2202 - A Programming Model and
Maurer, Lance Tool for Automatic High-Performance C to
CEO (Cinnafilm, Inc.) CUDA Mapping (Thursday, Sept 23, 09:00)
Lance Maurer is the founder, president and CEO of
Cinnafilm, Inc. – an American engineering company Menon, Shashi
dedicated to global leadership in image optimization R&D Manager (Schlumberger)
using innovative and affordable parallel-processing
methods. Prior to launching Cinnafilm, Maurer hh Session(s): 2141 - Moving the Frontier of Oil
FULL CONFERENCE GUIDE 2010
spent ten years, primarily with Goodrich Aerospace, and Gas Exploration and Production with GPUs
designing, analyzing and testing technology used in (Wednesday, Sept 22, 10:00)
the world’s most advanced spacecraft and launch
vehicles; clients include: NASA, Boeing, Ball, Kodak Meredith, Jeremy
and Lockheed-Martin. He would eventually become Computer Scientist (Oak Ridge National Laboratory)
an expert in thermal, structural and materials design hh Session(s): 2089 - Analyzing CUDA Accelerated
for extreme environmental applications, honing Application Performance at 20 PFLOP/s
regimented engineering philosophy in the true “failure- (Wednesday, Sept 22, 17:00)
is-not-an-option” defense industry. During his tenure
as an aerospace engineer he also pursued his life
117
Merrill, Duane Morrison, Michael
Ph.D Candidate (University of Virginia) (NVIDIA)
Duane Merrill is a Ph. D. candidate at the University of hh Session(s): 2308 - Building Cutting-Edge Realtime
Virginia, Department of Computer Science. His advisor 3D Applications with NVIDIA SceniX (Wednesday,
is Professor Andrew Grimshaw. Before graduate school, Sept 22, 10:00)
he was a software developer for Avaki Corporation,
specializing in grid computing middleware. His Mooney, Al (Adobe)
current research interests lay in parallel and high-
performance computing, specifically in regard to hh Session(s): 2224 - GPU Acceleration in Adobe
programming models and algorithmic primitives for Creative Tools (Tuesday, Sept 21, 15:00)
GPGPU, stream, and many-core architectures. Much
of his prior academic work has involved concurrent Morton, Scott
systems in one form or another, including grid and Geophysical Advisor (Hess Corporation)
distributed computing; virtual machines and hypervisor Scott Allyn Morton received his B.A. in physics &
technologies; operating systems and meta-systems; and math from Gustavus Adolphus College and his
security architecture and protocols. Ph.D. in astrophysics from University of Illinois at
hh Session(s): 2296 - CUDA Optimization for Urbana-Champaign. He has 25 years of experience
Ninjas: A Case Study of High-Performance in computational and theoretical physics distributed
Sorting (Wednesday, Sept 22, 15:00) between academia, the computer industry and the
petroleum industry. Scott has worked at NCSA (National
Micikevicius, Paulius Center for Supercomputing Applications), Shell, Thinking
Developer Technology Engineer (NVIDIA) Machines, Cray Research and SGI (Silicon Graphics Inc).
Scott manages the geophysical technology development
Paulius Micikevicius is a Developer Technology group for Hess Corporation and is responsible for
Engineer at NVIDIA with a focus on parallel computing monitoring, testing, adopting and developing new
and performance. Prior to joining NVIDIA, he was an geophysical and computational technologies.
assistant professor of Computer Science at Armstrong
Atlantic State University as well as a research associate hh Session(s): 2059 - Industrial Seismic Imaging
at the Media Convergence Laboratory at UCF. Paulius on GPUs (Wednesday, Sept 22, 11:00)
holds a PhD in Computer Science from the University
of Central Florida and a B.S. in Computer Science from Moulik, Supratik
Midwestern State University. (University of Pennsylvania)
hh Session(s): 2012 - Analysis-Driven Performance Dr. Moulik is a cardiovascular imaging fellow at the
Optimization (Thursday, Sept 23, 15:00) University of Pennsylvania. Combining an engineering
degree from Carnegie Mellon University with 10 year of
hh 2011 - Fundamental Performance Optimizations graduate medical education, Dr. Moulik is a unique blend
for GPUs (Wednesday, Sept 22, 17:00) of physician and programmer. The breadth of training
has allowed him to develop GPU computing algorithms
Miller, Phillip for the medical imaging community which are both
Director, Workstation Software Product Management intuitive and robust.
(NVIDIA)
hh Session(s): 2036 - Algorithms for Automated
Phillip Miller is an accomplished software product Segmentation of Medical Imaging Studies
manager with 16 years experience guiding industry Utilizing CUDA (Tuesday, Sept 21, 16:00)
leading solutions from companies such as Autodesk and
Adobe. He is also a registered architect, bringing real- Mroue, Abdul
SPEAKERS AND
world experience in using tools and directing projects Post-Doc Fellow (CITA, Univ. Of Toronto)
to software creation. At NVIDIA, Mr. Miller directs much
Abdul Mroue is a PostDoc Fellow at the Canadian
PANELISTS
SPEAKERS AND
hh 2071 - Large Scale Visualization Soup returned to academia to study for a PhD at Stanford
(Wednesday, Sept 22, 11:00) University. He has since worked for Schlumberger in
PANELISTS
a number of roles ranging from managing seismic
Naumov, Maxim research at the long term research labs to managing the
Software Engineer (NVIDIA) process for introducing new technologies.
Maxim Naumov’s expertise is in the area of parallel hh Session(s): 2142 - Complex Geophysical
numerical linear algebra. In particular, he has worked Imaging Algorithms Enabled by GPU
on parallel iterative linear systems and eigenvalue technology (Wednesday, Sept 22, 14:00)
solvers. He received his Ph.D. in Computer Science
(with specialization in Computational Science and Nicoletti, Bruno
Engineering) in 2009 and his B.Sc. in Computer Science CTO (The Foundry)
and Mathematics in 2003, all from Purdue University Bruno Nicoletti has worked in visual effects since
– West Lafayette. He currently works in NVIDIA CUDA graduating with a degree in Computer Science and
Platform team developing parallel numerical algorithms Mathematics from Sydney University in 1987. He
for Graphical Processing Units (GPUs). He has previously has worked in TV and Film production at companies
worked in the Intel Corporation Microprocessor such as Rushes, The Computer Film Company (now
Technology Lab and Computational Software Lab, and Framestore) and Animal Logic, as well as developing
received a 2008-09 Intel Foundation Ph.D. Fellowship. software commercially at Animal Logic, Softimage and
hh Session(s): 2070 - CUSPARSE Library: A Discreet Logic (now both part of Autodesk). He has
Set of Basic Linear Algebra Subroutines for developed 2D image processing software as well as 3D
animation, rendering and modelling tools. In 1996 he
FULL CONFERENCE GUIDE 2010
or a torch bearer burning the books old rules. In his Prof. Pande is currently an Associate Professor of
23 year history in filmmaking, computer technology Chemistry and (by courtesy) of Structural Biology and of
PANELISTS
has evolved at an alarming rate enabling consumers Computer Science at Stanford University. Prof. Pande
and professionals to harness the power of desktop received a BA in Physics from Princeton University
workstations. With the advent of CUDA’s mark in the in 1992 and PhD in physics from MIT in 1995. Prof.
HD and film making, the next logical step is Stereo 3D Pande’s current research centers on the development
on the desktop and Kevan’s unique vision and insight and application of novel grid computing simulation
will bring relevance to any audience as he is still a techniques to address problems in chemical biology. In
commissioned filmmaker in his own right. Kevan particular, he has pioneered novel distributed computing
currently works in the Quadro group at NVIDIA hoping methodology to break fundamental barriers in the
the IRS won’t catch up with him too quickly as he’s spent simulation of kinetics and thermodynamics of proteins
most of his salary on large cups of Starbucks coffee. and nucleic acids.
hh Session(s): 2279 - Working Man’s Guide to 3D hh Session(s): 2007 - Folding@home:
Video Editing (Thursday, Sept 23, 16:00) Petaflops on the Cheap Today; Exaflops
Soon? (Thursday, Sept 23, 11:00)
hh 2222 - Working Man’s Guide to 3D Video
Editing (Tuesday, Sept 21, 14:00)
Pappas, Jack
Co-founder, CEO (TidePowerd)
Obukhov, Anton
Developer Technology Engineer (NVIDIA) Mathematician, entrepreneur, “yankee”. Put another
way: I grew up in Princeton, NJ, then attended
Anton Obukhov is a Developer Technology Engineer at
the University of Alabama to study Mathematics.
NVIDIA Corporation since 2008. His field of interests
Coincidentally, this is also where I met Nick,
include computer vision, video encoding, and multimedia
TidePowerd’s co-founder, CFO, and resident
processing. He graduated from Moscow State University
Frenchman. Born from my love of math and hatred
in 2008 with a Masters Degree in Computer Science
of C programming, TidePowerd was accepted as
from the Computational Mathematics and Cybernetics
the first-ever participant in Red Gate Software’s
Department in Russia. Before joining NVIDIA, he
“Springboard” startup incubator. Since then, we’ve
conducted research and development in the Graphics
built our first tool, GPU.NET, which makes GPU
and Multimedia Lab at Moscow State University while
computing easier than ever!
also working at YUVsoft Corporation.
hh Session(s): 2294 - GPU.NET with has worked on both a creative R&D team as well as
TidePowerd (Wednesday, Sept 22, 17:00) shipped several game titles at EA. Before that he worked
as a consultant while pursuing his M.Sc. in real-time
Patel, Sandeep GPU volume rendering. Eric enjoys developing real-
Assistant Professor (University of Delaware) time algorithms in traditional rendering pipelines and
volume renderers. Some of his latest work includes
hh Session(s): 2035 - Simulations of Large two chapters in the upcoming book GPU Pro 2, on skin
Membrane Regions (Wednesday, Sept 22, 11:30) shading and pixel shader amoritization.
Patney, Anjul hh Session(s): 2235 - Advanced Medical
Graduate Student (University of California, Davis) Volume Rendering and Segmentation on
the GPU (Tuesday, Sept 21, 15:00)
Anjul is a third year PhD student in the Department
of Electrical and Computer Engineering at University Peterfreund, Natan
of California, Davis. He works under the guidance of CTO (Playcast Media Systems)
Prof. John Owens in the area of graphics and computer
architecture. Anjul is interested in pursuing hardware Dr. Natan Peterfreund is a world-renowned expert with
and software challenges in the design of programmable more than 20 years of research experience in video
rendering architectures. Before UC Davis, he received and image processing technologies. Prior to founding
his B.Tech. in Electrical Engineering from Indian Playcast, Dr. Peterfreund was Chief Scientist, Video
Institute of Technology, Delhi in 2007. Technologies in the DSP Group [NASDAQ: DSPG].
While serving as a principal scientist in Harmonic
hh Session(s): 2162 - Real-time Reyes: [NASDAQ: HLIT], he was one of the authors of the
Programmable Rendering on Graphics H.264 compression standard. Dr. Peterfreund holds a
Processors (Wednesday, Sept 22, 17:00) D.Sc degree in EE from the Technion Israel Institute of
Technology.
Peddie, Jon
President (Jon Peddie Research) hh Session(s): 4004 - Emerging Companies:
CEO on Stage featuring Cooliris, empulse
Dr. Jon Peddie is one of the pioneers of the graphics GmbH, and Playcast Media Systems
industry. After the successful launch of several (Wednesday, Sept 22, 16:00)
graphics manufacturing companies, Peddie began JPA
in 1984 to provide comprehensive data, information Peters, Alan
and management expertise to the computer graphics CTO (Universal Robotics, Inc.)
industry. In 2001 Peddie left JPA and formed Jon Peddie
Research (JPR) to provide customer intimate consulting Richard Alan Peters II is an Associate Professor of
and market forecasting services. Peddie lectures at Electrical Engineering at Vanderbilt University and the
numerous conferences on topics pertaining to graphics Chief Technology Officer of Universal Robotics, Inc (UR).
technology and the emerging trends in digital media During the years 2000-2008 he was a visiting researcher
technology, is the author of several books on computer on the NASA-JSC Robonaut Project. For Robonaut, he
graphics, and was named one of the most influential developed a multimodal short-term memory system
analysts, he is frequently quoted in trade and business and a sensory-motor coordination (SMC) control system
publications, and contributes articles to numerous that learns tasks from teleoperation. This technology
publications. led to the development of UR’s adaptive control
systems that learn SMC to enable industrial robots to
hh Session(s): 4001 - Emerging Companies: work in uncertain, dynamic environments. He is a Phi
CEO on Stage featuring Elemental Beta Kappa graduate of Oberlin College (Ohio) where
Technologies, Inc., Geomerics, and he received an A. B. in Mathematics (May 1979). He
SPEAKERS AND
Milabra (Wednesday, Sept 22, 11:00) received an M.S. (1985) and a Ph.D. (1988) in Electrical
hh 4002 - Emerging Companies: CEO on Stage Engineering from the University of Arizona, where he
PANELISTS
featuring Allegorithmic SAS, Bunkspeed, was a fellow of the American Electronics Association. He
and miGenius (Wednesday, Sept 22, 14:00) is the author of more than 70 scientific papers and holds
hh 4003 - Emerging Companies Summit Panel: GPUs four US patents. His research interests include sensory-
for Computer Vision (Wednesday, Sept 22, 15:00) guided robotics, computer vision, image processing,
embedded systems, and mathematics.
Pedersen, Chris hh Session(s): 2091 - The GPU in the Reactive Control
Market Development Manager (NVIDIA) of Industrial Robots (Thursday, Sept 23, 16:00)
Chris studied electrical engineering at BYU, but has
spent most of his career as an intrapranuer — helping Peters, Amanda
start new businesses in large companies. He began PhD Candidate (Harvard University)
his career at Hewlett Packard where he helped Amanda received a M.S. degree in Computer Science
start businesses in communications testing, video from Harvard University and a bachelor’s degree from
servers, consumer PC’s, handheld devices and digital Duke University in both physics and computer science.
entertainment products and services. He’s also created She spent three years working at IBM on the Blue Gene
startup businesses and worked as an industry analyst supercomputers where her job responsibilities included
/ consultant. Today he works with ISVs to develop applications porting and optimizing as well as system
compelling mobile application that run on NVIDIA Tegra- performance analysis. She is currently pursuing a PhD
powered devices. in Applied Physics at Harvard University with a primary
FULL CONFERENCE GUIDE 2010
SPEAKERS AND
Senior Research Programmer (University of Illinois) Pinskiy, Dmitriy
James Phillips is a Senior Research Programmer in the Sr Software Engineer (Walt Disney Animation Studios)
PANELISTS
Theoretical and Computational Biophysics Group at the Senior Software engineer at Walt Disney Animation
Beckman Institute for Advanced Science and Technology Studious; designed and developed CG animation tools
at the University of Illinois at Urbana-Champaign. He for feature animation movies Chicken Little, Meet the
has a Ph.D. in Physics from the University of Illinois. Robinsons, Bolt, Tangled. Prior to that, working at
Since 1999, James has been the lead developer of the Alias|Wavefront (now Autodesk) on Maya, an award
highly scalable parallel molecular dynamics program winning software, implemented various animation
NAMD, for which he received a Gordon Bell Award in deformers as well as modeling / sculpting tools.
2002. His research interests include improving the Published a number of computer graphics research
performance and accuracy of biomolecular simulations papers (as recent as Eurographics 2010) and presented
through parallelization, optimization, hardware SIGGRAPH talks.
acceleration, better algorithms, and new methods. hh Session(s): 2284 - GPU implementation
hh Session(s): 2054 - NAMD, CUDA, and Clusters: of Collision-Based Deformation
Taking GPU Molecular Dynamics Beyond (Wednesday, Sept 22, 17:30)
the Desktop (Thursday, Sept 23, 14:00)
Pinto, Nicolas
Phung, Huynh PhD Student (MIT)
Research Engineer (A*STAR Institute of High Nicolas Pinto is a PhD Student in Computational
Performance Computing) Neuroscience at MIT. He is currently a member of the
FULL CONFERENCE GUIDE 2010
Huynh Phung received a Ph.D. degree in Computer DiCarlo Lab at MIT, the Sinha Lab for Vision Research at
Science from School of Computing, National University MIT, and the Cox Visual Neuroscience Group at Harvard/
of Singapore. Currently, Huynh is working as a research Rowland. His research interests lie at the intersection of
engineer at A*STAR Institute of High Performance Brain and Computer Sciences.
Computing, Singapore. hh Session(s): 2204 - Bridging GPU Computing
hh Session(s): 2109 - Migration of a Complete and Neuroscience to Build Large-
3D Poisson Solver from Legacy Fortran Scale Face Recognition on Facebook.
to CUDA (Wednesday, Sept 22, 10:30) (Wednesday, Sept 22, 14:00)
123
hh 2176 - Easy GPU Meta-programming: A Case product manager and software engineer. He holds a
Study in Biologically-Inspired Computer BA in Computer Science from Willamette University
Vision (Thursday, Sept 23, 10:00) and completed the Japan Studies Program at Tokyo
International University. Outside of work, Will learns
Pissoort, Davy something new every day, usually from his two kids. He
Professor (KHBO-FMEC) enjoys hiking, camping, swimming, spending time with
Davy Pissoort was born in 1978. He received theM.S. his wonderful wife, and playing The Game.
and Ph.D. degrees in electrical engineering from hh Session(s): 2004 - Languages, APIs and
Ghent University, Ghent, Belgium, in 2001 and 2005, Development Tools for GPU Computing (Pre-
respectively. From October 2005 to October 2006, he was Conference Tutorial) (Monday, Sept 20, 13:00)
a Postdoctoral Researcher with the Fund for Scientific
Research-Flanders (Belgium) (FWO-Vlaanderen) at Rasmusson, Allan
the Department of Information Technology, Ghent PhD Candidate (University of Aarhus (NVIDIA intern))
University. In November 2006, he joined the Eesof-EDA Works within the field of Quantitative Tissue Analysis,
Department, Agilent Technolgies, Ghent, Belgium, as primarily with optimizing the process of counting and
a Research Engineer. Since August 2009, he is with the measuring cells in microscopic images of histological
KHBO, Belgium where he is also head of the Flanders’ tissue sections using 3D visualization and medical
Mechatronics Engineering Centre. His current research imaging computations.
interests include the development of fast and efficiënt
electromagnetic simulation methods, electromagnetic hh Session(s): 2021 - Efficient Volume Segmentation
compatibility, as well as the analysis and testing of on the GPU (Wednesday, Sept 22, 17:00)
the mechanical and thermal reliability of electronic
modules. Raue, Kristian
CEO and Founder (Jedox Business Intelligence)
hh Session(s): 2080 - Tackling Multi-Gigabit
Design Challenges with a Practical Virtual 2002 - today: CEO & Founder, Jedox Business
EMI/ESD Lab (Wednesday, Sept 22, 15:00) Intelligence
1991 - 2000: CEO & Founder, IntelliCube AG
Prevrhal, Sven 1989 - 1991: Management Consultant
Staff Research Scientist (Philips) 1982 - 1989: Technical University of Darmstadt (double
major in engineering and business administration)
PhD in Physics from Technical University Vienna, Austria
1997: work on medical imaging of Osteoporosis hh Session(s): 4005 - Emerging Companies:
Faculty Member University of California, San Francisco, CEO on Stage featuring Jedox Business
until 2009: work on Computed Tomography image Intelligence, Rocketick, and Softkinetic
reconstruction. Now with Philips Medical Imaging, San (Wednesday, Sept 22, 17:00)
Jose: work on Positron Emission Tomography image
reconstruction. Reid, Ian
Chief Commercial Officer (NAG)
hh Session(s): 2211 - Modern Architecture for
Massively Parallel Medical Tomographic As Chief Commercial Officer for the NAG Group, Ian
Image Reconstruction on a GPU Cluster has responsibility for driving all aspects of commercial
(Wednesday, Sept 22, 15:00) strategy. Ian has been with NAG for over 20 years and
has held various technical and commercial positions
Price, Daniel within the company during this time. Most recently he
Engineer (EM Photonics, Inc.) was Vice President for Business Development and he
continues to lead a worldwide team with responsibility
Dan Price is a member of the Accelerated Computing
SPEAKERS AND
SPEAKERS AND
Software Manager (NVIDIA)
hh 2236 - A Work-Efficient GPU Algorithm for Level
Set Segmentation (Thursday, Sept 23, 09:00)) hh Session(s): 2024 - NVIDIA Acceleration
PANELISTS
Engines Overview (Pre-Conference
Robison, Austin Tutorial) (Monday, Sept 20, 13:00_
Lead Developer, OptiX integration (NVIDIA)
Austin Robison is a Research Scientist at NVIDIA Sakharnykh, Nikolai
working as part of the OptiX team on GPU ray tracing. Developer Technology Engineer (NVIDIA)
His research interests include high performance Nikolai Sakharnykh is a developer technology engineer
ray tracing, physically-based rendering and hybrid at NVIDIA. He has worked with game developers
rendering. Austin holds a B.S. in Computer Science providing support for graphics technology content.
from the University of Chicago and an M.S. in Computer Recently he focused on GPU compute and CUDA.
Science from the University of Utah. Currently he is working on CFD-related projects and
hh Session(s): 2250 - GPU Ray Tracing Exposed: supporting CUDA customers. His interests include
Under the Hood of the NVIDIA OptiX Ray computational fluid dynamics, sparse matrix solvers
Tracing Engine (Tuesday, Sept 21, 17:00) and visualization techniques. Nikolai graduated with
honours from Moscow State University, the department
Rogan, Aaron of Computational Mathematics and Cybernetics as a
Research Scientist and System Adminstrator (Neva Ridge specialist in applied mathematics and informatics.
Technologies) hh Session(s): 2015 - Efficient Tridiagonal
I currently work for Neva Ridge Technologies where Solvers for ADI methods and Fluid
FULL CONFERENCE GUIDE 2010
I specialize in radar image processing. Over the past Simulation (Tuesday, Sept 21, 14:00)
year and half I have transitioned legacy code to run on a
GPU which has resulted in performance improvements Salian, Satish
anywhere from 20x to 400x depending ont he algorithm. Manager CUDA Debugger Tools (NVIDIA)
My interests are in GPGPU programming, image Satish Salian is the Manager for CUDA debugger tools
processing, algorithm development and numerical at NVIDIA, where he is responsible for the strategy,
simulations. direction and development of CUDA tools and support for
hh Session(s): 2003 - Using CUDA to Accelerate Radar CUDA developers. He joined NVIDIA in 2001 and before
Image Processing (Thursday, Sept 23, 15:00) moving to CUDA was responsible for the development of
125
NVIDIA’s Graphics tools and NVAPI SDK. Satish received University in 1974. He is Swanlund Professor of
his Bachelors degree in Computer Engineering from Physics and is also affiliated with the Department of
University of Pune, India. Chemistry as well as with the Center for Biophysics and
hh Session(s): 2002 - CUDA Debugging on Linux and Computational Biology. Professor Schulten is a full-time
MacOS with cuda-gdb (Thursday, Sept 23, 10:00) faculty member in the Beckman Institute and directs
the Theoretical and Computational Biophysics Group at
Sanders, Jason the University of Illinois Urbana-Champaign, IL. Honors
Senior Software Engineer (NVIDIA) and awards: Award in Computational Biology 2008;
Humboldt Award of the German Humboldt Foundation
Jason Sanders is a senior software engineer in the (2004); University of Illinois Scholar (1996); Fellow of the
CUDA Platform Group at NVIDIA. While at NVIDIA, he American Physical Society (1993); Nernst Prize of the
helped develop early releases of CUDA system software Physical Chemistry Society of Germany (1981).
and co-wrote the book CUDA by Example. Jason received
his M.S. in computer science from the University of hh Session(s): 1002 – Day 2 Keynote
California Berkeley where he published research in with Dr. Klaus Schulten, University of
GPU computing, and he holds a B.S.E. in electrical Illinois at Urbana-Champaign
engineering from Princeton University. Prior to joining
NVIDIA, he previously held positions at ATI Technologies, Sheehan, Andrew T.
Apple, and Novell. When he’s not coding or writing Managing Director (Sutter Hill Ventures)
books, Jason is typically in the gym, playing soccer, or Andy Sheehan focuses his investments on internet
shooting photos. software, services and digital media companies.
hh Session(s): 2131 - Introduction to CUDA C (Pre- Andy currently is a director of Buzz Media, Inc., Global
Conference Tutorial) (Monday, Sept 20, 14:30) Liquid Markets, LLC, Grain Communications Group, Inc.,
and Yext, Inc. His prior directorships have included, @
Sarzo, Rudy Road (acquired by Trimble), AllBusiness.com, BakBone
Principal (SMI) Software, Datran Media, Intermix Media and Myspace
(acquired by News Corp.) and ReachLocal. Andy joined
Rudy has been a professional recording and performing the firm in 2007 from VantagePoint Venture Partners,
artist worldwide for over 20 years. As a member of Ozzy where he was a managing director. Previously, he worked
Osbourne’s band, from March 1981 to September 1982, at Alex. Brown & Sons and ABS Capital Partners. Andy
Rudy toured the world in support of the “Blizzard of Ozz” received his BA from Dartmouth College with a degree
and “Diary Of a Madman” records. His bass playing can in English. He earned his MBA in 1985 from the Wharton
be heard on Ozzy’s multimillion selling CD “Tribute” and School.
“Speak of the Devil” CD and DVD.
hh Session(s): 4009 - Emerging Companies
hh Session(s): 2279 - Working Man’s Guide to 3D Summit Panel: The “New Normal” For Building
Video Editing (Thursday, Sept 23, 16:00) Emerging Companies Based On Disruptive
Technologies (Thursday, Sept 23, 14:00)
Scherl, Holger
Computer Scientist (Siemens AG) Shoemaker, Austin
Holger is a computer scientist in the field of medical CTO and Co-Founder (Cooliris)
image processing. He pursues his PhD studies at Austin is CTO and co-founder of Cooliris. Austin was
the University of Erlangen-Nuremberg, Germany, a master’s degree student in Computer Science at
specializing in hardware-accelerated cone-beam CT Stanford University specializing in artificial intelligence,
reconstruction. Since 2007 he is a system architect and stopped out to lead technology and product
in R&D at Siemens Healthcare and focuses on the development for the Cooliris platform. Austin is fluent in
SPEAKERS AND
development and implementation of cone-beam CT Spanish and Mandarin Chinese. Prior to his involvement
reconstruction algorithms on graphics hardware. with Cooliris, Austin worked at Apple Computer for
PANELISTS
hh Session(s): 2096 - High-Speed CT Reconstruction seven years, contributing to product development efforts
in Medical Diagnosis & Industrial NDT in several divisions.
Applications (Tuesday, Sept 21, 11:00) hh Session(s): 4004 - Emerging Companies:
CEO on Stage featuring Cooliris, empulse
Schneider, Jens GmbH, and Playcast Media Systems
Postdoctoral Fellow (King Abdullah University of Science (Wednesday, Sept 22, 16:00)
and Technology)
Dr. Jens Schneider received his MA from RWTH Aachen, Silva, Claudio
Germany in 2003 and his doctorate from Technische Professor (University of Utah)
Universitaet Muenchen in 2009. He is currently a Claudio T. Silva is a professor of computer science
postdoctorate fellow at the King Abdullah University of and a faculty member of the Scientific Computing
Science and Technology (KAUST) where he works in the and Imaging (SCI) Institute at the University of Utah.
Geometric Modeling and Scientific Visualization Center. He coauthored more than 150 technical papers and
His research interests include GPU-based algorithms, eight U.S. patents, primarily in visualization, geometric
Computer Graphics, Scientific Visualization, and GPU- computing, and related areas. He received IBM Faculty
friendly Data Compression. Awards in 2005, 2006, and 2007, and best paper awards
hh Session(s): 2139 - Interactive Histology at IEEE Visualization 2007 and IEEE Shape Modeling
of Large-Scale Biomedical Image Stacks International 2008. He is a member of the ACM,
(Wednesday, Sept 22, 14:00) Eurographics, and IEEE.
hh Session(s): 2248 - Parallel Processing on GPUs at
Schuh, Andrew the University of Utah (Wednesday, Sept 22, 14:00)
Project Manager (University of Illinois)
hh Session(s): 2249 - New Programming Tools Sinclair, Matt
GPU Computing (Wednesday, Sept 22, 10:00) Research Assistant (UW-Madison)
Matt Sinclair is a Ph.D. student at the University of
Schulten, Klaus Wisconsin-Madison in the Electrical and Computer
Professor (University of Illinois, Urbana-Champaign) Engineering Department. He received his B.S. degrees
Klaus Schulten received his Ph.D. from Harvard
in Computer Engineering and Computer Science with HPC machines.
Honors from the University of Wisconsin-Madison in hh Session(s): 2017 - Lessons Learned
2009. His interests lie in processor microarchitecture Deploying the World’s First GPU-Based
and high performance computing. Petaflop System (Tuesday, Sept 21, 15:00)
hh Session(s): 2044 - GRASSY: Leveraging
GPU Texture Units for Asteroseismic Data Spatz, Pierre
Analysis (Wednesday, Sept 22, 15:00) Head of Quantitative Research (Murex SAS)
Pierre joined Murex in 1989 and has a master degree
Snepvangers, Jeroen in computer science and applied mathematics from
President and CEO (RTT USA) ENSIMAG. After various leading positions in the Murex
Jeroen Snepvangers is President and CEO of RTT software development team, Pierre launched the Murex
USA, Inc., a subsidiary of RTT AG, a public company Analytics initiative in 2002.
trading at the Frankfurt Stock Exchange. RTT is the hh Session(s): 2032 - Practical Methods Beyond
largest real-time 3D computer graphics software and Monte Carlo in Finance (Thursday, Sept 23, 10:00)
CGI animation services provider to the Automotive and
Aircraft industries. The company serves its customers in Srinivasan, Savitha
industrial design and digital (3D CGI) marketing. Prior Corporate Venture Partner (IBM)
to joining RTT, Jeroen was a Management Consultant
at Urban Science Applications Inc., a technology- Savitha Srinivasan is a Corporate Venture Partner in
based management consultancy for the automotive IBM’s Venture Capital Group in Corporate Strategy
and financial industries, focusing on optimizing retail where she develops strategic relationships with venture
networks. He obtained his MBA at IMD, Switzerland capitalists and their portfolio companies to leverage
in 2002. He has a Bachelors Degree in Applied external innovation for mutual strategic advantage.
Mathematics from Warwick University, UK. He has an She has nearly 20 years of experience at IBM in
IB, International Baccalaureate from St. Clare’s College, leadership roles addressing the strategic priorities of
Oxford, UK IBM’s Services businesses. She leads the strategic
development of IBM’s Services venture ecosystem,
hh Session(s): 4007 - Emerging Companies: CEO with each of the Global Technology Services business
on Stage featuring Aqumin, RTT, and Scalable units – Strategic Outsourcing, Integrated Technology
Display Technologies (Thursday, Sept 23, 10:00) Services and Managed Business Process Services
by early identification of companies, fostering pilots,
Solano, Lizandro partnerships and contributing to M&A pipeline.
(Iowa State University)
hh Session(s): 4007 - Emerging Companies: CEO
Lizandro Solano-Quinde received his MSc. in Electrical on Stage featuring Aqumin, RTT, and Scalable
Engineering in 2006 at Iowa State University; currently Display Technologies (Thursday, Sept 23, 10:00)
he is a Ph.D. candidate in Computer Engineering at
Iowa State University under the advisement of Dr. hh 4008 - Emerging Companies: CEO on
Brett Bode and Dr. Arun Somani. He is affiliated to the Stage featuring ICD and Universal
Scalable Computing Laboratory at the Ames Laboratory, Robotics (Thursday, Sept 23, 11:00)
Department of Energy. His research interests are in
the fields of High Performance Computing, Computer Stam, Joe
Architecture and Fault Tolerance. Sr. Applications Engineer (NVIDIA)
hh Session(s): 2292 - Implementation of JOSEPH STAM is a Senior Applications Engineer for
High-Order Adaptive CFD Methods on NVIDIA Corporation. He has a focus on computer vision,
GPUs (Thursday, Sept 23, 10:30) video, and image processing applications of Graphics
SPEAKERS AND
Processors for both professional and embedded
Somani, Arun products. Prior to joining NVIDIA in 2007, he worked
PANELISTS
Anson Marston Professor (Iowa State University) in the automotive industry for 12 years on research
and development of imaging hardware and computer
Arun K. Somani is currently Anson Marston vision algorithms for vehicle based vision products.
Distinguished Professor and Jerry R. Junkins Endowed Joe received a B.S. degree in Engineering Physics
Chair Professor of Electrical and Computer Engineering & Computer Science from Hope College in Holland,
at Iowa State University where he first served as David Michigan and an M.S. degree in Electrical Engineering
C. Nicholas Professor during 1997-2002. He earned his from Michigan State University. He is an inventor on
MSEE and PhD degrees in electrical engineering from 82 U.S. patents and several foreign patents, many of
the McGill University, Montreal, Canada, in 1983 and which relate to computer vision software and imaging
1985, respectively. He worked as Scientific Officer for hardware technologies. Joe resides in Holland Michigan
Govt. of India, New Delhi from 1974 to 1982 and as a with his wife and four children.
faculty member at the University of Washington, Seattle,
WA from 1985 to 1997 in electrical engineering and hh Session(s): 2215 - Extending OpenCV with
computer science and engineering departments where GPU Acceleration (Thursday, Sept 23, 10:00)
he was promoted to the level of Professor in September hh 4003 - Emerging Companies Summit Panel: GPUs
1995. for Computer Vision (Wednesday, Sept 22, 15:00)
hh Session(s): 2292 - Implementation of
High-Order Adaptive CFD Methods on Stephan, Philippe
GPUs (Thursday, Sept 23, 10:30) CTO (RMS)
FULL CONFERENCE GUIDE 2010
SPEAKERS AND
hh Session(s): 2110 - Acceleration of a establishing the semiconductor research practice at
Novel Rotorcraft Wake Simulation Alex. Brown and Sons.
PANELISTS
(Thursday, Sept 23, 10:00) hh Session(s): 4010 - Emerging Companies: CEO
on Stage featuring NaturalMotion Ltd, OptiTex,
Stone, John and Useful Progress (Thursday, Sept 23, 15:00)
Senior Research Programmer (University of Illinois at
Urbana-Champaign) hh 4011 - Emerging Companies: CEO on Stage
featuring Cinnafilm, Inc., Perceptive Pixel, and
John Stone is a Senior Research Programmer in the Total Immersion (Thursday, Sept 23, 16:00)
Theoretical and Computational Biophysics Group
at the Beckman Institute for Advanced Science and Tal, Uri
Technology, and Associate Director of the NVIDIA CUDA CEO (Rocketick)
Center of Excellence at the University of Illinois. Mr.
Stone is the lead developer of VMD, a high performance Uri Tal has 12 years of experience in management,
molecular visualization tool used by researchers all over design and implementation of hardware acceleration
the world. His research interests include molecular technologies. Before founding Rocketick, Uri Managed
visualization, GPU computing, parallel processing, ray a large R&D team that developed FPGA-based
tracing, haptics, and virtual environments. acceleration solutions in the intelligence corps of the
Israeli Defense Forces. And was a system architect in
hh Session(s): 2073 - High Performance Molecular Siliquent / Broadcom. B.Sc. (Summa Cum Laude) and
Simulation, Visualization, and Analysis M.Sc. in Electrical Engineering ,Technion.
on GPUs (Wednesday, Sept 22, 16:00)
hh Session(s): 4005 - Emerging Companies:
FULL CONFERENCE GUIDE 2010
coordinating the customer support in cooperation with his last role was Head of Engineering. He received a B.S.
Engineering. Mr. Thamm joined mental images in 1989. EE from University of Brussels and a MSc Computer
PANELISTS
He has led several key projects of mental images, such Science from University of California, Santa Barbara.
as the definition of the extended OBJ format and the hh Session(s): 4005 - Emerging Companies:
integration of mental ray into many of the major CAD CEO on Stage featuringJedox Business
systems. He has studied Mathematics and subsequently Intelligence, Rocketick, and Softkinetic
developed free-form surface algorithms and various 3D (Wednesday, Sept 22, 17:00)
file formats.
hh Session(s): 2014 - Scalable Subsurface Tomov, Stan
Data Visualization Framework (University of Tennessee)
(Wednesday, Sept 22, 17:00) Stanimire (Stan) Tomov is Research Scientist at ICL
and Adjunct Assistant Professor in EECS at UTK. He
Thomason, Lee received Ph.D. in Mathematics from TAMU in 2002
Principal Scientist (Adobe Systems) and held positions at LLNL and BNL. Stan’s research
Lee Thomason is a principal scientist and Flash Player interests are in parallel algorithms, numerical analysis,
architect for Adobe Systems. He prototypes GPU and high-performance scientific computing. He co-leads
technology and leads the GPU development for the Flash the UTK’s CCOE on the development of MAGMA, a new
Player. generation of linear algebra libraries, extending the
hh Session(s): 2060 - GPUs in a Flash: Mapping sequential LAPACK-style algorithms, for highly parallel,
the Flash Animated Software Vector Rendering GPU and multicore heterogeneous architectures.
Model to the GPU (Tuesday, Sept 21, 17:00) hh Session(s): 2138 - Faster, Cheaper,
Better – Hybridization of Linear Algebra
Thrun, Sebastian for GPUs (Thursday, Sept 23, 09:00)
Professor / Distinguished Engineer (Stanford University hh 2263 - CUDA Centers of Excellence Super-
/ Google) Session II (Tuesday, Sept 21, 15:00)
Sebastian Thrun is a professor of computer science
and electrical engineering at Stanford, where he Townsend, Richard
directs the Stanford AI Lab. He is also a distinguished Assistant Professor (University of Wisconsin-Madison)
engineer at Google. Thrun’s team won the DARPA Grand Richard Townsend is a computational astrophysicist
at the University of Wisconsin-Madison, interested in Valich, Theo
the rotation, oscillations, magnetic fields and outflows President (Bright Side Network Inc)
of massive, luminous stars. His recent research has Theo Valich founded Bright Side Network and its
focused on investigating how GPU computing can be subsidiaries, developing products such as next-
brought to bear on the steep data analysis challenges generation automotive user interface, CPU/GPGPU
faced in Asteroseismology. Initial projects showing computational benchmark, proprietary web engine and
particular promise include GPU-accelerated period providing high-end 4K resolution videos in fully digital
searching in non-uniformly sampled data, and fast video production studio, utilizing latest GPU technology
spectrum interpolation by leveraging the untapped developments. Prior to founding Bright Side Network,
capabilites of GPU texturing units. Valich served as GPGPU technology analyst at JPR, CTO
hh Session(s): 2082 - CU-LSP: GPU-based at Provox, Senior Editor at TG Daily and Tom’s Hardware,
Spectral Analysis of Unevenly Sampled as well as Contributing Editor on The Inquirer.
Data (Wednesday, Sept 22, 10:00) hh Session(s): 2303 - Using Tegra to
Solve The Electric Car Power Dilemma
True, Thomas (Tuesday, Sept 21, 14:00)
Applied Engineer (NVIDIA)
hh 2304 - Harnessing the GPU to Accelerate
Tom is an Applied Engineer in NVIDIA’s Professional Automotive Development (Tuesday, Sept 21, 17:00)
Solutions Group where he focuses on the use of GPUs
in broadcast, video and film applications ranging from Vandermersch, Philippe
pre-visualization to post production and live to air. Prior Senior Software Engineer (NVIDIA)
to joining NVIDIA, Tom was an Applications Engineer at
SGI. Thomas has a M.S. degree in Computer Science Philippe Vandermersch joined the CUDA Platform
from the Graphic Lab at Brown University and a B.S. group in 2009, leading the development of the CUBLAS
Degree from the Rochester Institute of Technology. and CUSPARSE Libraries. Before that, Philippe was a
Senior Video architect, working on the NVIDIA Multi-
hh Session(s): 2159 - Programming the NVIDIA Standard Decoder solution (MSDEC). Prior to joining
Digital Video Pipeline with Direct3D (Pre- NVIDIA, Philippe has worked as an Embedded engineer
Conference Tutorial) (Monday, Sept 20, 14:30) at Equator Technologies and as a DSP engineer at
hh 2158 - Programming the NVIDIA Digital Siemens ICN. He is the inventor of two patents, and
Video Pipeline with OpenGL (Pre-Conference holds a Master degree from the Institut National des
Tutorial) (Monday, Sept 20, 13:00) Telecommunications in France.
hh 2161 - NVIDIA Quadro Digital Video Pipeline hh Session(s): 2216 - CUDA Libraries Open
Overview (Tuesday, Sept 21, 16:00) House (Wednesday, Sept 22, 11:00)
SPEAKERS AND
Tzeng, Stanley and a doctorate from Columbia University.
Graduate Student (University of California, Davis) hh Session(s): 2027 - GPU-Based Image Processing
PANELISTS
Stanley is a 3rd year PhD student at the University of in Military Applications (Thursday, Sept 23, 09:00)
California, Davis working with Professor John Owens.
Over the summer he is working at Microsoft Research Varhol, Peter
in Redmond with the Extreme Computing Group. His HPC Editor (Desktop Engineering Magazine)
research involves GPGPU algorithms, task parallel Peter Varhol is an industry veteran with over twenty
scheduling and alternative rendering pipelines on the years experience as a technology journalist, software
GPU. In his free time he loves to go explore different developer, product manager, and university processor.
places to eat and go on puzzle hunts. Currently he is HPC editor for Desktop Engineering
hh Session(s): 2162 - Real-time Reyes: magazine, and also principal at industry consulting
Programmable Rendering on Graphics firm Technology Strategy Research. He has graduate
Processors (Wednesday, Sept 22, 17:00) degrees in computer science, applied mathematics, and
psychology.
Uzzan, Bruno hh Session(s): 2130 - GPU Computing and a Revolution
CEO & Founder (Total Immersion) in Design Engineering (Tuesday, Sept 21, 11:00)
Bruno oversees operations and business development
for Total Immersion. He is principally responsible Varshney, Amitabh
for building the company’s client roster, including Professor (University of Maryland)
Renault, Peugeot,BMW, Disney, EADS, CBS, Thomson Amitabh Varshney is a Professor of Computer Science
FULL CONFERENCE GUIDE 2010
and SGI Japan. Before establishing Total Immersion, at the University of Maryland at College Park where
Uzzan served as a consultant for Pierre Henri Scacchi he directs the NVIDIA CUDA Center of Excellence. His
and Associates (Price Waterhouse Group).He holds a interests include GPU-based heterogeneous parallel
masters degree in management from the University of computing for computational biology, nano assembly,
Paris Dauphine. plasma physics, climate modeling and several other
hh Session(s): 4011 - Emerging Companies: CEO on applications. Varshney received a NSF CAREER Award in
Stage featuring Cinnafilm, Inc., Perceptive Pixel, 1995 and the IEEE Visualization Technical Achievement
and Total Immersion (Thursday, Sept 23, 16:00) Award in 2004. He is a Fellow of IEEE.
hh Session(s): 2263 - CUDA Centers of Excellence
Super-Session II (Tuesday, Sept 21, 15:00)
131
Vo, Huy
Venkataraman, Shalini Research Assistant (University of Utah)
Applied Engineer (NVIDIA) Huy T. Vo is a PhD student at the SCI Institute, University
Shalini Venkataraman is an applied engineer of Utah, under the supervision of Professor Claudio
withNVIDIA’s professional solutions group where T. Silva. His main research interests are in High
she focuses on using GPU’s to solve graphics and Performance Computing and Visualization Systems. Huy
visualization problems in the medical and oil & gas is currently working on HyperFlow, a parallel streaming
communities. Prior to joining NVIDIA, she was a framework for large-scale visualization where data-
research staff in scientific visualization at several flows can be executed efficiently on clusters of machines
institutions including the Center for Computation and with multiple CPUs and GPUs.
Technology at LSU and in Singapore, at the Institute hh Session(s): 2248 - Parallel Processing on GPUs at
of High-Performance Computing and the Center for the University of Utah (Wednesday, Sept 22, 14:00)
Information-Enhanced Medicine. Her interests include
scalable graphics and display environments, large Volkov, Vasily
volume visualization and higher bit depth rendering . PhD Student (UC Berkeley)
She earned her Master’s degree from the Electronic
Visualization Lab at the University of Illinois-Chicago and Vasily Volkov has contributed to substantial performance
B.Sc from the National University of Singapore. improvements in CUBLAS and CUFFT and has received
NVIDIA Graduate Fellowship in 2008. He is currently a
hh Session(s): 2009 - 4D Visualization and Ph.D. candidate at UC Berkeley.
Analysis of Flow (Tuesday, Sept 21, 17:00)
hh Session(s): 2238 - Better Performance at Lower
Vermes, Domokos Occupancy (Wednesday, Sept 22, 15:00)
Associate Professor (Worcester Polytechnic Insitute)
Vuik, Kees
Domokos Vermes received his MS in Electrical Professor (Delft University Of Technology)
Engineering at the Technische Universität Dresden.
He then earned his Ph.D. in Mathematics at the hh Session(s): 2049 - Deflated Preconditioned
University of Szeged and Doctorate in Mathematics at Conjugate Gradient on the GPU
the Hungarian Academy of Sciences. Currently, he is the (Wednesday, Sept 22, 14:30)
Associate Professor of Mathematics and the Founding
Director of Financial Mathematics Graduate Program
and Laboratory at the Worcester Polytechnic Institute. In Vukicevic, Vladimir
the past, he attended the University of Washington and Principal Engineer (Mozilla Corporation)
Brown University. He specializes in: optimization under Vladimir is a principal engineer at Mozilla, where he
uncertainty, optimal control of Stochastic Proc.,computer works on core browser technology. He is involved in
assisted medical decision making, quantitative finance, adding new capabilities to the web platform for use by
portfolio optimization and risk management, and high- both web content and the Firefox browser, and focuses
performance data analysis. on improving the rich media and graphics capabilities of
hh Session(s): 2111 - Using R for High-Performance the web. His early experiments with 3D in HTML canvas
Data Analysis (Tuesday, Sept 21, 16:30) led to the WebGL standard.
hh Session(s): 2113 - WebGL: Bringing 3D
Vetter, Jeffrey to the Web (Tuesday, Sept 21, 15:00)
Professor / Distinguished R&D Staff Member (Georgia
Tech / Oak Ridge National Lab) Wade, Will
Jeffrey Vetter, Ph.D., has a joint appointment between Business Alliance Manager (HP)
SPEAKERS AND
Oak Ridge National Laboratory (ORNL) and the Georgia Will Wade is the Business Alliance Manager for Hewlett-
Institute of Technology (GT). At ORNL, Vetter is a Packard Company’s Workstation Global Business Unit.
PANELISTS
Distinguished R&D Staff Member, and the founding His responsibilities include working with Intel, AMD, and
group leader of the Future Technologies Group in NVIDIA to develop programs and solutions that benefit
the Computer Science and Mathematics Division. At workstation users. Will joined Hewlett-Packard in 1997
GT, Vetter is a Joint Professor in the Computational as an engineer in Test & Measurement. In 1999 he
Science and Engineering School, where he serves as the transitioned to the Environmental Test Center for the
Principal Investigator of the NSF Track 2D Experimental workstation business and later moved in to a technical
Computing Facility, named Keeneland, for large scale marketing role working with software partners for
heterogeneous computing using graphics processors, workstation applications. He became a Workstation
and of the NVIDIA CUDA Center of Excellence. Product Manager in 2003 where he led the workstation
hh Session(s): 2262 - CUDA Centers of Excellence graphics strategy, and later the entry workstation
Super-Session I (Tuesday, Sept 21, 14:00) business.
hh Session(s): 2233 - Solving Your GPU Computing
Vidal, Antonio M. Needs (Sponsored by HP) (Tuesday, Sept 21, 14:00)
Professor (Universidad Politecnica de Valencia)
Antonio M. Vidal receives his Ph.D. degree in Computer Walker, Ross
Science in 1990, from the Universidad Politecnica de Research Professor (San Diego Supercomputer Center)
Valencia, Spain, where he is currently a full professor. He Ross Walker is a Research Professor at the San Diego
coordinates the project “High Performance Computing Supercomputer Center, an Adjunct Professor in the
on Current Architectures for Problems of Multiple Signal Chemistry and Biochemistry at UCSA and an NVIDIA
Processing”, financed by the Generalitat Valenciana in CUDA Fellow. He runs the Walker Molecular Dynamics
the frame of PROMETEO Program for research groups (MD) Lab leading a team that develops advanced
of excellence. His main areas of interest include parallel techniques for MD Simulations supporting simulations
computing with applications in numerical linear algebra improving drug and biocatalyst design. His work includes
and signal processing. improved Quantum Mechanical, Molecular Mechanical
hh Session(s): 2116 - Real-time Multichannel models and the development of a GPU accelerated
Audio Convolution (Thursday, Sept 23, 10:00) version of the AMBER Molecular Dynamics engine
PMEMD. for electromagnetics simulations. He co-authored the
hh Session(s): 2269 - Bringing GPUs first major text on discontinuous Galerkin methods,
to Mainstream Molecular Dynamics published by Springer in 2008. He is currently an
Packages (Thursday, Sept 23, 10:00) associate professor in the department of Computational
and Applied Mathematics at Rice University.
Wang, Long hh Session(s): 2078 - Shockingly fast and accurate
Associate Professor (Super Computing Center, Institute of CFD simulations (Wednesday, Sept 22, 11:00)
Computer Network Information of CAS)
Dr. Long Wang, he got his Ph.D in Computational Warren, Stephen
Mathematics from AMSS, CAS(Chinese Academy of Snr Linux Software Engineer (NVIDIA)
Sciences), in 2004. Then, he went to department of Stephen Warren is a Software Engineer at NVIDIA,
scientific & engineering computing of Peking University working on VDPAU (Video Decode and Presentation
for postdoc from 2004 to 2006. Now, he is associate API for Unix) and related portions of the Linux graphics
professor in Super Computing Center, Computer driver.
Network Information Center of CAS. His research hh Session(s): 2016 - VDPAU: PureVideo
interests include AMR algorithm, large scale GPU on Unix (Thursday, Sept 23, 15:00)
computing and high performance computing software.
He implemented parallel galatic wind code using 8192 Washbrook, Andy
cores on DeepComp 7000 supercomputing machine in Postdoctoral Research Assistant (University of Edinburgh)
2008. This summer, he held the first international GPU
workshop in Harbin, China (See: gpu-smp.sccas.cn). Dr. Andrew Washbrook is a physicist programmer based
at the University of Edinburgh working for the GridPP
hh Session(s): 2286 - Towards Peta-Scale collaboration. His previous physics research investigated
Green Computation - Applications of the GPU evidence for supersymmetric particle production
Supercomputers in the Chinese Academy of at CERN and he has also been a technical account
Sciences (CAS) (Wednesday, Sept 22, 11:00) manager for a leading enterprise open source software
company. His current research interests include
Wang, Peng investigating emerging computing methods that can be
Developer Technology Engineer (NVIDIA) used to improve the future software framework of the
Peng Wang is a member of the Developer Technology ATLAS experiment.
group at NVIDIA, where he develops algorithms for GPU hh Session(s): 2135 - Processing Petabytes
computing. Dr. Wang received his Ph.D. in Computational per Second with the ATLAS experiment
Physics from Stanford University, where his primary at the Large Hadron Collider at CERN
research was the development of multi-physics codes (Wednesday, Sept 22, 16:00)
for computational fluid dynamical simulations of
astrophysical turbulence. Dr. Wang’s education also Weber, Jason
includes a M.S. in Physics and a B.S. in Scientific Internet Explorer Performance Lead (Microsoft)
Computing from Nankai University, Tianjin, P.R. China.
Jason Weber is an engineering lead on the Microsoft
hh Session(s): 2008 - OpenCL Optimization Internet Explorer team. Jason is focused on ensuring
(Thursday, Sept 23, 14:00) that Internet Explorer 9 is ready for the performance
hh 2006 - Short-Range Molecular Dynamics demands of HTML5 applications, including hardware
on GPU (Wednesday, Sept 22, 14:00) accelerated graphics and compiled javascript. Jason has
been with Microsoft for thirteen years. Before joining
Wang, Xiaowei the Internet Explorer team in 2008, Jason worked on
SPEAKERS AND
(Institute of Process Engineering) projects ranging from Microsoft Office to Visual Studio,
and was a member of Chairman Bill Gates technical
hh Session(s): 2286 - Towards Peta-Scale
PANELISTS
staff.
Green Computation - Applications of the GPU
Supercomputers in the Chinese Academy of hh Session(s): 2274 - Harnessing the Power of the
Sciences (CAS) (Wednesday, Sept 22, 11:00) GPU in Internet Explorer 9 (Tuesday, Sept 21, 16:00)
High-Order Adaptive CFD Methods on an instrumental role in the $1.8 billion acquisition
GPUs (Thursday, Sept 23, 10:30) of Omniture, Inc., the $3.4 billion acquisition of
Macromedia, Inc., and a number of smaller acquisitions
Warburton, Timothy and venture investments.
Associate Professor (Rice University)
hh Session(s): 4010 - Emerging Companies: CEO
Tim Warburton specializes in devising new algorithms on Stage featuring NaturalMotion Ltd, OptiTex,
for solving partial differential equations. He is a leader
in the development of discontinuous Galerkin methods
133
and Useful Progress (Thursday, Sept 23, 15:00) Winarsky, Norman
hh 4011 - Emerging Companies: CEO on Stage VP Ventures, Licensing, and Strategic Programs (SRI
featuring Cinnafilm, Inc., Perceptive Pixel, and International)
Total Immersion (Thursday, Sept 23, 16:00) Norman Winarsky is SRI’s Vice President of Ventures,
Licensing, and Strategic Programs. As such he is
Whitehead, Nathan responsible for creating SRI’s highest value venture and
CUDA Software Engineer (NVIDIA) license opportunities. He is the creator and founder of
Nathan Whitehead works on the CUDA Platform team. SRI’s venture process, including venture and license
He holds a PhD in Computer Science from the University incubation, seed funding, the EIR program, and Venture
of California, Santa Cruz. Capital engagement. He chairs SRI’s Commercialization
Board and the nVention Board, a partnership with the
hh Session(s): 2216 - CUDA Libraries Open venture capital community that develops early-stage
House (Wednesday, Sept 22, 11:00) investment opportunities.
Williams, David M. hh Session(s): 4007 - Emerging Companies: CEO
PhD Candidate (Stanford University) on Stage featuring Aqumin, RTT, and Scalable
Display Technologies (Thursday, Sept 23, 10:00)
Mr. Williams is a Ph. D. candidate in the Aero/Astro
department at Stanford University under the advisement hh 4008 - Emerging Companies: CEO on
of Professor Antony Jameson. In particular, Mr. Williams Stage featuring ICD and Universal
is interested in developing efficient higher-order solvers Robotics (Thursday, Sept 23, 11:00)
capable of handling real world applications. He focuses
on characterizing fluid flow over complex geometries Witchel, Emmett
under viscous, unsteady, and compressible conditions. Professor (University of Texas at Austin)
hh Session(s): 2079 - A Fast, Scalable High- hh Session(s): 2124 - Operating System Abstractions
Order Unstructured Compressible Flow for GPU Programming (Thursday, Sept 23, 10:00)
Solver (Tuesday, Sept 21, 11:00)
Woolley, Cliff
Williams, Ian CUDA Developer Technology Engineer (NVIDIA)
Director PSG Applied Engineering (NVIDIA) Cliff Woolley is a developer technology engineer
Ian Williams is currently the Director of Applied at NVIDIA focused on enabling high-performance
Engineering within NVIDIA’s Professional Solutions computing on GPUs. He completed his Master of
Group, where he has worked since 2001. He holds Computer Science degree at the University of Virginia in
a BSc in Engineering Science and Technology from 2003, where his research group was among the earliest
Loughborough University (UK) as well as an MBA in academia to investigate the use of GPUs for general
from Pepperdine University (USA). He is a Chartered purpose computing.
Mechanical Engineer with the Institute of Mechanical hh Session(s): 2018 - OpenCL on the GPU (Pre-
Engineers (UK) and has been awarded 7 patents. He is Conference Tutorial) (Monday, Sept 20, 16:00)
also Chairman SPEC/GPC committee which is part of
the Standard Performance Evaluation Corporation. Wu, Ren
hh Session(s): 2279 - Working Man’s Guide to 3D Senior Scientist (HP Labs)
Video Editing (Thursday, Sept 23, 16:00) Dr. Ren Wu is a Senior Research Scientist at HP Labs,
hh 2222 - Working Man’s Guide to 3D Video Palo Alto. His research interests include data-intensive
Editing (Tuesday, Sept 21, 14:00) high-performance computing, massively parallel
algorithms and computational intelligence. In recent
SPEAKERS AND
SPEAKERS AND
hh Session(s): 2085 - Tridiagonal Solvers: Auto-
Young, Eric Tuning and Optimizations (Tuesday, Sept 21, 15:00)
PANELISTS
Manager of Developer Technology Profesional and
Consumer Applications (NVIDIA) Zhang, Yubo
PhD Student (UC Davis)
Eric Young manages the developer technology group
responsible for professional and consumer developers. Yubo Zhang is a PhD student supervised by Prof.
He has graduated from Cornell University in 1995 with Kwan-Liu Ma at the department of Computer Science,
a Master of Computer Engineering and University UC Davis. His research interests include numerical
of Michigan in 1994 with a Bachelors in Electrical methods, computer graphics and visualization.
Engineering. hh Session(s): 2145 - Photo Editing on the GPU
hh Session(s): 2260 - DirectCompute (Pre- with MuseMage (Thursday, Sept 23, 09:00)
Conference Tutorial) (Monday, Sept 20, 14:30)
Zhang, Yunquan
Young, Paul Professor (Institute of Software, CAS)
(Adobe) Prof. Yun-quan Zhang is the Associate Director of the
Parallel Computing Laboratory, Institute of Software,
hh Session(s): 2224 - GPU Acceleration in Adobe Chinese Academy of Sciences in Beijing, China. He
Creative Tools (Tuesday, Sept 21, 15:00) received his PhD degree in computer software and
theory from the same institute in 2000 and has worked
Zanella, Fabrizio at the Institute as a research scientist since then.
FULL CONFERENCE GUIDE 2010
Systems Manager (CST of America) His major research interests are in the areas of high
Fabrizio Zanella has over 15 years of experience working performance parallel computing, with particular
on Signal Integrity characterization of high speed digital emphasis on the design of large scale parallel
systems. Mr. Zanella has worked for several companies, computation modes and numerical libraries, and large
including Teradyne and EMC Corporation. Currently, he system performance modeling and evaluation. He has
is the Systems Manager at CST of America, a worldwide published over 90 papers and trained over 20 master and
provider of full wave electromagnetic software. In this Ph.D. students.
role, he leads the high performance computing effort,
advising customers on improving overall peformance of
135
hh Session(s): 2286 - Towards Peta-Scale
Green Computation - Applications of the GPU
Supercomputers in the Chinese Academy of
Sciences (CAS) (Wednesday, Sept 22, 11:00)
Zhao, Kaiyong
Graduate Student (HKBU)
I received my B.Eng. degree in the Aircraft Design and
Technology from Beijing Institute of Technology (BIT),
Beijing, P. R. China, in 2005. Then worked in CCUR
two years. I am currently an MPhil student in the
Department of Computer Science, Hong Kong Baptist
University.
hh Session(s): 2145 - Photo Editing on the GPU
with MuseMage (Thursday, Sept 23, 09:00)
Zhou, Huiyang
Associate Professor (North Carolina State University)
Huiyang Zhou is currently an associate professor in the
Department of Electrical and Computer Engineering at
North Carolina State University. His research focuses on
high performance microarchitecture, low-power design,
architecture support for system dependability, and GPU
Computing. He is a recipient of NSF CAREER award and
a senior member of the IEEE.
hh Session(s): 2067 - Experiences with Code
Optimizations for High Performance GPGPU
Programs (Tuesday, Sept 21, 16:00)
Ziegler, Gernot
Developer Technology (Compute) (NVIDIA)
Gernot Ziegler (MSc/civ.ing.) is an Austrian engineer with
an MSc degree in Computer Science and Engineering
from Linköping University, Sweden. He pursued his PhD
studies at the Max-Planck-Institute for Informatics
in Saarbrücken, Germany, where he specialized in
GPU algorithms for computer vision and data-parallel
algorithms for spatial data structures. As a member of
NVIDIA’s DevTech-Compute team, Gernot now consults
in high performance computing on graphics hardware.
hh Session(s): 2020 - GPU-Accelerated
Data Expansion for the Marching Cubes
Algorithm (Wednesday, Sept 22, 16:00)
SPEAKERS AND
Zigon, Robert
Sr Staff Development Engineer (Beckman Coulter)
Bob is the Software Technical Lead for Flow Cytometry
analysis products within Beckman Coulter. His interests
include high performance computing, numerical
analysis and information retrieval theory.
hh Session(s): 2055 - Application of Fermi
GPU to Flow Cytometry and Cancer
Detection (Thursday, Sept 23, 10:00)
A Powerful Platform for Amazing Performance
Performance. To get it right, you need a foundry with an Open Innovation Platform™ and process technologies that
provides the flexibility to expertly choreograph your success. To get it right, you need TSMC.
Whether your designs are built on mainstream or highly advanced processes, TSMC ensures your products achieve
maximum value and performance.
Product Differentiation. Increased functionality and better system performance drive product value. So you need
a foundry partner who keeps your products at their innovative best. TSMC’s robust platform provides the options you
need to increase functionality, maximize system performance and ultimately differentiate your products.
Faster Time-to-Market. Early market entry means more product revenue. TSMC’s DFM-driven design initiatives,
libraries and IP programs, together with leading EDA suppliers and manufacturing data-driven PDKs, shorten your yield
ramp. That gets you to market in a fraction of the time it takes your competition.
Investment Optimization. Every design is an investment. Function integration and die size reduction help drive your
margins. It’s simple, but not easy. We continuously improve our process technologies so you get your designs produced
right the first time. Because that’s what it takes to choreograph a technical and business success.
Find out how TSMC can drive your most important innovations with a powerful platform to create amazing performance.
Visit www.tsmc.com
Copyright 2010 Taiwan Semiconductor Manufacturing Company Ltd. All rights reserved. Open Innovation Platform™ is a trademark of TSMC.
EXHIBIT HALL
NVIDIA NVIDIA NVIDIA
29
30
104 106 108 110 118 120 122 124 126
PNY 31
105 107 109 111 119 121 123 125 127
12 32
11 33
34
35
36
10 37
9 38
74 76 78 80 82 84 86 88 90 92 94 96 98 100
8 39
75 77 79 81 83 85 87 89 91 93 95 97 99 101
7 40
6 41
5 72 73
42
70 71
43
68 69
44
4 66 67
45
3 64 65
46
2 62 63
47
1 60 61 48
49
59 58 57 56 55 54 53 52 51
HALL 1
Dell Dell’s research and development (R&D) efforts span the globe, driven by
some of the industry’s foremost product designers and engineers. At the
core of Dell’s research and development (R&D) efforts span the globe, driven
by some of the industry’s foremost product designers and engineers. At
the core of Dell’s innovation approach, however, remains an unwavering
commitment to deliver new and better solutions that directly address
customer needs. Many innovations begin in-house, led by a global team of
top engineers, product designers and technical experts. Others begin as
a team effort with Dell’s strategic partners like Nvidia. The mission is to
SPONSORS AND
Cooley Cooley LLP is a national law firm for the converging worlds of high
technology, high finance and high-stakes litigation. We are counselors,
strategists and advocates for the foremost private and public companies and
investors in all major technology fields. Our Emerging Companies practice
has a long tradition of representing emerging and high-growth companies
worldwide. The GPU space is an exciting growth area in the technology arena,
and Cooley has been at the forefront, advising both established and start-up
companies on the issues facing businesses in this industry. Our attorneys’
extensive experience in intellectual property protection and business
counseling along with the Firm’s deep roots in the technology sector give
us a unique perspective on the issues facing our clients. Cooley’s team
consists of experienced counselors and litigators that are equally skilled at
representing and advising clients on the protection and commercialization
of their intellectual property in a wide range of areas, including copyright,
trademark, patent, technology licensing, privacy, electronic security and
electronic commerce. We are dedicated to offering comprehensive and
creative legal support, utilizing the full resources of the Firm.
SPONSORS AND
Acer Established in 1976, globally Acer ranks No. 2 for total PCs and notebooks.
A profitable and sustainable Channel Business Model is instrumental to the
EXHIBITORS
company’s continuing growth, while its multi-brand approach effectively
integrates Acer, Gateway, Packard Bell, and eMachines brands in worldwide
markets. Overcoming the barriers between people and technology: this
is Acer’s long-term mission, to allow anyone to use and benefit from
technology. Acer is renowned for the development and manufacture of
sophisticatedly, environmentally friendly and intuitively designed, easy to use
products. For further information, please visit the website acer-group.com.
Gateway The Californian company Gateway is a historical brand in the IT market, and
has been a leading company in the field of computers and notebooks for
FULL CONFERENCE GUIDE 2010
GOLD SPONSORS
Next IO NextIO, Inc. is the leader in next-generation network consolidation
solutions for today’s dynamic data center in a variety of industries including
enterprise, oil and gas, High Performance Computing, digital media
and financial services. Leveraging PCI Express, NextIO offers true I/O
consolidation for any end-point technology
Citi Citi is today’s pre-eminent financial services company, with some 200 million
customer accounts in more than 100 countries. Our history dates back to the
founding of Citibank in 1812, Bank Handlowy in 1870, Smith Barney in 1873,
Banamex in 1884, and Salomon Brothers in 1910.
SPONSORS AND
EXHIBITORS
SILVER SPONSORS
GE Intelligent Platforms GE Intelligent Platforms is a leading manufacturer of rugged COTS
computer boards and systems for military programs. As a partner to NVIDIA
for Embedded Applications, GE-IP brings GPGPU technology into a wide
range of defense related programs and can now be used in ground tanks,
fighter aircraft, military helicopters, and UAV’s for Radar, ISR, DSP, Sensor
Processing, Imaging and many other military applications.
AMAX Information Technologies With 31 years as a leading systems manufacturer, AMAX delivers the
most cutting edge GPU and GPGPU computing solutions to solve data-
intensive computing challenges in today’s leading industries. Using an
open-architecture approach optimizing best-of-breed components for
solutions designed to fit precise needs with maximize functionality, superior
performance and power efficiency—this is the AMAX advantage.
SGI SGI is focused on helping customers solve their most demanding technology
challenges by delivering: high performance computing, servers, storage, data
center and cloud computing solutions and professional services. We develop,
market and sell a broad line of low-cost, mid-range and high-end scale-out and
scale-up servers and data storage solutions as well as differentiating software.
Sutter Hill Ventures Sutter Hill Ventures has financed technology-based start-ups and
assisted entrepreneurs in building market-leading companies since 1964.
Through our decades of experience, we have developed strong industry
networks, considerable operating and venture capital experience, and
an understanding of the challenges that early-stage and high-growth
companies face.
Hynix Hynix is a leading producer of DRAM, NAND Flash memory and Image
Sensors. Besides computing memory solutions, Hynix is a leader in Graphics
memory with a portfolio of high performance products.The newly introduced,
eco-friendly 1.35V, 44nm 2Gb GDDR5 offers 7Gbps speed targeting high end
graphics and high performance computing.
Silicon Valley Bank Silicon Valley Bank is the premier commercial bank for companies in the
technology, life science, venture capital, private equity and premium wine
industries. SVB provides a comprehensive suite of financing solutions,
treasury management, corporate investment and international banking
services to its clients worldwide. Through its focus on specialized markets
and extensive knowledge of the people and business issues driving
them, Silicon Valley Bank provides a level of service and partnership that
measurably impacts its clients’ success. Founded in 1983 and headquartered
SPONSORS AND
in Santa Clara, Calif., the company serves clients around the world through
26 U.S. offices and international operations in China, India, Israel and the
EXHIBITORS
United Kingdom. Silicon Valley Bank is a member of global financial services
firm SVB Financial Group (Nasdaq: SIVB), with SVB Analytics, SVB Capital,
SVB Global and SVB Private Client Services. More information on the
company can be found at www.svb.com. Silicon Valley Bank is the California
bank subsidiary and the commercial banking operation of SVB Financial
Group. Banking services are provided by Silicon Valley Bank, a member of
the FDIC and the Federal Reserve System. SVB Private Client Services is a
division of Silicon Valley Bank. SVB Financial Group is also a member of the
Federal Reserve System.
FULL CONFERENCE GUIDE 2010
Mandel Communications Inc. Mandel is a provider of communication skills coaching and training to
leading companies worldwide. Since 1993, Mandel has been dedicated
to helping executives, sales, and technical professionals achieve
improved business results through effective business conversations and
presentations, particularly when the stakes are high. Mandel’s proprietary
approach to strengthening presentation, conversation, and facilitation skills
turns spoken communications into a competitive advantage. With a global
presence, Mandel delivers its solutions both face-to-face and virtually in
more than 14 languages.
Emerging Companies
Exhibitors
3DreamTeam 3DreamTeam LLC have created a unique library of photo realistic
3D worlds and images as well as the Vizerra software platform and
ecosystem which allows their in-house team of thirty designers,
software engineers, technical artists or any independent studio
to assemble photo realistically rendered, 3D environments and
marketing content. The Vizerra Platform and Ecosystem will drive
3DreamTeam the content boom creating demand for enterprise and consumer
hardware. The unique benefits of the Vizerra Platform and
Ecosystem is speed to create, cost of production and size of file Vs
quality of rendering.
SPONSORS AND
Strings and the superlative image quality which can result when
EXHIBITORS
this power is properly harnessed. Cinnafilm is a privately held
company, headquartered in Albuquerque, NM, amongst powerful
resources such as the nation’s defense laboratories and New
Mexico’s highly competitive film tax incentives.
Code Sourcery CodeSourcery builds software tools that enable its customers to
get the most out of hardware platforms ranging from embedded
devices to supercomputers: Sourcery G++, tools for professional
embedded C and C++ developers, and Sourcery VSIPL++, a C++
library for developing high performance signal- and image-
processing applications.
FULL CONFERENCE GUIDE 2010
Cooliris, Inc Cooliris was founded in January 2006 with a simple mantra:
“Think beyond the browser.” We focus on creating products that
make discovering and enjoying the Web more exciting, efficient,
and personal. Each of us is passionate about serving our users
without compromise and seeing that our products deliver the best
experience possible. Headquartered in Palo Alto, CA, Cooliris is
backed by Kleiner Perkins Caufield & Byers, DAG Ventures, the
Westly Group, and T-Venture. For more information, please visit
145
www.cooliris.com/company.
Cyberlink CyberLink Corp. is the leader in enabling digital multimedia on
PCs and CEs. Our software solutions include 3D stereoscopic
applications, Blu-ray Disc playback and creation, digital home
entertainment, and touch-enabled media solutions. CyberLink’s
partners with worldwide business leaders in the PC industry include
top-5 desktop and notebook brands, drive manufacturers, and
graphics card markers. CyberLink offers the most extensive array
of 3D software for the consumer market, from Blu-ray™ 3D to 3D
conversion of 2D video and even 3D slideshow applications. Hardware
support from NVIDIA® 3D Vision™ and optimization for NVIDIA®
CUDA™ decoding and encoding technology results in a superb
viewing experience with reduced loading on PC system resources.
Discretix Security and content protection lie at the heart of the mobile
and home entertainment markets. Discretix’ suite of embedded
security solutions includes security co-processors, security sub-
systems, cryptographic cores and content protection applications.
The Discretix Downloadable Secure Player allows content service
providers (CSP) to target the large amount of devices already
in the market, overcoming the dependency on pre-installed or
embedded device applications and accelerating the deployment
of new services. This unique secure player supports both industry
standard and proprietary content protection schemes and is
compliant with the requirements of the content owners. Suitable to
a wide variety of connected devices including smartphones, e-book
readers, tablet computers, netbooks and internet-enabled TVs, the
secure player enables a broad range of premium content services
and applications. The secure player is available on various open
operating systems including iPhone OS, Android, Windows Mobile
/ Windows Phone, Linux and Symbian. Discretix’ content protection
solutions are field proven and trusted by some of the world’s best-
known semiconductor and device manufacturers including Intel,
HTC, SonyEricsson, Acer and Motorola. For more information with
visit www.discretix.com.
HPC Projects/ Wild Systems HPC Project is a high-tech company whose mission is to supply
combined hardware and software solutions to users requesting
high performance computing for their applications. Under the
brand name Wild Systems, HPC Project provides dedicated
appliances. Using the open source software Par4All for translating
standard C code in a CUDA-capable code, the appliances take full
advantage of the NVidia GPU technology. One of Wild Systems
turnkey appliances is the WildLab for the simulation community.
With the WildLab, users creating models in the Scilab script
language seamlessly produce a GPU accelerated autonomous
executable code.
Israel Economic Mission (IEM) “The Government of Israel Economic Mission (IEM) is responsible
for enhancing bi-national trade relations between the West Coast
and Israeli business communities. By leveraging its networking
capacity and industry knowledge in Israel and the Western United
States, IEM is able to seamlessly engage prospective business
partners half a world apart. Such actions manifest an array of
high-level connections ranging from brokering introductions to
organizing international trade missions. Our operations span a
variety of industry sectors, with a focus on high-tech, security, new
media, cleantech, and biotechnology.”
SPONSORS AND
spreadsheet in an organization. This technology stops “Spreadsheet
EXHIBITORS
Spreadmart Chaos” (hundreds of spreadsheets with uncoupled,
non-verifiable data “running amok” in an organization).
Mersive Technologies Inc Mersive Technologies’ display management software and solutions
bring unprecedented simplicity and affordability to large-scale,
beyond-HD displays allowing visual collaboration to go mainstream.
Mersive’s patented SOL software automatically aligns multiple
projectors into one seamless image of extraordinary quality and
resolution without the expense of specialized hardware, building
infrastructure or services.
FULL CONFERENCE GUIDE 2010
miGenius Limited Based on the powerful combination of NVIDIA CUDA based GPU’s
and mental images ‘iray’ rendering technology, miGenius has
developed an easy to use toolset, EasyRS, allowing non-technical
users to create fully customized User Interfaces with a wide
range of viewing and management controls that can be quickly
uploaded onto either a dedicated GPU server or utilizing the rapidly
emerging ‘cloud computing’ networks. With these toolsets, the
vast market of both businesses and consumers alike will be able
to rapidly transform these powerful web based rendering platform
147
technologies into a customisable and truly revolutionary 3D visual
communication and collaboration medium.
PhaseSpace PhaseSpace uses Nvidia GPUs with custom 4 megapixel cameras for
pose tracking, eye tracking, 3D scanning and computer vision tasks
in real time. PhaseSpace’s list of clients includes the U.S. Navy,
Air Force, Army, NASA, Boeing, Disney, Honda, Google, Stanford,
SPONSORS AND
Playcast Media Systems Playcast Media Systems brings video games to the world’s largest
media distribution platform – Pay TV networks. The company’s
solution includes a head-end based system, which streams a
game’s audiovisual content as a standard MPEG stream, as well as
the provisioning of the content and programming itself. Playcast’s
media streaming systems, located in operators’ headends, host
the games and stream them over the existing video network to an
already distributed base of set-top boxes. Playcast is a privately
owned, venture capital backed company, based in Israel and the UK.
Prometech Prometech Software, Inc. provides a particle-based CAE
software “Particleworks” for Japanese manufacturing industries.
Prometech, as an university-launched technology venture, has
a strong technical capability of complex physics simulation,
visualization and many-core acceleration and offers physics
based simulation in the fields of manufacturing, VFX and scientific
researches.
SPONSORS AND
its portfolio of clients includes names such as Adidas, Audi, BASF,
EXHIBITORS
BMW, Bosch, Daimler, EADS, Harley-Davidson, Miele, Porsche,
Samsung, Thyssen-Krupp, Toyota and Volkswagen. RTT AG is
a stock market listed company (Xetra:R1T; WKN: 701220; ISIN:
DE0007012205). For more information visit www.rtt.ag.
ScaleForm Corporation Scaleform is the leading provider of user interface software for the
videogame and consumer electronic industries. Scaleform GFx leverages
the power of the Adobe® Flash® tool set and enables developers to
quickly create powerful and immersive user interface environments
149
Tide Powerd Ltd. Founded in 2009 at the University of Alabama, TidePowerd Ltd.
was the first-ever participant accepted into Red Gate Software’s
(http://www.red-gate.com) “Springboard” incubator program. Our
core goal is to provide developers with powerful, yet easy-to-use
tools for numerical and high-performance computing (HPC) - and
to provide those tools with the best value and technical support
possible. To this end, we’ve created GPU.NET, a system that
allows developers to write their GPU code in any .NET-supported
language (e.g., C#, F#, IronPython). GPU.NET opens up the exciting
world of GPU computing to millions of new developers worldwide,
and we hope it will help to make GPU computing more popular
than ever before.
Universal Robotics Universal Robotics creates software that enables machines to learn
from their experiences, react and adapt to their surroundings, and
perform tasks that are costly, dangerous or difficult for humans to
SPONSORS AND
Ace Computers Ace computers is a 27 year old system integrator with a focus on
high performance computing, workstations, servers and storage.
We hold WSCA contract B27157, GSA schedule GS-35F-0400T and
other BPAs including major universities and federal agencies.
CIARA Technologies CIARA designs, develops, markets, services, and supports a variety
of High Performance systems including TITAN Systems based
on NVIDIA® Tesla™ 20-series, NEXXUS-4000® and CX1 Personal
Cluster, the acclaimed VXRACK® high density blade server, VXPRO®
rack-mount/tower servers and GRAPHIXX® high-end workstations.
SPONSORS AND
call (888) 942-3800.
EXHIBITORS
Colfax International Buy it from a trusted expert. Colfax provides the most comprehensive
range of innovative, cutting-edge and highly customized GPU
solutions. Colfax has been first-to-market with a 4GPU PSC and the
revolutionary CXT8000 - the world’s first 8GPU server unveiled at
GTC 09. Leading universities, labs and companies are accelerating
their research and business outcomes with optimally configured,
ready-to-go Colfax GPU solutions. Join us for an in-person
conversation at Booth #xxxx. Or visit www.colfax-intl.com
Cubix Corporation Cubix Visual & GPU Compute Solutions is a new division of Cubix
Corporation, a vertically-integrated manufacturer with 35 years
of manufacturing experience, focused on delivering deskside and
rackmount modular, scalable, GPU Compute hardware solutions
for demanding applications such as physically-based rendering,
animation, visualization, and cloud computing applications.
James River Technology Inc. JRTI, a leading provider of (HPC) solutions to the marketplace and
Velocity Micro, the premier high-performance personal computer
provider in North America, are pleased to introduce VelocityHPC,
our latest initiative focused on NVIDIA Tesla GPU Accelerated
Computing solutions.
SPONSORS AND
EXHIBITORS
JMR Electronics Inc JMR is a leading value provider of scalable storage systems for
performance and capacity driven applications in the government,
DCC, VOD, video surveillance and Web 2.0 markets. Headquartered
in Chatsworth, California, JMR has been developing reliable,
high performance RAID storage technologies since 1982. JMR’s
complete line of BlueStor PeSAN™ DAS, NAS and SAN solutions,
manufactured entirely in the U.S.A., are ideal for nearly every IT
and video production need. For further information please visit,
www.jmr.com.
Koi Computers Koi Computers, Inc. has over fifteen years of experience in the IT
hardware and systems integration industry. We offer a wide range
of custom configured computer systems and an extensive catalog
of technology products. Our core competencies include computer
high performance clusters; server, storage, and blade solutions;
desktops, laptops, and workstations; mounting and cabling
solutions; and strategic sourcing for technology products.
Los Alamos National Labs Los Alamos National Laboratory is a premier national security
research institution, delivering scientific and engineering solutions
for the nation’s most crucial and complex problems. Our primary
responsibility is ensuring the safety, security, and reliability of the
nation’s nuclear deterrent.
Mathworks Over 1,000,000 engineers and scientists in more than 100 countries,
on all seven continents, use MATLAB® and Simulink®. These
products have become fundamental tools for work at the world’s
most innovative technology companies, government research labs,
financial institutions, and at more than 5000 universities. For more
information, visit www.mathworks.com
SPONSORS AND
Morgan Kauffman Morgan Kaufmann has been bringing the knowledge of experts to
EXHIBITORS
the computing community since 1984. Our goal is to provide timely
yet timeless content to research and development professionals,
business leaders and IT managers, everyday practitioners, and
academia. We publish textbooks and references in Artificial
Intelligence, Computer Networking, Computer Architecture,
Computer Graphics & Game Development, Data Management &
Business Intelligence, Software Engineering, and User Experience
& Human Computer Interaction. For more information, visit mkp.com.
Platform Computing Platform Computing is the leader in cluster, grid and cloud
management software – serving more than 2,000 of the world’s
most demanding organizations. For 18 years, our workload and
resource management solutions have delivered IT responsiveness
and lower costs for enterprise and HPC applications. Platform has
strategic relationships with Cray, Dell, HP, IBM, Intel, Microsoft,
Red Hat, and SAS. GPU-accelerated clusters are rapidly growing
in popularity as powerful, cost-effective High Performance
Computing (HPC) solutions. NVIDIA’s GPU hardware, along with
the CUDA computing environment, is delivering impressive results
for commercial HPC workloads. Platform HPC suite, with CUDA
kit, enables analysts, engineers, and scientists to unlock the
power of NVIDIA GPU Clusters, making them easier to deploy,
run and manage.
Portland Group The Portland Group® (PGI®) offers high performance parallel
compilers and tools for workstations, servers and clusters based
on 64-bit x86 processors with NVIDIA CUDA-enabled GPUs running
under Linux, MacOS or Windows operating systems. PGI GPU
accelerator products include directive-based PGI Accelerator™
Fortran and C compilers and CUDA Fortran.
PSSC Labs PSSC Labs is everything you expect from your technology provider,
and more. With 20 years in business, PSSC Labs possesses the
knowledge, expertise and procedures to deliver high performance
computing solutions to the world’s most demanding organizations.
PSSC Labs computing solutions empower next generation science.
Tech- X Corporation Tech-X offers products and services for high-performance
computing. GPULib enables users of MATLAB and IDL to take
advantage of GPUs from within these high-productivity languages.
We offer consulting, training, and custom software development
to migrate customers’ scientific computing problems onto
hardware accelerated architectures, using technologies like
CUDA, OpenCL, or MPI.
Marketing Partners
SPONSORS AND
EXHIBITORS
FULL CONFERENCE GUIDE 2010
155
VISUALIZE A GREEN EVENT What We’re Doing
Place compostables and recyclables in proper bins >> 100% of convention center’s greenhouse gas is offset
Use public transportation during the show >> Extensive composting and recycling
In hotel, decline new sheets and towels >> Producers and vendors agree to green guidelines
Also, unplug phone and laptop chargers >> Minimizing printed materials
Offset your travel at www.cool-it.us >> Using recycled and biodegradable paper/non-toxic inks
Take only collateral/giveaways you will use >> Monitoring lighting and A/C usage
>> Local and organic food options
>> Non-toxic cleaning materials
A5 J3
E J2
A3
C
D EXHIBIT HALL KEYNOTE HALL SHOW MANAGEMENT
& SALES OFFICE J1 J4
A2 A7
EMERGING COMPANIES SPEAKER GREEN
F2 H
SUMMIT ROOM
B F1 G
A1 A8
STREET LEVEL
REGISTRATION
PARALLEL
PRESS NSIGHT LOUNGE
LOUNGE BY MICROSOFT
SILICON VALLEY
BOARD ROOM THINK TANK
MARRIOTT
GUADALUPE
HILTON
SAN JOSE
BALLROOM PARKING
3
WILLOW GLEN 2
1
ELEVEVATOR TO
BLOSSOM HILL ROOMS
(3RD FLOOR)
K L M N
COAT &
BAG CHECK
HILTON
MAIN ENTRANCE
MARRIOTT