Silicon Valley May 8-11, 2017

Schedule Planner

Print
Download PDF
 
  • List View

 
Refine:
  • Session Levels:
  • |
  • |
  • |
  • |

TALK

Presentation
Details

S7113 - GPU-Accelerated Graph Analytics

Howie Huang Associate Professor, The George Washington University
Howie Huang is an associate professor in the Department of Electrical and Computer Engineering at George Washington University.

Future high-performance computing systems will enable fast processing of large datasets, as highlighted by President Obama's executive order on the National Strategic Computing Initiative. Of significant interest is the need for analyzing big graphs arising from a variety of areas -- from social networks and biology, to national security. We'll present our ongoing efforts at George Washington University in accelerating big graph analytics on GPUs. We've developed a GPU-based graph analytics system that delivers exceptional performance through efficient scheduling of a large number of GPU threads and effective utilization of GPU memory hierarchy.

Level: All
Type: Talk
Tags: Accelerated Analytics; Supercomputing & HPC

Day: TBD
Time: TBD
Location: TBD

S7125 - Efficient Imaging in Radio Astronomy Using GPUs

Bram Veenboer PhD Researcher, Astron
Bram Veenboer is a Ph.D. researcher at ASTRON, the Netherlands Institute for Radio Astronomy. His work focuses on accelerator platforms towards the biggest radio telescope in the world: SKA.

Realizing the next generation of radio telescopes such as the Square Kilometre Array requires both more efficient hardware and algorithms than today's technology provides. We'll present our work on the recently introduced Image-Domain Gridding (IDG) algorithm that tries to avoid the performance bottlenecks of traditional AW-projection gridding. We'll demonstrate how we implemented this algorithm on various architectures. By applying a modified roofline analysis, we show that our parallelization approaches and optimization leads to nearly optimal performance on all architectures. The analysis also indicates that, by leveraging dedicated hardware to evaluate trigonometric functions, NVIDIA GPUs are much faster and more energy-efficient than regular CPUs. This makes IDG on GPUs a candidate for meeting the computational and energy-efficiency constraints for future telescopes.

Level: Intermediate
Type: Talk
Tags: Astronomy & Astrophysics; Performance Optimization

Day: TBD
Time: TBD
Location: TBD

S7129 - Scalable Deep Learning with Microsoft Cognitive Toolkit

Sayan Pathak Principal Engineer and ML Scientist, Microsoft
Sayan Pathak is a principal engineer and machine learning scientist at Microsoft. He is on the faculty at the University of Washington and IIT Kharagpur, India. His interests are in deep learning, vision, informatics, and online ads.

We'll introduce the Microsoft open source, production-grade deep learning Cognitive Toolkit (formerly CNTK) in a talk that will be a prelude to a detailed hands-on tutorial. The Cognitive Toolkit was used recently to achieve a major breakthrough in speech recognition by reaching human parity in conversational speech. The toolkit has been powering use cases leveraging highly performant GPU platforms. It is being used by several customers both on-premises and on Azure cloud. We'll introduce different use cases leveraging fully connected CNN, RNN/LSTM, auto encoders, and reinforcement learning. We'll deep dive into topics that enable superior performance of the toolkit in comparison with similar open source toolkits. We'll showcase scalability across multiple GPUs and multiple servers. We'll provide a teaser hands-on experience with Jupyter notebooks running on Azure with simple introductory to very advanced end-to-end use cases.

Level: Beginner
Type: Talk
Tags: Deep Learning & AI; Accelerated Analytics

Day: TBD
Time: TBD
Location: TBD

S7143 - Anomaly Detection for Network Intrusions Using Deep Learning

Adam Gibson CTO, Skymind
Adam Gibson is the cofounder and CTO of Skymind as well as the creator of Deeplearning4j, the first commercial-grade deep learning library for the JVM. He helps companies deploy deep learning to production.

We'll describe how deep learning can be applied to detect anomalies, such as network intrusions, in a production environment. In part one of the talk, we'll build an end-to-end data pipeline using Hadoop for storage, Streamsets for data flow, Spark for distributed GPUs, and Deeplearning4j for anomaly detection. In part two, we'll showcase a demo environment that demonstrates how a deep net uncovers anomalies. This visualization will illustrate how system administrators can view malicious behavior and prioritize efforts to stop attacks. It's assumed that registrants are familiar with popular big data frameworks on the JVM.

Level: Intermediate
Type: Talk
Tags: Deep Learning & AI; Accelerated Analytics

Day: TBD
Time: TBD
Location: TBD

S7153 - Efficient Observations Forecast for the World's Biggest Eye Using DGX-1

Damien Gratadour Associate Professor, Université Paris Diderot & Observatoire de Paris
Damien Gratadour has been an associate professor at Universite Paris Diderot and research scientist at LESIA, Observatoire de Paris since 2008. Damien holds an M.S. in theoretical physics and a Ph.D. in observational astronomy from Universite Paris Diderot. In the past, Damien has been responsible for the last stages of commissioning of the LGS upgrade to the Altair AO system on the Gemini North Telescope in Hawaii (2006). He spent two years as an AO scientist, with the responsibility of instrument scientist for GeMS, the Gemini MCAO System, a $15 million facility, participating in the various acceptance tests and integration of its sub-systems and first stages of technical tests of the full instrument and most notably the DSP-based RTC. At Observatoire de Paris, Damien is concentrating on high-performance numerical techniques for astronomy for modeling, signal processing, and instrumentation and on the development of observational programs.
Hatem Ltaief Senior Research Scientist, KAUST
Highly-Rated Speaker
Hatem Ltaief is a senior research scientist in the Extreme Computing Research Center at KAUST, where he also advises several students in their M.S. and Ph.D. research. Hatem's research interests include parallel numerical algorithms, fault tolerant algorithms, parallel programming models, and performance optimizations for multicore architectures and hardware accelerators. His current research collaborators include Aramco, Total, Observatoire de Paris, Cray, NVIDIA, and Intel. Hatem received his engineering degree from Polytech Lyon at the University of Claude Bernard Lyon I, France, an M.S. in applied mathematics at the University of Houston, and a Ph.D. in computer science from the University of Houston. From 2008 to 2010, he was a research scientist in the Innovative Computing Laboratory in the Department of Electrical Engineering and Computer Science at the University of Tennessee, Knoxville.

Have you heard about the largest ground-based telescope ever built? Are you interested in the newest NVIDIA DGX-1 hardware accelerator? Come and learn how the DGX-1 architecture dramatically leaps forward the computational astronomy community in designing major, multimillion-dollar optical instruments for the European Extremely Large Telescope. Starting from the mathematical model up to the high-performance implementation on distributed-memory systems with hardware accelerators, we'll explain how the resulting matrix computations associated with an efficient task-based programming model help design the next generation of telescope instruments.

Level: Intermediate
Type: Talk
Tags: Astronomy & Astrophysics; Tools and Libraries

Day: TBD
Time: TBD
Location: TBD

S7172 - Autonomous Drone Navigation with Deep Learning

Nikolai Smolyanskiy Principal Software Engineer, NVIDIA
Nikolai Smolyanskiy is a Principal Software Engineer on the NVIDIA Autonomous Vehicles team. Prior to NVIDIA, he worked on various projects at Microsoft, including: Microsoft Research on a drone project, Hololens on SLAM for AR headset pose tracking, and at Kinect on face tracking and other projects in machine learning, natural language processing and online search. Nikolai is currently pursuing his Ph.D. focusing on mini-max optimal control, and received his M.S. in Applied Mathematics.
Alexey Kamenev Senior Deep Learning and Computer Vision Engineer, NVIDIA
Alexey Kamenev is a senior deep learning and computer vision engineer at NVIDIA on the autonomous vehicles team. Prior to NVIDIA, Alexey worked at Microsoft on various projects, including Microsoft Research (CNTK and deep learning), Azure machine learning (Machine Learning Algorithms team), and Bing (Relevance team). He has an M.S. in applied mathematics.

We'll present an autonomous drone piloted by a deep neural network (DNN) that can autonomously navigate through a forest by following trails and can avoid obstacles. DNN gets video frames from the onboard drone camera as its input and computes high-level control commands as its output. The control commands are sent to the low-level drone's autopilot for execution. Our DNN runs onboard an NVIDIA® Tegra® TX1 in real time. The drone uses open source PX4 flight stack for the low-level control and ROS for its runtime. We'll present the DNN's architecture, describe how we train it and run it as ROS node. We'll also demo the flight videos and show some qualitative analysis of the autonomous flights.

Level: All
Type: Talk
Tags: Intelligent Machines & IoT; Computer Vision & Machine Vision; Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

S7174 - An Architectural Design Firm's Journey Through Virtual GPU Technology for Global Collaboration

Jimmy Rotella Design Application Specialist, CannonDesign
Jimmy Rotella has a background in technology, architecture, and education, uniquely positioning him to help designers throughout the AEC industry build and realize their digital designs using cutting-edge technology.
Andrew Schilling Director of Information Technology, CannonDesign
Andrew Schilling's central charge is to develop and execute CannonDesign's information technology strategies, advancing tools, workflows, and emerging technologies that enable design teams to deliver outstanding solutions for clients.

Learn the benefits that virtualization provides for an architecture and engineering design firm, along with the journey through the advancements in virtualization technology it took to finally meet the graphics-intensive needs of our design software. We'll share our experiences in how virtualization allows a large company, with over 15 offices and 1,000 people worldwide, to collaborate and work as a single firm. We'll show some cost comparisons with virtualization, along with their management benefits and requirements. We'll also look at the methods we used to set and test metrics specific to our requirements, and follow the results of those metrics through the changes in graphics virtualization technology.

Level: All
Type: Talk
Tags: AEC Industries; Data Center & Cloud Computing; Graphics Virtualization

Day: TBD
Time: TBD
Location: TBD

S7175 - Exploratory Visualization of Petascale Particle Data in NVIDIA DGX-1

Benjamin Hernandez Computer Scientist, Oak Ridge National Laboratory
Benjamin Hernandez is a computer scientist in the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory. His research interests are in the intersection of crowd simulations, scientific visualization, interactive computer graphics, and human computer interaction using HPC systems.

Learn to leverage the visualization capabilities of the NVIDIA® DGX-1™ system to visualize particle data. We'll cover techniques suitable for exploratory visualization such as parallel dataset reading and reduction on demand with ADIOS I/O library, GPU-based optimization techniques for particle rendering such as radar view frustum culling, occlusion culling, texture-less point sprites, and OpenGL near zero driver overhead methods. We'll also include implementation details to take advantage of the eight NVIDIA Pascal™ GPUs included in the NVIDIA DGX-1.

Level: All
Type: Talk
Tags: In-Situ & Scientific Visualization; Real-Time Graphics; Supercomputing & HPC

Day: TBD
Time: TBD
Location: TBD

S7194 - Light Baking with IRAY

Martin-Karl Lefrançois DevTech Software Engeneer Lead , NVIDIA
Martin-Karl Lefrancois is a senior software engineer and team lead in the Developer Technology organization at NVIDIA in Berlin. Martin-Karl works with various NVIDIA rendering and core development teams to bring to clients the best rendering experience. Prior to NVIDIA, he worked at mental images to deliver automatic GPU support in mental ray. After graduating with a degree in computer science and mathematics from the University of Sherbrooke in Quebec, he worked as a graphic developer for nearly 10 years at Softimage in Montreal and Tokyo before leading the core game engine team at A2M.

Learn how to have global illumination in a real-time engine using Iray renderer. With Iray Photoreal Renderer, allow your real-time engine to use the full global illumination computation from the most advanced path tracer. The technique uses the properties of the physically based material (MDL) assigned to the object and all the various sources of energy in the scene. Sources can be high-dynamic-range images, built-in sun and sky, implicit lights, such as point, spot, and area lights, and also emissive objects. The method also allows use of light path expression, which can create a light map for a specific group of lights, or excluding objects from the calculation.

Level: All
Type: Talk
Tags: Real-Time Graphics; Rendering & Ray Tracing

Day: TBD
Time: TBD
Location: TBD

S7199 - Interactive HPC: Large Scale In-situ Visualization using NVIDIA Index in ALYA MultiPhysics

Vishal Mehta Senior Engineer, Barcelona Supercomputing Center
Vishal Mehta works as a senior engineer at the Barcelona Supercomputing Center. He is motivated by a co-design approach driven by ambitious applications and influencing the software stack for the development of next-generation, exascale-ready HPC ecosystems. Vishal's fields of interest include computational mechanics, linear algebra, and GPU algorithms for computational science. He has six years of experience in working with GPUs in HPC ecosystem.
Christopher Lux Senior Graphics Software Engineer, NVIDIA IndeX R&D, NVIDIA
Christopher Lux is a senior graphics software engineer at the NVIDIA Advanced Rendering Center. He received is PhD in computer science in 2013 from the Bauhaus-Universität Weimar, Germany. Through his interest in real-time computer graphics and scientific visualization he early on focused his work on the interactive visualization of large-scale datasets from the geo-scientific and medical domain.
Marc Nienhaus Sr. Engineering Manager, Product Technology Lead, NVIDIA IndeX, NVIDIA
Marc Nienhaus is the product technology lead of the NVIDIA IndeX(TM) commercial software at NVIDIA. He manages the NVIDA IndeX software engineering team and is responsible for overall product architecture and applications in various domains. Before joining mental images' R&D rendering department and NVIDIA, Marc was a postdoc at Northwestern University and led research projects at the University of Potsdam. His research interests include parallel and distributed rendering and computing, scientific visualization, GPU-based rendering, and photorealistic and non-photorealistic expressive depictions. He holds a master's in mathematics with a minor in computer science from the University of Muenster and a Ph.D. in computer science from the Hasso Plattner Institute at the University of Potsdam. Marc has published various papers on GPU-based real-time rendering and non-photorealistic rendering.

We'll discuss how NVIDIA IndeX™ Advanced Rendering Tools are helping researchers get more insight through in-situ visualizations. HPC applications have always been centered around large computations, small input, and extremely large simulated output. HPC applications running on big supercomputers are executed using a queuing system, where researchers have to wait a couple of hours before analyzing the outputs. We've designed essential software components that allow in-situ visualizations of sparse volume data from ALYA multiphysics simulation code (Barcelona Supercomputing Center) using NVIDIA IndeX. ALYA multiphysics is one of the two European exascale benchmarks and is used in targeted medicine, cardiac modeling, renewable energy, etc. We'll guide you through techniques that have been used in enabling in-situ rendering and analysis of data.

Level: Intermediate
Type: Talk
Tags: In-Situ & Scientific Visualization; Supercomputing & HPC

Day: TBD
Time: TBD
Location: TBD

S7209 - Using NVIDIA FLeX for Real-Time Fluid Simulation in Virtual Surgery

Bradley Hittle Senior Research Software Engineer, Ohio Supercomputer Center
Bradley Hittle is a senior research software engineer at the Ohio Supercomputer Center, specializing in software engineering for the development, support, and evaluation of virtual systems and virtual reality-based simulations for medical applications. Brad's primary areas of research include the integration and evaluation of computer interface technology for virtual simulation, developing tools that aid viewing and interactivity with large data, the development of software and hardware systems for real-time virtual simulations and visualizations, and using GPU technology to provide increased algorithm performance. Brad has contributed extensively to projects funded through ARDF, NIDCD, NIOSH, and the National Institute of Health. His primary areas of expertise are real-time volume visualization, software engineering, computer interface technology for virtual systems, and GPU compute development.

Learn how to use NVIDIA FleX to simulate complex real-time fluid interaction. We'll use our virtual surgical environment to give a detailed overview of techniques and algorithms needed to incorporate FleX into your application. Topics include collision handling with dynamic volumetric data through signed distance field approximation, as well as tricks for emulating diffusion, bleeding, and absorption. We demonstrate the necessity for optimizations in a compute-intensive application through the use of threading and multi-GPU support. A basic understanding of the FleX library is assumed.

Level: Intermediate
Type: Talk
Tags: Real-Time Graphics; Healthcare & Life Sciences

Day: TBD
Time: TBD
Location: TBD

S7215 - Automating VR and Photoreal Imagery From Siemens Teamcenter

Dave Hutchinson Chief Technology Officer, Lightwork Design Ltd.
Dave Hutchinson is responsible for leading Lightworks into new technology and business opportunities through its Lightworks Iray+, Lightworks Web Configurator, Iray for 3DSMax, and connected VR solutions, all based around NVIDIA's world-leading Iray technology. Dave leads product development, engineering, support, and customer interactions. He is also responsible for driving commercial strategy and directing the marketing and sales teams to achieve new sales and existing customer satisfaction. Dave has an extensive background in visualization technology and the 3D market.

Learn how manufacturers are automating and in-housing their digital photorealistic and VR/AR visualization pipelines out of Siemens Teamcenter and NX through JT. This is leading to improved efficiency and cost reduction and, crucially, enabling manufacturer control over digital assets that allows them to be repurposed across the business. We'll demonstrate how to set up an automated visual digital pipeline out of Siemens Teamcenter into NVIDIA Iray and Epic Unreal Engine, accounting for configuration rules and buildability.

Level: All
Type: Talk
Tags: Manufacturing Industries; Rendering & Ray Tracing; Virtual Reality and Augmented Reality

Day: TBD
Time: TBD
Location: TBD

S7218 - Training of Deep Networks with Half-Precision Float

Boris Ginsburg Deep Learning Engineer, NVIDIA
Boris Ginsburg is a principal engineer working on deep learning algorithms at NVIDIA, which he joined in 2015. For last five years, Boris has worked on distributed deep learning algorithms and hardware accelerators for deep learning. Before that, he worked on hardware accelerators for machine learning, computer vision, and speech recognition; CPU architecture; and wireless networking. He has 60 issued patents and 15 patents applications in the area of CPU, GPGPU, and wireless networking. Boris earned his a Ph.D. in applied math (non-smooth optimization) from Technion in 1997.

We'll describe new algorithms used to train very deep networks with half-precision float. Float16 has two major potential benefits: better training speed and reduced memory footprint. But Float16 has very narrow numerical range (0.00006,65504). This narrow numerical range can result both in overflow ("inf/nan" problem) or underflow ("vanishing gradient") during training of deep networks. We'll describe the new scaling algorithm, implemented in nvcaffe, which prevents these negative effects. With this algorithm, we successfully trained such networks as Alexnet, GoogLeNet, Inception_v3, and Resnets without any loss in accuracy. Other contributors to this work are S. Nikolaev, M. Houston, A. Kiswani, A. Gholaminejad, S. Migacz, H. Wu, A. Fit-Florea, and U. Kapasi.

Level: Intermediate
Type: Talk
Tags: Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

S7248 - GPU Computing for the Construction Industry: AR/VR for Learning, Planning, and Safety

Kyle Szostek Sr. Virtual Construction Engineer, Gilbane Building Company
Kyle Szostek is a senior virtual design and construction engineer who has been with Gilbane Building Company for the last four years, managing virtual design and construction services for over $2 billion of construction projects. He's focused his work on research and development of collaborative BIM workflows, visualization techniques, and AR/VR tools. With a background in 3D art and a bachelor's of architecture from the University of Arizona, Kyle brings a unique 3D visualization skillset to Gilbane's VDC team.
Ken Grothman Sr. Virtual Construction Engineer, Gilbane Building Company
Ken Grothman is a senior virtual design and construction engineer who has been with Gilbane Building Company for two years, involved in over $1 billion of construction work, including high-end corporate, medical facilities, and mission-critical data infrastructure. Ken specializes in laser scanning and reality capture, and is an active member in the industry's laser scanning community. With a background in design|build architecture and a master's of architecture from the University of Kansas, Ken brings a pragmatic, problem-solving skillset to Gilbane's VDC team.

We'll dive headfirst into some of the current challenges of the construction industry, how we're addressing them, and how we're planning to utilize virtual/augmented reality and real-time GPU computing to address them. To optimize the construction of a building, site logistics must be planned, and all systems analyzed and coordinated to confirm constructability. Along with the use of building information modeling (BIM) and the advent of inexpensive GPU and AR/VR hardware, we're building tools to redefine the planning and analysis process for construction management. No longer are virtual and augmented reality systems just for entertainment; they can help us plan faster, help confirm our client's design goals, and facilitate stronger communication among our team members before and during the construction process.

Level: Beginner
Type: Talk
Tags: AEC Industries; Virtual Reality and Augmented Reality

Day: TBD
Time: TBD
Location: TBD

S7298 - Blasting Sand with NVIDIA CUDA: MPM Sand Simulation for VFX

Gergely Klar Software Engineer, DreamWorks Animation
Gergely Klar received his Ph.D. from the University of California, Los Angeles. During his graduate studies he worked on a range of physically based animation projects, including MPM, SPH, and FEM simulations. Gergely joined the DreamWorks Animation's FX Research and Development team, where he is helping artists create more magnificent effects. He is a Fulbright Science and Technology alumnus, avid sailor, and father of two.

We'll present our challenges and solutions for creating a material point method (MPM)-based simulation system that meets the production demands of fast turnaround for artistic look development. Our method fully utilizes the GPU and performs an order of magnitude faster than the latest published results. With this improvement, the technique's main limiting factor - its speed - has been eliminated, making MPM appealing for a wider range of VFX applications. Practitioners in computational physics and related fields are likely to benefit from attending the session as our techniques are applicable to other hybrid Eulerian-Lagrangian simulations.

Level: Intermediate
Type: Talk
Tags: Media & Entertainment; Computational Physics

Day: TBD
Time: TBD
Location: TBD

S7310 - 8-Bit Inference with TensorRT

Szymon Migacz CUDA Library Software Engineer, NVIDIA
Szymon Migacz has worked on the CUDA libraries team at NVIDIA since 2015. His main focuses include CUDA math library, cuRAND, and cuFFT. Recently he has been working on accelerating deep learning algorithms, including inference in reduced numerical precision.

Traditionally, convolutional neural networks are trained using 32-bit floating-point arithmetic (FP32). By default, inference on these models employs FP32 as well. We'll describe a method for converting FP32 models to 8-bit integer (INT8) models. Our method doesn't require re-training or fine-tuning of the original FP32 network. A number of standard networks (AlexNet, VGG, GoogLeNet, ResNet) had been converted from FP32 to INT8. Converted models achieve comparable Top 1 and Top 5 inference accuracy. The methods are implemented in TensorRT and can be executed on GPUs that support new INT8 inference instructions.

Level: Intermediate
Type: Talk
Tags: Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

S7314 - Fast Flow-Based Distance Quantification and Interpolation for High-Resolution Density Distributions

Steffen Frey Postdoc, Visualization Research Center, University of Stuttgart
Steffen Frey is a postdoc in Thomas Ertl's lab at the Visualization Research Center of the University of Stuttgart. His research primarily focuses on performance-related aspects in scientific visualization. Steffen has made research contributions in situ visualization, (dynamic) parameter tuning and performance prediction, the analysis of time-dependent data, image-based visualization, as well as scheduling for interactive visualization. He also serves on numerous committees in the field. He has a diploma in computer science and a Ph.D. in visualization, both from the University of Stuttgart.

We'll discuss our GPU-targeted algorithm design for the efficient computation of distances and interpolates between high-resolution density distributions (based on the Earth Mover's Distance / the Wasserstein metric). We particularly focus on the changes - and their rationale - to transition from our previous multicore approach to a manycore design (utilizing NVIDIA® CUDA®, CUB, and Thrust) that yields a massive improvement in performance. Expressive distances and interpolates are a crucial building block for numerous applications in computer vision, computer graphics, and visualization, and we'll give examples from different areas to demonstrate both utility and performance of our improved approach.

Level: Intermediate
Type: Talk
Tags: In-Situ & Scientific Visualization; Rendering & Ray Tracing

Day: TBD
Time: TBD
Location: TBD

S7320 - Optimizing Efficiency of Deep Learning Workloads through GPU Virtualization

Tim Kaldewey Performance Architect, IBM Watson
Tim Kaldewey is the performance architect for the Watson Innovations (R&D) group at IBM. Before joining IBM, Tim worked at Oracle's special projects group as a senior researcher with a focus on high-performance data management. He joined IBM Research in 2010 and moved to his current position in the Watson Group when it was established in 2014. He received his Ph.D. in computer science from the University of California Santa Cruz in 2010. Tim has published over two dozen articles, which include two best paper awards at major conferences. He is also an adjunct faculty at the University of Pennsylvania, where he teaches selected GPU acceleration topics.
David K. Tam Performance Analyst, IBM
David Tam is a performance analyst in the Power Systems Performance Department at the IBM Canada Lab. David has worked on performance analysis tools development, performance optimization of Watson Services on Power Systems, and performance analysis of hardware-accelerated databases. In 2016, he received an IBM Outstanding Technical Achievement Award for his work on deep optimization of Watson Services on Power Systems. David received B.A., M.S., and Ph.D. in computer engineering from the University of Toronto in 1999, 2003, and 2010, respectively. He is author or co-author of eight technical papers.

Cognitive applications are reshaping the IT landscape with entire data centers designed and built solely for that purpose. Though computationally challenging, deep learning networks have become a critical building block to boost accuracy of cognitive offerings like Watson. We'll present a detailed performance study of deep learning workloads and how sharing accelerator resources can improve throughput by a factor of three, effectively turning a four GPU commodity cloud system into a high-end, 12-GPU supercomputer. Using Watson workloads from three major areas that incorporate deep learning technology (language classification, visual recognition, and speech recognition), we document effectiveness and scalability of this approach.

Level: Intermediate
Type: Talk
Tags: Deep Learning & AI; Performance Optimization

Day: TBD
Time: TBD
Location: TBD

S7332 - Accelerated Astrophysics: Using NVIDIA(R) DGX-1(TM) to Simulate and Understand the Universe

Brant Robertson Associate Professor of Astronomy and Astrophysics, University of California, Santa Cruz
Brant Robertson is an Associate Professor in the Department of Astronomy and Astrophysics at the University of California, Santa Cruz. His research interests include theoretical topics related to galaxy formation, dark matter, hydrodynamics, and numerical simulation methodologies. Brant was previously an assistant professor at the University of Arizona from 2011-2015, held a Hubble Fellowship in the Astronomy Department at the California Institute of Technology from 2009-2011, and a Spitzer and Institute Fellowship at the Kavli Institute for Cosmological Physics and Enrico Fermi Institute at the University of Chicago from 2006-2009. Brant earned his Ph.D. in astronomy from Harvard University in 2006, and received his B.S. in physics and astronomy at the University of Washington, Seattle in 2001. He can be found on Twitter at @brant_robertson.

Get an overview of how GPUs are used by computational astrophysicists to perform numerical simulations and process massive survey data. Astrophysics represents one of the most computationally heavy sciences, where supercomputers are used to analyze enormous amounts of data or to simulate physical processes that cannot be reproduced in the lab. Astrophysicists strive to stay on the cutting edge of computational methods to simulate the universe or process data faster and with more fidelity. We'll discuss two important applications of GPU supercomputing in astrophysics. We'll describe the astrophysical fluid dynamics code CHOLLA that runs on the GPU-enabled supercomputer Titan at Oak Ridge National Lab and can perform some of the largest astrophysical simulations ever attempted. Then we'll describe the MORPHEUS deep learning framework that classifies galaxy morphologies using the NVIDIA DGX-1 deep learning system.

Level: All
Type: Talk
Tags: Astronomy & Astrophysics; Supercomputing & HPC

Day: TBD
Time: TBD
Location: TBD

S7343 - Optimizer's Toolbox: Fast CUDA Techniques for Real-Time Image Processing

Sarah Kabala High-Performance Graphics Engineer, Aechelon Technology, Inc.
Sarah Kabala is a graphics engineer at Aechelon Technology, pioneers in image generator hardware and software for aircraft simulators and geospecific Earth database creation. Her development work includes real-time image-processing kernels for filter effects and object tracking and artist-friendly tools to detect features in aerial imagery with machine learning. Following a passion for applied programming, Sarah left her hometown Ph.D. program at Iowa State University to join Aechelon. Raised by a blind parent, she began thinking about vision and imagery at an early age. In her free time, Sarah plays Tetris and builds Duplos with her nephews.

Take your kernels to the next level with performance-enhancing techniques for all levels of the CUDA memory hierarchy. We'll share lessons gleaned from implementing demanding image-processing algorithms into the real-time visual simulation world. From CPU prototype to optimized GPU implementation, one algorithm saw 150,000X speedup. Techniques to be presented include: instantaneous image decimation; CDF via warp shuffle; block and grid shapes for easy-to-program cache optimization; designing XY-separable kernels and their intermediate data; and sliding window tradeoffs for maximum cache locality. Straightforward examples will make these optimizations easy to add to your CUDA toolbox.

Level: Intermediate
Type: Talk
Tags: Real-Time Graphics; Video and Image Processing; Performance Optimization

Day: TBD
Time: TBD
Location: TBD

S7346 - Real Time American Sign Language Video Captioning Using Deep Neural Networks

Syed Ahmed Research Assistant, Rochester Institute of Technology
Syed Tousif Ahmed is majoring in computer engineering at RIT and works there as a research assistant in the Future Everyday Technology Lab and in the Center on Access Technology. Syed's interests lie in computer vision, machine learning, embedded systems, and cryptography.

We'll demonstrate how to build an end-to-end video captioning system using deep neural networks. The specific application we'll discuss is an American Sign Language video captioning system. We'll discuss implementation details of the neural network with popular frameworks, like TensorFlow and Torch, as well as how to deploy the system on embedded platforms, like the NVIDIA® Jetson™TX1 and NVIDIA SHIELD™ tablet, to achieve real-time captioning of live videos.

Level: Intermediate
Type: Talk
Tags: Deep Learning & AI; Computer Vision & Machine Vision; Media & Entertainment

Day: TBD
Time: TBD
Location: TBD

S7383 - Accelerating Cyber Threat Detection with GPU

Joshua Patterson Director of Applied Solution Engineering, NVIDIA
Joshua Patterson is the director of Applied Solution Engineering at NVIDIA and a former Presidential Innovation Fellow. Prior to NVIDIA, Josh was the principal data scientist in Accenture's Cyber Security Lab. Rethinking the cyber defense problem, Josh worked with leading experts across the public and private sectors and academia to build a next-generation cyber defense platform. His current passions are graph analytics, GPUs, and advanced visualization. Josh also loves storytelling with data, and some of his work can be seen at "Hotshotcharts":http://www.hotshotcharts.com/, "Disaster Visualization":http://www.disasterviz.com/, and "Loan Risk Analysis":http://www.riskanalyticsviz.com. Josh holds a B.A. in economics from the University of North Carolina at Chapel Hill and an M.A. in economics from the University of South Carolina Moore School of Business.

Analyzing vast amounts of enterprise cyber security data to find threats is hard. Cyber threat detection is also a continuous task, and because of financial pressure, companies have to find optimized solutions for this volume of data. We'll discuss the evolution of big data architectures used for cyber defense and how GPUs are allowing enterprises to do better threat detection more efficiently. We'll discuss (1) briefly the evolution of traditional platforms to lambda architectures with new approaches like Apache Kudu to ultimately GPU-accelerated solutions; (2) current GPU-accelerated database, analysis, and visualization technologies (such as Kinetica and Graphistry), and discuss the problems they solve; (3) the need to move beyond traditional table-based data-stores to graphs for more advanced data explorations, analytics, and visualization; and (4) the latest advances in GPU-accelerated graph analytics and their importance all for improved cyber threat detection.

Level: Intermediate
Type: Talk
Tags: Accelerated Analytics

Day: TBD
Time: TBD
Location: TBD

S7391 - Turbocharging VMD Molecular Visualizations with State-of-the-Art Rendering and VR Technologies

John Stone Senior Research Programmer, University of Illinois at Urbana-Champaign
Highly-Rated Speaker
John Stone is a senior research programmer in the Theoretical and Computational Biophysics Group at the Beckman Institute for Advanced Science and Technology, and associate director of the NVIDIA CUDA Center of Excellence at the University of Illinois. John is the lead developer of VMD, a high-performance molecular visualization tool used by researchers all over the world. His research interests include molecular visualization, GPU computing, parallel processing, ray tracing, haptics, and virtual environments. John was awarded as an NVIDIA CUDA Fellow in 2010. In 2015, he joined the Khronos Group advisory panel for the Vulkan graphics API. He also provides consulting services for projects involving computer graphics, GPU computing, and high performance computing.

State-of-the-art molecular simulations pose many challenges for effective visualization and analysis due to their size, timescale, and the growing complexity of the structures under study. Fortunately, a panoply of new and emerging technologies can address these challenges. We'll describe our experiences and progress adapting VMD, a widely used molecular visualization and analysis tool, to exploit new rasterization APIs such as EGL and Vulkan, and the NVIDIA OptiX(TM) ray tracing API for interactive, in-situ, and post-hoc molecular visualization on workstations, clouds, and supercomputers, highlighting the latest results on IBM POWER hardware. Commodity VR headsets offer a tremendous opportunity to make immersive molecular visualization broadly available to molecular scientists, but they present many performance challenges for both rasterization- and ray tracing-based visualization. We'll present results from our ongoing work adapting VMD to support popular VR HMDs.

Level: Intermediate
Type: Talk
Tags: In-Situ & Scientific Visualization; Virtual Reality and Augmented Reality; Supercomputing & HPC

Day: TBD
Time: TBD
Location: TBD

S7405 - Bifrost: a Python/C++ Framework for Easy High-Throughput Computing

Miles Cranmer Research Assistant, Harvard-Smithsonian Center for Astrophysics
Miles Cranmer is a physics undergraduate at McGill University and research assistant in Lincoln Greenhill's research group at the Harvard-Smithsonian Center for Astrophysics, specializing in software instrumentation for radio telescopes. He loves astroinformatics and machine learning, and has a profound interest in singularities - both black holes and superintelligence alike.

Bogged down trying to build a fast GPU processing pipeline? We'll present a solution: Bifrost, a framework for rapidly composing real-time data collection and analysis pipelines. Real-time data processing lies at the heart of most modern radio telescopes, and while hardware capabilities and data collection rates advance to the petascale regime, development of efficient real-time processing codes remains difficult and time-consuming. Bifrost solves this problem by combining a TensorFlow-like Python API with a library of common algorithms and highly efficient data transport. We'll describe the design and implementation of this framework, and demonstrate its use as the backend for a large radio telescope.

Level: Intermediate
Type: Talk
Tags: Astronomy & Astrophysics; Tools and Libraries

Day: TBD
Time: TBD
Location: TBD

S7424 - Introduction and Techniques with NVIDIA Voxels

Rama Hoetzlein Graphics Research Engineer, NVIDIA
Rama Hoetzlein is the lead architect of NVIDIA Voxels (GVDB) at NVIDIA, where he investigates applications of sparse volumes to 3D printing, scientific visualization, and motion pictures. In 2010, Rama's interdisciplinary thesis work in media arts at the University of California, Santa Barbara, explored creative support tools for procedural modeling. He studied compute science and fine arts at Cornell University, and co-founded the Game Design Initiative at Cornell in 2001.

We'll explore NVIDIA Voxels, a new open source SDK framework for generic representation, computation, and rendering of voxel-based data. We'll introduce the features of the new SDK and cover applications and examples in motion pictures, scientific visualization, and 3D printing. NVIDIA Voxels, based on GVDB Sparse Volume technology and inspired by OpenVDB, manipulates large volumetric datasets entirely on the GPU using a hierarchy of grids. The second part of the talk will cover in-depth use of the SDK, with code samples, and coverage of the design aspects of NVIDIA Voxels. A sample code walk-through will demonstrate how to build sparse volumes, render high-quality images with NVIDIA OptiX(TM) integration, produce dynamic data, and perform compute-based operations.

Level: All
Type: Talk
Tags: In-Situ & Scientific Visualization; Computational Physics; Manufacturing Industries; Media & Entertainment

Day: TBD
Time: TBD
Location: TBD

S7435 - Adapting DL to New Data: An Evolutionary Algorithm for Optimizing Deep Networks

Steven Young Research Scientist in Deep Learning, Oak Ridge National Laboratory
Steven Young is a researcher at Oak Ridge National Laboratory working in the Computational Data Analytics Group. His research focuses on applying deep learning to challenging datasets using HPC to enable faster training and quicker discovery. He has a Ph.D. in computer engineering from the University of Tennessee, where he studied machine learning in the Machine Intelligence Lab.

There has been a surge of success in using deep learning in imaging and speech applications for its relatively automatic feature generation and, in particular, for convolutional neural networks, high-accuracy classification abilities. While these models learn their parameters through data-driven methods, model selection (as architecture construction) through hyper-parameter choices remains a tedious and highly intuition driven task. To address this, multi-node evolutionary neural networks for deep learning (MENNDL) is proposed as a method for automating network selection on computational clusters through hyper-parameter optimization performed via genetic algorithms. MENNDL is capable of evolving not only the numeric hyper-parameters (for example, number of hidden nodes or convolutional kernel size), but is also capable of evolving the arrangement of layers within the network.

Level: Intermediate
Type: Talk
Tags: Deep Learning & AI; Supercomputing & HPC

Day: TBD
Time: TBD
Location: TBD

S7437 - Deep Learning based Big Data Analytics for Medical Imaging

Di Zhao Dr., Chinese Academy of Sciences
Dr. Di Zhao introduced their research in Show Your Science at GTC 2016.

Medical big data analytics includes electronic health records, medical imaging, genomic data, and more. Meanwhile, medical imaging data occupies more than 90 percent among them. How to apply medical big data into clinical practice? This is a question that concerns medical and computational researchers, and deep learning and GPU computing provide an excellent answer for this question. We'll introduce our research of deep learning-based disease diagnosis such as Alzheimer's disease and mild cognitive impairment, and discuss current statuses and approaches of deep learning-based medical big data analytics.

Level: All
Type: Talk
Tags: Healthcare & Life Sciences; Accelerated Analytics; Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

S7449 - Driving the Assembly of the Zebrafish Connectome through Deep Learning

Ishtar Nyawira Co-President, Timmy Global Health: Pitt Chapter, University of Pittsburgh
Ishtar Nyawira is a computer science major at the University of Pittsburgh (class of 2018). Upon entering her freshman year, she chose to study biology but quickly grew interested in computer science, despite having little background in the field. After changing her major in her third year, she became wholly dedicated to educating herself inside and outside of the classroom in the fields of computer science. After she graduates with a B.S. in computer science and a minor in Korean, she will pursue a Ph.D. in machine learning or computer science. She works at the Pittsburgh Supercomputing Center on a machine learning project that will harness the power of deep learning to automate the process of high-resolution biomedical image annotation. Her current research interests include machine learning and deep learning, natural language processing and computational linguistics, software engineering, biological modeling and simulation, and the pairing of HPC and AI.
Nick Nystrom Senior Director of Research, Pittsburgh Supercomputing Center
Nick Nystrom is senior director of research at the Pittsburgh Supercomputing Center. Nick leads the scientific research and future technology teams of PSC, including the user support for scientific applications, biomedical, and public health applications groups, as well as a core team targeting strategic applications, allocations, and project management. He is principal investigator for "Bridges," a new kind of supercomputer that converges HPC and HPDA and aims to aid researchers who are new to HPC. His research interests include machine learning and data analytics, genomics, causal modeling, coupling HPC applications and AI, graph algorithms, hardware and software architecture, software engineering for HPC, and performance modeling. Nick earned his B.S. in chemistry, math, and physics and his Ph.D. in quantum chemistry from the University of Pittsburgh.

Tracing pathways through large volumes of data is an incredibly tedious, time-consuming process that significantly encumbers progress in neuroscience and the tracing of neurons through an organism. We'll explore the potential for applying deep learning to the automation of high-resolution scanning electron microscope image data segmentation. We've started with neural pathway tracing through 5.1GB of whole-brain serial-section slices from larval zebrafish collected by the Center for Brain Science at Harvard. This kind of manual image segmentation requires years of careful work to properly trace the neural pathways in an organism as small as a zebrafish larvae, which is approximately 5mm in total body length. Automating this process could vastly improve productivity, which would lead to faster data analysis and more breakthroughs in understanding the complexity of the brain.

Level: All
Type: Talk
Tags: Deep Learning & AI; Supercomputing & HPC

Day: TBD
Time: TBD
Location: TBD

S7452 - Cutting Edge OptiX Ray Tracing Techniques for Visualization of Biomolecular and Cellular Simulations in VMD

John Stone Senior Research Programmer, University of Illinois at Urbana-Champaign
Highly-Rated Speaker
John Stone is a senior research programmer in the Theoretical and Computational Biophysics Group at the Beckman Institute for Advanced Science and Technology, and associate director of the NVIDIA CUDA Center of Excellence at the University of Illinois. John is the lead developer of VMD, a high-performance molecular visualization tool used by researchers all over the world. His research interests include molecular visualization, GPU computing, parallel processing, ray tracing, haptics, and virtual environments. John was awarded as an NVIDIA CUDA Fellow in 2010. In 2015, he joined the Khronos Group advisory panel for the Vulkan graphics API. He also provides consulting services for projects involving computer graphics, GPU computing, and high performance computing.

We'll present the latest advances in the use of NVIDIA ® OptiX™ for high-fidelity rendering of state-of-the-art biomolecular and cellular simulations. We'll present the latest technical advances in the OptiX-based ray -racing engines in VMD, which are heavily used for both interactive progressive ray-tracing (local and remote), and for batch mode in-situ or post-hoc visualization of petascale molecular dynamics simulations.

Level: All
Type: Talk
Tags: Rendering & Ray Tracing; In-Situ & Scientific Visualization; Healthcare & Life Sciences; Supercomputing & HPC

Day: TBD
Time: TBD
Location: TBD

S7466 - Production-Quality, Final-Frame Rendering on a GPU

Panagiotis Zompolas CTO, Redshift Rendering
Panagiotis Zompolas is a video game industry veteran driven by a passion for computer graphics and hardware. Panos has worked with GPUs since the days of the 3dfx and has closely followed the GPU compute revolution since its inception in the mid-2000s. Panos' career in the video game industry includes leading companies like Sony Computer Entertainment Europe and Double Helix Games (now Amazon Games). He has led teams of graphics programmers in the creation of render engines, spanning several generations of hardware. This experience, tied with his passion for the industry, is one of the key pillars of Redshif's success.
Robert Slater VP Engineering, Redshift
Robert Slater is a seasoned GPU software engineer and video game industry veteran, with a vast amount of experience in and passion for the field of programming. As a programmer, Rob has worked for companies such as Electronic Arts, Acclaim, and Double Helix Games (now Amazon Games). During this time, he was responsible for the core rendering technology at each studio, driving their creative and technical development. Rob's graphics engine programming experience and know-how ensures that Redshift is always at the forefront of new trends and advances in the industry.

We'll discuss the latest features of Redshift, the GPU-accelerated renderer running on NVIDIA GPUs that is redefining the industry's perception towards GPU final-frame rendering. A few customer work examples will be demonstrated. This talk will be of interest to industry professionals who want to learn more about GPU-accelerated production-quality rendering as well as software developers who are interested in GPU-accelerated rendering.

Level: Intermediate
Type: Talk
Tags: Rendering & Ray Tracing; Media & Entertainment

Day: TBD
Time: TBD
Location: TBD

S7467 - Multi-Dimensional Deep Learning for Medical Images

Bradley Erickson Director, Radiology Informatics Lab, Mayo Clinic
Brad Erickson received his M.D. and Ph.D. from Mayo Clinic. He went on to be trained in radiology, and then a neuroradiology fellowship at Mayo, and has been on staff at Mayo for 20 years. He does clinical neuroradiology, has been chair of the Radiology Informatics Division, and is currently associate chair for research. He has been vice chair of information technology for Mayo Clinic. He has been awarded multiple external grants, including NIH grants on MS, brain tumors, polycystic kidney disease, and medical image processing. He is a former president of the Society of Imaging Informatics in Medicine and is the chair of the board of directors for the American Board of Imaging Informatics and is on the board of the IHE USA. He holds several patents and has been involved in three startup companies.

Machine learning and deep learning have been applied to medical images to predict tumor type, genomics, and therapy effects. They can also be used to segment images, such as to define a tumor. While some traditional machine learning work has been multi-dimensional and multi-parametric, very few deep learning applications have gone beyond applying photographic networks to medical image problems. As such, they ignore some of the rich information available in other dimensions (3D and time) as well as parameter space (other types of images). We'll discuss some of the challenges and early results in extending traditional 2D convolutional neural networks to n-dimensional images, including space, time, and other parametric image types. Challenges include representational issues as well as computational (for example, memory constraints). Applications we'll show include multi-dimensional image segmentation of brain tumors as well as prediction of tumor genomics and therapy response.

Level: All
Type: Talk
Tags: Healthcare & Life Sciences; Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

Talk

TUTORIAL

Presentation
Details

S7271 - Leveraging Electronic Health Records Using Recurrent Neural Networks

David Ledbetter Data Scientist, Children's Hospital Los Angeles
David Ledbetter has an extensive and deep understanding of decision theory. He has experience implementing various decision engines, including convolutional neural networks, recurrent neural networks, random forests, extra trees, and linear discrimination analysis. His particular area of focus is in performance estimation, where he has demonstrated a tremendous ability to accurately predict performance on new data in nonstationary, real-world scenarios. David has worked on a number of real-world detection projects, including detecting circulating tumor cells in blood, automatic target recognition utilizing CNNs from satellite imagery, make/model car classification for the Los Angeles Police Department using CNNs, and acoustic right whale call detection from underwater sonobuoys. Recently, David has been developing RNNs to generate personalized treatment recommendations to optimize patient outcomes using unstructured electronic medical records from 10 years of data.

We'll explore how deep learning can be leveraged in a healthcare setting to predict severity of illness in patients based on information provided in electronic health records from a pediatric intensive care unit. This tutorial session will use the deep learning framework keras to build a recurrent neural network. The result will be an analytic framework powered by deep learning that provides medical professionals the capability to generate patient mortality predictions at any time of interest.

Level: All
Type: Tutorial
Tags: Healthcare & Life Sciences; Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

Tutorial

INSTRUCTOR-LED LAB

Presentation
Details

L7104 - Deep Learning Using Microsoft Cognitive Toolkit with Hands-on Tutorials

Sayan Pathak Principal Engineer and ML Scientist , Microsoft
Sayan Pathak is a Prin. Engineer and ML Scientist at Microsoft. He is also a faculty at Univ. of Washington and IIT Kharagpur (India) with interest in deep learning, vision, informatics & online ads.

Level: Intermediate
Type: Instructor-Led Lab
Tags: Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

L7105 - Learn to create simple object detection pipeline with CUDA EGL-streams and GIE.

Yogesh Kini Manager, System Software, NVIDIA
Yogesh Kini manages the Tegra CUDA driver team at NVIDIA. For the last several years he has been working on enabling GPU compute software on different Tegra platforms. His team is responsible for CUDA API and compute system software on various embedded, mobile and automotive platforms based on Tegra SOC. He holds a B.S in Computer Science from Manipal institute of Technology, India.

Level: Intermediate
Type: Instructor-Led Lab
Tags: Deep Learning & AI; Self-Driving Cars

Day: TBD
Time: TBD
Location: TBD

L7107 - Kokkos, Manycore Performance Portability Made Easy for C++ HPC Applications

H. Carter Edwards Principle Member of Technical Staff, Sandia National Laboratories
Highly-Rated Speaker
H. Carter Edwards currently leads the Kokkos project (github.com/kokkos/kokkos) at Sandia National Laboratories. Carter has a BS and MS in Aerospace Engineering and PhD in Computational Mathematics from the University of Texas at Austin. He has over three decades of experience in modeling & simulation software development and over two decades of experience in HPC, parallel processing, and C++ software development. His recent (6 year) HPC focus is on algorithms and programming models for thread-scalable and performance portable parallelism across next generation platform (NGP) node architectures. Carter represents Sandia on the ISO C++ language standard committee.
Christian Trott Senior Member Technical Staff, Sandia National Laboratories
Christian Trott is a high performance computing expert with experience in designing and implementing software for GPU and MIC compute-clusters. He earned a Dr. rer. nat. from the University of Technology Ilmenau in theoretical physics. Prior scientific work focused on computational material research using Ab-Initio calculations, molecular dynamic simulations and monte carlo methods. As of 2015 Christian is a senior member of technical staff at the Sandia National Laboratories. He is a core developer of the Kokkos programming model with a large role in advising applications on adopting Kokkos to achieve performance portability for next generation super computers.
Fernanda Foertter (tbd - on file with GTC), Oak Ridge National Laboratory
Highly-Rated Speaker
(tbd: on file with GTC)

Level: Intermediate
Type: Instructor-Led Lab
Tags: Supercomputing & HPC; Tools and Libraries

Day: TBD
Time: TBD
Location: TBD

L7108 - CUDA Programming in Python with Numba

Stanley Seibert Director of Community Innovation, Continuum Analytics
Dr. Stanley Seibert is the Director of Community Innovation at Continuum Analytics and also contributes to the Numba project. He received a Ph.D. in experimental high energy physics from the University of Texas at Austin and performed research at Los Alamos National Laboratory, University of Pennsylvania and the Sudbury Neutrino Observatory. Prior to joining Continuum Analytics, Stan was Chief Data Scientist at Mobi working on vehicle fleet tracking and route planning. Stan has more than a decade of experience using Python for data analysis and has been doing GPU computing since 2008.
Siu Kwan Lam Software Developer, Continuum Analytics
Siu Kwan Lam is a software developer at Continuum Analytics and the lead developer of the Numba open source compiler project. He has a B.S. and M.S. degree in Computer Engineering from San Jose State University. He taught CUDA at San Jose State University during his senior year and has researched TCP covert channel detection for NSF, STC, and TRUST.

Level: Intermediate
Type: Instructor-Led Lab
Tags: Programming Languages; Tools and Libraries

Day: TBD
Time: TBD
Location: TBD

L7109 - Advanced NVIDIA GRID Deployment Hands-on Lab

Jeff Weiss Director, Solution Architects West Territory, NVIDIA
Jeff Weiss is a Director that leads the West Territory SA team within the Solution Architecture & Engineering group at NVIDIA. Prior to joining NVIDIA, Jeff had a pedigree that included a 7 year stint at VMware as an EUC Staff Engineer, as well as spending time at Symantec and Sun Microsystems. Along with his current focus of NVIDIA GPU enabled computing solutions, his experience includes HPC, datacenter business continuity/disaster recovery solutions, software infrastructure identity management and email security/archiving tools.
Shailesh Deshmukh Sr. Solutions Architect, NVIDIA
Seasoned systems engineer with 15+ years of experience in virtualization, networking and storage technologies. He is passionate about new technologies such as NVIDIA GRID in VDI, VR, Deep Learning and Autonomous Driving. He's a big fan of outdoor activities and SIFI movies.

Level: Intermediate
Type: Instructor-Led Lab
Tags: Graphics Virtualization

Day: TBD
Time: TBD
Location: TBD

L7114 - Multi GPU Programming with MPI and OpenACC

Jiri Kraus Senior Devtech Compute, NVIDIA
Highly-Rated Speaker
Jiri Kraus is a senior developer in NVIDIA's European DeveTech team. In his work he focuses on multi GPU programming models and the NVIDIA collaborations with the Juelich Super Computing Centre.
Robert Henschel Director Science Community Tools, Indiana University
Robert received his M.Sc. from Technische Universität Dresden, Germany. He joined Indiana University in 2008, first as the manager for the Scientific Applications Group and since 2016 as the director for Science Community Tools. His responsibilities include making sure that the scientists from IU can utilize the IU IT resources to enable scientific discoveries and breakthroughs.
Guido Juckeland Head of Computational Science Group, Helmholtz-Zentrum Dresden-Rossendorf
Guido Juckeland received his Ph.D. from Technische Universität Dresden for his work on trace-based performance analysis for hardware accelerators. He has a long history in working with GPUs and teaching GPU programming. In 2016 he joined HZDR to head the newly founded computational science group. His responsibilities include working with the researchers at better utilizing the central IT resources for scientific purposes.

Level: Intermediate
Type: Instructor-Led Lab
Tags: Programming Languages; Supercomputing & HPC

Day: TBD
Time: TBD
Location: TBD

L7115 - In-Depth Performance Analysis for OpenACC/CUDA®/OpenCL Applications with Score-P and Vampir

Guido Juckeland Head of Computational Science Group, Helmholtz-Zentrum Dresden-Rossendorf (HZDR)
Guido Juckeland received his Ph.D. from Technische Universität Dresden for his work on trace-based performance analysis for hardware accelerators. He has a long history in working with GPUs and teaching GPU programming. In 2016 he joined HZDR to head the newly founded computational science group. His responsibilities include working with the researchers at better utilizing the central IT resources for scientific purposes.
Robert Henschel Director Science Community Tools, Indiana University
Robert received his M.Sc. from Technische Universität Dresden, Germany. He joined Indiana University in 2008, first as the manager for the Scientific Applications Group and since 2016 as the director for Science Community Tools. His responsibilities include making sure that the scientists from IU can utilize the IU IT resources to enable scientific discoveries and breakthroughs.
Jiri Kraus Senior Devtech Compute, NVIDIA
Highly-Rated Speaker
Jiri Kraus is a senior developer in NVIDIA's European DevTech team. In his work he focuses on multi GPU programming models and the NVIDIA collaborations with the Juelich Super Computing Centre.

Level: Intermediate
Type: Instructor-Led Lab
Tags: Supercomputing & HPC; Tools and Libraries

Day: TBD
Time: TBD
Location: TBD

L7120 - Getting Started with Deep Learning (End-to-end Series Part 1)

Level: Beginner
Type: Instructor-Led Lab
Tags: Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

L7121 - Object Detection with Deep Learning (End-to-end Series Part 2)

Level: Beginner
Type: Instructor-Led Lab
Tags: Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

L7122 - Image Segmentation with Deep Learning

Level: Intermediate
Type: Instructor-Led Lab
Tags: Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

L7123 - Network Deployment with Deep Learning (End-to-end Series Part 3)

Level: Intermediate
Type: Instructor-Led Lab
Tags: Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD

L7124 - Medical Image Analysis with Deep Learning

Level: Intermediate
Type: Instructor-Led Lab
Tags: Deep Learning & AI

Day: TBD
Time: TBD
Location: TBD