Program Details - 7th KAUST-NVIDIA Workshop

Monday, 23rd Nov - 2020

10am - 12pm

AI competition: Overview & Kick off

Dr. David R. Pugh, Visualization Scientist, Visualization
Dr. Sameh M Abdullah, Research Scientist, Extreme Computing Research Center
Dr. Mohsin Ahmed Shaikh, Computational Scientist, KAUST Supercomputing Lab

Tuesday, 24th Nov - 2020

1pm - 3:30pm

Tutorial: Best Practices for Distributed Deep Learning on Ibex

Dr. David R. Pugh, Visualization Scientist, Visualization
Dr. Glendon Holst, Visualization Scientist, Visualization
Dr. Mohsin Ahmed Shaikh, Computational Scientist, KAUST Supercomputing Lab


With increasing size and complexity of both Deep Learning (DL) models and datasets, the computational cost of training these model can be non-trival, ranging from few tens of hours to even several days. Exploiting data parallelism exhibited inherently in the training process of DL models, we can distribute training on multiple GPUs on a single or multiple nodes of Ibex. We discuss and demonstrate the use of Horovod, a scalable distributed training framework, to train DL models on multiple GPUs on Ibex. Horovod integrates with Tensorflow 1 & 2, Pytorch and MxNet. We also discuss the some caveats to look for when using Horovod for large mini-batch training.  

Wednesday, 25th Nov - 2020

9:30am - 12pm

Tutorial: Accelerated Data Science with NVIDIA RAPIDS

 Dr. Manal Jalloul, Lecturer, American University of Beirut, Lebanon


RAPIDS is a collection of data science libraries that allows end-to-end GPU acceleration for data science workflows. Learn to GPU-accelerate end-to-end data science workflows by using cuDF to injest and manipulate massive datasets directly on the GPU, utilizing a wide variety of GPU-accelerated machine learning algorithms including XGBoost and several other cuML algorithms to perform data analysis, and performing end to end analysis tasks utilizing several realistic data sets. Upon completion, you'll be able to perform a wide variety of end-to-end data science tasks using multiple datasets.

Sunday, 29th Nov - 2020

9am - 3pm

Tutorial: Directive based GPU programming using OpenACC

Dr. Saber Feki, Senior Computational Scientist Lead @ KSL


Accelerated computing is fueling some of the most exciting scientific discoveries today. For scientists and researchers seeking faster application performance, OpenACC is a directive-based programming model designed to provide a simple yet powerful approach to accelerators without significant programming effort. With OpenACC, a single version of the source code will deliver performance portability across the platforms.

This training is comprised of instructor-led class that include interactive lectures, hands-on exercises, and office hours with the instructors. You’ll learn everything you need to start accelerating your code with OpenACC on GPUs and CPUs. The courses cover introduction on how to parallelize, profile and optimize your code, as well as manage data movements and utilize multiple GPUs.


Tutorial: Overview of NVIDIA Nsight Systems

Dr. Issam Said, Manager Solutions Architecture and Engineering @ NVIDIA



Monday, 30th Nov - 2020

9am - 10:30 am

HPC Hackathon -- Day 1

12pm - 12:15pm

Formal Opening of KAUST-NVIDIA Workshop on Accelerating Scientific Applications using GPUs 

Dr. Saber Feki, Senior Computational Scientist Lead @ KSL

HPC Hackathon: Overview & Kick off

Dr. Bilel Hadri, Computational Scientist @ KSL


Keynote: A Universal Accelerated Computing Platform for the Data Centre  

Dr. Timothy Lanfear, Director, Solution Architecture and Engineering EMEA @ NVIDIA


For more than two decades, NVIDIA has been building a comprehensive ecosystem for accelerated computing. At the base of the stack are high throughput GPUs and low latency network data processing units. A suite of SDKs offers the application developer access to the underlying hardware to support the task of scaling applications from the individual processor to large-scale data centre deployments.

The A100 Tensor Core GPU is the most recent hardware addition to the platform used in the DGX A100 universal accelerated computing system, which is equally suited to scientific simulation, data analytics, AI training and AI inference workloads. The launch of the A100 GPU is accompanied by new solutions such as GPU acceleration of Apache Spark 3.0, improved big data analytics benchmark results (TPCx-BB) and a new release of the CUDA-X SDKs.

The network data processing unit (DPU) is a new class of accelerator designed to take over functions from the server in the areas of in-flight data processing, security, storage interface, and virtualisation with the dual benefit of freeing up server resources for other tasks, and keeping the server isolated from malicious attacks.

An example of the success of the ecosystem has been the way NVIDIA’s platform has underpinned a variety of contributions to the fight against COVID-19 including data analytics, scientific simulation and visualisation, artificial intelligence, edge computing and robotics.

2 pm - 3pm

Tutorial: Parallelware Analyzer: Data race detection for GPUs using OpenMP and OpenACC 

Dr. Manuel Arenaz, CEO, Appentra Solutions


The development and maintenance of parallel software is far more complex than sequential software. Bugs related to parallelism are difficult to find and fix because a buggy parallel code might run correctly 99% of the time and fail just the remaining 1%. This is also true for Graphical Processing Units (GPUs). In order to take advantage of the performance promised by GPUs, developers must write bug-free parallel code using the C/C++ and Fortran programming languages. This talk presents a new innovative approach to parallel programming based on two pillars: first, an open catalog of rules and recommendations that leverage parallel programming best practices; and second, the automation of quality assurance for parallel programming through new static code analysis tools specializing in parallelism that integrate seamlessly into professional software development tools. We also present new Parallelware Analyzer capabilities for data race detection for GPUs using OpenMP and OpenACC.

Tuesday, 1st Dec - 2020

9am - 5pm

HPC Hackathon -- Day 2

2pm - 3pm

Tutorial: Overview of NVIDIA Nsight Compute

Flex Schmitt, Senior System Software Engineer @ NVIDIA



Wednesday, 2nd Dec - 2020

9am - 12pm

Tutorial: Introduction to deep learning image classification using Keras (Part 1)

Glendon Holst, Visualization Scientist, Visualization


ImageNet, an image recognition benchmark dataset, helped trigger the modern AI explosion. In 2012, the AlexNet architecture (a deep convolutional-neural-network) rocked the ImageNet benchmark competition, handily beating the next best entrant. By 2014, all the leading competitors were deep learning based. Since then, accuracy scores continued to improve, eventually surpassing human performance.

In this hands-on tutorial we will build on this pioneering work to create our own neural-network architecture for image recognition. Participants will use the elegant Keras deep learning programming interface to build and train TensorFlow models for image classification tasks on the CIFAR-10 / MNIST datasets. We will demonstrate the use of transfer learning (to give our networks a head-start by building on top of existing, ImageNet pre-trained, network layers), and explore how to improve model performance for standard deep learning pipelines. We will use cloud-based interactive Jupyter notebooks to work through our explorations step-by-step. Once participants have successfully trained their custom model we will show them how to submit their model's predictions to Kaggle for scoring.

Participants are expected to have access to laptops/workstations and sign-up for free online cloud services (e.g., Google Colab, Kaggle). They may also need to download free, open-source software prior to arriving for the workshop.

12:30pm - 1:30pm

Keynote: GPU-Accelerated Applications: The Why and The How? 

Dr. Hatem Ltaief, Senior Research Scientist, Extreme Computing Research Center @ KAUST


Ever wondered how to squeeze out performance from these GPU accelerator beasts? Come and learn what are the key algorithmic ingredients developed in the Extreme Computing Research Center (ECRC) at KAUST to accelerate a broad range of scientific applications on GPUs including computational astronomy, computational chemistry, and climate/weather prediction applications. This talk presents a five-year overview of accelerated computing at ECRC and explains how to mitigate some of the applications’ performance bottlenecks (e.g, synchronization-reducing and communication-reducing) using manycore systems equipped with GPU hardware accelerators.

3:30pm - 4:30pm 

HPC Hackathon: Team's presentations


HPC Hackathon: Prize ceremony & closing

Thursday, 3rd Dec - 2020

8:30am - 11:30am

Tutorial: Introduction to deep learning image classification using Keras (Part 2)

Hands on session

Glendon Holst, Visualization Scientist, Visualization
12pm - 1pm

KeynoteAI in healthcare and lifescience

Craig Rhodes EMEA Industry Lead for Healthcare and Life Sciences at NVIDIA

12:30pm - 4:30pm

ML competition: Predicting house pricing

Thursday, 10th Dec - 2020

10am - 11:30pm

Keynote: AI @ KAUST

Prof Bernard Ghanem, Associate Professor, Electrical Engineering @ KAUST

2pm - 3:30pm

AI competition: Prize ceremony & closing