Wednesday 28 Nov 08:00

Workshop: Parallel Programming and Deep Learning

1 or 2 Day Course


LEVEL: Intermediate

2 Days of deep-dive hands-on training sessions with Intel experts – bring your own laptop*!

Following a series of technical workshops on Parallel Programming and Deep Learning with Intel® Software, we are now going to the next level with the first hands-on workshop where you’ll have the opportunity to try all the software and programming techniques by yourself on your own laptop, guided by the experts.

Please bring your own – Intel®-based – laptop*, we’ll provide all required software and technology. Detailed technical requirements will be sent to the registered attendees.

REGISTER here There are seats left.

Date: 28 & 29 November 2018

Day 1: Code Modernization for Intel Architecture

When it comes to improve application performance, one needs to re-architect and/or tuning existing code to expose enough vectorization and parallelism.  In this workshop we will dive into the code modernization framework. This results in a systematic approach which needs to be followed to achieve the highest performance possible. With the help of examples, use cases and a better usage of the Intel® C/C++ compiler, we pinpoint you to possible inefficiencies both on sequential and vectorized code and we explain remedies, hints and strategies to be considered to ensure an application delivers great performance on today’s scalable hardware and upcoming future generations.


08:00-09:00      Registration with light breakfast

9:00-10:00        Introduction to Intel tools
This session will introduce intel tools and the different suites available for writing codes for single or multi-nodes computers as well as analyzing the performance.

10:00-10:30:     Login to GCP

10:30-11:00:     Break

11:00-12:00:     Compiler Based Optimization - nbody - part 1
This session will drive the user into many compilers based optimizations. Step by step, the attendee will be able to understand how to modify the code and the compiler arguments to achieve a great performance.

12:00-13:00:     Lunch

13:00-15:00      Compiler Based Optimization - nbody - part 2
This session will drive the user into many compiler based optimizations. Step by step, the attendee will be able to understand how to modify the code and the compiler argumuments to achieve a great performance.

15:00-15:30      Break

15:30-16:30      OpenMP for Threading and Vectorization
Modern architectures offers many way to greatly speedup your applications. Parallelism is one of them. On a single node point of view, parallelization can be achieved by threading and vectorization. This presentation will explain how to create threaded and vectorized workload with OpenMP.

16:30-17:00      Memory Optimization (Cache Blocking) - iso3dfd
Moving data in an efficiçent way is a crictical point when it is about HPC. Many HPC applications tends to be memory bounded and it is alway a good practice to verify that memory accesses are hadware frendly and that we reuse as much data from the cache as it is possible.

17:00-17:30      Intel MKL
Intel MKL provides many mathematic functions implemented by the most talented engineers at Intel. This library achieves incredible performance. This track will explain how to compile and link with the MKL. We will also show some speedup that can be obtained by using this library.

17:30                Q&A Open discussion

17:30-19:00      Get-together Networking evening with drinks & food

Day 2    Enhance Performance with Intel tools and Python

During the morning session we will show how performance analysis tools like Intel® Advisor and Intel® VTune™ Amplifier can be used efficiently to investigate issues and guarantee the optimal performance for the underlying latest Intel® Xeon® Scalable processor. During the afternoon we will show how to use the Intel® Distribution for Python giving insights of the most used algorithms for machine learning applications and how libraries such as Scikit-Learn has been optimized for the Intel® hardware.


08:00-09:00       Registration with light breakfast

09:00-10:00       Intel Advisor – nbody

Intel Advisor is a powerful tool for tracking down and solving vectorization problems. This presentation will introduce Intel Advisor and especially the survey and the trip count analyses. We will explain how to read Advisor's outputs to improve the vectorization.

10:00-10:30       Roofline
A Roofline chart is a visual representation of application performance in relation to hardware limitations, including memory bandwidth and computational peaks.

10:30-11:00       Break

11:00-12:00       Intel VTune optimization - iso3dfd
This session will drive the user from an unoptimized version of a wave propagation kernel to a much more optimized version. We will see on a real world example how to detect bottlenecks and how to optimize them.

12:00-13:00       Lunch

13:00-15:00       K-means clustering: From Scikit-Learn to DAAL to Cython
Lloyd's algorithm is the standard algorithm used for K-means clustering, an unsupervised machine learning technique. We'll use it to reduce the number of colors used in an image. First, we'll use Scikit-Learn to perform this task and we'll try to understand why the implementation is so slow. Then we'll move to the DAAL to get better performance. Finally, we'll try to write and improve the performance of our own implementation using Cython.

15:00-15:30       Break

15:30-16:30       Composed Multithreading:

TBB and PythonIn this lab, we'll show how Intel has incorporated TBB into the Python ecosystem. We'll first study an example on collaborative filtering, a technique used by recommender systems. We'll show how oversubscription can reduce the performance of the original program and how this problem has been fixed by Intel TBB. We'll then show how Dask can be used on top of TBB for composed parallelism.

16:30-17:00       Q&A Open discussion    

During the workshop, we will provide to the attendee’s access to the Skylake processors and Intel® tools using VM instances offered by Google Cloud Platform. Attendees should be comfortable with either C/C++ or Fortran programming language and basic Linux command, like make and ssh. No previous experience in vectorization and parallelization is required and profiling tools, as well.

Date & Time:

Wednesday, November 28th 08.00 – 19.00 

Thursday, November 29th 08.00 – 17.00

You can register for both days or one of the days. 

REGISTER here There are seats left.

* Other names and brands may be claimed as the property of others.
** Agenda is subject to change
Preparation, configuration, and usage of your laptop computer during the workshop is at your own risk. Please note that Intel cannot take on any responsibility for your hardware.

Brought to you by:

Platinum Partners


Premium Partners

Microsoft100px Iis 100px Sentor100 Redakita100px2 Oneagency100px Daresay100


Agical100px Pinmeto100px Informator16 100px Oderland100px Factor10px100 Falconio100px Raygun100px Alfasoft100
Sign in