Tuesday 04 Jun 17:45

Distributed Deep Learning, Rackenflow, AMD/ROCm

HopsML Stockholm


LEVEL: Advanced

At our fourth meetup, we are delighted to have some more world firsts: the first asynchronous hyperparameter optimization framework for TensorFlow based on Spark, and the first platform (Hopsworks) to support AMD's ROCm platform (an alternative to Nvidia's Cuda) with TensorFlow. Christer Enfors will talk about the framework for Data he uses, called RackenFlow.


17:45 Doors Open

18:10 - 18:35: Maggy: Open-source asynchronous hyperparameter optimization using TensorFlow and Apache Spark

18:35 - 19:20: Break with food and drink

19:20 - 19:40: The RackenFlow Journey

19:40 - 20:00: AMD ROCm on TensorFlow/Spark/Hopsworks

Maggy: Open-source asynchronous hyperparameter optimization using TensorFlow and Apache Spark

For the past two years, the open-source Hopsworks platform has used Spark to distribute hyperparameter optimization tasks for Machine Learning. Hopsworks provides some basic optimizers (gridsearch, randomsearch, differential evolution) to propose combinations of hyperparameters (trials) that are run synchronously in parallel on executors as map functions. However, many such trials perform poorly, and we waste a lot of CPU and hardware accelerator cycles on trials that could be stopped early, freeing up the resources for other trials. In this talk, we present our work on Maggy, an open-source asynchronous hyperparameter optimization framework built on Spark that transparently schedules and manages hyperparameter trials, increasing resource utilization, and massively increasing the number of trials that can be performed in a given period of time on a fixed amount of resources.

Finally, we will perform a live demo on a Jupyter notebook, showing how to integrate Maggy in existing PySpark applications.

Moritz Meister is a Systems Research Intern at Logical Clocks, the developers and creators of Hopsworks. Moritz has a background in Econometrics and Operations Research and is about to finish MSc degrees in Computer Science from Politecnico di Milano and Universidad Politecnica de Madrid. He has previously worked as a Data Scientist for Deutsche Telekom and Deutsche Lufthansa in Germany, helping them to productionize machine learning models to improve customer relationship management.

The RackenFlow Journey

To support projects and customers, CGI has developed the Rackenflow platform for scalable data analytics and data science, based on open-source tools and frameworks. Recently, we've switched to using Hopsworks as the base platform for Rackenflow.

In this talk, I'll share:
- our thoughts on a large IT consultancy firms position in the data space.
- the challenge, journey and learnings in developing a data platform for AI
- customer experiences.

Jonas Forsman is the lead data scientist within the RackenFlow initiative. He has worked at CGI since 2001, focusing on Applied Research in the fields of AI, Machine Learning and Innovation.

AMD/ROCm on TensorFlow/Spark/Hopsworks

ROCm, the Radeon Open Ecosystem, is an open-source software foundation for GPU computing on Linux. ROCm supports TensorFlow and PyTorch using MIOpen - AMD's own Cuda. In this talk, we describe how we enabled ROCm on the Hopsworks platform and how TensorFlow applications can be run without changing a single line of code, on either AMD or Nvidia GPUs.

Jim Dowling

Robin Andersson, Software Engineer 


Platinum Partners


Premium Partners

Redakita100px2 Oa100px Daresay100 Internetstiftelsen 100x50 Factor10px100


Falconio100px Raygun100px Agical100px Pinmeto100px Alfasoft100 Q100x50
Sign in