Friday 15:00–16:45

Python for distributed systems

Guillem Borrell

Audience level:
Intermediate

Description

From big data to supercomputing, most modern high-performance tools are concurrent and parallel. This tutorial introduces some of the tools that are available in the Python ecosystem to develop, deploy and maintain modern and efficient distributed applications.

Abstract

This workshop will not cover trendy applications or bundled frameworks like Hadoop or Spark. It won't build recipies that you can reuse for any particular purpose. The goal is to buid a general comprehension about how to program distributed applications in a general way.

The workshop will walk through the following topics.

  1. Distributed hardware. A short introduction to clouds and supercomputers.
  2. Distributed software. Large distributed applications usually exploit task-based parallelism. Messaging is the way to make those tasks talk to each other. There are many different messaging strategies, protocols, transports, layers... Each one is suitable for a different case.
  3. Parallel algorithms. A short introduction about some algorithms that incorporate messaging.
  4. Threading and concurrency. If a task communicates and computes, it is doing two things at the same time, but it is not, since Python has a GIL...
  5. Management, service discovery, logging and availability. Managing tens, hundreds or thousands of tasks can be tricky. But Python has tools that may simplify the management of parallel applications.

Sponsors