Wednesday 9:00 AM–10:30 AM in Room 3

Simplifying large scale parallel processing with Storm and streamparse

Dan Blanchard

Audience level:
Intermediate

Description

Streamparse is a popular Python library for writing bolts/spouts (i.e., workers/producers) for use with Apache Storm. If you've ever been bitten by the GIL when trying to process data at scale, you will enjoy seeing how the Storm/streamparse combination can be used to sidestep the issue entirely.

This talk will cover the basics of Apache Storm and streamparse and why they're useful, discuss some of the efforts we're taking to unify the core components of the two competing Python Storm libraries (pyleus and streamparse), and show some of the command-line utilities that streamparse provides to simplify managing Storm topologies.

Abstract

This talk will cover the basics of Apache Storm and streamparse and why they're useful, discuss some of the efforts we're taking to unify the core components of the two competing Python Storm libraries (pyleus and streamparse), and show some of the command-line utilities that streamparse provides to simplify managing Storm topologies.