Sunday 15:15–16:00 in Kursraum 1

Meaningful histogramming with Physt

Jan Pipek

Audience level:
Intermediate

Description

Histogram is a very simple and powerful statistical tool (disclaimer: it has its weaknesses too). Standard Python scientific libraries offer methods for calculating and visualizing histograms but there is much more (fun as well as boring) stuff that can be done with them. The physt library focuses especially on those fun parts.

Abstract

Numpy algorithms for calculating bins and their content are very efficient, matplotlib produces nice histogram plots, several plotting libraries combine this with useful interactive exploratory features.

But what if you suddenly decide to add new values to an already existing histogram (and did not specify a proper value range from the start)? What if you want to automatically find human-friendly bin edges (ever wondered why we should count people that are from 168.47854 to 173.45667 cm tall?)? What if you want to project or slice your multidimensional histograms? What if you wanted cylindrical or spherical histograms? What if you want to add the values of two histograms? What if you want to persist bins and meta-data alongside with the calculated values?

The physt library takes histograms as proper objects and combines the computing power of numpy with visualization posibilities of matplotlib (and optionally other backends) and a level of semantics and more advanced functionality.

In this talk, I will describe the object model behind the library and show a live demo what it can help you accomplish.

Subscribe to Receive PyData Updates

Subscribe