Industry

Business & Industry Applications

Language

Python

Features

High Performance Computing
Big Data
Data Mining

PyTables is a Python package for storing and querying large tabular datasets in an efficient way. PyTables is built on top of the HDF5 library and the NumPy and numexpr packages; these provide the foundations for very compact storage and high performance data management. Moreover, PyTables comes with OPSI, an indexing engine meant to work with datasets exceeding the RAM capacity while allowing query times to be competitive against engines in relational databases. Finally, PyTables comes with the high-speed Blosc compressor, making the overhead of compression typically negligible in terms of performance (and many times even beneficial) when dealing with large datasets, even when they are in-memory.

PyTables has been used in a variety of both academic and industry contexts, including at: CalTech, the NASA Jet Propulsion Lab at CalTech, Universitat Politècnica de València, University of Southampton School of Engineering Sciences, Max Planck Gesellschaft, SLAC, ACUSIM Software, NOAA, General Dynamics, Germanischer Lloyd, SarVision, cellzome, and TeraView.

 

Be the First to Know

Be the First to Know

New developments and features from our sponsored projects, straight to your inbox, once a month.

New developments and features from our sponsored projects, straight to your inbox, once a month.