Wednesday 2:45 p.m.–3:25 p.m.

A next generation live tick database accessible through Python, backed by SciDB

Jonathan Rivers

Audience level:
Novice

Description

Trading firms run on market tick data—usually stored in files or legacy timeseries DBs. File-based methods are hard to maintain, timeseries DBs require archaic programming. SciDB and Python combine to offer live access to tick data in an ACID DB accessible through Python—empowering traders and quants with self-service data.

Abstract

A live tick database has to be able to ingest massive amounts of market data—keeping up with market feeds while simultaneously servicing queries. We will present how a trading firm built a live tick database backed by SciDB that runs on commodity hardware and can ingest 1TB/trading day while servicing interactive queries from users—all controlled from a Python interpreter.

In contrast to legacy timeseries databases that require specialized programming talent, SciDB can be controlled through Python thereby allowing users to self-serve their data needs to drive better analytics.

SciDB’s distributed array storage model provides efficient storing and retrieving financial data and facilitates ingesting data at market rates. Ticks and new data need to be appended rapidly with fast writes, and the data needs to be read back efficiently to serve multiple users. SciDB’s array storage model is critical to efficient analytics. E.g., to compute the moving average of a symbol the calculation will begin with prices in a sequential layout because of the ordered method SciDB uses to store data.

SciDB provides the infrastructure to warehouse tick data and SciDB-Py, the python interface to the database, serves financial data users in a expressive language that they are familiar with while harnessing the power of distributed compute provided by SciDB.

Sponsors


Become a sponsor.