Automating the Exploration of Databases for Data Science with AEDA

Diego Arenas

Prior knowledge:
No previous knowledge expected

Summary

AEDA is a new python library to automatically explore the content of databases. It automates the metadata extraction from different data sources and creates a data catalogue and run a data quality report on the data source.

Description

In this lightning talk a new python library is presented to explore and extract metadata from databases. The use of this library will save you many hours of querying the source database to explore its content. Only a reading access to the target database is needed to extract the content and analyse it without the need to query large amounts of data. It runs from the terminal using one liner command. The extracted metadata will be stored locally or in a database of your preference, the library will create docker containers to store the metadata if you need it. The library has a streamlit dashboard added to explore the metadata database but the user can query it with any data visualisation tool or SQL client.