GeoPandas is the standard analytical tool for tabular geospatial data in Python. It is well loved but known to be slow. This talk describes recent work to accelerate GeoPandas with Cython and Dask to make it one of the fastest and scalable geospatial libraries in existence.
Geospatial data is used in city planning, real estate, agriculture, and any other field in which location has an impact. GeoPandas is the standard analytical tool for tabular geospatial data in Python. It has a well loved API, and integrates cleanly with the rest of the geospatial ecosystem, but can be very very slow.
This talk describes two recent modifications to GeoPandas that both accelerate it's performance and enable it to handle very large datasets.
This talk will include a brief overview of geospatial data and the GeoPandas project using examples from open datasets. It will then describe the use of Cython and Dask to accelerate and scale the project to handle larger datasets more quickly. We will end with benchmarking information and plans for the future.