Gundam is a new, embarrassingly parallel code to calculate spatial two-point correlation functions (2pcf) of hundreds of millions of galaxies observed by modern surveys, employing several statistical estimators and error estimates. Gundam is designed to be very fast, flexible and extensible, combining the user friendliness of Python and a fast skip list/linked list algorithm implemented in Fortran
I will present Gundam, a new code to calculate spatial two-point correlation functions (2pcf) in large astronomical surveys. The code can efficiently estimate 3D/projected/angular auto and cross correlation functions with a variety of statistical estimators and bootstrap errors. Gundam is designed to be very fast, flexible, extensible and user friendly; serving occasional users who just want the 2pcf out of a given list of source coordinates and redshifts (e.g. a junior student), as well as more advanced users who need access to custom weighting schemes, fiber-collisions corrections, model selection effects, get the full covariance matrix, etc.
Gundam is a fine example of Python as a glue: taking advantage of the inherent speed of Fortran to do raw calculations, and integrating optimized, well tested packages already available in the astronomy community to perform certain tasks. No need to reinvent the wheel.
Gundam leverages raw pair counting to native Fortran routines compiled via f2py, that implement a fast, skip list/linked list algorithm to avoid unnecessary distance calculations. A custom sorting algorithm rearranges data so that structures close in real space remain close in memory space, largely increasing the CPU cache friendliness and speed. As a bonus, this implementation is embarrassingly parallel so the code automatically employs all cores in a desktop CPU using OpenMP or in a large computer cluster using ipyparallel. Current performance is up to 2x better than the fastest state-of-the-art publicity available code.