Monday 11:30 AM–12:10 PM in Central Park East 6501a (6th fl)

Git Risky: Using git metadata to predict code bug risk

J. Henry Hinnefeld

Audience level:
Intermediate

Description

Git is a powerful tool for code versioning. If you follow its best practices and have good ‘commit hygiene’ it can also be a source of valuable data about your coding practices. In this talk I’ll describe a system we built at Civis that uses the metadata git collects, along with its logging and ‘blaming’ functionality to score commits in real time on their likelihood of introducing a bug.

Abstract

git log and git blame are great for tracking down and understanding which commit introduced a bug into some code, but you have to notice the bug in the first place. What if it were possible to proactively identify buggy commits? That’s the goal of the system described here.

The system is based on metadata accessible via git log and git blame, so I’ll begin with a brief overview of these tools. Then I’ll spend the majority of the talk going over the steps we used to build a commit-level model of bug risk, including:

Finally, the commit risk model relies heavily on having good commit hygiene in the repository, so I’ll conclude with some things you can do to improve commit hygiene in your own repositories.

Subscribe to Receive PyData Updates

Subscribe