Thursday 11:30 AM–12:10 PM in Room 3

Open Data, Networks and the Law

Iain Carmichael, Michael Kim

Audience level:
Intermediate

Description

What does network science have to say about the law? Can we determine which are the most the most influential cases in our legal system? Can we understand how legal doctrine evolves? Using tools from network statistics and data provided by Court Listener (an open legal data project), we analyze the network of law case citations.

Abstract

Citation networks have recently been a topic of interest to network scientists. Court Listener, an open data initiative, provides the network of law case citations as well as the text of (almost every) court case in the US. This network data set provides a rich array of questions that are of interest to legal scholars as well as network scientists.

Can we determine which cases are the most influential in our legal system? Can we understand how legal doctrine evolves? We will discuss what we learned about how the network of law cases evolves and what this means for legal practitioners.

Inspired by this data set we develop new statistical methodology to model how networks evolve. We also provide new techniques to asses the goodness of fit for both standard and novel probabilistic network growth models.

We also discuss what we learned from this project about advancing undergraduate statistics education and how it can interface with industry or other areas of academia. This project involves a wide swath of the data science process from acquiring/cleaning data, to building up Python infrastructure required to analyze a complex data set, and finally to developing new statistical theory for network data. We believe that data science research projects like this one are ideal for undergraduate (and graduate) statistics students to get comprehensive training in the computational techniques required for the real world.