You’ll learn how to take a never-before-seen legal document, like a contract or a convertible note, and use machine learning to “read” the document and answer questions like “Who’s the investor” and “What interest rate did the parties agree to?”
If you’re a law firm, when you get a new client, they’re going to send you a giant zip file with hundreds or thousands of documents. You might then sic a team of highly educated, highly paid paralegals to read each document and painstakingly enter the counterparty, date of execution, etc. into a spreadsheet. But what if an algorithm could do it for free?
In this talk, we’ll cover extracting this information from a legal document using an extension of a Hidden Markov Model called a CRF, or Conditional Random Field. We’ll go through the math behind the model in some depth. We’ll give some tips for feature selection. And we’ll talk about how to vectorize a legal document. We’ll close with a description of the pluses and minuses of using CRFs in lieu of more complex deep learning-type models.
The target audience is anyone who’s ever heard of Bayes’ Rule.