This talk discusses using transfer learning and novel forms of feature engineering to detect and localize the absence of signatures in legal documents. This talk is targeted towards attendees with some familiarity with computer vision who are interested in an applied case study, learning how to leverage deep learning with small amounts of data, or applying computer vision to text documents.
With the advent of large, pre-trained, neural networks for object detection, it’s pretty straightforward to leverage transfer learning, and train a model that can recognize cats, hot dogs, or anything you please. But what about when you’re trying to detect the ABSENCE of an object?
Let’s say you’re given a mass of contracts, and you want to train a model that can tell which ones have been signed and which haven’t. You label all the signatures and create a train/test split. You download the weights for Resnet or Inception, set your classes, choose your hyperparameters, and you’re off to the races. Your final model can find signatures on a page with 95% precision and recall.
You still haven’t solved the original problem — finding the unsigned documents. To do this, you need to go further and treat the absence of a signature as an object in itself. That’ll get you part of the way there. To achieve the highest possible accuracy, we’ve used a couple of task-specific features: optical character recognition (like banks use to read checks at ATMs), and page length data. This has allowed us to achieve very high accuracy even with only 3,000 labeled examples. We’ve also successfully extended the solution beyond the realm of binary classification into object localization.