Pointer Generator Summarisation and Explainability for Legal Documents

Brian Cechmanek

Prior knowledge:
No previous knowledge expected


Thomson Reuters CourtWire aids legal researchers in discovering new court cases. We used the OpenNMT Pointer Generator Network, to create extractive-abstractive summaries containing both novel words and case-specific terms. To increase explainability, we display network attention 'highlights' over the original document. Ecco is an open source tool which can do this on modern transformer networks.


CourtWireTM aids legal researchers in discovering court cases, as they arise, from over 200 courts across the USA. A key offering of CourtWire are case summarisations, produced via an in-house tool for editorialists. These summarisations are typically 1-2 sentences long and take an average of 6 minutes to write and cover 10-15 pages of the case document. Summaries tend to be both abstractive and extractive.

TR LabsTM used the OpenNMT Pointer Generator Network (PGN), trained on ~1 million historical case document/summary pairs from 2008-2019, at 8000 input tokens. Pointer Generator networks aid in factual accuracy of a summary as the generator retains the ability to produce novel text, whilst the pointer extracts source words – which may be out of vocabulary. It uses a coverage mechanism to prevent the model from repeating itself.

Informally, a good summarisation should:

  1. Be shorter than the input text
  2. Capture the salient information for the task.
  3. Faithfully represent the document (do not introduce false information or provide out-of-context snippets)
  4. Be grammatically and semantically valid.

To monitor 1 and 4, summaries were scored using ROUGE, which we have previously written about to be a good proxy-metric for summarisation outputs. Ultimately however, summaries must pass human evaluation (2 and 3), a task which lawyers are particularly strict about. To increase user trust in the veracity of the summary and to give insight into the working of the PGN, we created an in-house document visualization, highlighting the attention scores. We do this by overlapping the first 4000 tokens used to evaluate a document and displaying their normalised attention scores over the original document. We call this ‘attention highlights’. Editors enjoyed the experience of interacting with Explainable AI Norkute et al. 2021, and the median task time was reduced by 37%.

For those wishing to experiment with visualising any of today’s state-of-the-are transformer based networks, ecco is an, unaffiliated, BSD-3 open source library for “exploring and explaining Natural Language Processing models using interactive visualizations.”