Monday 2:45 PM–3:25 PM in Central Park East 6501a (6th fl)

Bookworm: Social Networks From Novels

Harrison Pim

Audience level:
Intermediate

Description

Most novels are, in some way, a description of a social network. Bookworm leverages the diverse python ecosystem to extract the implied social network from a novel, and transforms it into an intuitively understandable and deeply analysable graph.

Abstract

Detailed analysis of literature is traditionally left to academics in the humanities. But why should they get all the fun?

In this talk, I'll take the audience on an in-depth tour of Bookworm: a system which ingests novels and extracts social networks of their characters. The process allows for the kind of detailed network analysis which is typically carried out on social networks at scale, using similar opensource tools and techniques:
- Jupyter Notebooks
- Pandas
- NLTK
- NetworkX
- d3.js

Having transformed the initially unstructured data into something with a bit more structure, individual characters' communicability, connectivity, importance and clique-involvement can all be enumerated. The analysis can also take on a temporal dimension, allowing novels' dynamics and narrative structures to be inspected and compared - how (and how often) are relationships formed, broken down, or developed over time, and do authors exhibit patterns in their work?

In a world in which decision making is increasingly data-driven, graph based, and algorithmic, it's important that firm links be drawn between the sciences and the humanities. Those working in the humanities need to know that scientific techniques are not exclusive to the sciences, and those in the sciences must remain in touch with their humanity.
Alongside bridge-building, this talk can also be seen as an applied introduction to plenty of other subjects: Natural Language Processing, Text Mining, Graph Theory, Digital Humanities, Computational Social Science and others.

Subscribe to Receive PyData Updates

Subscribe