Despite the internet becoming an increasingly visual place, the algorithms we use to explore it haven’t developed much since the text-heavy web of the 90s/00s. We’re exploring a new approach to image search at Wellcome Collection. Using some neat computer vision and NLP tricks, we’re connecting search queries to images’ visual features directly and removing the need for rich descriptive captions.
Despite the internet becoming an increasingly visual place, the algorithms we use to explore it haven’t developed much since the text-heavy web of the 90s/00s.
Wellcome Collection is home to a lot of images, and we're digitising 10,000s of new ones every day. It’s impossible for our cataloguers to write detailed captions for all of them, but to most search algorithms, an image without a caption might as well not exist. We’re exploring a new approach. Using some neat computer vision and NLP tricks, we’re connecting search queries to images’ visual features directly and removing the need for such rich, descriptive captions.
This talk should be relevant and fairly accessible to anyone with an interest in image search with neural networks, or a burning desire to see my favourite drawings of skeletons and psychedelic cats from the 1800s.
In this talk I’ll cover:
- Image feature extraction with neural networks, and how those feature vectors can be represented as points in a high-dimensional feature space
- Word embeddings, and how to create state-of-the-art sentence embeddings
- How bridging the gap between separate vector spaces opens the door to some unintuitive ways of exploring datasets
- Some ethical considerations we take into account when building machine learning systems for expert researchers
We work openly at Wellcome Collection, so all of the code and data behind this talk is available and openly licenced. If you’re interested in what’s covered, feel free to use it!