Wednesday 10:50 AM–11:30 AM in Central Park East (#6501a)

Word2Vec 4 GIFs

Anthony Johnson

Audience level:
Intermediate

Description

Word2Vec’s ability to map textual entities sharing a common context into a high dimensional space means that you can discover relationships and derive meanings across all sorts of things, even GIFs! In this talk, I’ll go over how GIPHY used word2vec to create better suggestions for searches and GIFs and, in turn, increased user engagement on our site.

Abstract

Word2Vec is one of the most popular neural-network architectures for producing word embeddings, and its ability to capture various degrees of similarity between words is well documented. In this talk, I’ll go over how GIPHY created novel word2vec models trained on custom corpuses, such as our GIF metadata and the behavior of users of our website.

Despite the idiosyncratic nature of our data, we were still able to create models that expressed the power of the word2vec algorithm, including a fun byproduct we call “GIF Arithmetic” (Running dog GIF - dog GIF + cat GIF = running cat GIF).

We built services around these models that power the “recommended searches” and “related GIFs” features of our website, resulting in both qualitative and quantitive improvements that progress as we continue to train these models.

Subscribe to Receive PyData Updates

Subscribe