Sunday 14:15–15:00 in LG7

Word Embeddings for fun and profit in Gensim

Lev Konstantinovskiy

Audience level:


Python has great open source libraries to extract data from its most raw format - the human readable text. We will discuss a family of algorithms called word embeddings - Word2Vec being most famous and how they can be used in practice using Gensim package


A tour of word embeddings, their Python implementations and their use in the industry.

We will start with theory and academic results for word2vec, glove, swivel and Word Movers Distance. Then proceed to their Python open source implementations mainly in the Gensim package