Sunday 11:45–12:30 in LG6

Find the text similiarity you need with next generation of word embeddings in Gensim

Lev Konstantinovskiy

Audience level:


There are many ways to find similar words/docs with an open-source Natural Language processing library Gensim that I maintain. I will give an overview of modern word embeddings like Google's Word2vec, Facebook's FastText, GloVe, WordRank, VarEmbed and discuss what business tasks fit them best.


What is the most similar word to "king"? It depends on what you mean by similar. "King" can be interchanged with "Canute", but it's attribute is "crown". We will discuss how to achieve these two kinds of similarity from word embeddings. Also touch on how to deal with the common issues of rare, frequent and out of vocabulary words.

Subscribe to Receive PyData Updates



Get Now