Friday 1:00 PM–2:30 PM in Room #370B/C (3rd Floor)

Modern NLP in Python

Patrick Harrison

Audience level:
Intermediate

Description

Academic and industry research in Natural Language Processing (NLP) has progressed at an accelerating pace over the last several years. Members of the Python community have been hard at work moving cutting-edge research out of papers and into open source, "batteries included" software libraries that can be applied to practical problems. We'll explore some of these tools for modern NLP in Python.

Abstract

Academic and industry research in Natural Language Processing (NLP) has progressed at an accelerating pace over the last several years. Members of the Python data science community have been hard at work moving cutting-edge research out of papers and into open source, "batteries included" software libraries that can be applied to practical problems.

In this tutorial and live demo, we'll explore some of these tools for modern NLP in Python, including spaCy and gensim. Along the way, we'll learn about recent foundational advances in machine natural language representations, such as topic modeling with Latent Dirichlet Allocation (LDA) and word vector embedding with word2vec. Finally, we'll discover visualization tools to help us introspect and understand high-dimensionality natural language models, including pyLDAvis and t-SNE.