PyData New York City 2019 - Presentation: [SCHEDULE CHANGE 12:45PM - 2:15PM] Neural Networks for Natural Language Processing

[SCHEDULE CHANGE 12:45PM - 2:15PM] Neural Networks for Natural Language Processing

Audience level:

Intermediate

Description

Deep, pretrained neural networks have in the past roughly 2 years become a staple in the NLP community. The driver for this development has been the success of transfer learning that build on large pretrained language models. In this tutorial I will cover the main concepts behind these methods.

Abstract

Neural Networks for NLP

Deep, pretrained neural networks have in the past roughly 2 years become a staple in the NLP community. The driver for this development has been the success of language modelling as a pretraining task (transfer learning) as well as new and improved network architectures and training methods. State-of-art models like ULMfit, BERT, GPT-2, XLM or AWD-LSTM all use some kind of a language modelling task to first train a network, which can then be used to perform other downstream tasks.

Tutorial Overview

In this tutorial I will first cover the basics of neural networks as applied to NLP before moving on to the ideas behind these state-of-art methods. A rough outline of the tutorial is as follows:

Neural network basics (10-15 minutes)
Recurrent neural networks and language models (10-15 minutes)
Language model pretraining and transfer learning (30 minutes)
Attention and the Transformer model (30 minutes)

I aim to focus on practical applications (document classification, sequence labelling) with real world examples. At the end of the tutorial you should be able to use some of the state-of-art methods in your own projects and have a better understanding how the building blocks of those methods fit together.

Prerequisites

You'll get most out of the tutorial if you are already familiar with Python. Basic familiarity with neural networks is assumed, it helps if you have at least a passing knowledge of loss functions, gradient descent and the like. Familiarity with specific neural network packages is not required, although familiarity with pytorch doesn't hurt. The tutorial will feature lots of code, but all code is available for offline viewing as notebooks. The code itself is not the main point!

All associated code is available on Github from https://github.com/mattilyra/pydatanyc_2019

Wednesday 1:15 PM–2:45 PM in Radio City (6604)