Saturday 15:45–16:30 in Hörsaal 3

Understanding and Applying Self-Attention for NLP

Ivan Bilan

Audience level:


Understanding attention mechanisms and self-attention, presented in Google's "Attention is all you need" paper, is a beneficial skill for anyone who works on complex NLP problems. In this talk, we will go over the main parts of the Google Transformer self-attention model and the intuition behind it. Then we will look on how this architecture can be used for other NLP tasks, i.e. slot filling.



  1. Introduction to Attention for NLP: [ A brief overview of how attention is used to improve the performance of LSTMs. We will look into several approaches and how they can impact the quality of your NLP models. ]
  2. Neural Machine Translation (NMT) and seq2seq: [ Short motivation on how sequence to sequence (seq2seq) models were introduced and their contribution to the field of NMT. ]
  3. Self-attention in Detail: [ We will look into the Encoder/Decoder architecture of the Google Transformer and talk in detail about the Multi-Head Attention. ]
  4. Overview of Relation and Argument Extraction: [A quick refresher on how a typical Relation Extraction and Argument Extraction task looks like to get you up to speed on this field of research. ]
  5. Adapting Self-Attention for Relation Extraction: [ In this part, we will look into how self-attention can be adapted to work on relation extraction. What changes to the Encoder layer are more beneficial and what parts of the architecture are more crucial.]

Subscribe to Receive PyData Updates