Sunday 10:50–11:25 in Auditorium

Efficient Transfer Learning for Machine Translations @Booking.com

Karlijn Zaanen, Satendra Kumar

Audience level:
Intermediate

Description

At Booking.com, we want to automatically translate millions of guest reviews across 43 languages. Transfer Learning, both across domains and languages, is at the heart of our solution. We will present how we used the OpenNMT-tf library, leveraged various open source datasets, and trained these models efficiently at scale. We will end with a nice demo!

Abstract

At Booking.com’s Machine Translation (MT) team, we want to empower people to experience the world without any language barriers. Booking.com is available in 43 languages. But the reviews that our users write used to be available only in the reviewer’s original language of choice. In order for more users to be able to understand and get value from this review, our ultimate goal is to translate also these reviews into all of our 43 languages.

We will present our workflow of training our MT models with the open source package OpenNMT-tf (multi-GPU training, based on Tensorflow, with Python bindings). We would like to show how using transfer learning with a big dataset of open source translation data (up to 90 million parallel sentences) helped us advance our review translations use case. Increasing our dataset to this scale also introduced the challenge of keeping the training time manageable. We will also dive into more detail on how to monitor GPU-usage, and explain some steps we took to reduce training time.

We would like to end with a more relaxed and fun demo, in which we'll show:

Subscribe to Receive PyData Updates

Subscribe