Saturday 16:45–17:30 in Small Room

Simulate your language. ish.

John Paton

Audience level:
Novice

Description

John will present a simple character-level Markov model for simulating language in Python. The goal is to generate text that demonstrates how English looks to non-English readers. The model generates text that is simultaneously totally foreign and yet weirdly familiar, using logic simple enough that anyone could replicate it.

Abstract

engl_ish is a Python model for text generation. It is based on a character level Markov model augmented with some additional logic, with the aim of capturing the "feel" of a language. The goal is to generate text in e.g. English, such that it contains no actual English meaning, but nonetheless looks like English to someone who doesn't speak the language.

In this talk, John will share his inspiration for creating the model, go into detail about its logic and how it was implemented in Python, share results from a variety of training sets and settings, and talk about issues and opportunities for improvement.

Subscribe to Receive PyData Updates

Subscribe

Tickets

Get Now