Friday 14:35–16:10 in LG7

One workshop that data scientists don't want you to attend...

Oliver Laslett, Andraz Hribernik

Audience level:


With this one weird trick you can build a text processing pipeline!

We've all fallen for clickbait articles online. They pollute our news feeds and make it harder to filter out valuable information. In this workshop we'll stream news articles in real-time and detect clickbait using simple machine learning techniques. You won't believe what happened next...


By the end of the workshop you'll have your very own python app for streaming real-time news and detecting click bait. In the workshop we'll cover: - Streaming data from a REST API - Preprocessing textual data - Training a simple machine learning classifier for clickbait - Putting everything together in a scikit-learn pipeline - Analysing our results (which news source is the most clickbaity?)