Friday October 29 9:30 AM – Friday October 29 11:00 AM in Workshop/Tutorial II

Getting Started with Text Classification: Predict if Tweets are about Real Disasters

Nabanita Roy

Prior knowledge:
Previous knowledge expected
Python and Machine Learning

Summary

Natural Language Processing is an essential skill set for Data Scientists. In this session, I will demonstrate how to deal with textual data, techniques to derive insights, engineer features from textual data, and use it to solve text classification tasks. This is use-case-driven training where I will work through a text classification task to predict if a tweet is about a real disaster or not.

Description

Tutorial Summary

Natural Language Processing is an essential skill set for Data Scientists. In this session, I will demonstrate how to deal with textual data, techniques to derive insights, engineer features from textual data, and use it to solve text classification tasks. This is use-case-driven training where I will work through a text classification task to predict if a tweet is about a real disaster or not.

The content includes:

  1. Introduction to NLP based Python Libraries
  2. Exploratory Data Analysis
  3. Tweet Cleaning and Preprocessing
  4. Feature Engineering - Location Analysis | Sentiment Analysis | Tweet Wordcloud Analysis | Relative Feature Extractions
  5. Training a Classifier
  6. Conclusion and Future Work