Friday 12:15–13:50 in LG7

Building a ChatBot with Python, NLTK and scikit

Edward Bullen

Audience level:
Novice

Description

Introducing the basics of Natural Language Processing using Python NLTK and Machine Learning packages to classify language in order to create a simple Q&A bot.

Abstract

Working code samples and a basic ChatBot framework (written in Python) will be provided and explained so that a simple Q&A bot that learns from previous experience and responds to questions with appropriate answers can be created. In this talk we will cover:

  1. Build a basic ChatBot Framework using core Python and a SQL database.
  2. Demonstrate and experiment with a Learning-by-Example bot using ranking functions in Python and SQL to get some basic chat functionality working.
  3. Introduce the Python NLTK to extract features from the chat sentences and words stored in the chatbot database.
  4. Work through a feature engineering example using NLTK and Sci-Kit and Numpy to show how we can classify sentences using Supervised Learning and estimate the accuracy of our classification model.
  5. Apply the sentence classification ML model to our chatbot engine to target responses more accurately.

Prerequisites

Attendees will need:
+ Anaconda for Python 3.5 or 3.6
+ NLTK (Python Natural Language Toolkit - pip install nltk)
+ The Stanford Java CoreNLP Parser (https://stanfordnlp.github.io/CoreNLP/ or wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip and un-zip)
+ Java rel 8

Theoretically all of this could be installed on the day but it would just help to save time by preparing in advance. Most of what I am demonstrating will probably work against Python 2.7, but it hasn’t been tested with 2.7.