TensorFlow has a component named TensorFlow Object Detection, whose purpose is to train a system capable of recognizing objects in a frame. Usually, this technology is used to detect real-life objects, until I took on the challenge of building a detection system to detect everybody's favorite Pokemon, Pikachu. What is object detection, and how I built my model? That's what this talk is about.
Deep inside the many functionalities and tools of TensorFlow, lies a component named TensorFlow Object Detection API, whose purpose, as the name says, is to train a system capable of recognizing objects in a frame. Usually, this technology is used to detect real-life objects such as cars, trees and dogs. However, I decided to give it a personal touch and took on the challenge of building a detection model to detect everybody's favorite Pokemon, Pikachu.
In this talk, I will introduce the concept of image detection alongside how to train a custom model using TensorFlow Object Detection API with the goal of deploying it on several platforms, such as an Android device. But more importantly, instead of just talking about fancy neural networks and good results, I would like to explain the whole process I followed to achieve this.
First, I will start with an introduction of the TensorFlow Object Detection API library by summarizing some of the details explained in the original paper, for instance, the features that make this library unique, and the different kinds of object detection architectures it supports. After that bit of theory, I will continue with how I converted my Pikachu images into the right format so I could create the dataset. Afterwards, I will explain the training process, and how to evaluate it using Tensorflow’s web UI, TensorBoard. Lastly, I will describe how to use the model in a Python notebook, the (somehow hard and painful) process of exporting it and using it on Android, and how to apply it to detect Pikachu in a video.