In this talk I will introduce a Python-based, deep learning gesture recognition model. The model is deployed on an embedded system, works in real-time and can recognize 25 different hand gestures from a simple RGB webcam stream. it uses a simple laptop camera and allows for building playful user interfaces that are controlled with hand gestures. Lastly I will show a demo of such application.
In this talk I will introduce a Python-based, deep learning gesture recognition model. The model is deployed on an embedded system, works in real-time and can recognize 25 different hand gestures from simple webcam stream. In contrast to traditional vision-based gesture controllers like the Microsoft Kinect, our system requires no depth information. The development of such an architecture is a complex process that requires careful consideration during each step. I will guide you through our process and share the lessons we learned. This includes: our large-scale crowd-acting operation to collect over 150,000 short video clips, our process to decide which deep learning framework to use, the development of a network architecture that allows for classifications of video clips solely with RGB input frames, the iterations necessary to make the neural network run in real-time on embedding devices, and lastly, the discovery and development of playful gesture-based applications.