Building an E2E working prototype that detects sign language meanings in images/videos and generate equivalent, realistic voice of words communicated by the sign language, in real-time, won't be completed in a day's work. Here I'd explain how it happened and what I learned in the process.
I built an end-to-end working prototype that uses Computer Vision to detect sign language meanings in images/live videos and generate equivalent, realistic voice of words communicated by the sign language, while also using Machine Translation to map the returned texts to local African languages (in progress), all in real-time. It was fun. And it was stressful too.
In this talk, I'd talk about the highs and lows of working on this project while also elaborating on the groundwork, the process, the results, and the future scope of the project.