Friday 17:30–19:30 in Intermediate

Convolutional neural networks for audio processing: starting pack

Marius Miron, Olga Slizovskaia

Audience level:


In this tutorial we will present our experience with adapting neural network frameworks for audio processing tasks. Specifically we focus on pre-training routines such as data processing and on post-training parameter visualization and debugging.


Neural networks are increasingly popular in audio signal processing for topics as speech recognition or denoising. Scientific papers are usually accompanied by code repositories which rely on libraries as Theano or Tensorflow that can be interfaced from python. However, adapting a system to different tasks and data must take into account a set of pre-training routines and parameter debugging which we will discuss in this tutorial. Starting from the audio signals we introduce the data pre-training steps (feature computation, batch generation, normalization( with examples in numpy or scipy. We summarize the core concepts in neural networks and we code an architecture with the Keras library. Finally, we learn how to visualize and debug parameters with TensorBoard. The repository for this workshop can be found at the github page

Subscribe to Receive PyData Updates



Get Now