This talk will cover learning about neural networks through programming and experimentation. In particular, we will use the Keras library as a straightforward way to quickly implement popular neural network architectures such as feed-forward, convolutional, and recurrent networks. We'll also focus on internal data transformations within these networks, as information is passed from layer to layer.
Programming frameworks for implementing neural networks have become easy to use, and now allow for rapid prototyping and experimentation. These frameworks can also be used as teaching tools for those who are getting started in neural networks and deep learning. However, often students and practitioners start from textbooks and research papers in order to learn about these powerful techniques, and get bogged down in mathematical notation and jargon. This talk proposes a different approach through three high-level steps: 1. learning the basics of neural network architectures and applications, 2. experimenting with these models through code examples, and 3. revisiting the math and theory behind these models with a more practical understanding of how they work. We will focus mainly on architectures for three popular types of neural networks (feed-forward, convolutional, and recurrent), setting aside the issue of optimizing these networks during training.
This talk assumes some familiarity with supervised machine learning and classification, but assumes no prior knowledge of neural networks or deep learning. A familiarity with Python is beneficial, since this talk presents neural nets primarily from the perspective of programming using a high-level library. However, if you are familiar with another programming language or deep learning library, the concepts will likely make sense. A detailed talk outline is given below:
a. Feed-forward networks
b. Convolutional networks
c. Recurrent networks
a. Layers, stacking, and data transformation
b. Linking layers with activation functions
a. A convolutional network for object recognition
b. A LSTM network for document classification
a. General framework for thinking of network layers: input, feature extraction, and classification layers
b. Intermediary layers as "data transformers"
c. Implementing "partial networks" (e.g. networks with feature extractors, but no classification layers) to
better understand data transformations and for debugging