I present the application of a relatively new method in unsupervised deep learning, called the Variational Auto-encoder, to creating a latent feature space for image data. Using this method I’ll show how we can transform images of women’s tops into a latent style space and then use that space to randomly generate new synthetic images of tops.
It's a harsh reality, in the world of data science, that some of the most dense and informative data is contained in unstructured sources like text or images. We as people have evolved over many years to have both sensors and processing structures in our brains to help us interpret these data sources with relative ease, so it's only natural that we choose to take advantage of such systems in the way we store and communicate information. Interestingly, the relative ease with which we parse text and images is also a monumental obstacle when formulating algorithms to do the same. If we can perform these seemingly complex tasks without conscious effort, then what hope might we ever have of describing such a process to a computer?
With the recent advances in deep learning for image processing, tasks that were once relegated to very specific methods of feature engineering are now being generalized by models of deep learning. Recent efforts have focused on supervised learning tasks for image classification. At Stitch Fix we have been experimenting with unsupervised deep-learning models, such as the Variational Auto-encoder, to develop latent feature spaces for a quantitative description of style.
I will present my work on applying Variational Auto-encoders to images of women’s clothing and use the resultant generative portion of the model to show how this space can be used to create new styles. I’ll then show a simple example for how to use a new Python module I’ve developed, called fauxtograph, to train an auto-encoder and generate random images of a similar context.