Friday 9:00–10:30 in Track 1

Using GANs to improve generalization in a semi-supervised setting - trying it in open datasets

Andreas Merentitis, Carmine Paolino, Vaibhav Singh

Audience level:


In many practical machine learning classification applications, the training data for one or all of the classes may be limited. We will examine how semi-supervised learning using Generative Adversarial Networks (GANs) can be used to improve generalization in these settings. The full approach from training to model deployment will be demonstrated, using AWS Lambda and/or AWS Sagemaker


In this tutorial we will briefly introduce GANs and discuss how they can be used in a semi-supervised learning setup, to improve generalization when used in addition to a few labeled examples. Specifically, using as basis open source datasets we will demonstrate how we can train the discriminator part of the GAN on a combination of real labeled images, real unlabeled images, as well as artificially generated images from the GAN. By combining these three sources of data instead of just one, the final model will generalize to the test set much better than a traditional classifier trained on only one source of data. The tutorial will also cover deployment options for the developed model, including managed deployment with AWS Sagemaker as well as serverless deployment with AWS Lambda.

Subscribe to Receive PyData Updates