Thursday October 28 7:00 AM – Thursday October 28 7:30 AM in Talks II

Anyone GAN do this: Solving the Minority Class Imbalance problem once and for all

Dipam Paul, Alankrita Tewari, Dipam Paul

Prior knowledge:
No previous knowledge expected


In this talk, our mission is to highlight and try to solve one of the most pressing problems that exist in the world of training and deploying Neural Networks. This problem is called the 'Minority Class Imbalance Problem'. Our proposed technique comprising Deep Generative Models would not just solve the problem but would also show a way to how one can seamlessly attain state-of-the-art accuracy!


"If I were given a penny every time someone told me that their Neural Network model was performing poorly because of lack of data, I'd probably be a millionaire today." - Anonymous.

In this talk, our mission is to highlight and try to solve one of the most pressing problems that exist in the world of training and deploying Neural Networks. This problem is called the 'Minority Class Imbalance Problem'. Today, almost every Machine Learning or Deep Learning based Computer Vision task would require us to train, tune and deploy a functional Neural Network pipeline at some point that is ultimately designed to achieve a particular goal - be it classification, segmentation, or detection. Our proposed technique encompassing Deep Generative Models would not just solve the Minority Class Imbalance problem but would also show a way to how one can seamlessly attain state-of-the-art accuracy even on heavily imbalanced datasets with the help of a simple yet effective tweak!

  • What is exactly the problem? What causes it?

Neural Networks learn from examples that we input to them in the form of images (especially, for Computer Vision tasks). Again, let's consider there are multiple attributes we expect our model to learn - in that case, we will have 'classes'. Each class consists of some unique information about a particular subject or feature which contributes to the identification of that particular class when put to test. Oftentimes, in the world of Deep Learning what we see is (especially, in multi-class studies) one or more than one of these classes have very few examples for the model to learn from compared to the rest of the classes in the dataset. This imbalance could be caused due to multiple factors including but not limited to: Actual shortage of data acquisition concerning the particular class, unavailability of Gold Standard labeling of data, etc.

  • Now, why is this a problem?


More Examples = More Trainable Parameters = The model learns better = All good. Less Examples = Less Trainable Parameters = Inconsistent Results.

Now, having described the problem in somewhat detail let's move on to understanding the proposed solution to counteract this.

We propose to use Conditional Deep Convolutional Generative Adversarial Networks (cDCGAN).

  • How exactly do we do this? Why cDCGAN and not something else?

Generative Adversarial Networks, or GANs, are an engineering tool for preparing generative models, for example, deep convolutional neural networks for generating pictures.

In spite of the fact that GAN models are fit for creating new irregular conceivable models for a given dataset, it is highly unlikely to control the kinds of images that are produced other than attempting to sort out the intricate connection between the dormant space contribution to the generator and the generated images.

The network consists of a generator and a discriminator. The generator model is answerable for producing new conceivable models that preferably are indistinct from genuine models in the dataset. The discriminator model is liable for characterizing a given picture as one or the other genuine (drawn from the dataset) or phony (produced).

The models are prepared together in a zero-sum or adversarial manner, with the end goal that enhancements in the discriminator come at the expense of a decreased ability of the generator, and the other way around. GANs are viable at picture amalgamation, that is, creating new instances of pictures for an objective dataset.

  • Final Thoughts

Now, there are two primary motivations for making use of the class label information in a cDCGAN model:

  • Improve the Generative Network.
  • Targeted Image Generation - especially for minority classes.

We will be discussing ways to encode and incorporate the class labels into the discriminator and generator models. We will deep dive into the best practices which involve using an embedding layer followed by a fully connected layer with a linear activation that scales the embedding to the size of the image before concatenating it in the model as an additional channel or feature map.

In this contribution, we try to solve the long-withstanding problem of Class Imbalance to the greatest extent possible WITHOUT employing techniques such as Data Augmentation. We will walk through how to produce high-quality images from very few examples in order to increase a Minority Class not just quantitatively but also qualitatively in terms of the features it brings to the table. We also go on to compare the present methods with the proposed one highlighting reasonable ablation studies and recorded results that were validated on multiple landmark datasets.

(1) Who is this talk for?

The target audience includes a wide range of individuals. My talk would be relevant to - Professional Deep Learning enthusiasts, Machine Learning or Data Science practitioners with a niche towards Computer Vision and the use of AI, and lastly any wide-eyed high school or college kid like me who wants to relentlessly find the answers to all the mysteries in this field of study.

(2) What background knowledge or experience do you expect the audience to have?

The audience does not need to essentially have any specialized knowledge on any subject matter. I firmly believe a little keenness towards the area of Machine Learning, Computer Vision, Data Science, or Deep Learning is sufficient to understand the modalities of all the aspects of the talk. Furthermore, to tackle the time management issues we will also try and incorporate anticipated answers to questions we think the audience may be interested in asking us apart from the ones they actually do.

(3) What do you expect the audience to learn or do after watching the talk?

I believe the set of audience who choose to attend this talk will gain very critical insights on a very long-withstanding problem that a lot of us encounter on a daily basis but fall prey to as there are no standard solutions to this yet. I also believe the main motivation behind delivering this talk is to make the audience aware of how the integration of something like Generative Adversarial Networks (GANs) in a Data Science pipeline could be customized to solve the Class Imbalance problem and more importantly the pedagogy one should follow in order to do so. And of course, the future course of action and the inherent trade-offs that come with this solution.

Declaration: This project was entirely implemented with Python 🐍 and through the means and knowledge of Graduate level Linear Algebra.