Ch:14 General Adversarial Networks (GAN’s) with Math.

I have been writing stories about a lot of different algorithms so far and all are discriminative algorithms, but this story is all about Generative models so let me quickly detail you about what the differences are

Overview — Cover Photo

Discriminative vs Generative

we all love (x, y) pairs, x being the inputs/features ( images,text,speech and etc..) y being the targets/labels.

(x, y) → (features, labels) / (inputs, targets)

let’s think about classification in supervised way

Discriminative

Given inputs we want to build a model that can classify the inputs to the corresponding targets as correct as possible.

Eg →Given these features this mail is SPAM or Not ??

It learns the conditional probability distribution.

p(y|x) –“the probability of y given x should be maximum.

so the model learns to predict the labels from the data other words, it learns the decision boundary between classes.

it does not really care about “How the training data is generated/distributed.

Ex: logistic regression, SVM’s

Generative

Given inputs we want to build a model that can understand the inputs to generate similar inputs and it’s labels from the targets .

Eg → Assume this mail is SPAM what likely are these features ??

it learns the joint probability distribution.

P(x,y) = p(y|x).p(x)

the model has to learn p(x).

it cares about “How the training data is generated/distributed. it cares about How to get x??

Ex: Naive bayes

Okay I hope you get some idea so let’s move on

GAN’s Concepts

GAN’s are generative models that try to learn the model to generate the input distribution as realistic as possible.

Gan’s end goal is to predict features given a label, Instead of predicting a label given features.

Eg: if we take cat images being x , then the GAN’s goal is to learn a model that can produce the realistic or believable cat images from the training data x.

A general adversarial network(GAN) consists of 2 neural networks.

A neural network called “Generator ” which generates new data points from some random uniform distribution. The goal is to produce the similar type of fake results from inputs.

while another neural network called “Discriminator” which identifies the fake data produced by Generator from the real data.

The main idea for GAN’s is to train 2 different networks to compete with each other with 2 different objective functions.

→The generator G tries to fool the discriminator into believing that the input sent by generator is real

→While the discriminator D gives a slap to the generator by identifying that this is fake.

→Then after taking the slap from the discriminator D , the generator G learns to produce similar type of training data inputs.

→ And this process is repeated for a while or until Nash equilibrium found.

Mode collapse happens quite often and there are some ways to prevent it from happening, #willdiscusshortly.

2. Vanishing gradient

This is a very often problem we see in deep neural networks in general, the same problem gets stronger here because the gradient at Discriminator not only goes back to Discriminator network but also it goes back to Generator network as feedback.