Other Forms of Deep Learning Tutorial

Welcome to the ninth lesson ‘Other forms of Deep Learning’ of the Deep Learning Tutorial, which is a part of the Deep Learning (with TensorFlow) Certification Course offered by Simplilearn. This lesson explains the functionality of autoencoder and the working of Generative-Adversarial Networks.

Let us begin with the objectives of this lesson.

Objectives

After completing this lesson on Other forms of Deep Learning, you’ll be able to:

Elaborate on the functionality of an autoencoder and its various types.

Reinforcement Learning

Reinforcement Learning teaches desirable behavior by assigning rewards for the right set of actions. It trains a model to make decisions about future actions. Eg: Robotics, gaming or driverless cars.

Working Principle

The neural network is trained by awarding a positive incentive for achieving the desired behavior. The agent, in this case, learns to do actions which maximize the cumulative rewards earned over a set of such action sequences.

The steps followed are:

Agent observes the environment

Takes actions such that certain long-term rewards are maximized

Adjusts actions based on rewards

Training a dog to fetch a ball is an example of this. You cannot control the behavior of dog but you can reward him with food every time he fetches the ball. Over time, the dog masters what behavior will get him rewards.

It is used extensively in gaming applications as well as driverless cars.

The agent learns the strategy, or policy (choice of actions), which maximizes its rewards over time.

Simple reward feedback is provided to the agent to learn its behavior. This is called a reinforcement signal.

The agent interacts with the environment.

S = set of states the environment is in

A = set of actions that the agent can take

Each time the agent performs action “a” in the state “s”, it receives a real-valued reward “r”.

The agent’s task is to learn a control policy π: S => A, that maximizes the expected sum of these rewards, with future rewards discounted exponentially by their delay.

The immediate rewards have a higher weight compared to the future rewards.

USE CASES

Typical RL environments like gaming have an advantage that a huge amount of data can be generated easily unlike in real-world cases of supervised learning, where a lot of data is needed from the real world.

Google used RL systems to achieve 40% reduction in the amount for electricity for cooling in

their data centers.

In the next section, let us learn the generative adversarial networks.

Generative Adversarial Networks (GANs)

GANs are generative models used to learn new generation capabilities. They are based on ‘Game Theory Scenario’ in which the generator network must compete against an adversary.

The idea of GANs is to produce output that contains new data similar to training data. They do so by learning the natural features of the training dataset. A gradual learning enables this network to produce an output which looks similar to training data.

GANs represent unsupervised learning as there does not exist an explicit desired output. It deals with unlabelled data.

Example: Millions of real images are taken as training data in a GAN to produce images similar to training data, in other words, realistic images

ADVANTAGES

The advantages of using GANs are:

The neural network is learning how the real world looks like

Images produced by GAN could be sharper than the training images

Used to produce samples of photorealistic images for the purposes of visualizing new interior/industrial design, shoes, bags, and clothing items or items for computer games' scenes.

Let us discuss the working of GANs in the next section.

How Do Generative Adversarial Networks(GANs) Work

The working of GANs can be explained in five steps as follows:

Step 1

In addition to a generative layer in the network (G), there is an additional neural network, a discriminator network D that tries to classify if an input image is real or generated.

For example, around 200 generated images and 200 real images are fed into the discriminator and trained to classify them as “real images” or “generated images”.

The discriminator network is usually a standard convolutional neural network.

Step 2

Backpropagation is done both through discriminator and generator to find how the generator parameters can be changed to make its 200 images slight more confusing for the discriminator.

Step 3

In effect, the generator G is trying to fool the discriminator D by generating fake images which are closer to real images, and discriminator is always trying to distinguish them.

Step 4

The training procedure for G is to maximize the probability of D making a mistake.

Step 5

In the end, the generator will output images that are indistinguishable from real images for the discriminator.

Summary

The generator produces samples which are close to real samples, and the discriminator tries to distinguish real samples from generated samples.

An Autoencoder represents an identity function where the input is copied to the output, but in such a manner that the copy over is approximate and not precise. This makes the model learn the most important features of the input.

Eventually, the generator learns to produce samples which look real to the discriminator. This helps in generating new art or new text.

Reinforcement Learning teaches agents what kind of actions are more desirable. This is achieved by incentivizing the agents with positive rewards for performing those actions.