This site uses cookies to deliver our services and to show you relevant ads and job listings.
By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and our Terms of Service.
Your use of Stack Overflow’s Products and Services, including the Stack Overflow Network, is subject to these policies and terms.

Join us in building a kind, collaborative learning community via our updated
Code of Conduct.

9 Answers
9

one epoch = one forward pass and one backward pass of all the training examples

batch size = the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you'll need.

number of iterations = number of passes, each pass using [batch size] number of examples. To be clear, one pass = one forward pass + one backward pass (we do not count the forward pass and backward pass as two different passes).

Example: if you have 1000 training examples, and your batch size is 500, then it will take 2 iterations to complete 1 epoch.

The term "batch" is ambiguous: some people use it to designate the entire training set, and some people use it to refer to the number of training examples in one forward/backward pass (as I did in this answer). To avoid that ambiguity and make clear that batch corresponds to the number of training examples in one forward/backward pass, one can use the term mini-batch.

I'm confused. Why would you train for more than one epoch - on all the data more than once? Wouldn't that lead to overfitting?
– SoubriquetOct 15 '16 at 13:35

12

@Soubriquet Neural networks are typically trained using an iterative optimization method (most of the time, gradient descent), which often needs to perform several passes on the training set to obtain good results.
– Franck DernoncourtOct 15 '16 at 15:54

2

Hmm...so is this the reason for using early stopping and a validation set when training?
– SoubriquetOct 15 '16 at 16:03

2

But if there are a lot f training samples, say $1$ million, would just one epoch be enough? What do people typically do if the training set is very huge? Just divide the training set into batches and just perform one epoch?
– pikachuchameleonJan 9 '17 at 16:45

5

@MaxPower - typically, the step is taken after each iteration, as Franck Dernoncourt's answer implied; that's what we do with the information from the backwards pass. In a mini-batch gradient descent with m iterations per epoch, we update the parameters m times per epoch.
– dan mackinlayFeb 17 '17 at 3:14

Epoch

An epoch describes the number of times the algorithm sees the entire data set. So, each time the algorithm has seen all samples in the dataset, an epoch has completed.

Iteration

An iteration describes the number of times a batch of data passed through the algorithm. In the case of neural networks, that means the forward pass and backward pass. So, every time you pass a batch of data through the NN, you completed an iteration.

Example

An example might make it clearer.

Say you have a dataset of 10 examples (or samples). You have a batch size of 2, and you've specified you want the algorithm to run for 3 epochs.

Therefore, in each epoch, you have 5 batches (10/2 = 5). Each batch gets passed through the algorithm, therefore you have 5 iterations per epoch.
Since you've specified 3 epochs, you have a total of 15 iterations (5*3 = 15) for training.

Can you please explain if the weights are updated after every epoch or after every iteration?
– Inherited GeekJul 8 '17 at 11:11

1

@InheritedGeek the weights are updated after each batch not epoch or iteration.
– bhavin dhedhiFeb 3 at 14:31

1

@Bee No, take for example 10000 training samples and 1000 samples per batch then it will take 10 iterations to complete 1 epoch.
– bhavin dhedhiFeb 28 at 7:03

1

In addition to the previous comment, if your batch size is same as the total number of training samples then 1 epoch = 1 iteration.
– bhavin dhedhiFeb 28 at 7:54

1

@bhavindhedhi I think what Bee was asking is that in your example of 10000 total samples with 1000 per batch, you effectively have 10 total batches, which is equal to 10 iterations. I think that makes sense, but not sure if that's a proper way of interpreting it.
– Michael DuApr 1 at 3:52

Many neural network training algorithms involve making multiple presentations of the entire data set to the neural network. Often, a single presentation of the entire data set is referred to as an "epoch". In contrast, some algorithms present data to the neural network a single case at a time.

"Iteration" is a much more general term, but since you asked about it together with "epoch", I assume that your source is referring to the presentation of a single case to a neural network.

Typically, you'll split your test set into small batches for the network to learn from, and make the training go step by step through your number of layers, applying gradient-descent all the way down. All these small steps can be called iterations.

An epoch corresponds to the entire training set going through the entire network once. It can be useful to limit this, e.g. to fight overfitting.

You have a training data which you shuffle and pick mini-batches from it. When you adjust your weights and biases using one mini-batch, you have completed one iteration. Once you run out of your mini-batches, you have completed an epoch. Then you shuffle your training data again, pick your mini-batches again, and iterate through all of them again. That would be your second epoch.

To my understanding, when you need to train a NN, you need a large dataset involves many data items. when NN is being trained, data items go in to NN one by one, that is called an iteration; When the whole dataset goes through, it is called an epoch.

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).