with being the source sequence, the true target sequence and the -th target word. The numerator is the negative log likelihood and the loss function value.

You want the perplexity to go down and be low in which case it means your model fits well the training data.

At the end of an epoch, the logs report by default the validation perplexity with the same formula but applied on the validation data. It shows how well your model fits unseen data. You can select other validation metrics with the -validation_metric option.