I have a time series regression prediction problem. So I divided the dataset into 3 parts:

training (first 70% of the time series data)

validation (from 70% to 85% of the time series data)

test set (last 15% of the data)

Then trained the model for some epochs and used earlystopping callback (keras callback) on validation set. By using earlystopping, I will be able to stop the model from further training, if no improvement is detected on the validation set.

Then calculated errors of the predictions on each dataset. Here are the results:

Training dataset Mean-Squared-Error = 921.4

Validation dataset Mean-Squared-Error = 1200.2

Test dataset Mean-Squared-Error = 300

From this question I concluded that my model is acting normally, because Training error is not higher than validation error.

I know that my Test dataset is easier than training and validation sets to predict and this is the reason for lower error. Does my model have problems? Should always test and validation errors be more than train errors? Is my model good at generalization?

$\begingroup$My time-series-naive hunch is that the process is not stationary. If it isn't, the test set is not a representative subset, so the model may perform better or worse. I'm curious if this is the right intuition.$\endgroup$
– StudentNov 7 '19 at 0:53

As you can see, that validation error is not really improving with progressing epochs, with the minimum validation error at (iteration x).

In case your training/validation graphs look something like this, then I guess the model is doing good.

Additional Question:

Looking at your values of train/val error, it seems these are inverse scaled errors that you calculated manually after model learning was completed, correct? Else if these values are actual error metrics given out by modelling technique, then I would recommend scaling the data before feeding it to the network.

$\begingroup$Answer to the question: Yes, exactly. I did scaling the data before feeding it to the network and as you guessed, I used inverse scaled errors. I feel I should use some Cross Validation on the dataset. By doing so, I'll make sure the best model is selected.$\endgroup$
– hyTuevMay 15 '19 at 11:29