Data Science Stack Exchange is a question and answer site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field. It only takes a minute to sign up.

Suppose I want to build a timeseries where each timestep is represented by a categorical array: the encoded sequences look like [[2, 0, 5],[3, 1, 4],..] and each entry has a different number of possible values (categories).
For example the first entry has 0-3 values, the second 0-1 and so on...

I want to train an LSTM model in order to predict the next timestep. So I defined a one hot encoding of each entry by means of the maximum number of classes:

For example [2, 0, 5] becomes

[[0. 0. 1. 0. 0. 0.],
[1. 0. 0. 0. 0. 0.],
[0. 0. 0. 0. 0. 1.]]

Unfortunately this kind of representation raises the error

ValueError: Invalid shape for y: (1, 3, 5)

I have three questions:

Is it possible to pass a 3d y target to Keras?

Should I define a single one-hot encoding which combines all the possible triplets of categorical values instead? The problem is that in this case I would lose the correlation between the occurrences of the labels in the same category, because each possible combination of labels would become independent from the other ones.