How to implement Seq2Seq LSTM Model in Keras #ShortcutNLP

How to implement Seq2Seq LSTM Model in Keras #ShortcutNLPIf you got stuck with Dimension problem, this is for youAkira TakezawaBlockedUnblockFollowFollowingMar 18Keras: Deep Learning for PythonWhy do you need to read this?If you got stacked with seq2seq with Keras, I’m here for helping you.

When I wanted to implement seq2seq for Chatbot Task, I got stuck a lot of times especially about Dimension of Input Data and Input layer of Neural Network Architecture.

So Here I will explain complete guide of seq2seq for in Keras.

Let's get started!MenuWhat is Seq2Seq Text Generation Model?Task Definition and Seq2Seq ModelingDimensions of Each Layer from Seq2SeqPreprocessing of Seq2Seq (in Chatbot Case)Simplest preprocessing of code: which you can use today!1.

What is Seq2Seq Text Generation Model?Seq2Seq is a type of Encoder-Decoder model using RNN.

It can be used as a model for machine interaction and machine translation.

By learning a large number of sequence pairs, this model generates one from the other.

More kindly explained, the definition of Seq2Seq is below:Input: Text DataOutput: Text Data as wellAnd here we have examples of business applications of seq2seq:Chatbot (you can find from my GitHub)Machine Translation (you can find from my GitHub)Question AnsweringAbstract Text Summarization (you can find from my GitHub)Text Generation (you can find from my GitHub)If you want more information about Seq2Seq, here I have a recommendation from Machine Learning at Microsoft on Yotube:So let’s take a look at whole process!— — — — —2.

LSTM layer of Encoder and Decoder (3D->3D)The tricky argument of LSTM layer is these two:1.

return_state:Whether to return the last state along with the output2.

return_sequences:Whether the last output of the output sequence or a complete sequence is returnedYou can find a good explanation from Understand the Difference Between Return Sequences and Return States for LSTMs in Keras by Jason Brownlee.

0, 0, 1, 0, 1, 0, 0] ], dtype=int32)After Data passed this Fully Connected Layer, we use Reversed Vocabulary which I will explain later to convert from One-Hot Vector into Word Sequence.

— — — — —4.

Entire Preprocess of Seq2Seq (in Chatbot Case)Creating A Language Translation Model Using Sequence To Sequence Learning ApproachBefore jumping on preprocessing of Seq2Seq, I wanna mention about this:We need some Variables to define the Shape of our Seq2Seq Neural Network on the way of Data preprocessingMAX_LEN: to unify the length of the input sentencesVOCAB_SIZE: to decide the dimension of sentence’s one-hot vectorEMBEDDING_DIM: to decide the dimension of Word2Vec— — — — —Preprocessing for Seq2SeqOK, please put this information on your mind, let’s start to talk about preprocessing.