Data Science Stack Exchange is a question and answer site for Data science professionals, Machine Learning specialists, and those interested in learning more about the field. It only takes a minute to sign up.

3 Answers
3

RNN architectures like LSTM and BiLSTM are used in occasions where the learning problem is sequential, e.g. you have a video and you want to know what is that all about or you want an agent to read a line of document for you which is an image of text and is not in text format. I highly encourage you take a look at here.

LSTMs and their bidirectional variants are popular because they have tried to learn how and when to forget and when not to using gates in their architecture. In previous RNN architectures, vanishing gradients was a big problem and caused those nets not to learn so much.

Using Bidirectional LSTMs, you feed the learning algorithm with the original data once from beginning to the end and once from end to beginning. There are debates here but it usually learns faster than one-directional approach although it depends on the task.

Yes, you can use them in unsupervised learning too depending on your task. take a look at here and here.

Humans don’t start their thinking from scratch every second. As you read this essay, you understand each word based on your understanding of previous words. You don’t throw everything away and start thinking from scratch again. Your thoughts have persistence.

Traditional neural networks can’t do this, and it seems like a major shortcoming. For example, imagine you want to classify what kind of event is happening at every point in a movie. It’s unclear how a traditional neural network could use its reasoning about previous events in the film to inform later ones.

Recurrent neural networks address this issue. They are networks with loops in them, allowing information to persist.