Empty Menu

Long Short-Term Memory Networks With Python

Long Short-Term Memory Networks With Python

Develop Deep Learning Models for your Sequence Prediction Problems

The Long Short-Term Memory network, or LSTM for short, is a type of recurrent neural network that achieves state-of-the-art results on challenging prediction problems.

In this laser-focused Ebook written in the friendly Machine Learning Mastery style that you’re used to, finally cut through the math, research papers and patchwork descriptions about LSTMs.

Using clear explanations, standard Python libraries and step-by-step tutorial lessons you will discover what LSTMs are, and how to develop a suite of LSTM models to get the most out of the method on your sequence prediction problems.

Classical neural networks called Multilayer Perceptrons, or MLPs for short, can be applied to sequence prediction problems.

The application of MLPs to sequence prediction requires that the input sequence be divided into smaller overlapping subsequences called windows that are shown to the network in order to generate a prediction.

This can work well on some problems but suffers some critical limitations such as being stateless and having a fixed number of inputs and outputs.

Promise of Recurrent Neural Networks

The Long Short-Term Memory, or LSTM, network is a type of Recurrent Neural Network (RNN) designed for sequence problems.

Given a standard feedforward MLP network, an RNN can be thought of as the addition of loops to the architecture. The recurrent connections add state or memory to the network and allow it to learn and harness the ordered nature of observations within input sequences.

The internal memory means outputs of the network are conditional on the recent context in the input sequence, not what has just been presented as input to the network.

In a sense, this capability unlocks sequence prediction for neural networks and deep learning.

Impressive Applications of LSTMs

We are interested in LSTMs for the elegant solutions they can provide to challenging sequence prediction problems.

Let’s look at 3 examples to give you a snapshot of the results that LSTMs are capable of achieving.

Automatic Image Caption Generation

Automatic image captioning is the task where, given an image, the system must generate a caption that describes the contents of the image.

What would I teach if I had to get a machine learning practitioner proficient with LSTMs in two weeks?

I had been researching and applying LSTMs for some time and wanted to write something on the topic, but struggled for months on how exactly to present it. The above question crystallized it for me and this whole book came together.

The above motivating question for this book is clarifying. It means that the lessons that I teach are focused only on the topics that you need to know in order to understand (1) what LSTMs are, (2) why we need LSTMs and (3) how to develop LSTM models in Python.

I developed a program to take you on the critical path:

From…a practitioner interested in LSTMs (e.g. you right now).To…a practitioner that can confidently apply LSTMs (e.g. you after reading the book).

I want you to get proficient with LSTMs as quickly as you can. I want you using LSTMs on your project.

This also means not covering some topics, even topics covered by “everyone else“, like LSTM math.

This book is not for everyone…so is this book right for YOU?

Let’s make sure you are in the right place.

This book is for developers that know some applied machine learning and need to get good at LSTMs fast.

Maybe you want or need to start using LSTMs on your research project or on a project at work. This guide was written to help you do that quickly and efficiently by compressing years worth of knowledge and experience into a laser-focused course of 14 lessons.

The lessons in this book assume a few things about you, such as:

You know your way around basic Python.

You know your way around basic NumPy.

You know your way around basic scikit-learn.

For some bonus points, perhaps some of the below points apply to you (don’t panic if they don’t).

You may know how to work through a predictive modeling problem.

You may know a little bit of deep learning.

You may know a little bit of Keras.

This guide was written in the top-down and results-first machine learning style that you’re used to from Machine Learning Mastery.

This book is not a panacea…so what will YOU know after reading it?

This book will teach you how to get results as a machine learning practitioner interested in using LSTMs on your project.

After reading and working through this book, you will know:

What LSTMs are.

Why LSTMs are important.

How LSTMs work.

How to develop a suite of LSTM architectures.

How to get the most out of your LSTM models.

This book will NOT teach you how to be a research scientist and all the theory behind why LSTMs work. For that, I would recommend good research papers and textbooks. See the Further Reading section at the end of the first lesson for a good starting point.

Exactly What You Need to Know…14 carefully designed lessons to take you from Beginner to Practitioner

This book was designed to be a 14-day crash course into LSTMs for machine learning practitioners.

There are a lot of things you could learn about LSTMs, from theory to applications to Keras API. My goal is to take you straight to getting results with LSTMs in Keras with 14 laser-focused lessons.

I designed the lessons to focus on the LSTM models and their implementation in the Keras deep learning library. They give you the tools to both rapidly understand each model and apply them to your own sequence prediction problems.

Each of the 14 lessons are designed to take you about one hour to read through and complete, excluding the extensions and further reading.

You can choose to work through the lessons one per day, one per week, or at your own pace. I think momentum is critically important, and this book was intended to be read and used, not to sit idle. I would recommend picking a schedule and sticking to it.

Book Structure for Long Short-Term Memory Networks With Python

The lessons are divided into three parts:

Part 1: Foundations. The lessons in this section are designed to give you an understanding of how LSTMs work, how to prepare data, and the life-cycle of LSTM models in the Keras library.

Part 2: Models. The lessons in this section are designed to teach you about the different types of LSTM architectures and how to implement them in Keras.

Part 3: Advanced. The lessons in this section are designed to teach you how to get the most from your LSTM models.

You can see that these parts provide a theme for the lessons with focus on the different types of LSTM models.

Lessons

Here is an overview of the 14 step-by-step tutorial lessons you will complete:

Each lesson was designed to be completed in about 30-to-60 minutes by the average developer.

Part I. Foundations

Lesson 01: What are LSTMs.

Lesson 02: How to Train LSTMs.

Lesson 03: How to Prepare Data for LSTMs.

Lesson 04: How to Develop LSTMs in Keras.

Lesson 05: Models for Sequence Prediction.

Part II. Models

Lesson 06: How to Develop Vanilla LSTMs.

Lesson 07: How to Develop Stacked LSTMs.

Lesson 08: How to Develop CNN LSTMs.

Lesson 09: How to Develop Encoder-Decoder LSTMs.

Lesson 10: How to Develop Bidirectional LSTMs.

Lesson 11: How to Develop Generative LSTMs.

Part III. Advanced

Lesson 12: How to Diagnose and Tune LSTMs.

Lesson 13: How to Make Predictions with LSTMs.

Lesson 14: How to Update LSTM Models.

You can see that each lesson has a targeted learning outcome. This acts as a filter to ensure you are only focused on the things you need to know to get to a specific result and not get bogged down in the math or near-infinite number of configuration parameters.

These lessons were not designed to teach you everything there is to know about each of the LSTM models. They were designed to give you an understanding of how they work, how to use them on your projects the fastest way I know how: to learn by doing.

Table of Contents for Long Short-Term Memory Networks With Python

Discover 4 Different Sequence Prediction Models

There are 4 main types of sequence prediction models that you need to know.

Each of these model types are presented in the book with code examples showing you how to implement them in Python.

1. One-to-One Model

One-to-One Sequence Prediction Model

2. One-to-Many Model

One-to-Many Sequence Prediction Model

3. Many-to-One Model

Many-to-One Sequence Prediction Model

4. Many-to-Many Model

Many-to-Many Sequence Prediction Model

Discover 6 Different LSTM Architectures

The LSTM network is the starting point. What you are really interested in is how to use the LSTM to address sequence prediction problems.

The way that the LSTM network is used as layers in sophisticated network architectures. The way that you will get good at applying LSTMs is by knowing about the different useful LSTM networks and how to use them.

The whole middle section of this book focuses on teaching you about the different LSTM architectures.

1. Vanilla LSTM

Memory cells of a single LSTM layer are used in a simple network structure.

2. Stacked LSTM

LSTM layers are stacked one on top of another into deep recurrent neural networks.

3. CNN LSTM

A Convolutional Neural Network is used to learn features in spatial input and the LSTM is used to support a sequence of inputs (e.g. video of images).

4. Encoder-Decoder LSTM

One LSTM network encodes input sequences and a separate LSTM network decodes the encoding into an output sequence.

5. Bidirectional LSTM

Input sequences are presented and learned both forward and backward.

6. Generative LSTM

LSTMs learn the structure relationship in input sequences so well that they can generate new plausible sequences.

Don’t have a Python environment?

About The Author

Hi, I'm Jason Brownlee.

I live in Australia with my wife and son and love to write and code.

I have a computer science background as well as a Masters and Ph.D. degree in Artificial Intelligence.

I’ve written books on algorithms, won and ranked in the top 10% in machine learning competitions, consulted for startups and spent a long time working on systems for forecasting tropical cyclones. (yes I have written tons of code that runs operationally)

I get a lot of satisfaction helping developers get started and get really good at machine learning.

I teach an unconventional top-down and results-first approach to machine learning where we start by working through tutorials and problems, then later wade into theory as we need it.

I'm here to help if you ever have any questions. I want you to be awesome at machine learning.

Download Your Sample Chapter

Do you want to take a closer look at the book? Download a free sample chapter PDF.

Enter your email address and your sample chapter will be sent to your inbox.

Check Out What Customers Are Saying:

I loved the book.

Jason teaches advanced machine learning and deep learning topics in a way that makes even a novice able to run models quickly and effectively. This book I purchased outlined multiple LSTM model types, and I was able to use this information to quickly get usable results.

Michael GrantStudent

Excellent and clear explanation of LSTMs along with nice examples and start to end projects.

I really enjoyed reading all the books in the super bundle and going through different examples with working Python code. Great work. I would highly recommend anyone struggling to understand machine learning and the hands-on working examples, this is the perfect resource, right from basic machine learning concepts to advanced levels.

Dr Girija ChettyAssociate Professor

Congratulations on writing a book about LSTMs that is both sophisticated and idiot proof.

That is **exactly** the combination I needed. I applaud you for starting with simple topics, like normalizing, standardizing and shaping data, and then taking the discussion all the way to performance tuning and the more complicated LSTM models, providing examples at every step of the way.

I love this book.

John StrongBusiness Owner

The book starts with the following thought, stating its main purpose:

“If I had to get a machine learning practitioner proficient with LSTMs in two weeks (e.g. capable of applying LSTMs to their own sequence prediction projects), what would I teach?”

Previous to reading this book I had no experience with RNNs at all. The book is well written, in a concise way with no unnecessary wording, which makes it a delight to read. The book delivers on its purpose, and you go from zero to hero in two weeks, as promised. Lots of practical, concise and well-thought examples are given, which help you master the practice of this art quickly.

The author wisely chose to leave the theory out, which I have now had the time to dive into, and understand better after having the practical knowledge under my fingers. I highly recommend this book to anyone wanting to deliver the power of LSTMs in their next project.

Marco Bertani-ØklandResearcher

I really like this book and topics are well informed with examples.

Biswajit SamalSoftware Engineer

You're Not Alone in Choosing Machine Learning MasteryTrusted by Over 10,000 Practitioners

...including employees from companies like:

...students and faculty from universities like:

and many thousands more...

Absolutely No Risk with...100% Money Back Guarantee

Plus, as you should expect of any great product on the market, every Machine Learning Mastery Ebookcomes with the surest sign of confidence: my gold-standard 100% money-back guarantee.

100% Money-Back Guarantee

If you're not happy with your purchase of any of the Machine Learning Mastery Ebooks,just email me within 90 days of buying, and I'll give you your money back ASAP.

Can I get an invoice for my purchase?

Email me with the details of your order (order number or email address used to make the purchase) and details you would like to appear on the invoice (your name, company name and address).

I will create a PDF invoice for you and email it back.

How long do books take to ship?

There are no physical books, therefore no shipping is required.

All books are EBooks that you can download immediately after you complete your purchase.

Do you ship to my country?

There are no physical books, therefore no shipping is required.

All books are EBooks that you can download immediately after you complete your purchase.

I support purchases from any country via PayPal or Credit Card.

Can I have a discount?

I do offer a discount to students, teachers, and retirees.

Note: I only offer discounts on individual books, not on the bundles. This is because the bundles are already heavily discounted.

If you are a student, teacher or a retiree please contact me and ask for the discount.

Do you have any sales, deals, or coupons?

No.

I generally don't do sales.

If I do have a special, such as around the launch of a new book, I only offer it to past customers and subscribers on my email list.

I do offer book bundles that offer a discount for a collection of related books.

Can I get a refund?

Yes.

I am sorry to hear that you want a refund.

Please contact me directly with your purchase details (order number or email address used to make the purchase) and I will organize a refund.

Will you help me if I have questions?

Yes.

Please contact me anytime with questions about machine learning or the books.

One question at a time please.

Also, each book has a final chapter on getting more help and further reading and points to resources that you can use to get more help.

Do I need to be a good programmer?

No.

Not at all.

My material requires that you have a programmers mindset of thinking in procedures and learning by doing.

You do not need to be an excellent programmer to read and learn about machine learning algorithms.

How much math do I need to know?

No background in statistics, probability or linear algebra is required.

I teach using a top-down and results-first approach to machine learning. You will learn by doing, not learn by theory.

There are no derivations.

Any questions presented are explained in full and are only provided to make the explanation clearer, not more confusing.

How much machine learning do I need to know?

Only a little.

If you are a reader of my blog posts, then you know enough to get started.

I do my best to lead you through what you need to know, step-by-step.

How long will the book take me to complete?

I recommend reading one chapter per day.

Some students finish the book in a weekend.

Most students finish the book in a few weeks by working through it during nights and weekends.

How are your books different to other books?

My books are playbooks. Not textbooks.

They have no deep explanations of theory, just working examples that are laser-focused on the information that you need to know to bring machine learning to your project.

My books are not for everyone, they are carefully designed for practitioners that need to get results, fast.

How are your books different from the blog?

The books are a concentrated and more convenient version of what I put on the blog.

I design my books to be a combination of lessons and projects to teach you how to use a specific machine learning tool or library and then apply it to real predictive modeling problems.

The books get updated with bug fixes, updates for API changes and the addition of new chapters, and these updates are totally free.

I do put some of the book chapters on the blog as examples, but they are not tied to the surrounding chapters or the narrative that a book offers and do not offer the standalone code files.

With each book, you also get all of the source code files used in the book that you can use as recipes to jump-start your own predictive modeling problems.

How are the 2 algorithms books different?

The book “Master Machine Learning Algorithms” is for programmers and non-programmers alike that learn through worked examples. It teaches you how 10 top machine learning algorithms work, with worked examples in arithmetic, not code (and spreadsheets) that show how each model learns and makes predictions.

The book “Machine Learning Algorithms From Scratch” is for programmers that learn by writing code to understand. It provides step-by-step tutorials on how to implement top algorithms as well as how to load data, evaluate models and more. It has less on how the algorithms work, instead focusing exclusively on how to implement each in code.

The two books can support each other.

Is there a team or company-wide license?

No.

Due to abuse of the privilege, I only support purchases by individuals.

Is there a license for libraries?

No.

Sorry, I only support purchases by individuals.

Do you have videos?

No.

I only have tutorial lessons and projects in text format.

This is by design. I used to have video content and I found the completion rate much lower.

I want you to put the material into practice. I have found that text-based tutorials are the best way of achieving this.

After reading and working through the tutorials you are far more likely to apply what you have learned.

What operating systems are supported?

Linux, Mac OS X and Windows.

Can you be my mentor or coach?

No.

Thanks for asking. I would love to help, but I just don't have the capacity.

I try to help as many people as possible through my blog and books.

Can I purchase from Amazon (or elsewhere)?

No.

My books can only be purchased from my website.

The reason is that I am a small business and I want a direct relationship with you, my customer, so that I can offer personal support and send out updates about your book and new stuff I am working on.

I hope you can understand my rationale.

What if my download link expires?

It is possible that your link to download your purchase will expire after a few days.

This is a security precaution.

Please contact me and I will resend you purchase receipt with an updated download link.

Can I use your code in my own project?

Yes.

But, understand that all code was developed and provided for educational purposes only and that I take no responsibility for it, what it might do or how you might use it.