Bootstrap Karpathy RNN

There are plenty of resources on Machine Learning and Neural networks, and many interactive playgrounds where you can learn and test out ML concepts. If you still need references before going further – check out the basics

Traditional (feed-forward) neural networks treat all inputs and outputs as independent pieces, which, most of the time, is not the case. Imagine predicting next character in a word without knowing the preceding one

That’s where RNNs come in place, with output of subsequent task directly dependent on the previous

This IO coupling allows RNNs to thrive in handwriting and speech recognition, in addition to being able to easily train character-level language models

The last part, in layman’s terms – if we feed RNN with bunch of text, it’ll model probability distribution of each character in a sequence, allowing us to create new text snippets, char by char

Karpathy RNN

Mr Andrej Karpathy nicely documented everything in his blog, shared code behind it and provided detailed guide on how to prepare and work with his RNN implementation

Karpathy’s blog post covers awesome examples of the effectiveness of his neural network, accompanied by general guidelines on backpropagation, operating on sequences of vectors, as well as speeding things up with CUDA on nVIDIA GPUs

Bootstrapping whole enchilada, however, is a bit time consuming and ain’t promptless, so let’s automate things a bit

Other distros, ones not using apt, yum or zypper, can still use boostrap script by simply adding a flag ./setup-rnn.sh --ommit-dependency-installation and the list of system packages required will be displayed on start of it’s run

In any case, after 40-ish lines bash script completes it’s run, Torch framework is installed in /opt/torch/, while the repo ends up in /opt/rnn-karpathy

Digits

In order to be consistent and have credible numbers, bootstrap was performed three times on each of the four default EC2 Linux flavors

upper left to lower right: Amazon, SUSE, RHEL, Ubuntu

Two things are interesting here – time to prepare everything and time to first processed sample, and here is how distributions stack up:

install

checkpoint

Amazon Linux

00:14

00:06

SUSE Linux Enterprise Server

00:04

03:13

Red Hat Enterprise Linux

00:05

03:15

Ubuntu Server

00:05

03:04

sherpa blue – bootstrap time; olive – processing first checkpoint

Given times are average of 3 VMs being bootstrapped and finalized processing 1000 cycles, when 1st checkpoint is created

While the bootstrap time varied between 4 and 14 minutes, initial checkpoint generation lasted between 6 minutes in case of Amazon Linux and ranged between 184 and 195 mins for the remaining three

All instances in test were, of course, identical. 12x t2.micro VMs with same amount of RAM, vCPUs and storage (1/1/EBS)

Checkpoint times are way off chart!

For some reason, Torch runs miraculously slow on all EC2 instance flavors (not including AWS Marketplace), except Amazon Linux, which is again based on RHEL, so something is a bit 🐟y here. Either optimizations Amazon made are astonishing, or they favor their distro over competition, resource-wise, under the hood

Any way you prefer, Amazon Linux, in spite of slower bootstrap, is the logical choice when it comes to char-rnn training, with cycles up to 10x faster than the rest

Playtime

Post bootstrap, all that remains is to aim a terminal to /opt/rnn-karpathy/, provide a text source in data/input.txt (remove tinyshakespeare/) and fire up sudo th train.lua -data_dir /opt/rnn-karpathy/data -gpuid -1

Console will display train loss, epoch and cycle duration and this is a great time to thank the screen command and step away from the server, since modeling will take some time

Processing duration differs based on the dataset size, but for a very rough estimate – input.txt [1.06 MB], from char-rnn repo, on Amazon Linux instance type t2.micro – takes about 24 hours to complete

Amid modeling, assuming you’re anxious to see preliminary results, start another console and examine latest checkpoint file, as soon as cycles round up to 1k. Head out to /opt/rnn-karpathy/ and sample with sudo th sample.lua cv/* -gpuid -1

If, however, you want to test pre-provided input right away and generate some of the Kind Richards paragraphs, Andrej neatly provided beforementioned data/tinyshakespeare/input.txt