Perhaps the pickup is more of a grace note, but it is clear that the 9/8 time signature is not correct. The key signature works, and the IV-V-I resolution is good with the octave jump down.
Here is another, named “Quirch cathp’3b (The Nille L’ theys Lags Bollue’s)”

http://www.eecs.qmul.ac.uk/~sturm/software/quirch1.mp3
Now we have a tune in 6/8, but the last measure is missing one eighth note and has an unnecessary natural. As “Lisl’s Stis”, “Quirch” begins and resolves to the tonic specified by the key signature. I like how it fiddles around in either ii or VI before resolving.

These are of course short tunes that I have hand selected from the virually unlimited output from sampling the RNN. Here is an example of what the raw output looks like:
T:Lat canny I. the dlas.
M:C
L:1/8
Q:1/2=100
K:D
A>A|:F>DEA|F2dF|A/F/A/B/ AF|G/E/E FD |DDDG|Edec|defd |eged|fdgd|dcd2||
e|g2ef gef(e/c/)|ddfe fdAA|F3 A c4|efef g{e}d4 |
gfga afgf|eggb ad'eg|fgdB edAB|BedA BABg|fdde ddd:|

This format of music notation is called ABC, and provides an extremely economic and interpretable format of music (monophonic typically, but polyphony is possible too). For instance, here is Volume 1 of “A Selection of Scotch, English, Irish and Foreign Airs adapted to the Fife, Violin, or German-Flute” published by James Aird in 1778. To create the training data for the RNN, I just combined all 1180 tunes in Aird’s six volumes — digitised by the great Jack Campin. I then trained a RNN with my CPU (on my slow MacBook Air) and the default parameters set by Andrej Karpathy:
-rnn_size size of LSTM internal state [100]
-num_layers number of layers in the LSTM [2]
-learning_rate learning rate [0.002]
-decay_rate decay rate for rmsprop [0.95]
-dropout dropout to use just before classifier. 0 = no dropout [0]
-seq_length number of timesteps to unroll for [50]
-batch_size number of sequences to train on in parallel [100]
-max_epochs number of full passes through the training data [30]
-grad_clip clip gradients at [5]
-train_frac fraction of data that goes into train set [0.95]
-val_frac fraction of data that goes into validation set [0.05]
-seed torch manual random number generator seed [123]
-print_every how many steps/minibatches between printing out the loss [1]
-eval_val_every every how many iterations should we evaluate on validation data? [1000]

I sample the trained system using the CPU, some random seed and the default parameters:
-sample 0 to use max at each timestep, 1 to sample at each timestep [1]
-length number of characters to sample [2000]
-temperature temperature of sampling [1]

It is remarkable that the RNN has learned some of the rules of ABC. There are some errors however. For instance, the RNN produces the abc of “Lisl’s Stis” as
T:Lisl's Stis.
M:9/8
L:1/8
Q:3/8=120
K:D
g/|a>f) d2b |gfe dAB |G3 B2c|A2G FAc| d3 D3:|

Clearly, the time signature should not be 9/8, but 6/8. The abc2midi tool gracefully fails, and fills in what was missing. Anyhow, most of the output of the RNN begins with the preface material, and ends with the music. Increasing the temperature beyond 1, or decreasing it below about 0.45 produces a lot of gibberish though.

Here is one piece it generates that is longer than those above, but at a sampling temperature of 0.45. This one I will add to my repertoire immediately: