The goal of this environment is to keep the pole balanced for as long as possible by moving the cart left and right. It has three actions– left, right, and no-op, and four inputs: position and velocity of the cart, and position and velocity of the pole.

Q-Learning

The first model I implemented was a basic Q-Learning model. Q-learning operates by estimating the expected value of each action given the current state, including both the reward from the immediate next time step and the expected reward from the resulting state. For my implementation, the Q function is simply a matrix mapping (state, action) -> value, and at any time step it selects the action which has the highest value in the given state, with some small probability of choosing randomly instead (this helps it explore and makes it less likely to get trapped in local minima.

And it learns! Quite quickly, in fact.

A graph showing progress of a Q learning model (x-axis is training iteration, y-axis is how many frames it survived)

This is comparatively simple to implement, but has some problems. Specifically, the state space must be discrete. In order to make it possible to index into this matrix with (state, action) pairs, the state space has to be discrete, In order to do this I binned the parameters provided by the environment (8 bins for each of the cart parameters, 10 bins for each of the pole’s parameters, which is 6400 possible states). There’s a trade-off here, since more states results in a finer-grained model which is better able to comprehend the state of the environment, but takes longer to train because there are more spaces to fill in the matrix, and each one will be encountered less frequently (the state space becomes sparser).

How do we solve this problem? We need to go deeper.

Deep-Q

The only difference between basic Q-Learning and Deep-Q Learning is that the Q function is a neural network instead of a matrix. This allows us to use a continuous state space. The network is trained to produce the same value used above for the more basic Q-Learning model. This model should perform better in theory, because of this.

Unfortunately, this model wasn’t very practical. The Q function estimates the value of each action separately, which means that the model runs several times per iteration. This makes it very slow to train, which makes it kind of miserable to iterate. It was learning, but so slowly, I moved right on to my next experiment for the sake of time and my sanity.

Asynchronous Actor-Critic

The Actor-Critic model improves over Q-learning in two significant ways. First off, it’s asynchronous. Rather than having a single network, it parallelizes the learning across multiple workers which each have their own environment. Each workers’ experiences are used to update the shared network asynchronously, which results in a dramatic improvement in training performance.

Secondly, it doesn’t just output the value (V(s), below), which corresponds to the value of the current state (similar to the Q value above, but it doesn’t consider actions), it also outputs the policy (π(s) below), which is a distribution over all possible actions. Importantly, the network’s own value judgements are used to train its policy- based on the action taken, it compares the value of the new state and updates the policy with respect to the previous state to reflect that.

We can make one more improvement: Instead of using the value estimations and rewards directly, we can calculate the advantage. This represents how much higher the reward was than estimated, which gives a larger weight to places where the network struggles and lets it exploit sparse rewards more effectively.

I used a different environment than before, moving out of the parameterized space into a more complex environment: Breakout. It’s one of the atari

The Breakout environment

My specific architecture used 2 convolutional layers to parse the images, followed by a single LSTM layer to get time relations, and then a set of fully-connected layers for the policy and value outputs. You can see an early example of it above! This model actually worked very well for me– here’s the tensorboard output from 20 hours of training:

Graphs showing a steady increase in length, reward, and value

Interestingly, the gradient and variable norm also increased over the course of the training, which suggests that I need to normalize the model more strongly, but given the duration of the test, I’m not sure I want to cross-validate that.

Unfortunately, due to an error which would very rarely not terminate a training episode, the workers all died at the end and my computer ran out of memory as the episodes dragged on forever (alas). However, it did still seem to be improving when it died, and I still have the model, so I plan to continue training and see just how good it eventually gets. I also made a few improvements over that first try above, which you can see in the sample below:

A black-and-white, cropped version of Breakout

As you can probably see, the images have been pre-processed to crop out the score, resized, and converted to black-and-white for efficiency. Additionally, I moved the training and sampling of the network run on my GPU, which freed up the CPU to run the simulations of the environment. This more than doubled the speed of the training and was what made it reasonable to train overnight like I did above.

Option-Critic

One feature of the Atari environments caught my attention, though: Frame skip. In the Breakout environment, each action has a random chance of being repeated 2, 3, or 4 times, which adds some noise to the behavior of the agents that they have to account for. I began to wonder, however, how an agent might perform that was able to choose how many frames to execute an action for. It’s been done before with a Q-learning model, but this model simply adds an additional version of each decision that corresponds to each number of time steps, which dramatically increases the cost of choosing an action and the number of hyperparameters, and makes the model more sparse in general. What would be ideal is a model which can use some sort of regression to select. Fortuitously, the Actor-Critic model is ideal for this!

The architecture proposed by Lakshminarayanan et. al. in the paper above

We can simply add another output to the model which selects the number of times to repeat the chosen action.

This is remarkably similar to a concept called “options.” Options generalize decisions over time by outputting not a single discrete decision, but two components: a distribution over possible actions (we have that already), and a termination condition. Ideally, the termination condition would be some subset of the possible states, such that the option could continue until some state is reached, however this is difficult to accomplish with the neural network interpreting the states. Instead, our termination condition is a simple scalar which will sample from that distribution the given number of times.

This is what I’ve been working on, but unfortunately, learning the options seems to be much more difficult than learning one-frame actions. My best results have come from bootstrapping the network with the pre-trained checkpoint from the actor-critic model above, but it simply always chose a duration of 1 frame and then continued learning exactly as before. If I choose to reward the model for using larger time steps, it always chooses the maximum and ignores the ball completely to cash in on that easy reward. Reward engineering is tough!

In the near future, I’ll continue working on this problem and see what I can make of it, but unfortunately with this sort of reinforcement learning model, sometimes it’s nigh-impossible to identify the source of a bug, and iteration takes so long that it’s difficult to find them all in a reasonable amount of time.

Next steps

My next step in this is really to move out of the OpenAI gym and out into the wild a little bit, or rather, into unity. Unity is a very popular game development engine which recently released a toolkit for training machine learning agents in environments created in the engine. This opens up some really interesting possibilities, as the environments can be anything you can create in unity, including physics and collision simulations, realistic lighting, and, most importantly, human interaction. This opens up a lot of really interesting possibilities that I’m excited to explore.

If you’re into reading papers, definitely take a look at this one, if not, Károly Zsolnai-Fehér of Two Minute Papers did an excellent video about this paper. This is one of the things that really inspires my love of machine learning, and it’s great to see awareness of this really interesting stuff on the rise.

An additional note, if you haven’t yet, check out some other blogs hosting generative humor, including AI Weirdness and Botnik Studios (not neural networks, but still hilarious). There’s also a (very new) subreddit for Botnik with similar predictive text humor, so check that out.

Finally, since I’ve been getting a ton of new subscribers (Welcome!) an update on my current projects: The first one, which I’ll be doing a much more substantial write-up of some time in the next month, is a gimp plugin interface for Deep Dream. Here’s an example of the kind of stuff you might see some more of:

Tubingen with the “honeycomb” class deep-dreamed onto it

I’m really excited about this project finally coming together– once this is set up, I may look into the feasibility of porting it to a Photoshop plugin and/or creating a plugin for Neural-Style, since these tools really need to be put into the hands of artists.

The second is just learning more about reinforcement learning using the OpenAI Gym and (eventually) the Unity ML Agent framework. But look! It learns:

Asynchronous Actor-Critic RL model playing Breakout

It’s not very smart yet, but it’s trying its best. (I’ve since switched to grayscale for better performance)

In the meantime, I’ve found a set of web scrapers that pull data from BoardGameGeek, so once I get that set up to get natural language descriptions instead of just ratings data, expect some AI-generated board games.

[last-minute edit]

I should also mention, I did run char-rnn on the database of Urban Dictionary results, but then realized belatedly that this website is on my resume which I hand out to employers, so I decided to do a bit more thinking about whether I wanted to dip into the NSFW on this blog. I’ll keep you updated on that.

A very common problem in natural language processing is that of sentiment analysis, or rather, attempting to analytically determine if a given input has an overall positive or negative tone. For example, a positive string might be “I really enjoyed this movie!” whereas a negative input might be “it was boring, vapid, and dull.”

My idea is essentially thus: given recent advancements in machine translation, it should be possible to translate between sentiments (as opposed to between languages. The difficulty in doing this presented itself fairly immediately: there is no dataset available with a one-to-one mapping between negative and positive sentiments. There are models for unpaired machine translation, but they’re still relatively unproven.

My first implementation was a fairly simple rule-based approach: try to remove or add inverters (the word “not,” for example) and replace words with their antonyms until the sentiment changes appropriately. This worked well for very simple cases, but just wasn’t smart enough to capture more complex relationships (for example, it loved to translate “good” to “evil,” even when “bad” would make a lot more sense). My new implementation takes a different approach, using (and abusing) a model loosely adapted from Daniele Grattarola’s Twitter Sentiment CNN.

The Data

I used the aclimdb dataset, a set of reviews scraped from the International Movie Database, split into four parts of ~12500 reviews each: positive training, negative training, positive test, and negative test. Movie reviews work very well for this problem because they are essentially already annotated with the sentiment in the form of the user’s star rating for the film.

In pre-processing, I split the reviews into sentences to reduce the length of each input and convert each review in to word vectors (in my experiments, I used the googlenews-300 pretrained vectors). Unfortunately, due to the size of the input when converted into 300-dimensional vectors, I frequently ran out of memory during training. To reduce this issue, I only load the million most common words in the google news negative 300 vectors

The Model

The model is based on a set of one-dimensional convolutions over the series of word vectors, followed by a max pooling layer, ReLU, and a fully-connected layer. This is trained as a standard sentiment classifier, learning to predict the sentiment of a given input sentence.

At sampling time, however, we do something different. We run the input sentence through the classifier, as normal, however we give the classifier a different target sentiment. We then find the gradient of the loss of the classifier with respect to the input word vectors. This may be familiar to anyone who’s implemented Google’s Deep Dream algorithm or worked with Adversarial images. In essence, this will give us the direction we should perturb the input vectors to cause the largest change in the sentiment. Additionally, the magnitude of the gradient for a given word roughly corresponds to how much that word contributes to the sentiment (and therefore, how important it is to change).

We hit on another problem here. The space of possible words is discrete, but the word vector space is continuous (and very high-dimensional, and thus sparse). How can we be sure that these gradients are moving towards an actual word? To be honest, I’m not entirely sure. My first approach was to use multiple gradient steps, however this appeared to find minima in the sentiment that didn’t correspond to actual words in the input set. My second approach was to extend the gradient out as a ray from the original word and find the word vectors closest to this line: this worked a good deal better (specifically, it captures the “hate” <-> “love” relationship), but still isn’t perfect: we still need an heuristic method to select which of the proposed word replacements to use, which in the end will make this method little better than the rule-based approach from my initial implementation.

Conclusion

The biggest realization I came to was that when mapping a discrete space to a continuous space, the meaning of the intermediate values is not always intuitive. This is what we see when we simply perform gradient descent on the word vectors– the sentiment converges very nicely, but the resulting vectors have barely changed from their original values. This is interesting in the domain of computer vision, as it typically results in an “adversarial image,” or an image which can fool the classifier into misclassifying it with a very high confidence while being indistinguishable from the original to a human. However, as we are hoping for some of the words to converge to different discrete values in the word space, this is less than ideal.

Additionally, one unanticipated disadvantage of the lack of unpaired data was the inability to mathematically verify the accuracy of the translations– there was no ground truth to translate to.

Future Work

One thought I’ve had is to try to do something similar to CycleGAN, which performs “image translation” in an unpaired fashion through a combination of GAN loss and “reconstruction loss,” however this still introduces problems as we cannot easily calculate gradients of the sentiment loss through the discretization into the word space.

It’s a tricky problem, but if anyone has any ideas, I’m interested.

]]>https://machinedaydreams.com/2018/03/14/sentiment-translator/feed/0underwhelmingforceAutomated topic extraction with Latent Dirichlet Allocationhttps://machinedaydreams.com/2018/01/08/automated-topic-extraction-with-latent-dirichlet-allocation/
https://machinedaydreams.com/2018/01/08/automated-topic-extraction-with-latent-dirichlet-allocation/#respondMon, 08 Jan 2018 03:52:03 +0000http://machinedaydreams.com/?p=1083Continue reading →]]>During last semester, I became aware of a really interesting NLP algorithm called LDA, short for “Latent Dirichlet Allocation.” The goal of the algorithm is to create a set of “topics” which represent a specific human-interpretable concept

The core assumption of LDA is that documents are generated as follows:

For each document:

generate a distribution over topics based on model parameters

for each word in the document:

Sample a topic from the document’s topic distribution

Sample a word from that topic’s distribution over words

The idea being that there is some latent variable (the topics) that informs word choice.

Unfortunately, the document->topic and topic->word distributions are impossible to calculate exactly, but it is possible to approximate them using gibbs sampling or variational inference- approximation techniques which will allow us to eventually converge to a solution close to the true solution (insofar as such a thing exists). Unfortunately, these have the side-effect of being very slow, so the algorithm is not exactly the most efficient. Compared to training a neural network, though, it’s not actually unreasonable.

Here are some results from running the algorithm on a datset of news articles, where each line is a discrete topic. See if you can figure out what each topic is about:

And finally, the SCP foundation. One thing to note is that I didn’t do as much data cleaning or parameter selection as I did with the previous datasets, so quality could be better. I’ll fine-tune the results later.

These are just a few examples but you can see how easily interpretable the results are with basically no human intervention or annotation. I’m hoping to apply this to some other datasets in the near future to see what sort of results I get.

]]>https://machinedaydreams.com/2018/01/08/automated-topic-extraction-with-latent-dirichlet-allocation/feed/0underwhelmingforceMaking art with neural style and fractalshttps://machinedaydreams.com/2018/01/07/making-art-with-neural-style-and-fractals/
https://machinedaydreams.com/2018/01/07/making-art-with-neural-style-and-fractals/#respondSun, 07 Jan 2018 06:12:42 +0000http://machinedaydreams.com/?p=1077Continue reading →]]>I recently attempted to see if I could create art with Neural Style using only photos I’ve taken and fractal flames I’ve created with Fractorium and Apophysis. I must say, I’m very happy with the results! I generated the outputs using the Neural Style GUI and fine-tuned them using Photoshop– the latter was necessisary to mitigate the black splotches still present in some images.

I’m planning to write up a post soon about a really interesting NLP algorithm, and I’ve been having fun recently training char-RNN on a database of Urban Dictionary so stay tuned for that.

]]>https://machinedaydreams.com/2018/01/07/making-art-with-neural-style-and-fractals/feed/0underwhelmingforce22405550_1451387041581435_9162479106892362539_n1507861578595fractal_cathedral_spoopyfractal_townSteam Gameshttps://machinedaydreams.com/2017/08/05/steam-games/
https://machinedaydreams.com/2017/08/05/steam-games/#respondSat, 05 Aug 2017 15:33:03 +0000http://machinedaydreams.com/?p=1044Continue reading →]]>While I work on my next big project, I decided to generate some random steam game names. All of these are games that don’t actually exist:

Happy Panic

Unraveled Land

Mad Sharkness

The Heart’s Medicine

formic innocence

Heroes Over Europe – Full Mojo

Lovely Ventures

The Gravesable Moon Visual Novel

Hotline Miami 3: Back under Begins

Redemption Cemetery: Secret Of The Angel

Nightman: Trigger Element Space

Hellfrosted

Princess Maker Expedition

Gorescripted Addiction: Possibility

Mars Indian

The Ember Sigil

Train Simulator: Eternal Gun

5-Bit Soundtrack

Best Force

Happy Fantasy

Jackal Journey

Signal Flyng

And the winner for the most probable steam game name is:

The Walking Dead: Inside The Walking Dead: “The Secret of the Magic Crystal”

and also:

Steam Dev Days: Steam Edition

]]>https://machinedaydreams.com/2017/08/05/steam-games/feed/0underwhelmingforceEspresso is the marshmallow of coffee (fun with Word2Vec)https://machinedaydreams.com/2017/05/30/espresso-is-the-marshmallow-of-coffee-fun-with-word2vec/
https://machinedaydreams.com/2017/05/30/espresso-is-the-marshmallow-of-coffee-fun-with-word2vec/#respondTue, 30 May 2017 02:07:40 +0000http://machinedaydreams.com/?p=980Continue reading →]]>In exploring the recipe dataset, I decided to have some fun with Word2vec, an algorithm originally created by Google. For the layperson, this algorithm works by looking at the context in which a given word appears and learns vectors to represent words such that words that appear in similar context have similar vectors. On the recipe dataset, this means that, for example, the vectors for vodka and cognac are very close together, wheat and rye are very close, chocolate and butterscotch are very close together, etc.

What’s really neat about this, though, is that it enables us to do some very interesting things. One of the properties of the vectors created is the ability to perform vector arithmetic, adding and subtracting these semantic vectors to create word analogies. Here are a few examples: (read a – b + c = d as “b is to a as c is to d”)

pie – pizza + calzone = blintz

That makes sense! Never would have thought of that to be honest.

banana – plantain + apple = blueberry

I guess an apple is just a big blueberry. Who knew

candy – marshmallow + coffee = espresso

I guess that makes sense. Weird though.

fish – tuna + chocolate = candy

Ok, tuna is a type of fish, chocolate is a type of candy. I guess I’ll let that one slide.

coffee – tea + lemon = orange

tea – coffee + lemon = lime

That’s interesting. It seems to think coffee is sweeter than tea.

coffee – knife + spoon = expresso

Interesting. In addition to the marshmallow of candy, it’s also the spoon of cutlery.

rasin – grape + fish = offal

Wow, ok, I guess it doesn’t like rasins.

brie – cheese + candy = meringues (closely followed by “fondant”)

Makes sense. Fancy, soft, light.

ribbon – bar + dome = tented

Let’s try the classic word2vec analogy:

king – man + woman = bruce

What. The next closest option is “retired.”

I’m going to continue experimenting with this. I’ve also been getting some really good results with the chef-rnn, so I’ll get back to you with more of that soonish as well.

]]>https://machinedaydreams.com/2017/05/30/espresso-is-the-marshmallow-of-coffee-fun-with-word2vec/feed/0underwhelmingforceChicken soup (for the robot soul?)https://machinedaydreams.com/2017/05/24/chicken-soup-for-the-robot-soul/
https://machinedaydreams.com/2017/05/24/chicken-soup-for-the-robot-soul/#respondWed, 24 May 2017 22:42:42 +0000http://machinedaydreams.com/?p=893Continue reading →]]>So I did a little more training with the chef-rnn at a lower learning rate to fine-tune it a bit and got some shockingly good results, with some weird quirks. I really just couldn’t resist posting more recipes. These are so much fun to read and try to imagine cooking/eating. This is the closest I’ve ever gotten to eatable food, and I’m kind of tempted to try some of these at some point. As usual, I’m going to start

preheat oven to 350 degrees. in a small bowl, combine chicken,
cheese, parsley, mustard, salt and pepper. spread evenly over chicken
mixture. place a layer of chicken on top. bake uncovered at 350 for
20 minutes. sprinkle with parmesan cheese. bake uncovered for 10
minutes or until cheese melts and cheese is melted.

Wow… that actually sounds pretty good. Replace “chicken” with “sausage” at some point so it fits the name, and this could be a decent meat pie. A warning: this version of the network loves chicken.

mix the cheese and seasonings together in a bowl. combine the cheese
with the cheddar cheese and stir into the cheese. place in a lightly
greased 9-inch square baking dish. sprinkle with the cheese, and
sprinkle with cheese. bake for 20 minutes. remove from oven and allow
to stand for 10 minutes before serving. serves 6 to 8.

Oh jeez, another recipe with a full cup of mayo. Why do you need a full cup of mayo (or salad dressing) in a tortellini sandwich?

remove skin from chicken breasts and discard. remove skin and bones
from chicken; set aside. in a large skillet, heat oil over medium heat.
add chicken and cook, turning the chicken for about 10 minutes or
until cooked through and cooked through. set aside.

in a small bowl, whisk together the remaining ingredients. pour over
the chicken. serve immediately.

Wow, you could cook this with only fairly minimal changes. But also, why are the shrimp cubed? Why is that necessary?

Okay, I could write about these all day. Let’s look at some weird edge cases. What happens when I turn the temperature all the way down? (to 0.1, the lowest value it will let me use)

in a large saucepan, combine the carrots, onion, celery, carrots,
celery, carrots, celery, carrots, celery, parsley, bay leaf, pepper,
and salt. bring to a boil, reduce the heat and simmer, covered, for
15 minutes. strain the stock through a fine sieve into a bowl. add
the chicken stock and the remaining ingredients and toss to coat.
serve immediately.

cut chicken into small pieces. combine chicken broth, soy sauce,
vinegar, sugar, salt and pepper in a small bowl. add chicken and
cover with plastic wrap. microwave on high for 10 minutes. remove
chicken from broth and set aside. combine cornstarch and water in a
small bowl. add to skillet. cook and stir until thickened. serve over
rice.

[after this it just starts repeating “1 ts cayenne pepper” over and over]

I, uh, wow. That first one sounds really good, and the second is a weird way to cook a chicken (which involves a sauce that is just corn starch and water). I think it would need some modification to be eatable (e.g. actually cook the chicken) but it seems totally reasonable otherwise.

What happens if I let the network overfit a bit? To do this, I sample from not the best checkpoint, but the most recent checkpoint. Typically the validation behavior is a very rapid drop, followed by a much more gradual decline, and then eventually it bottoms out or starts going up- this means the network has overfit to the data and is reproducing the inputs instead of generalizing to new data.

Yeah, that seems reasonable. Too reasonable. Fortunately, the ingredients are great- Why does it have popcorn? I guess that would have to be a topping or something? All said, this one could be pretty good.

cream butter and sugar until light and fluffy. blend in egg and
vanilla. sift flour with baking powder and salt and add to creamed
mixture. mix in another 1/2 cup of chocolate chips. pour into greased
and floured 9″ round cake pan. bake at 350 f for 50-55 minutes. cool
in pan and cut into squares.

Ok, there’s no way that’s not a real recipe with a few modifications. That’s too good. There were a couple like this, so I’ll skip over them for now.

preheat oven to 375f. lightly butter a 13 x 9 x 2 inch baking
dish. sprinkle the shrimp with the salt and pepper. arrange the
artichokes in a single layer on the carrots. add the onion and garlic,
and sprinkle with the garlic powder. place the chicken breasts on top
and bake for 15 minutes or until they are soft and crunchy.

place the chicken in a serving bowl and top with the basil sprigs.
sprinkle with parsley and serve immediately.

combine the chocolate chips, sugar, corn syrup and salt in a heavy
saucepan over medium-high heat. cook over moderate heat, stirring
constantly, until the sugar is dissolved. remove from the heat and
stir in the butter until dissolved. stir in the vanilla and coconut.
spoon into a 9-inch springform pan. using a rolling pin, score the
cake layers in the pan. bake the cake in the middle of a preheated
350f oven for 50 minutes, or until a toothpick inserted in the center
comes out clean. cool on a wire rack for 10 minutes, then remove the
cake from the pan and cool completely.

in a small saucepan, combine the chocolate and water. cook, stirring
constantly until the chocolate is melted. remove from the heat and
stir in the coconut. spread the chocolate mixture over the cream
cheese mixture, and spread the remaining cream over the cake. top
with the remaining chocolate truffle mixture. refrigerate until
chilled.

to serve, cut into squares and serve with a sprinkle of confectioners
sugar.

directions: place potatoes in a heavy pot over ham heat. add the
s&p. and bring to a boil over high heat, continuing to toss the last 2
minutes of cooking.

put the clam juice, wine and vinegar into a medium saucepan and
add the rice. bring it as the grain cooks. bring to its boil over
medium heat, then pierce it off with a knife; over low heat, simmer
for 15 min. until the flavors have blended. strain the fruits and
reserve the liquid.

cut the cauliflower into bite sized pieces. wash these and peel
them.

after the couscous has cooled, the next day, rinse under cold water
and place in a dipping bowl.

meanwhile, rinse the chicken with a mixture of warm water and 1/4 tsp
salt.

drying liquid 1: sprinkle the breadcrumbs evenly over the skin, each
one. lightly brown the spareribs in it in a little oil in a roasting
pan and add remaining ingredients. cover and cook over low fire for 1
minute per side.

in a large saucepan place 2 or 3 cups beef bouillon cubes; set aside.
cook sausage over medium heat until tender. remove, and drain pieces;
place in a greased 9-inch baking dish. sprinkle margarine on bottom
and sides. repeat two more or more. cover and bake at 350 degrees for
5 minutes. combine mayonnaise, mustard, horseradish, pepper, curry
powder and salt, using the metal blade. on a baking sheet, place a mixt of
the eggs, salt and pepper, and the beaten egg abert to the meat
mixture. fill each scallop mixture with the egg mixture and then top
with 1 t of parmesan cheese. bake at 500 f for 45 mins. or until
beginning to bubble.

Ok, I think we can ignore the recipe and just go from the instructions here. This will definitely make some irregular beef.

Just to compare, here’s the non-overfit network at a high-temperature:

divide egg whites equally among oiled serving dishes. place your
finger and chopped mint in a hot water bath and refrigerate overnight.
when it comes out clean, toss it over in a 900′. in order to melt the
caramel thermometer, pour in the banana sauce. serve in ice cream
refrigerator. the recipe was doubled…yeast! place the kebab in a deep
1 quart or tiled container and let stand at room temperature. when
firm it is done at its way folks, but not enough to within another.
pour into container and freeze up to one month.

Mmm, maple pineapple jam sounds good. But the instructions… You have to refrigerate your finger overnight and melt a thermometer.

Later on we get cool stuff like:

cream butter and sugar, heat to medium. add banana, eggs, vanilla,
cinnamon, nutmeg and lemon rind; beat at low speed until fluffy,
whirl in dry ingredients, no longer, add cream cheese and beat until
smooth. stir in currants. drop by rounded tablespoonfuls, 2″ apart,
3 inches apart. be careful not to knock some of the rest of the
cookies.

blend sugar and vermicelli til smooth. place in ice-water bath; mix well
and pour over salad.

A… thing.

steam the oranges for 4 minutes. into a blender, combine the flour,
salt, and corn meal. process for 30 seconds. add the butter and
margarine, and pulse for about 5 seconds. add the pureed apples
and the margarine. process on low the bowl until the mixture is
combined evenly. add the remaining 1/2 cup mashed bananas.

Some kind of… fruit cake?

In summary, I’ve gotten really astounding results and the number of actually somewhat cookable recipes has gone up immensely (I think?). I’m definitely putting a lot more thought into the idea of cooking some of these and making an RNN cookbook.

]]>https://machinedaydreams.com/2017/05/24/chicken-soup-for-the-robot-soul/feed/0underwhelmingforceCycleGAN and Chef-RNN updatehttps://machinedaydreams.com/2017/05/22/cyclegan-and-chef-rnn-update/
https://machinedaydreams.com/2017/05/22/cyclegan-and-chef-rnn-update/#commentsMon, 22 May 2017 22:52:34 +0000http://machinedaydreams.com/?p=686Continue reading →]]>There has been a development in image generation that I find absolutely fascinating. It’s based off Generative Adversarial Networks, which are a very powerful and promising model for image generation. A recent modification to this architecture, the Conditional GAN has been making some waves, as it allows for the translation of one type of image to another, but it has the drawback of requiring databases of before and after images with roughly 1:1 correspondence, which is difficult to find, dramatically limiting the applications. However, very recently, the folks at UC Berkley made a few additional modifications which remove this requirement, creating the CycleGAN architecture. I’ve downloaded the source and been using the Flickr api to do some experiments with it, including trees <=> flowers, summer <=> fall, summer <=> winter, and landscape <=> desert. Legal disclaimer: I don’t own the rights to any of the images here. Unfortunately I neglected to get the photographers’ info when I scraped the images, but if anyone knows the creator of any of the original images please let me know. Here’s a few example images I’ve gotten so far:

Trees<=>flowers doesn’t work very well, which isn’t too surprising, but it is pretty entertaining sometimes what it does. It found pretty early on that it can do decently by just inverting the colors, but eventually the behavior got more complex and started making gross brown flowers:

summer<=>fall works just… absurdly well. It’s a bit scary, and some of the results are really pretty. With some parameter tuning and more (and better sanitized) data, this could be really cool! I’m definitely going to do some more experimentation with this.

summer<=>winter also works, though I couldn’t get it looking as good as the authors of the paper did. These examples are a bit cherry-picked, though– it never really learned how to fully get rid of snow, but it’s really good at color balance adjustments that make it feel way more wintery/summery.

The “desertifier” was largely unsuccessful. It never learned how to make things into sand like I’d hoped, but I didn’t train it for nearly as long as the others, and the success cases give me hope that it could learn:

Essentially what I’ve found is that the network doesn’t like to totally get rid of anything or hallucinate new things, even when it would make sense to do so. For example, if it gets rid of some water to make a desert, it might not be able to put it back- because of how the Cycle-GAN works, it needs to be able to reconstruct the original image. What it is really good at is changing colors and patterns, less good at structural stuff. I’d bet that you might be able to improve this behavior with skip connections between the initial transformation and the reconstruction pass. This would be similar to the “u-net” encoder-decoder architecture described, except the connections between the encoder and the decoder would also connect the first step of the titular cycle to the second. It might defeat the purpose a bit, but as long as there’s still an information bottleneck it might help.

Robo-Chef strikes again (barbecue sauce edition)

Finally, I discovered that I never did any experimentation with the sequence length I used to train my robo-chef, which would put a hard limit on how long it could remember things. Here are some recipies from a network trained with a much longer memory (3.5X longer)

source: canadian living magazine, apr 95 presented in article by diana
rosenberg

That’s a lot of nuts! Also it forgot the cranberries. Once again, the instructions seem to be totally independent of the ingredients, but would make… some kind of cinnamon crumb pie? That actually sounds kind of delicious. I bet you could probably make something really tasty out of this one.

combine all ingredients except salt and pepper in a large saucepan.
bring to a boil, reduce heat and simmer for 1 hour. add cornstarch
and cook until thickened. stir in chicken and cook until thickened.
serve over rice.

And here we go. Crunchy Barbecue Sauce. That full 1/4 cup of worcestershire sauce. The tons and tons of cornstarch. Oh. Man. What is going on? Weirdly enough, the instructions seem spot-on (except the last two sentences get a bit weird). Here’s a condensed list of the ingredients just to see if it makes sense:

cut the chicken into small pieces. heat the oil in a large skillet over
medium heat. add the chicken and cook, turning the chicken frequently,
until the chicken is cooked through, about 5 minutes. remove the chicken
to a plate. remove the chicken from the skillet and keep warm. add the
chicken to the pan and stir-fry for 1 minute. add the chicken and
stir-fry for 1 minute. add the chicken and cook, stirring, for 1
minute. add the chicken and continue to cook for another 2 minutes.
return the chicken to the pan. add the chicken and stir-fry for 1
minute. add the chicken broth and cook for 1 minute. add the chicken
and stir-fry for 2 minutes, then add the sesame oil and stir until
combined. add the chicken and stir-fry for 1 minute. add the chicken
broth and cook, stirring constantly, until the sauce thickens. stir in
the chicken broth and cook for another 2 minutes. stir in the cornstarch
mixture and cook for 1 minute. stir in the chicken broth and cook,
stirring, until the sauce thickens. serve immediately.

serves 4.

from the files of al rice, north pole alaska. feb 1994

Did you remember to cook the chicken? How about the sesame oil? Ok, good. What about the chicken? Also this recipe is from “north pole, Alaska.”

combine all ingredients in a large saucepan. bring to a boil, reduce heat
and simmer, uncovered, for 1 hour, stirring occasionally. stir in
cornstarch mixture and cook 3 minutes more. stir in cornstarch mixture
until smooth. add salt and pepper to taste. serve over rice.

Yeah, ok, that one makes more sense. Actually that seems to be a legit barbecue sauce. Cool! Unfortunately it’s also the most boring barbecue sauce ever because there are only actually four ingredients and one of them is water.

in a large pot, bring the water to a boil. add the garlic and stir
for 2 minutes. add the spices and reduce the heat. simmer, covered,
for 10 minutes. remove the pan from the heat and store in an
airtight container.

makes about 1 cup.

Hahaha, another barbecue sauce! Nein, mein sauce! Too bad the instructions are basically boiled garlic. At this point we’re done with the barbecue sauce (alas).

place the rice in a saucepan and bring to a boil. add the chicken broth
and cook under medium heat for 5 minutes. remove the chicken from
the pot. add the chicken and cook for another 10 minutes. remove
the chicken from the pan and set aside.

add the chicken and the remaining ingredients and simmer 15 minutes.
skim off the excess fat and return the chicken to the pot.

in large skillet over medium heat, heat oil over medium heat; cook
garlic until soft, but not browned, about 15 minutes. add remaining
ingredients except noodles; cook for 2 to 3 minutes or until thickened,
stirring after 5 minutes. stir in raisins and simmer for 1 minute.
serve over chicken.

source: taste of home mag, june 1996

It’s a what? I guess if you go to a Junk-Joint and order a Salet Burger this is what you get. Also those ingredients… substitute ground hamburger for spam and you might be able to make a decent, but weird, burger. What’s really neat is that the instructions remember that this is supposed to be pasta (which the ingredients conveniently forgot).

combine flour, salt & pepper in a medium bowl. cut in margarine
until particles and can leave from tip. pat the mixture into a baking dish
and sprinkle with the cheese. bake, basting every 15 minutes, until
crust is golden brown, turning the cheese over after 35 minutes.
meanwhile, mix the egg and water in a small bowl. stir in the remaining
ingredients. pour grated cheese into the pie shell and bake for
20 minutes. remove from oven. sprinkle with roquefort on top.

What… what is this? The name is weird, the ingredients are… confusing (tortillas and spagetti?) and the instructions are for… some weird cheese pie. I don’t really know what to make of this. I think I may need to call a chef to reconcile some of these recipes for me. That said, if you did manage to actually make this it might not be half bad if you made some pretty liberal substitutions and improvisations.

thaw and drain chicken (roll up the sides of the chicken). trim and
cut the chicken into strips. combine the chicken with the pork mixture
with the salt, pepper and thyme. mix everything together gently and
add to the chicken mixture. cover and refrigerate for at least 4 hours
or overnight.

…and then what? Wait, do you serve this raw? It put SO MUCH WORK into those ingredients (look at all that vinegar) and then forgot to actually cook the meat (arguably the most important step).

put first 4 ingredients in a bowl, mix well and stir into corn mixture.
in a 2-quart saucepan, heat the butter and 2 tablespoons of the
frankfurter seasoning and add the cooked rice and stir until the sauce
thickens and serves 4 to 6. makes about 2 1/2 cups

recipe by : cooking live show #cl8726

This one is actually so close. If only it actually included portobello mushrooms! Also, I want to emphasize: 8 ounces of fresh plantain leaves, frozen, then thawed. WHAT.

in a large bowl, cream margarine. add sugar, flour, vanilla, and eggs.
mix thoroughly. pour into prepared pan. bake 45 minutes or until
oblong starts to pull away from sides of pan and a wooden pick inserted
into center comes out clean. cool in pan on wire rack for 5 minutes.
remove cake from pans to wire rack. remove from cookie sheet to wire
rack and cool.

break up cooked peas. saut� garlic in oil until soft. stir in flour
and stir until smooth. combine all ingredients. cook and stir over medium
heat until sauce boils and thickens. cool
1/4 hour before serving.

chop all of this liquids into separate bowls. put 2 mayonnaise into a
large bowl. stir in the garbanzo beans until pureed. add the
pork and mix thoroughly. toss the spatula and toss well with the
first mixture. set aside for a few hours before coating.

remove the skin to a dinnworm enough to act a lasagle, starting in the
rosette. roast, uncovered, in a hot 350 f. oven for 15 minutes.
meanwhile, wash the lettuce, well, tuver peel the green palm. hold the
carrots very finely. but do not rinse them. after they are cooked
to the texture, place the sauce in another hot skillet largeroune,
and add enough hot water to cover it.

cover and simmer the soup until the rice is done, about 4-6 hours,
date to see dowel up. pour into hot sterilized jars to make sure your
amber liquid has reheated. chiln quickly if the barbeque side is
chilled and stored in a storage tin, loosely probably one day, watch
until chiles are soupy, thoughly barbecued, about 37 hours, or in the
refrigerator to marinate the meat or your beurre but may be made up to 2
days, covered.

cornstarch mixture: this sauce manie sirfully begin to should
be approximately 3 cups of cooking your toothpicks.

Oh jeez too weird too weird. There’s so much going on in here. What is a dinworm? For that matter, What’s a lasagle? The first paragraph is 100% gold. Also, this recipe takes a long time. First, you have to stir some beans until pureed. You have to puree beans with a spoon. Then you cook the… dinnworm… for 15 minutes, then simmer the soup for 4-6 hours, then watch the “chiles” until they are “soupy, thoughly barbecued,” which takes 37 hours or up to 2 days.

1. place remaining ingredients in each of a bl. plate, cover and microwave
on 300of until cheese melts (about 15 seconds). serve at once, with
salsa.

Somehow the network made an OK sounding chip dip. I love that the instructions are basically “throw everything in a bowl and microwave.”

In summary, holy cow! This is so much more coherent than my previous experiments, and with only one night of training! I definitely need to dig into this a bit more. If you’ve somehow made it to the end of this post and want MORE, I’ve found another blog that does similar things. Check it out!

]]>https://machinedaydreams.com/2017/05/22/cyclegan-and-chef-rnn-update/feed/2underwhelmingforceDouble Jeopardyhttps://machinedaydreams.com/2017/02/20/double-jeopardy/
https://machinedaydreams.com/2017/02/20/double-jeopardy/#respondMon, 20 Feb 2017 00:55:06 +0000http://machinedaydreams.com/?p=592Continue reading →]]>Because I was gone for most of the weekend, (and because I didn’t want that awful jokebot being the first thing people see of this blog) I retrained the jeopardy network to some amusing results. Take a gander:

THE SPORTING LIFE,$400,’The first of these in the U.S. was the first to control the state of Maryland’,a statue of Martin Luther King, Jr.

THE OLD WEST,$800,’This country is the only one of the world’s largest countries’,Chile

THE BIBLE,$400,’This composer of the 1999 film The Sound of Music was a star of the 1995 film The Sound of Music’,John Steinbeck

THE SOUTHERN DANDY,$400,’The name of this country is a synonym for a state of a country’,South Africa

John Steinbeck composed and starred in a 1999 remake of The Sound of Music, apparently. Also a statue of MLK Jr. took over Maryland, and Chile is bigger than I previously thought. The more you know! Increasing the temperature a bit:

A WORLD OF BEER,$400,’The first of these in the U.S. was a company in 1999′,a balloon

THE CARIBBEAN,$400,’The sea is the capital of this country’,Chile

THE FIFTH,$400,’The name of this body part is from the Latin for to strain into a string’,a contract

THE STORY OF O,$400,’This country is the second largest city in the world’,Martinique

BEAR SCREEN,$200,’This author of The Secret Garden was based on a 1989 film about a stripper who was a little boy’,James Bond

FICTIONAL DETECTIVES,$1000,’This 1954 film is set in the 1997 film seen here’,The Man Who Shot The Rainier

ART & ARTISTS,$1000,’This Southern country was a colony of the New York City Company in 1968 & is now a capital city’,Berkeley

The network continues to fail at geography. In addition to being the largest country in the world, the capital of Chile is the sea. Also, some of Jame’s Bond’s sordid origins and a 1950’s sci-fi detective film, The Man Who Shot The Rainier, which I kind of want to watch. Stepping up the temperature some more, we learn about American history:

THE CIVIL WAR,$600,’In 1990 the Confederacy allowed this country to the U.S. Constitution’,South Africa

THE 1980s,$400,’This son of a president was a senator from 1948 to 1972′,James Buchanan

THE CIVIL WAR,$800,’This secretary of the American Idol was buried in the first festival of the State Department’,Harry Truman

BOOKS OF THE ’60s,$400,’This Seattle children’s team was based on a 1986 movie based on a series of books’,Stevie Wonder

COLLEGES & UNIVERSITIES,$1000,’On April 1, 1998 this country became the first black president to control the U.S. Army’,Japan

BIBLICAL PEOPLE,$400,’This man who resigned as a lawyer in 1994 was the first female president of the Confederacy’,John Adams

THE 1980s,$300,’This man who died in 1978 was a president of the Senate from 1934 to 1990′,Benjamin Franklin

THE NEW YORK TIMES TECH BIZ,$300,’This country’s 1969 exploits were completed in 1936′,Australia

STATE CAPITALS,$1000,’This capital city was founded in 1939 by the El Capitan of New York City’,Columbus

Okay, I snuck some Australian time travel in there. I also got this absolute gem:

THE FIFTH,$400,’This president was the first president to serve as president’,John F. Kennedy

Let’s keep this going:

WORLD CAPITALS,$1000,’The company that contains the largest island in the U.S.’,Canada

THE END,$1000,’It’s the body of water in the Confederacy that shares its name with a former capital’,Barcelona

STATE CAPITALS,$200,’The name of this capital city is a 2-word name for a pope’,Beijing

THE ASTRONAUT HALL OF FAME,$200,’In 1969 he was called the last world championship to win the major series title’,Alexander Hamilton

FAMOUS COLLEGE DROPOUTS,$200,’In 1998 this president was a commander of the Confederate Army’,Adolf Hitler

THE OLD WEST,$200,’In 1991 this American became the first woman to be a consul on the Moon’,Britney Spears

THE SOUTH PACIFIC,$400,’It’s the only country that makes it to the Atlantic Ocean’,Australia

WOMEN AUTHORS,$400,’In 1987 this TV heroine was a spin for the No. 1 hit Heart of Darkness’,John Paul Jones

THE ELEMENTS,$400,’This compound is a major work of the subatomic particle that makes surreal & trick’,a sodict

Godwin’s law invoked! Also, it invented a subatomic particle. Canada is a US-based company, and Britney Spears is a consul on the moon. Things started to get a bit more dadaist from here:

MADE ON CHARACTER!,$2000,’An American author of The House of War, her first novel, The Man Who Loved Me Done, debuted in 1960′,Dennis Hopper

DO YOU BETTER A FACE!,$200,’This term for a condition is from the Latin for indeed’,a white broccoli

THE STATE OF CLASS,$200,’From 1935 to 1996, these U.S. planets abbreviated the Baltimore Order’,the California Signs

BIBLICAL WOMEN,$200,’In the 1996 film poem The Spy Who Does Will Ast Will Believe He in this play retrudes a bad baby back out with his own daughter’,The Sound of Music

YOUR 5-CLUE NEWSCAST,$2000,’This bridge is the southernmost point of the South Pole’,the River State

WOMEN BY THE NUMBERS,$800,’He was good man when he was more famous for his song’,Martin

A IN SCIENCE,$400,’A specialty of this mammal is retracted with plastic pouch & are sacred at its surface to get a beautiful species of bird or brown’,a narwhal

And so on. I do like “jin-Aak, Calamity, the Balthamar.” That’s just a really cool set of titles. Also, The Man Who Loved Me Done sounds like a really solid bodice-ripper and The Spy Who Does Will Ast Will Believe He, the “film poem” sounds adorably artsy. Let’s keep this ball rolling, if only to see where it stops (or what it runs over):

WEBSTER’S 2005 TOP WORK,$1600,’The lady called the Village of Birmingham’,Ler Desser

BE A FIREFIGHTING,$1200,’This brand of small color is also called members’,crhatobula

Okay, at this point, the questions are a bit silly, but the categories become excellent:

NOT A ROCKER

WHAT A COCKIN’?

EOGRAPHY

WE LOVE BROWN

WHAT HE WAS IN HOLE

1985: THE EVERYTHING WAR

LOOKER ONE OUT FOR KIDS

ANIMAL YOUNG ‘UNS

THE NEW YORK TIMES METAL

IT HAPPENED IN SPORTS

YOU’RE A BEACH I AM

And more. Just for fun, I took it up one more step:

OKLAHOMA!,$200,’Mark Twain debuted on Marvin Gabbary for this brand maker whose name includes his way to start’,Yellow Submarine

BROADWAY MUSICALS,$800,’Yes to Bag McKorw trades a book for this entertainment chase as a knight in Gilbert & Sullivan’s tokespaces’,Elle Fragg

LITERally ELEMENTS,$200,'(<a href=http://www.j-archive.com/media/2008-11-28_DJ_26.jpg target=_blank>Cheryl delivers the clue from the set of Halloween.</a>) Some people wourd appeal on Inuit inside the <a href=http://www.j-archive.com/media/2011-09-20_J_21a.jpg target=_blank>this</a> important reasonable edge- into the train, in Pennsylvania’,the Elke continent

B IN FASHION,$1000,’It’s the depth at <a href=http://www.j-archive.com/media/2005-12-02_DJ_04.jpg target=_blank>Edward J. Deimos’,Body is American Miss Vinnegas

ICK BIN APPLE,$1,700,’Together Bubble Down is the first earl’s only one’,the T.Orvertine

FAMILIAR PHRASES,$1600,’The straws of <a href=http://www.j-archive.com/media/2008-04-15_DJ_03.jpg target=_blank>this</a> lair found in the Smithsonian’,Herudge

20th CENTURY FASHION,$1600,'(<a href=http://www.j-archive.com/media/2009-09-12_J_15.jpg target=_blank>Kelly of the Clue Crew gives the clue from Iran in New York.</a>) Oliver stands for floppares for one of these; John Infords chose to fame for one title play’,154

GONE BUT NOT,$1600,'(<a href=http://www.j-archive.com/media/2005-01-28_DJ_12.jpg target=_blank>Alex reports from a cape at ’34.</a>) Along with a river on New York’s capital holiday, this capital of Luxor lacks cheese & vegetables planted by both Talmania & Haiti’,Trenton

DUSTIN COUNTRY HEIR STORIES,$2000,’Weird TV’s Whale’,Paddhe

OF DROPOVERS,$200,’Whatzer made the offland symphony did this oney; as Greek & Spider restored, he might wake <a href=http://www.j-archive.com/media/2005-03-18_DJ_19.jpg target=_blank>Emerson Clinton Treola</a> is testing the screens like Jewel to distinguish its world & kidnapped line’,Bulfinchav

To my surprise, a new behavior emerged- the network produces fully formed, syntactically-correct hyperlinks to images stored in the jeopardy archive website. This wasn’t present in any of the previous temperature level, and the accuracy of these hyperlinks is somewhat astounding. Though I’m fairly sure none of the targets actually exist- I saved 100kb of output as html and all of the links gave a 404, which was disappointing. In theory, eventually it might produce a real link but it would probably take a while.

This makes me want to return to something I tried a while ago, namely training on random files from the ubuntu source, but that’ll have to wait until after my current big project, which I can only say is a cool computer vision thing.