The site allows us to upload a 32x32 pixel image. Then it performs some sort of identification to determine which user we are. Finally, if it determines us to be “Mr. Deer” it will hopefully give us the flag.

Someone else on my team found that the robots.txt contained the following listings:

So we have several convolution/maxpooling layers followed by a two fully connected (dense) layers. Presumably the final layer is a softmax output that generates probabilites of image classes.

(I’m not going to explain the actual model much more, this is a very common example of a deep convolutional neural network for image classification. Googling any of those terms will lead you to more in-depth resources)

Our goal is to generate an crafted input for this network such that it predicts index 4 with a high probability (corresponding to Mr. Deer).

I adapted the technique presented here to generate this crafted example.

The generation script essentially starts with a random vector. We define the cost as the probability of predicting class 4. Then we take the gradient of the cost and add it to our original vector such that the next time we run the image through the network, it is more likely to pick 4.

This is kind of like standard gradient descent training on a network instead this time, our network weights are fixed and we are updating the image itself.

After letting the following script run for a few minutes, we obtain an image that can fool the network: