Jeff06 took first place on Coders Strike Back’s podium. Here are his explanations about how he thought and coded his AI: the perfect opportunity to get inspired and learn more about genetic algorithms… Thanks a lot Jeff for sharing!

Intro

Wow, what a great contest Codingame! I had a lot of fun during the week, but didn’t sleep much, and when I did dreamt of drones and collisions! After participating in the SF2442 challenge two weeks before, I was initially disappointed when I saw that drones were back, but the differences from SF2442 were more than enough to provide a fresh challenge. In SF2442, we had only one drone to race around the track, so the strategy was much more limited. It was possible to block the opponent, but at the cost of stopping your own drone for a while. In Coders Strike Back, having a team of two drones lead to extremely complex situations, where cooperation between the team, and aggression towards the opponents, was key. Since the physics engine was the same between the two contests, it was that much boilerplate code I didn’t have to rewrite, and could dive straight into the complex part, coding the AI.

Generic code architecture

Before I explain genetic algorithms and evaluation functions, I want to write a bit about how I organized my code. I’ve done a few Codingame competitions by now, so I try to reuse the same basis as much as possible. I separate the representation of the game state from the AI algorithms, so that I can easily switch the AI without recoding everything. I also try to keep all the “magic numbers” at the top of my code, so I can quickly test new variations. Normally this would go in a specific header, but we are limited to a single file…

For almost (all?) multiplayer contests, it is necessary to simulate the game state multiple rounds in the future. For Coders Strike Back, the only interaction between the AI algorithm and the game state was through the drones. The AI sets the drones’ requested angle and requested thrusts, then asks the game state to update to the next round. There is no intelligence in the drone objects themselves, they are just plain objects that any AI can interact with. Check out the class diagram at the end of this document for more info!

Game simulation

Without an accurate game simulation, even the best AI won’t perform well. So the first step in this contest (well, actually in SF2442, since the environment was luckily the same!) was to write an exact simulation. I already had a few geometric classes I wrote for the Mars Lander puzzle (Points, Vectors, and Segments,) so getting the correct speed and position given and angle and thrust was an easy modification from my Mars Lander code. Using the debug logs, I verified that my expected game state matched what was given in input by Codingame.

But collisions… I had no idea how to do that, and my Google Fu failed me. I wrote something with vectors and dot products and cross products, which was complete nonsense. Well, I think this is where I say thanks to pb4608 for his explanations on the collision algorithm! He pointed me to the Poker Chip Race forum post, and explained the weird impulse rebound. I eventually found a fun article with the maths and physics behind it; you can read it here for more all the details.

So I had a working simulation of the game, and could now plug in my AI.

The AI

I’m a huge fan of genetic algorithms, and use them everywhere when I don’t have a clear idea on how to solve the problem directly! So obviously, that’s the route I took. I won’t explain in depth how genetic algorithms work, but if you’re not familiar with them you can start with the Wikipedia page, and follow the links. Be careful, the rabbit hole goes very deep!

Genome description

In the case of Coders Strike Back, I considered each gene to be a set of instructions for a drone. So a typical genome could be read as:

In practice, I simulated the game state up to 6 turns in the future, so there would be 6 triplets of instructions in one drone’s genome. The genome itself is encoded as an array of doubles between 0.0 and 1.0. So a raw genome is actually closer to this:

Specifying the 0.25 and 0.75 limits to the genes makes it select way more often the extreme angles and thrust values, which intuitively seemed like values with extra importance.

Shared genome

Actually, I concatenated both drones’ genomes into a unique shared genome, a series of 36 genes. The first 18 are for drone1, the last 18 for drone2. By considering both drones as a single entity, I could evolve cooperative behavior that would otherwise be impossible if each drone was evolved separately. That’s where all the “wow!” moments from the replays come from, it’s all emergent behavior. I was amazed when I first saw one of my drones collide with the other drone on purpose to give it a boost to it’s next checkpoint! I sometimes felt like I was watching a pool game instead of a drone race…

Direct bot

Remember how I said I could switch AIs easily? Well I coded a controlling bot (no genetic algorithms here, just basic if/then/else) that would target the next checkpoint for my first drone, and would target the opponent’s best drone for my second drone. By the way, when I write “first” and “second” drone, I mean the first and second with regards to the distance to the finish line, not the order in which they are given in input.Specifying this bot as my main bot allowed me to validate its behavior in the IDE. It would effectively turn to face its target (checkpoint or opponent drone) and go full thrust towards it. Not very effective, as it would overshoot the checkpoint and waste a lot of time turning around, but as a first approximation of the opponent’s behavior it was good enough.

So my first version of the genetic algorithm would evolve a series of controlling genes for both of my drones, trying to beat a virtual opponent controlled by this “direct bot.” This worked pretty well for a good part of the week, and I spent my time improving the performance and number of simulations I could run in the allotted timespan.

Evaluation function

I tried a lot of different evaluation functions, but eventually settled for something very simple. I believe the most important in this game is having the biggest difference in checkpoints between the two teams, and then having the biggest distance to finish line difference between the two teams. The rest is just minor adjustments to achieve this goal. So to compare two genomes, and select the best of the two, I compare the checkpoint differences. Only if they are equal do I compare the distance to finish line difference. And if those are equal, I chose the one where my second drone is closest to the opponent’s next checkpoint (or the one after the next, if it can’t reach the next checkpoint before the opponent.)

There was also a few lines of code in case of danger of timeout, to force both drones to target their respective next checkpoints, and in case the opponent was in risk of timeout, to prevent both opponent drones from reaching their respective next checkpoints.

I was quite happy with my discrete evaluation. There was no need for coefficients to apply to each parameter, keeping it simple and efficient.

First improvement: use direct bot in genome

I was dissatisfied by how long my genetic algorithm took to evolve a basic control scheme. For the first tens of milliseconds, the genomes were completely random, wasting precious time resources. I modified the “shield gene” to make it more into a “meta gene”. The rules for the angle and thrust stayed the same, but now the “shield gene” had much more meaning than just shield on or shield off, as it could switch the drone control over to the direct bot:

if(gene1 > 0.95)
requestShield()
else if(gene1 < 0.3 && drone is first drone)
use direct bot for control to next checkpoint
else if(gene1 < 0.3 && drone is second drone)
use direct bot to target best opponent
else if(gene1 < 0.2 && drone is second drone)
use direct bot to target best opponent’s next checkpoint
else if(gene1 < 0.1 && drone is second drone)
use direct bot to target best opponent’s next next checkpoint
else
use gene2 and gene3 to control angle and thrust

After the input phase, I also fill one of the genomes with the value 0.3 so that it effectively uses only the direct bot at each turn.

There was an immediate gain in performance. Now, even at generation 0, there was at least one genome which would control a coherent behavior. Printing out the genome in the debug logs at the end of turn showed that about a third of the genes used the direct bot.

Second improvement: evolve opponent

The second major improvement I did was use the genetic algorithm the evolve the opponent for some time.Effectively, for the first 30ms of a turn, I would pretend I was the opponent, and evolve a good strategy to counter the direct bot. I would then switch back to my own drone, and evolve them against the opponent drones controlled by their best genome.

That provided another huge performance gain. My leading drone would now often accurately predict that the opponent would try to block it, and go around the opponent instead of blindly slamming into it over and over again.

Optimization for speed and win ratio

Throughout the week, I was continuously trying to get more simulations out my AI. I used Callgrind and Kcachegrind extensively to identify bottlenecks in the simulation code. One of the improvements I implemented was to use a cache for the direct bot calculations, so as not to waste time recalculating 10,000 times the same output angle and thrust for the given input position, speed, and angle. At the end, I could evolve around 1,500 total generations per round (shared between the opponent’s and my drones). So with a population of 10, that means 15,000 game simulations with a prediction of 6 turns in the future.

I also did many tests in the IDE against the top 5 players to select my best parameters. The number of simulation rounds, the 0.25 and 0.75 bounds for the genes, the genetic population size, the time allotted to evolve the opponent… were all set so as to maximize the win ratio.

Conclusion

I’ve written it in the intro, but I’ll write it again, it was a really fun challenge! It’s very easy to write the basic drone controller which will simply go straight towards the next checkpoint, and then the real work starts… However, it would be nice if all the details of the physics were given from the start, maybe in the “Expert” section of the rules so as not to scare off everyone!

See you all in April for the next contest!

Annex: class diagram

Congratulations again Jeff06 and thank you for sharing your strategy in detail!

Any recommendation on an existing puzzle / game of CG well fitted to practice such algo ?

Jeff06

If you don't want to bother with the physics, you can apply GA on the Mars Lander puzzles. It's a good practice because it will force you to think of a good evaluation function, since you probably won't find an optimal solution in the first 100ms round.

I guess Poker Chip Race would be a good AI Bot candidate, since it's quite similar to Coders Strike Back, but i haven't done it yet. It's also quite a bit more complex than Mars Lander!

A nice approach. My previous entries have all used some kind of tree search to try to find the best moves, but they are very limited by the 100ms search window. I've never considered using a GA, but I will have to give this a try next time!

I hugely agree that it would have been really useful (and perhaps fairer too) if the physics of the game was described up front. I spent hours and hours trying to figure out an accurate model for collisions, and as a result I had very little time left to actually code the AI.

Hey dude, read this post found it fascinating, I'm in my first year of university my question right now is, will I even be able to do this? xD Btw I'm learning C right now not C++ so there were somethings I didn't understand in the code examples you put (mainly how u call functions and things like that)

Anyway see you in April then, will try to make my AI more smart next contest, rank 700 it's bad :C

Hi Jeff,I really want to learn more about Genetic Algorithm and I absolutely appreciate your huge effort to make this post. I must say this is the best demonstration about GA I've read since others seem to be pretty confusing. However, it's not quite clear for me as a beginner and I would like to see a sample of it. So, could you share the source code to me 'cause I don't know how to implement it.

Thank you!

Konstantin Kharlamov

Nice write up, yet it lacks the most important part — how does your genome interact with the inputs? It’s so important thing, that the lack of it actually defeats whole post. Your post alone gives impression like every simulation you have a static gene enable/don’t shield, and the car either holds the shield enabled whole race until the timeout hook gets triggered, or never uses one whatsoever. Such a bot wouldn’t stand a chance in competition.

Jeff

Hi Konstantin!
I’m not sure I understand your question correctly, but I’ll try to answer the best I can.
Each gene is used for only a single turn, *not* the entire game. As I wrote in the “genome description” part: “In practice, I simulated the game state up to 6 turns in the future, so
there would be 6 triplets of instructions in one drone’s genome.”
So every time the CodinGame servers send me a turn inputs, I generate a genome that encodes moves for the next 6 turns.
I hope it’s clearer, but don’t hesitate to ask more questions if my answer was not complete!

Konstantin Kharlamov

@Jeff thanks for reply, I don’t know why it doesn’t show up here, but I did see it in mail notification.

Before I clarify, I shall mention I’ve got it a bit backwards: usage of genes is a plain if-else table, what inputs affect is rather the generation. But my question, though slightly modified, still holds: how does inputs affect generated genes? For example, you’ve got generated shield = 0.95 — how did it happen, did you multiply coordinates with something, or… I don’t know, I ran out of examples — it is absolutely unclear to me how is it supposed to work 😀

Thibaud Jobert

It ended up in spam box. I don’t know why. Just validated it.

Jeff06

I’m replying here, this comment thread is complicated to follow 😉

I don’t use inputs at all to generate genes. Genes are generated randomly every time, i.e.: gene1 = randomDouble(0.0 ,1.0), and the inputs are only used to simulate the game for the next few rounds.

Yeah, I did read that article, and do have basic understanding how genetic algorithms work, but the understanding is limited to… dunno how to say, let’s call it static data. Take for example this demo http://rednuht.org/genetic_cars_2/ every new round changes in the outside world are pretty small, just geometry of the road. The fact why bruteforcing parameters to find “best car” works is very clear — the cars don’t even need to get the input, they just blindly drive forward until they stop.

But in your game every turn there’s a lot of changes: opponent locations, opponent angles, checkpoint location, “self” location, “self” angle, “friend” location, “friend” angle. My point is: you can’t just ignore the input data — otherwise how could a car being blind know why enabling shield in the previous generation/turn worked positively, but in this generation/turn it works negatively? It just can not work this way — your genome is gonna jump back and forth between decisions that “for some reason” worked in previous generation, but “for some reason” don’t in the next one.

Konstantin Kharlamov

And by the way, if you don’t use inputs to generate genes, the situation you described, where one bot boosts another to the next checkpoint, is impossible — how could they know locations of each another??