Skynet meets the Swarm: how the Berkeley Overmind won the 2010 StarCraft AI competition

StarCraft, one of the most popular games ever made, also serves as the perfect …

The concept of potential fields is straightforward, but with multiple fields active and interacting with each other, the challenge to making such a controller work is finding the right field strength parameters. Strong repulsion fields may keep the mutalisks safe longer but prevent them from concentrating their fire, and different enemies require different behaviors and thus different field strengths. Manually iterating through parameters and making adjustments would take far too long, however.

Instead, we let the Overmind learn to fight on its own.

In Norse mythology, Valhalla is a paradise where warriors’ souls engage in eternal battle. Using StarCraft’s map editor, we built Valhalla for the Overmind, where it could repeatedly and automatically run through different combat scenarios. By running repeated trials in Valhalla and varying the potential field strengths, the agent learned the best combination of parameters for each kind of engagement.

The video below shows our agent’s mutalisks engaging a Protoss high templar. High templars are a standard counter to mutalisks in human games: the templars’ psi storm attacks blanket an entire area, massacring tightly spaced mutalisks. With repulsive potentials properly tuned, however, the mutalisks scatter instantly to avoid the storms and then regroup to pick off the high templar with ease.

Overmind

Smart targeting completed the mutalisk controller. Potential field control works well given a target to attack, but choosing the right targets was a headache for us. The mutalisks weren’t immune to the plague of fixed, hard-coded strategies: our early targeting schemes were based on simple threat hierarchies, making for generally reasonable behavior punctuated periodically by spectacular failure. Mutalisks would almost destroy a valuable command center, then rush off to attack a marine halfway across the map or bravely suicide against lines of enemy missile turrets.

In the end, the solution was to give the agent the ability to predict the results of its actions. The damage that each unit deals versus another unit is known, so it’s possible to calculate roughly how much damage the mutalisks will take in exchange for killing a target, and how long it will take. By assigning values to targets and the mutalisks based roughly on their resource costs, the agent can predict the value of choosing a particular target, decide which targets are worth attacking, and what targets to prioritize. This lets it make intelligent decisions when choosing targets.

The video below, from the Overmind’s victory against Oriol, shows the mutalisk swarm with this final improvement, attacking the Terran base and picking off high-value targets while avoiding stronger enemy forces.

Overmind

The end result is a mutalisk swarm that intelligently chooses its engagements, picking off high-value targets and avoiding unwinnable fights until it possesses overwhelming force. Enemy bases and armies are simply disassembled one target at a time.

Dispelling the fog of war

By mid-summer, flexible build planning and a smart, Valhalla-trained mutalisk swarm were turning the Berkeley Overmind into formidable opponent. As we continued testing, though, we discovered that our agent had a weakness, one that highlighted the critical importance of acquiring and managing information about the enemy.

Mutalisks are tier-two units, meaning that several prerequisite buildings and upgrades must be completed before they can be created in the game. Until this point is reached, the agent is vulnerable; this was precisely what we saw in testing. Once the mutalisk swarm reached a certain size it was nearly unstoppable, but opponents could win by attacking early and preventing the swarm from forming in the first place.

Our first response was to force the build planner to produce more static defenses and cheap ground units early in the game. This protected against early attacks, but slowed progress toward creating mutalisks. It was a reflection of the general difficulty of balancing build priorities like economic growth and military force. Building too many workers creates a vulnerability to early attacks, but too few workers results in smaller armies and insufficient resources later on. Unnecessary static defenses drain resources from offensive forces.

Striking the right balance depends on reading the enemy’s forces and intentions; an early push must be met by sufficient defenses, but a slow enemy build-up means the agent should go all-out for mutalisks.

If the agent could see what its enemy was producing, the build planner could adjust production appropriately. Due to the fog of war, seeing a part of the map requires placing a unit near there, so gathering information on the enemy requires active reconnaissance in the face of enemy threats. In the earliest parts of the game, this means getting a fast worker or ground unit into the enemy base and surviving long enough see what the opponent has built.

For the Zerg, much mid and late-game scouting information comes from proper use of overlords. Overlords are slow-moving flying units with long sight-range that also provide supply, an in-game metric governing the number of units that can be built. Having overlords scattered around the map improves targeting and macro planning, but losing overlords both reduces vision of the map and hinders the ability to build more units. The benefit gained from observations has to be balanced against the risk of losing overlords, and the overlords need to be protected from blundering into enemy units.

Our solution to the problem of overlord control and scouting had an uninspired beginning. StarCraft’s built-in path planning for ground units is terrible, an irritant that has hindered players for over a decade. As development progressed, Dan decided that we weren’t going to put up with the indignity of watching units getting stuck on walls and chasing their own tails, so we implemented our own path planning.

Simply getting from point A to point B successfully was useful enough, but the real change came from combining path planning with an awareness of enemy units. The Overmind keeps a continuously updated map of the positions of every enemy unit it has ever seen and the last known location of that unit. Since the attack range and speed of units in StarCraft are known, the agent can combine this information with its map of enemy units to build a threat map. For each unit type, it can calculate a level of danger for that unit to be in an area. This threat map can be combined with the path planning algorithm by including the threat in a given map area as part of the cost of traversing it. Short paths that are under high threat are less preferable to longer, safer paths. Modifying the algorithm this way required some technical tricks to run quickly enough to be useful, but once working it proved to be a valuable tool. Though originally intended for worker units, we were suddenly using it everywhere.

Threat-aware path planning allowed for a number of improvements. Mutalisks could thread their way past enemy defenses to pick off targets. In the early game, the agent could infiltrate the enemy’s base with ground units and keep them alive, avoiding enemy combat units. The path planning also let the agent scatter overlords around the map with much less fear of losing them. Our agent’s overall “picture” of the game improved dramatically. With a better idea of the enemy’s forces, the build planner could react to enemy actions, creating defenses and optimizing the balance between economy and military forces.

The video below shows an example of this. In this video, one of our agent’s overlords enters the enemy base and discovers a Protoss stargate under construction. The stargate is a structure that builds air units, triggering the Overmind to produce an anti-air defensive structure that is completed before the opponent’s air unit arrives at the Overmind’s base.