Using SC2 to study collective intelligence

I am leading the human research unit of an IARPA funded project investigating ways to improve the reasoning ability of teams. We are required to test our system on multiple problem types, and are considering using Starcraft 2 as one problem domain. Teams would be presented with a replay that is truncated at a certain point, and would be asked to predict the state of the game (numbers of units of each type, structures built, tech developed, bases taken etc) at a time in the future. They would also be asked to provide rationales for each kind of decision. In different conditions, teams might have two hours to produce a result, or 2-3 days.

I wonder if you think this could present an appropriate level of challenge. If it is too unpredictable then all teams will fail which would introduce floor effects. If it is too easy, teams will be at ceiling which is equally problematic. Presumably the challenge would depend on the amount of time before truncation, and the time in the future that the teams are required to predict. I was thinking that we might truncate at 5 minutes and ask for the state a minute further into the game. What to you think?

Another important variable is the level of the players. We would most likely not use professional players, as it is important that teams not be able to find the games online (and thus see what happened). What level do you think would be best?

I think 5 minutes is a good timestamp to truncate at since most of the decisions in builds should be relatively standard. Something to consider is the way the game was opened, since if nothing "major" happens, a macro Zerg vs Terran can be predicted fairly far into the future, but an aggressive opening throws a bunch of variables into the air, such as unit control etc.

As for the level, I agree that using professional players can result in warped results even besides finding the games online, so maybe using games from low Grandmaster or high Masters, where there will be no pros in that MMR range.

I like the idea and it gives me a new perspective for analyzing my own replays. I believe the only way to keep it consistent would be using the highest level players because if not you could easily get a lower level game where maybe one player has ok'ish macro but very good micro against an opponent who has good macro but poor micro and there is no telling what will happen. You could try messaging some random bar codes from kr grand master and ask for a replays.

Predicting a minute ahead after 5 minutes into a replay really depends on the replay for how difficult it is. If it's a long macro game then at 6 minutes probably not much will have happened. So predicting the change is just a matter of increase in workers/bases/buildings and tech. On the other hand if it is a cheesy game or a player does some kind of timing attack, predicting a minute ahead could be extremely difficult, not to say impossible.

I would say the question kind of depends on what you are interested in with your research: predicting predictable situations (e.g. how good are teams of people at estimating the growth of a player's economy, assuming no attacks take place), or predicting unpredictable situations (e.g. will this player's timing attack succeed).

Regardless, the difficulty of predicting will depend almost completely on the specific replay in question, and very little on the time mark. The more fighting there is in - or before - the period where people need to make predictions, the more difficult it gets.

That's the 2nd time in a week I've seen someone sig a quote from this GD and I have never witnessed a sig quote happen in my TL history ever before. -Najda

I have a couple of thoughts. First, I if you decide to do this with SC2, I would highly recommend sticking to games between Grandmasters. I suggest this because I feel that SC2 is difficult to predict where a game will go even under the best conditions. Outside of GM there are more variables added, namely the types of builds used are going to vary more widely and there is suboptimal/slow play to consider. Furthermore, at certain levels (especially diamond, I have heard) there is a ton of "cheese" and predicting what a game state will be in the future becomes absurdly impractical.

I think being able to successfully predict where a game state will be is highly dependent on a rather intimate knowledge of SC2, especially the current "metagame" (builds and strategies that are considered optimal/effective by top players). You might want to consider whether the teams will be given some background training in the current meta, assuming you are using recent replays. Or perhaps this will be part of their research over the 2-3 days? Even with this background there are no guarantees, as no one executes things exactly the same, and even at the very top many players eschew the consensus of what builds and strats are best and just do their own thing.

Secondly, as an above poster pointed out, predicting the outcome of battles is extremely complicated and effectively impossible unless you know the game as intimately as a pro does (at least battles that aren't extremely lopsided). At a minimum, you must be totally familiar with the unit sets and tactics each side will employ as well as the skill level of each player. Even knowing these things, there are elements of randomness, as often a single mis-click on the part of one player can drastically change the outcome of a battle (which is part of the excitement of watching the pros play). Even something far less chaotic than a full scale battle, such as various kinds of worker harass (which often come around the 5 minute mark) is next to impossible to predict whether 0, 5, or 15 workers will go down, or if the harassing units will survive intact, based on a static moment in the game. I think your endeavor would only be practical if you pre-screen the replays and you select for ones where very little unit interaction takes place between opposing sides.

Too unpredictable, I have to say. If there is any major fight which occurs during the period between the replay ending and the prediction then it's unreasonable for there to be muchaccuracy. Especially because in general, if a fight occurs and one person loses, it might as well have not occured as the would-be loser could have pulled back his army and waited for a bit. Or alternatively, if a protoss is in the game, then the fight can depend critically on how good their micro is (which is so hard to judge, that you see the best protoss players in the world walking into big fights and getting smashed because they miscalculated how much mileage their skill would give them).

The only situation which I think could be good for your paper is after some unusual early game shenanigans where both players almost die and then you stop the replay and have them predict the game state a minute or two into the future. For instance, Serral vs Maru Game 1 at WESG 2017 at 7:45.

Problem is that you need high level players to play the games (since everyone else kind of sucks at this game to the point that predicting their behavior accurately is hopeless), and so it will be hard to find real examples.

Banned for saying "zerg players are by far the biggest whiners in sc2 history" despite the fact that this forum is full of such posts about Terrans. Foreigner Elitists in control!

Interesting idea, but i wonder how appripirate this challange would be.Generally speaking, how an SC2 match goes down comes down as a function of the participating players skill,general knowledge and immediate decisions. Basically it is possible for a player to make different decisions the next game starting from the same positions, and thus the whole game will go down differently. Actually as most people improve they will continuously try out new ideas in the game or optimize their previous execution to fit the situations better. So trying to predict how the next few minutes of the game go down sounds largely as is trying to predict the skill levels / styles / current mood of the participating players, on top of needing to deal with a significant noise coming from unpredictable engagement mistakes on battles.It just feels a little bit like trying to predict what Bob Ross will put on the canvas next (most likely friend-trees)But maybe it can be a good proble depening on the setup of conditions.

Well since you are leaving specifics open-ended of course this is possible. That said, it wouldn't be reasonable to predict non-professional caliber players because those players are too unpredictable and prone to nonsensical mistakes.

Truncating at 5 minutes and asking about the state in one minute is very possible, we can tell this because professional casters and commentators often do exactly this, with a reasonable degree of accuracy.

The problem with this experiment, however, is that you are essentially predicting:

1.) Are these two players going to follow the meta for the next 60 secondsand

2.) Is there something about the current state of this game at 5 minutes that makes it a rare exception to #1?

I like this idea ! However, as it has been pointed, it is better if the subjects are aware of the players' league considering players' reactions are heavely dependant of their overall knowledge/skill level.Overall pro ladder games should be used.

Yeah, OK. So trying to come up with the right time probably isn't the way to think about it. Probably need to pick a replay so that its not too crazy at the start and where there aren't any major battles in the minute after the replay stops so that there aren't so many random factors (e.g. missed clicks) in play.

Thanks.

On February 18 2019 00:54 solidbebe wrote:Predicting a minute ahead after 5 minutes into a replay really depends on the replay for how difficult it is. If it's a long macro game then at 6 minutes probably not much will have happened. So predicting the change is just a matter of increase in workers/bases/buildings and tech. On the other hand if it is a cheesy game or a player does some kind of timing attack, predicting a minute ahead could be extremely difficult, not to say impossible.

I would say the question kind of depends on what you are interested in with your research: predicting predictable situations (e.g. how good are teams of people at estimating the growth of a player's economy, assuming no attacks take place), or predicting unpredictable situations (e.g. will this player's timing attack succeed).

Regardless, the difficulty of predicting will depend almost completely on the specific replay in question, and very little on the time mark. The more fighting there is in - or before - the period where people need to make predictions, the more difficult it gets.

Having read the comments (which have been great by the way), I think I might pilot some grandmaster master and maybe some gold games and see what works best. I'll stay clear of games further down the spectrum which may have more variability in their play.

Thanks.

On February 17 2019 22:04 WeakOwl wrote:I like the idea and it gives me a new perspective for analyzing my own replays. I believe the only way to keep it consistent would be using the highest level players because if not you could easily get a lower level game where maybe one player has ok'ish macro but very good micro against an opponent who has good macro but poor micro and there is no telling what will happen. You could try messaging some random bar codes from kr grand master and ask for a replays.

On February 18 2019 05:47 stilt wrote:I like this idea ! However, as it has been pointed, it is better if the subjects are aware of the players' league considering players' reactions are heavely dependant of their overall knowledge/skill level.Overall pro ladder games should be used.

On February 18 2019 15:23 SimonDennis wrote:Having read the comments (which have been great by the way), I think I might pilot some grandmaster master and maybe some gold games and see what works best. I'll stay clear of games further down the spectrum which may have more variability in their play.

On February 17 2019 22:04 WeakOwl wrote:I like the idea and it gives me a new perspective for analyzing my own replays. I believe the only way to keep it consistent would be using the highest level players because if not you could easily get a lower level game where maybe one player has ok'ish macro but very good micro against an opponent who has good macro but poor micro and there is no telling what will happen. You could try messaging some random bar codes from kr grand master and ask for a replays.

Fully agree that you should use high-level games. Probably just stick to grandmaster (or pro) games. It should be possible to contact a couple of pros and have them send you replays from their ladder games, which will not be publicly available for participants to view.

FYI, Gold league is not what comes after Masters. Gold is a low level. It goes like this:

GrandmasterMasterDiamondPlatinumGoldSilverBronze

"bubble up a lot of infestors, rained like hulala throwing out N much infestedman" || End region lock next year, please

On February 18 2019 15:23 SimonDennis wrote:Having read the comments (which have been great by the way), I think I might pilot some grandmaster master and maybe some gold games and see what works best. I'll stay clear of games further down the spectrum which may have more variability in their play.

On February 17 2019 22:04 WeakOwl wrote:I like the idea and it gives me a new perspective for analyzing my own replays. I believe the only way to keep it consistent would be using the highest level players because if not you could easily get a lower level game where maybe one player has ok'ish macro but very good micro against an opponent who has good macro but poor micro and there is no telling what will happen. You could try messaging some random bar codes from kr grand master and ask for a replays.

Fully agree that you should use high-level games. Probably just stick to grandmaster (or pro) games. It should be possible to contact a couple of pros and have them send you replays from their ladder games, which will not be publicly available for participants to view.

FYI, Gold league is not what comes after Masters. Gold is a low level. It goes like this:

On February 18 2019 15:23 SimonDennis wrote:Having read the comments (which have been great by the way), I think I might pilot some grandmaster master and maybe some gold games and see what works best. I'll stay clear of games further down the spectrum which may have more variability in their play.

Thanks.

On February 17 2019 22:04 WeakOwl wrote:I like the idea and it gives me a new perspective for analyzing my own replays. I believe the only way to keep it consistent would be using the highest level players because if not you could easily get a lower level game where maybe one player has ok'ish macro but very good micro against an opponent who has good macro but poor micro and there is no telling what will happen. You could try messaging some random bar codes from kr grand master and ask for a replays.

Fully agree that you should use high-level games. Probably just stick to grandmaster (or pro) games. It should be possible to contact a couple of pros and have them send you replays from their ladder games, which will not be publicly available for participants to view.

FYI, Gold league is not what comes after Masters. Gold is a low level. It goes like this:

GrandmasterMasterDiamondPlatinumGoldSilverBronze

I'm not sure but the person to contact about replay handling would be Breath20 from sc2replaystats.com or the people running Scelight.I know Breath20 is on the sc2replaystats discord (https://discord.gg/FnBCzPn) quite often so that may be the best way to contact him.

Given a snapshot of the state of the game at a given time, there may be some reflexion needed to perdict a future state. Knowing how the game developped (replay), depends if there are still upcoming decisions or not. I agree that if the game contains harass/micro battles in the prediction time it is hard to predict how much it will slow things down.

If there is still an upcoming decision in the tree, main problem would be to identify the build. Would mostly be dependent of the meta which changes from season to season. Process would probably be:

identify the map, given the map a date range is known,

given the date range a list of builds that were/are used can be found

match the current status to the list of status corresponding to those builds to get a set of probable decisions

There would be a number of possible choices, so answer would be a list of probabilities on the decision tree.(in a game, if there is no common response to the main choices, player either scouts for additional information or gambles)

If all basic decisions have been made and it is just a matter of adding the worker/units produced down the line, it is probably too easy. Moreover, if the input is available as a replay, anyone should be able to come with a pretty close result by simply playing out the next few minutes using the "resume from replay" function.

Taking TvZ 2 base vs 4 base so many factors come into play leading up to this moment. Worker damage and harrasment, tech delays, army engagement, worker cutting, creep spread, map control. On paper you'd say an undamagerd 4 base zerg player with all gases is definitively ahead and should win the game. But that's the perfect world. From their you need to start to breaking down in what ways that advantage is mitigated.