Monthly Archives: April 2014

In my mind, the heart of Spawning Tool is the extracted, readable build orders. To get to that point, however, there are a lot of replays to sort through, and I think that’s where the tagging system becomes valuable. Many of the tags are auto-generated from replay data or extrapolated from past data. The biggest area still requiring human analysis, however, is labeling build orders, which, judging from the front page, hasn’t been well-distributed in the community.

And I admit that the experience so far sucked. You had to find a replay on your own, and it was at least 4 or 5 clicks to punch in a build order. Hopefully, however, it’s a lot easier now with the new system for labeling build orders. There are a few parts to this.

First, you can now approve and reject suggested tags with one click. Previously, it was hard to know what taxonomy of build orders was, and it took too many steps. Now, there are a few thousand procedurally-generated suggested tags for you to approve. You can take a look at the build order and hit “yes” or “no” to determine whether the tag is appropriate or not.

Second, the interface to tag replays is up top on this page. Previously, it was hidden at the bottom of the page on the sidebar and took a click to open up. Now, you’re automatically focused into the box so you can add tags immediately on page load without having to click or scroll anywhere. Hopefully, you can pair up the action of approving or rejecting a suggested tag with more detailed tags on top of that.

Finally, you now see isolated build orders and can browse replay-to-replay. Previously, you had to bounce back and forth from browse (or open 10 tabs at once like me) to tag several replays in a row. In the build order labeling pages, you can jump from random build to build and stay on a roll.

The link to the build order labeler is on the front page, so you can hop straight into that and check out the world of actual in-game builds. Remember to login as well so that your tags are associated with you and counted in the leaderboard.

One more thing: my hope is that labeling build orders can become more and more automatic (though still with some human intervention). Machine learning is in the works and can generate suggestions, but it will only improve with more hand-labeled training data. I’m not sure where the tipping point is, but I’m excited to get to a point where that can take off on its own!

For each replay, the map is divided into a 3×3 grid, and each cell is assigned a clock position (11, 12, 1, 3, 5, 6, 7, 9). The starting building (CC, Hatch, Nexus) position for each player is recorded. With those, cross positions are all locations that don’t share either a column or row, leaving 3 cross positions for each starting location. For example, 11 is cross from 3, 5, and 6.

The Replays are mostly from released tournament replay packs uploaded to Spawning Tool. Unfortunately, the biggest source of professional games is WCS, and they haven’t released for 2013 season 3 or 2014 season 4 (though I’m excited to redo these numbers after they do!). Because of that, we don’t have as many examples from newer maps.

Maps are collapsed across the different versions (e.g. Frost and Frost LE are counted together). Star Station was changed to a 2 player map at some point, and Alterzim Stronghold is relatively new. For the other maps, close positions are twice as likely as cross positions, so that’s the difference in counts.

A confounding factor here is bans. Since players in tournaments can ban maps that they don’t have favorable matchups in, we have a biased sample on these maps. I don’t really have any thoughts here.

The cross/close position data is available on Spawning Tool (though it does require sticking &tag=1173 or &tag=1172 to work in the research tool), so I welcome you to poke around with the data there to see if you can find anything else. Also let me know if there is anything else you’re interested in that you think can be informed by replay analysis!

A few weeks ago, I sent out a survey (still open here if you want to fill it out) about how users use Spawning Tool and what they were interested in seeing in future development. Thanks to the feedback there, I have made quite a few changes recently. There are a few big ones I want to talk about in more detail in future posts, but here’s a list of some of the smaller ones.

First, the browse replays page now shows the names of tagged players. This happens to be on of the most important pieces of information to see at a glance, and it doesn’t clutter the interface. I would have liked to do map name as well, but the poor standardization in map names would be messy, and you’re better off using the hierarchy from the tag filters.

Second, I slapped race icons around on the site. One totally valid criticism of Spawning Tool is that it lacks any visuals. I’m not great with either visuals or data visualization, so I largely depend on text and numbers to convey things. I’m open to other suggestions on visuals as well.

Third, I opened up tag pages for all users. I was previously using this just as an administrator tool, but it’s a handy dashboard around a player or build order. Currently, it contains the list of replays tagged and the parents and children of the tag so you can see the hierarchy that exists behind the scenes. I’m a little scared of fleshing out the page too much since generating content is time-consuming and would probably look a lot like liquipedia content, but if you have any ideas on useful things for this page, I’m open to suggestions.

Fourth, there have been various tweaks to the research pages, which were largely inspired by my own annoyances in using them. You can now filter by build orders for each players, and the View Win Rates page has more data to read things off more easily. I think I buffed out the advanced research page as well, but you should consider that “under construction” even still.

Fifth, you can now drag-and-drop .SC2Replay files onto any page (other than the upload page) to instantly upload your replays. A common use case I see for replay sharing is getting feedback from others, and I wanted to make it as painless as possible for someone to share a replay and the build orders.

Those are the minor but not trivial updates. Look for updates soon on other features, and send along any feedback on these or other proposed changes for Spawning Tool.

It’s past 3AM here, and over the past 6 hours or so, I have been cranking on a few minor features for Spawning Tool, but primarily machine learning to learn to label build orders. It’s not very well-trained at the moment, but it got to 61% on Reaper Expands, so it was above 50-50. More importantly, the code ran to completion! I’ll write more about that soon.

In the meantime, however, I think I might be sleeping in tomorrow, so I thought I would publish stats before heading to bed. Enjoy the semifinals tomorrow!

2. HyuN can be deadly in the early game. He’s 9-0 before 12 minutes and 13-2 before 16 minutes (ref)

Methodology
I used the data from Spawning Tool to generate all of these statistics. Notably, replays for WCS 2013 season 3 and the current WCS season have not been released, so those are not included in the current sample. Some players have more data from other recent tournaments, whereas others may be based on older data with different play styles.

If you get chance, please poke around with the data on Spawning Tool and share any other interesting trends you find!