Valve plugged the hole, but important data has already escaped.

Share this story

A recently discovered hole in Valve's API allowed observers to generate extremely precise and publicly accessible data for the total number of players for thousands of Steam games. While Valve has now closed this inadvertent data leak, Ars can still provide the data it revealed as a historical record of the aggregate popularity of a large portion of the Steam library.

The new data derivation method, as ably explained in a Medium post from The End Is Nigh developer Tyler Glaiel, centers on the percentage of players who have accomplished developer-defined Achievements associated with many games on the service. On the Steam web site, that data appears rounded to two decimal places. In the Steam API, however, the Achievement percentages were, until recently, provided to an extremely precise 16 decimal places.

This added precision means that many Achievement percentages can only be factored into specific whole numbers. (This is useful since each game's player count must be a whole number.) With multiple Achievements to check against, it's possible to find a common denominator that works for all the percentages with high reliability. This process allows for extremely accurate reverse engineering of the denominator representing the total player base for an Achievement percentage.

As Glaiel points out, for instance, an Achievement earned by 0.012782207690179348 percent of players on his game translates precisely to 8 players out of 62,587 without any rounding necessary (once some vagaries of floating point representation are ironed out).

Caveats

Further Reading

Because this data is derived directly from Steam's API for each game, it ends up much more precise than the old Steam Gauge/Steam Spy estimation methods, which relied on random sampling of a small portion of the Steam player base. But this method only works for games with developer-defined Achievements, so it covers about 13,000 of the roughly 23,000 games now on Steam.

It's not exactly clear how Valve defines this "Achievement denominator," which approaches but doesn't precisely match up with the "players" statistics provided to individual developers. The new data also gives no indication of how many people own the game without having played it. And, in very rare cases, this method could come up with a denominator that's off by a factor of two, thanks to common factors (though this chance becomes vanishingly small in games with more than a few Achievements).

The numbers

Before the Achievement data hole could be plugged, Sergey Galyonkin was able to integrate the method into the machine learning algorithm used for Steam Spy, where the data was briefly displayed on individual game pages earlier in the week. As a public service, and with Galyonkin's permission, we're able to share the Achievement-dervied player numbers he collected in this handy CSV file. (Ars was able to confirm the reliability of this data through API-based spot checks earlier in the week). The top 1,000 games by player numbers are also listed on the following pages, for convenience.

This snapshot, accurate as of July 1, will surely grow less useful as time goes on, and it isn't useful at all for the significant portion of the Steam library that don't use Achievements. (Such games aren't included in the data set.) Despite that, and the other caveats listed above, we're happy to share what is probably the most robust and precise data currently available regarding the relative popularity of a large proportion of the Steam library.

Besides satisfying the curiosity of fans, this kind of data can help increase our understanding of the shape of the PC games market. This is the kind of data that other entertainment industries take for granted—in the form of regular reports on box office receipts and TV ratings, for instance—but which remains frustratingly opaque for the game industry at large.

As Valve's Ewert told developers recently, "The only way we make money is if you make good decisions in bringing the right games to the platform and finding your audience." For now, at least, this data leak can help us all better understand many of the games that are "finding an audience" of Steam players.

Share this story

Kyle Orland
Kyle is the Senior Gaming Editor at Ars Technica, specializing in video game hardware and software. He has journalism and computer science degrees from University of Maryland. He is based in the Washington, DC area. Emailkyle.orland@arstechnica.com//Twitter@KyleOrl

I think that means overall players with said game in their account. Considering TF2 has been F2P for about 7 years and also has a major cosmetics market for which people constantly alt and farm, I'm surprised the number isn't higher.

CS: GO is in the 2nd spot but that game has been dead since Valve made changes to how long you have to wait to sell an item on the market. All the skin/crate people lost their incentive to come back. I think this number relates to valid licenses of said game, not current players.

The disheartening thing about this list, to me, is that it shows that Valve likely made hundreds of millions of dollars on Portal 2, but apparently that wasn't enough of a representation of what they're missing out on by abandoning the Half Life series, because I have to imagine HL3 would sell just as many copies as Portal 2 did.

There are a few on this list that I am genuinely surprised are as populated as they are. Spiral Knights, haven't played that in a minute. Didn't think it was that popular of a game. I thought Wildstar died after major population issues.

If you're interested multiplayer, the preferred client is voobly (see aoezone for how to install) as it's way less laggy. But there's still the awesome single player and a relatively large Steam community if you stick with the steam version.

There are a few on this list that I am genuinely surprised are as populated as they are. Spiral Knights, haven't played that in a minute. Didn't think it was that popular of a game. I thought Wildstar died after major population issues.

Wildstar had a F2P conversion, so poor kids who wanted to play an MMO probably swarmed it.

I was in the Wildstar open beta, actually, and I remember giving up on it pretty quickly. It was another WoW-derivative, and it didn't really have any cool ideas. Of course, that doesn't mean that the developer behind it hasn't been adding patches and salvaging the title, but it still was so bland and utterly devoid of anything to make it stand out, that I have to agree with you. I don't see how it has an audience.

The disheartening thing about this list, to me, is that it shows that Valve likely made hundreds of millions of dollars on Portal 2, but apparently that wasn't enough of a representation of what they're missing out on by abandoning the Half Life series, because I have to imagine HL3 would sell just as many copies as Portal 2 did.

Portal 2 came out around the same time DOTA2 was still in beta and users were selling beta invites for copies of Skyrim. While Portal 2 sold pretty big, DOTA represented a massive meal ticket in waiting, and Valve decided to double down on it. While I would like to see HL3 or Episode 3 just for closure, it's pretty clear that a full-on sequel is lower on their priority list than getting cosmetics into DOTA/CSGO/TF2.

I think that means overall players with said game in their account. Considering TF2 has been F2P for about 7 years and also has a major cosmetics market for which people constantly alt and farm, I'm surprised the number isn't higher.

As far as we can tell, the number is actually closest to "totaly number of Steam users who have played the game at least once." People with the game in their account but who have never touched it don't count.

There are a few on this list that I am genuinely surprised are as populated as they are. Spiral Knights, haven't played that in a minute. Didn't think it was that popular of a game. I thought Wildstar died after major population issues.

Remember, these are not current player numbers, but the number of Steam users that have ever played the game...

Since it’s launch in 2013, “GTA V” has sold 90 million units, putting its total haul for publisher Take-Two Interactive Inc. TTWO, +2.23% in the neighborhood of $6 billion—far above the success of blockbuster movies like “Star Wars” or “Gone With The Wind,” which both collected more than $3 billion, adjusted for inflation. Even taking into account DVD and streaming sales would not put the biggest movie blockbusters in GTA V’s neighborhood, said Cowen analyst Doug Creutz, estimating those sales might add up to $1 billion to the films’ totals.

Since it’s launch in 2013, “GTA V” has sold 90 million units, putting its total haul for publisher Take-Two Interactive Inc. TTWO, +2.23% in the neighborhood of $6 billion—far above the success of blockbuster movies like “Star Wars” or “Gone With The Wind,” which both collected more than $3 billion, adjusted for inflation. Even taking into account DVD and streaming sales would not put the biggest movie blockbusters in GTA V’s neighborhood, said Cowen analyst Doug Creutz, estimating those sales might add up to $1 billion to the films’ totals.

By the same token, it's been released on 5 different platforms, each time retailing for $60, plus it has microtransactions up the wazoo, so that it's financially successful is little surprise. Also, Rockstar offers the game through their own client, and I imagine there's a few million who solely own the game through Social Club.

I think that means overall players with said game in their account. Considering TF2 has been F2P for about 7 years and also has a major cosmetics market for which people constantly alt and farm, I'm surprised the number isn't higher.

As far as we can tell, the number is actually closest to "totaly number of Steam users who have played the game at least once." People with the game in their account but who have never touched it don't count.

Can you mention this paragraph in the original article (without total spelled correctly)?

Good points, everyone. I guess I expected more than 12 million on Grand Theft Auto 5 for PC, that sounds low for the most successful media product of all time. I suppose it is probably right.

What you're missing is that the Steam version of GTAV is not the only PC version. If you bought GTAV on PC from somewhere other than Steam (like I did), you would not be part of the number Steam shows.

There are a few on this list that I am genuinely surprised are as populated as they are. Spiral Knights, haven't played that in a minute. Didn't think it was that popular of a game. I thought Wildstar died after major population issues.

Remember, these are not current player numbers, but the number of Steam users that have ever played the game...

There are a few on this list that I am genuinely surprised are as populated as they are. Spiral Knights, haven't played that in a minute. Didn't think it was that popular of a game. I thought Wildstar died after major population issues.

Remember, these are not current player numbers, but the number of Steam users that have ever played the game...

It's always interesting to see all the ways that data can leak, this is definitely a case that, as a programmer, would never have crossed my mind. (And now I'm wondering if I've ever done anything similar...)

My theory on that would be that HL2 came out in 2004 and achievements were added to Steam in like 2007 or 2008. So if this API hack was based on reading out who had lets say a single achievements in a game, or maybe even who has played the game since achievements were added, then a lot of people were already finished and beyond HL2 by the time Valve would have been capturing that data.

I am surprised that games are issued an ID with 3's in them at all seeing as how the number 3 is taboo at Valve. If they are so afraid of the number 3 why not just skip it and release Half-Life 4? It worked for Microsoft.