Social

The State of Competitive Prismata: Ladder, Ratings, Arena, and Unit Updates

My apologies for the late post. I have been a bit sick, and spending a lot of time on the road with PAX and now SXSW. It’s finally up; you can put your pitchforks away.

This post is actually a bit overdue. I wanted to talk about a number of changes that we’ll be making in the upcoming weeks as a direct result of the feedback we have been receiving from our players. This post is pretty long, containing a lot of info, comments, pseudo-announcements, and actual announcements, and will cover:

Unit balance tweaking

The Prismata leaderboard (and its future)

Ratings

Our policies regarding smurfing

Competitive game modes

We’ll also be giving a few sneak peeks of some upcoming new units.

Unit Balance Tweaking

The most important of today’s announcements is a new unit balance survey, featuring questions about Scorchilla, Electrovore, and all the other units we’ve added or changed in the last few weeks:

We also made a big change recently to one of the most contentious Prismata units—Deadeye Operative.

ANNOUNCEMENT 2: Deadeye Operative’s cost has been changed from 11BB to 4BB + Consume a Steelsplitter

Here’s my explanation from the changes, copied from the related reddit thread (if you have any comments on the new Deadeye, feel free to post them there!)

Balancing a unit with two abilities is always difficult. A 2/4 that cannot block isn’t worth much more than 8BB (consider Doomed Mech, which has an excellent blocking ability), and the old drone-sniping deadeye was priced fine at 7BB (though it had other problems). So a unit that can choose one of these abilities OR the other should cost at least 8BB, but how much more? The answer tends to depend on the option value afforded by having the choice of two different abilities.

The old Deadeye had the problem of this option value being worth A TON when the unit was rushed, because the traditional drawback of getting the first attacker (that it would deal no injury to the opponent if the opponent responded with a Wall) is completely negated by the sniping ability. 11BB adds 3 gold worth of cost, effectively balancing “one sniped Drone”, but certain Deadeye rushes (e.g. the P2 DD/DD/DBB into Deadeye) builds get even more value out of the unit.

Of course, this leads to a problem. Increasing Deadeye’s price wouldn’t really nerf this rush that much unless we pushed it to 13BB, in which case it would be almost unbuyable in the late game (two Steelsplitters would be better in most cases).

We did some analysis ourselves and didn’t find insta-win/unbeatable rushes for either P1 or P2 (though some seemed VERY strong, especially alongside other BB units like Energy Matrix, Grenademech, and Doomed Mech). But we did find that Deadeye openings were often so good that in most Deadeye sets, it was almost always correct to buy Deadeye as the first attacker.

So the question became… how can we weaken Deadeye rushes without affecting the balance of the unit? We thought about increasing its tech costs to BBB, adding other tech colours, adding costs to its ability (e.g. adding a B cost so you couldn’t spam Deadeyes in the opening), making it have only one supply, adding arbitrary rules like “cannot buy until turn 8”, and even removing the unit entirely.

Credit goes to Alex for thinking of the solution we went with: Deadeye is now an UPGRADE for Steelsplitter. This means:

You can’t get a Deadeye straight away off of two Blastforges. You have to get a Steelsplitter the turn before, which means you have to get at least one Blastforge TWO TURNS before you get the Deadeye.

Consequently, rushes aimed at getting a Deadeye out on turn 4 are substantially weakened.

You cannot consistently pump Deadeyes out of two Blastforges. You have to also get Steelsplitters, which means you need a third Blastforge, or you need to take a break from pumping Deadeyes for a turn.

The Steelsplitter’s attack is not consumed when a Deadeye is constructed. However, it will be wasted during a rush if the opponent creates a Wall (which further decreases the value of rushing out a Deadeye instead of buying it later in the game).

We’re still experimenting with the unit. As before, it’s tricky to price; the correct cost might be anywhere from 3BB to 6BB, so we’re keeping an eye on things, with the idea of tweaking knobs as necessary to arrive at the best possible version of Deadeye we can find. Let me know if you have any feedback.

The Prismata Leaderboard

Ever since we first added it, the Prismata leaderboard has been a source of contention as top players battled over the highest spots. Many players love the idea of being able to climb a ladder to prove that they are among the top Prismata players. But the leaderboard can also have a number of negative effects, including:

Ladder anxiety: players may not want to play games for fear of losing a position on the leaderboard.

Cheating: players may trade wins or use smurfs to gain high leaderboard positions.

Unfairness: inactive players might sit on high ratings, claiming a higher leaderboard rank than they truly deserve as the remaining community improves at the game.

The current highest rated Prismata players

To provide some context, let me emphasize one thing: this leaderboard is JUST FOR FUN. We never plan on basing invitations to tournaments or world championships on ratings (either peak ratings, or ratings at midnight on a particular day). Instead, we are working on developing a point system, tentatively called Grandmaster Points, that will be awarded to top players based on their best tournament performances in a sanctioned series of regularly run tournaments. However, this tournament series (tentatively called the Grand Prix) will likely not start for several months.

(Incidentally, if you have any ideas or suggestions on how the Grand Prix system should work, or know of similar systems that have been successful for other games in the past, please let us know! One important goal for us is to be inclusive of players of all skill levels.)

As for the leaderboard, some players have suggested replacing the current top ratings with a ranking that sorts players by their highest rating ever achieved. Though this can help reduce some ladder anxiety (players never have to worry about bad performance affecting their position on the ladder), it also has its own problems (namely, it heavily rewards luck, and is quite vulnerable to cheating).

Of course, we know that a number of players enjoy working their way up the ranks, so it seems likely that some version of the current leaderboard should remain in Prismata. However, our new user interface (coming in April) will display other top player rankings, which may include arena mode scores, Grandmaster Points, and daily or weekly performance ratings (e.g. the average of the ratings of the best 20 players that you beat this week).

For the “top rated players” ladder, we do plan on making a couple of changes to the current leaderboard in the upcoming few days. The first is visibility decay for inactive players, which we first discussed on reddit several weeks ago. We’re going to try the 15/15/15 scheme for now, but we’re open to changing this if you folks think we are being too harsh (or too lenient).

ANNOUNCEMENT 3: An inactivity penalty will soon be applied to the leaderboard-displayed ratings of players who have played fewer than 15 ladder games in the past 15 days. The penalty is 15 points for each game not played (max of 225 points). This penalty is progressively removed as soon as each missing ladder game has been played.

The goal of this policy is threefold: to encourage top players to play more games, to increase the legitimacy of the ladder by preventing players from sitting on high ratings, and to give a bit of a gentler experience to players who might be coming back after a few weeks away from the game. This change will go live whenever the next version of our server is deployed (likely next week).

The second change is explained below.

Prismata’s Rating System

Prismata’s rating system doesn’t use Elo or Trueskill; it actually uses a homebrewed system based on Bayesian learning, designed by Shalev—our resident rating expert (and MIT PhD student nerd). The system is pretty flexible and lets us tweak things until we’re perfectly happy with the results. And tweaks will be happening soon!

Before I get to the tweaks, I want to explain a bit about why changes are necessary. It might not be obvious that a rating system (e.g. Elo) could possibly mispredict the probability that one correctly-rated player beats another (after all, isn’t that what a rating system is supposed to do?) However, not all rating systems are created equal. Here’s a quote of mine from a reddit discussion on the topic.

Consider the following thought experiment: Suppose A beats B 1/4 of the time and B beats C 1/4 of the time. How often is A expected to win against C?

Different rating models give vastly different answers to this question, and different competitive games offer different answers as well. The Elo system would estimate A beats C about 10% of the time. But there are games where this could be higher or lower (anywhere from 0% to 25% might be valid). See these examples:

Example 1 (25%): the game is “take two people’s ages and flip two coins; if they are both heads, the older person wins. Otherwise, the younger person wins.”

Example 2 (0%): the game is “Flip 2 coins with the older player winning 3 points for each heads and the younger player winning 3 points for each tails. Then add 1 point for each year of each player’s age.” and the players A/B/C have ages 10, 14, and 18.

Obviously these are stupid games, but if you replace age with “skill”, then suddenly it becomes apparent that different kinds of games can have vastly different win likeliness probabilities depending on the gap in skill.

The point is, our rating model implicitly contains a “curve” for this “skill gap vs win percentage” relationship. It’s different than what is used by Elo, Trueskill, and other algorithms. It’s a little bit off at some skill levels, and we’ll be making some adjustments to correct this.

ANNOUNCEMENT 4: The rating system will receive some minor tweaks. These will result in the following differences:

Games between players of distant ratings will result in a smaller change if an upset occurs.

High-ranked players’ ratings will adjust more slowly when they win or lose games (it will take more wins/losses to climb or drop the same amount).

Like the previous change, this change will go live whenever the next version of our server is deployed (likely next week).

A few other rating-related changes are under consideration for inclusion later this year. They include:

More rating transparency; for example, displays of how many rating points (or Tier %) can be won or lost before each game.

Rating history: display of how many points gain or lost in past games.

Separate ratings for blitz time controls.

Though under consideration, I don’t wish to make any hard promises on these ideas, because we’re very cautious about presenting information that might annoy players or drive them away from the game. When the new user interface is launched next month, you may see some of these features added.

Smurf Accounts

There has been a ton of discussion in the Prismata chat and subreddit regarding this, so I want to clarify our position on smurf accounts.

A smurf account is a secondary Prismata account used for laddering by a person who already owns a primary Prismata account.

Smurfing might seem harmless, but having smurfs on the Prismata ladder can damage the experience for players in a number of ways. We want our top 200 ranking list to reflect the 200 best Prismata players, and the legitimacy of the list can be called into question if the same player is represented in multiple spots because they’ve laddered multiple accounts up to the top. Players just outside the top 200 can feel cheated out of a spot that they might otherwise deserve. And of course, smurfing can lead to all kinds of fraud and shenanigans when it comes to tourneys.

It is for reasons like these that smurfing has always been banned in our ToS. You can read about that in section 4 here. It doesn’t mean we necessarily will disable smurf accounts, but we have full authority to do. If we abruptly decide to nuke your smurf account from orbit, you can’t whine about it (because we told you so!)

That said, I want to clarify our “practical” policy on the matter, which essentially amounts to the following:

ANNOUNCEMENT 5: For the time being, Lunarch Studios does not intend to remove or delete smurf accounts unless asked to do so.

Of course, this may change at any time (especially if we notice somebody using a smurf account to do something particularly nefarious or annoying.) But for now, you can keep your smurfs. There are two reasons why we’re being a bit soft on smurf accounts for now:

A ban on smurfing is fundamentally unenforceable. We can check the IP addresses of our players to see if multiple accounts played games from the same location, and there are several other tools (e.g. web fingerprints) we can use to identify multiple accounts played from the same computer, but this can falsely flag roommates or two friends sharing a computer. It’s effectively impossible to distinguish a smurf from a legitimate pair of friends playing two accounts on the same computer without additional evidence.

We want to eliminate smurfing at the source: by removing the incentives to smurf.

We talked to a number of players with known smurfs and asked them why they had created the accounts. Their answers usually were related to the ladder; they wanted to play in a different or experimental style, but didn’t want to risk their high position. Several planned features will help reduce the incentives to smurf. They are:

Unranked games. A new unranked game option (to be debuted when Arena mode is launched) will allow players to play practice games against strangers without risking their ratings.

Inactivity penalties. The inactivity penalties described above eliminate the temptation to sit on one’s rating, which reduces the need for a smurf.

Rewards. Once players start collecting skins and emotes, they will be incentivized to keep their entire collection on a single account.

We hope that these changes (all coming in the next month or so) will be enough to discourage players from laddering with smurfs.

All that said, we do want to remove obvious smurfs from the list of top Prismata players, so that the top 100 represents 100 unique Prismata-playing people. If you’re in possession of a smurf account that is high on the leaderboard, send me an email and I’ll hide it from the top 200. This is entirely voluntary, but I’ll be nagging those of you with known smurf accounts!

Game Modes and Teasers

One final announcement concerning Prismata’s game modes:

ANNOUNCEMENT 6:The “GM set” option in the queue settings will be removed.

The queues for the GM were almost always empty, so we’ll be removing them. With the new “make your own unit set” feature, many of you have found other ways to enjoy specific Prismata unit sets over and over again. It’s not clear that the place for those matches is on our 1v1 ladder, so we’ll be removing the feature.

The arena and reward systems are mostly specified and implemented, but we’re still testing a few things. Expect an announcement on how they will work in the upcoming weeks (likely 1-2 Tuesdays away on this blog).

Finally, I wanted to post some teasers for a few of the new units that we’ll be releasing next week. Here are the images (full specs to come next week!)

About Elyot Grant

A former gold medalist in national competitions in both mathematics and computer science, Elyot has long refused to enjoy anything except video games. Elyot took more pride in winning the Reddit Starcraft Tournament than he did in earning the Computing Research Association's most prestigious research award in North America. Decried for wasting his talents, Elyot founded Lunarch Studios to pursue his true passion.