Testing Preordain: Qualitative Results

Once again, it is time to start rolling out my results from the latest Banlist Test. As usual, I will start with the experimental setup and the unquantifiable results. I know that what most readers care about are the hard numbers, but I’m not done gathering the data yet. That will be coming sometime in September—probably. I’m done with Storm and about halfway through the UW testing. Completion date will depend on how the PPTQ season goes, as I’m splitting my testing time between that and Preordain.

For those who are new to the series, I take a card from the Modern Banned List, put it back into the deck that got it banned (or as close as possible), and see how it fares in the current metagame. My goal is to bring hard data and scientific inquiry into the discussion instead of more opinion and baseless speculation. Therefore, I play a lot of matches with the deck (normally 250 with the banned card, 250 without it) to build a sufficient data set for analysis. I take the test data, compare it to the control data, and from that I hypothesize about the safety of the test card. I laid all this out in more detail in a previous piece. The card that readers voted for me to test this time was Preordain.

This test was very different from the lastseveral. With both Stoneforge Mystic and Jace, the Mind Sculptor, I just tested a single deck against the gauntlet. While this often took a while, the testing was fairly straightforward. I took the deck, learned the deck enough to be passable, ran the gauntlet. The decks I was using certainly helped. Yes, they were midrange decks, but their gameplan was clear and the decision trees relatively clear and comprehensible.

This time, for reasons explained here, I tested Gifts Storm and UW Control. This complicated things. To get a decent data set for both I’d have to play a lot more games. Doing the usual 500 matches would yield half the data. I made this harder for myself by playing hard decks against hard matchups. These decks require a lot of experience to navigate and Storm is very vulnerable to itself in the face of pressure. I’m not claiming to have played these decks perfectly, but I was at least average with Storm and good enough with UW that I took an updated version to a PPTQ. Thus if you see issues with the results or my data, consider that I am just one man with a few volunteers—in an enormous undertaking like this, exhaustion and deck difficulty are bound to play a part.

Experimental Setup

As always, I would be piloting the test decks against (semi-) willing opponents wielding decks that they are reasonably good with. We’d play match after match at a stretch, with me alternating between the test and control deck to even out the experience and skills I was developing during the tests. Prior to data collection, we always played at least a few practice games to get a feel for things and determine the correct sideboard plans. Previously, my team has used a variety of methods to actually play the games, including MTGO. We did not use MTGO at all this time. This prevented us from losing matches to misclicks and ruining the data set. It was also significantly cheaper. I don’t own most of the digital pieces for Storm, couldn’t get them, and already dislike MTGO. Playing paper in person or over Skype was much easier. And free. I like free.

As I mentioned above, my data set is normally 500 matches. That is too small a set for two decks, but it was logistically implausible to just double it. It takes months to get all the data together as is—doubling would push completion into October at the earliest. I’m just not going to put that kind of time in to this project. Therefore, this data set is 640 total matches (160 per deck, and 32 per matchup). Why 640? I didn’t have a set target when I started, but I knew that 150 was the bare minimum. Of course, I was testing both decks simultaneously to save time and I was burning out. I decided I’d had enough at 27 matches, but that was an ugly looking number and felt like too big a cop-out so I kept going to 30. And then did two more so we’d get nicer aggregate numbers.

The Test Decks

All of the decks were chosen in mid-May. They are as close to “average” lists as my team could find. Several members were irritated, as they wanted to try out their personal tech during testing, but the whole point is to see how these cards work against a representative metagame. Thus we used the most average build of every deck possible.

Choosing the test decks was harder than actually fitting in Preordain. In previous tests, I actually had to build decks around the test card. Stoneforge Mystic requires six slots minimum, Jace, the Mind Sculptor benefits from and rewards decks that play lots of very cheap spells. This required actual deckbuilding. This time I’m testing a cantrip in decks that already play cantrips. I just replaced the weaker one for Preordain. There is some consideration of adding more, like a Legacy deck would do, but we couldn’t agree on how to do that and the clock was ticking. I went with the quick and easy option.

The core combo of the deck is very well established, and it’s just as powerful and fragile now as it was in 2013. Swapping Pyromancer Ascension for Baral and the banned Gitaxian Probe for Gifts Ungiven is the only new innovation. I saw some lists running Merchant Scroll, but that was very much a fringe choice and didn’t make the cut.

The most common sideboards at the time were Gifts packages. I’m not sure they’re actually better than more focused boards, particularly because there are no Blood Moons, but this was what saw the most play at the time. I don’t know that it made much of a difference. My experience showed that sideboarding was a very delicate thing and I did it at the barest minimum possible to preserve the combo. I doubt that the exact composition of my sideboard would have changed that plan. There was some consideration for the transformative Madcap Experiment/Platinum Emperion combo, but everyone I asked said it was worse than extra Empty the Warrens.

The Spell Queller plan was popular at the time, though it has gone away recently. I didn’t really like it, but it also didn’t have much opportunity to shine.

The Gauntlet

As usual, I chose five decks from all corners of the metagame, giving preference to Tier 1 decks. Again, the point is to test the power of these boosted decks; it makes the most sense to test against the best. This was both easier and harder than before. Every type of deck was represented in Tier 1 in May, but the control deck was UW Control. Which I was already testing by virtue of it being the… erm, control deck.

I needed to use the same gauntlet for both decks so the results were comparable. As such I fudged it to use a Jeskai list. This is not unusual now, with Jeskai ticking up in popularity, but it was unheard of at the time. I’m also fudging a bit by using Counters Company as my combo deck. It’s far more combo than Abzan Company was, but it’s still not a true combo deck.

Preordain, Qualitatively

The initial results are actually very disappointing. At this point I’ve played over 500 matches (~140 to go!) and I don’t have a strong opinion on Preordain. This shouldn’t be surprising: it’s a cantrip. Cantrips don’t have that much impact on a game (unless you play a lot of them), hence the name (it’s a D&D reference). They’re like the oil in an engine. You notice when they’re not there, but otherwise you just don’t see the impact. Upgrading your cantrip is like buying higher quality oil. Yes, your engine will run smoother and your mechanic may see some improvement, but you are unlikely to actually notice any difference in normal operation.

In a way, that is my answer. It didn’t really feel special to play with Preordain. It was a definite improvement over the replaced cantrip, but not enough for me to feel strongly about the card. Its value swung wildly based on the situation and stage of the game, but so does that of any cantrip. Part of that may be how I played it, and it is very possible that decks would be built very differently with Preordain in the format. But players may also find that the lengths you have to go to just aren’t worthwhile, like putting high-octane gas and racing lubricant in a Civic.

In Storm

I barely noticed any difference between Preordain and Sleight of Hand. This is probably because most of the time PreordainwasSleight of Hand. I will include the actual numbers when I circle back to this, but most of the time I kept one card and bottomed the other. You do get extra value from having options, but I didn’t utilize them very often. It is entirely possible that I was wrong about that, but it certainly didn’t seem that way to me or my team.

Preordain was swept up in the post-Pro Tour Philadelphia 2011crackdown on combo. At the time it made sense—not all the combo decks used fast mana but they all used cantrips. Subsequent bannings have further weakened combo. Based on what I experienced, those later bannings made cantrips worse in combo. Games when I had a cost-reducer into Gifts Ungiven were far better than stringing cantrips together. It just didn’t feel important to Storm.

In Control

Of course, it really doesn’t feel special in UW either. It is unequivocally better than Serum Visions after turn four, but on turns 1-2, it’s worse. In the mid- to late-game, you’re looking for specific answers and Preordain delivers them right away instead of setting you up for next turn. However, early on you’re just looking to get deeper into your deck, and Visions will always show you three cards. You get a random card that you won’t play anyway and then set up for the next two turns. It’s normally correct to Visions at the first opportunity as a result. Preordain cannot do that, so you don’t play it early, saving it to find specific cards when you need them. I suspect that I should have played both, but hindsight is 20/20. I believe that I’m doing better as the game goes long but losing to mana screw early more often. We’ll see what happens when the data comes in.

Coming Soon

So that’s it for now, I’ll be back with the data sometime relatively soon. Next week, we’ll be seeing if anything interesting happens to the banlist on Monday.

David began playing Magic during Odyssey block, quit playing Magic when Caw Blade ruled the world, and returned to Modern shortly before Deathrite was banned. He’s made an appearance at the Pro Tour, made money at GP Denver, and is constantly grinding and brewing in Modern.

5 thoughts on “Testing Preordain: Qualitative Results”

I think that the card that you must take out for preordain is serum visions, not sleight of hand. Serum visions is awful on the combo turn (preordain is awesome), while sleight is awesome cause is pure card selection. As a setup card, visions and preordain are very similar, while sleight is a little worse, but storm being a t3 deck and not so dependant of keep drawing cards like the old versions (you just need a creature and gifts to go off) mitigates this issue

And, on an empyrical level, i play a lot of combo (mostly ad nauseam) ive lost a lot of games by visions not being preordain (A LOT) but ive never felt bad casting sleight

In storm, preordain (digs up to 3 cards deep* on combo turn) is better than serum visions (always 1 card deep). Sleight of hand (always 2 cards deep) is also better than serum visions in a combo deck of this kind.

As others have mentioned, serum visions was the card to cut for preordain, not sleight of hand.

Also, the power (and potential problem) of having more “good” cantrips in a format like Modern is that you get to play more of them in your deck! If you’re just subbing out like-for-like and basically building the same deck again with one marginally better cantrip instead of another, I’m not surprised you didn’t see much improvement.

That said, 3 cards deep* is a lot better than 1 card deep. I realise you’ve done the testing now and put the hours in, but it might be worth consulting your audience (or at least a focus group) before making cuts or replacing cards like you did with Storm here.

Thanks for all the work though bud. You’re putting a lot of effort into this.

It may well be true that Serum Visions was the card to cut, but that wasn’t what my preliminary testing and consultations showed. I do ask around and try out decks before starting the testing and while it was close, Sleight was the pick. The problem I had wasn’t fizzling during the combo, I think that happened ten times total, but getting to the point where I *could* combo. The majority of my losses were becauseI just couldn’t find all the pieces I needed to actually go off before I died. Having Merchant Scroll for Gifts may have solved that problem, but I can’t say for certain. The fact that Sleight was better midcombo was largely irrelevant as far as I could tell; I needed the extra looks from Visions. That’s why I kept it.

As for just playing more cantrips, I’m not certain there’s room for that. Storm is a pretty tight list and I don’t think Gifts is cutable. It appeared better to me to keep that package and just run 8 cantrips.

Current metagame: 12/1 – 12/31

NOTE: Metagame % is calculated from the unweighted average of all MTGO leagues, paper T8s/T16s, and GP/PT/Open Day 2s in the date range. Data is tracked in the Top Decks page, which you can browse for more details.