Projections are central to fantasy sports, and RotoValue integrates projections into the site. Pages like Search or Projected Standings have a Source selection, where you can choose which set of statistics to display: the current or past seasons, preseason data where available, and of course projections.

But with RotoValue Analyst, you not only can see projections like Steamer, Marcel, or RotoValue’s own, but you can upload and edit your own projections, too. How do you do this? RotoValue Analyst customers can see links to custom projections under the Settings menu. From a PlayerDetail page, there is a link to edit projections for that particular player:

Click on this page to go to a page that lets you set custom projections for the player:

If you haven’t set any projections at all, this defaults to use RotoValue’s projections for the player, but here you can save whatever values you like for the player’s statistics. You can also multiply all projected values by a constant, so if you expect a player to play 50% more than the current projections, you’d enter 1.5 in the “Multiply all values by” field.

Once you save, the player’s projections will now appear on their PlayerDetail page as My Projection:

The “My Projection” text links back to the page to update projections for the player, so you can easily change them.

You can also update projections in bulk. When you’re not on a PlayerDetail page, Settings has a link to MyProjections, which lets you update defaults for your projections.

Click on that link, and you come to the MyProjections page. This page lets you download your current projections to a spreadsheet, upload a spreadsheet to change projections at once, remove all projections, or change the default projection source to use for players whom you don’t project:

Here you can click “Download” to download a .csv file of your projections, upload a new file of projections, remove all custom projections, or change the default source for projections. The default source is used for players for whom you have not set a projection.

While on the PlayerDetail page, your projections will always appear exactly as entered, on other pages they may be prorated up or down based on injury data. When a player appears on an injury report and has a target return date, then his projections will be scaled down to reflect the games missed prior to that target return date, so you shouldn’t have to manually change projections to reflect known injuries. Injury-adjusted projections are used in player Search pages, Projected Standings, and also the Auction and Draft applications. You can even change your projections during your draft: either edit a particular player’s projections, or upload a new projections file, and then reload the Draft (or Auction) page to see your most recent projections reflected in the application.

RotoValue Analyst doesn’t just compute prices customized to your league’s settings from some set of statistics. It also lets you create your own projections, and/or modify projections from any of the sources the site displays.

RotoValue Search pages used to support filtering by a single pro team or fantasy team, but now that has been extended to support searches filtering on multiple values of these types.

Previously you’d need two different windows or browser tabs to see a search matching only players on two different teams, but now these can be combined in a single search.

Where before the RotoTeam and ProTeam filters were a single selection dropdown, now there are checkboxes to hide or show the selections. By default the team choices are hidden, but they will appear when you check the relevant box.

The team filters take effect whether they’re currently visible or not, so you can hide the team lines.

So if you want to look at players on your team and those of a potential trading partner in a single tab, you can filter on those two teams. Or, you can search for players who are either on your team or are free agents, which is helpful when considering whom to pick up from the waiver wire.

Like other filters, these are cumulative, so you can add filters on name, position, and/or pro team, and (where appropriate) minimum statistics values (e.g. AB, IP, FGA). If you have RotoValue Analyst, you can also add filters for arbitrary values in statistics, by checking the “Other Statistics” box. This shows a row where you can add statistics and limit values (either >= or <=), and the Search will only show players who match the additional limits.

So this page shows NBA players on one of four teams (Spurs, Suns, Thunder, and Timberwolves) who have 200 or more rebounds and 100 or more assists in the 2018-9 season:

These filters take effect whether the lines where you select them are visible or not, with the corresponding checkbox in the top line toggling which additional filter criteria to display.

Nate Silver’s fivethirtyeight.com published forecasts of the 2018 mid-term elections, predicting the outcome of races for House, Senate, and Governor. The site also made it easy to download their projection data, and so I’d like to look at their forecasts and see how their models did. Silver made three variants of his model, dubbed “Lite”, “Classic”, and “Deluxe”. Lite basically uses only polling (and, where polling is scarce or non-existent, comparisons to similar districts which have polling). Classic adds in other fundamental data, like candidate fund raising and historical voting patterns. Finally, Deluxe factors in expert ratings from the Cook Political Report, Inside Elections, and University of Virginia political scientist Larry Sabato’s Crystal Ball. Silver’s expectation is that while all three models should be good, adding additional complexity to the models should improve their accuracy.

The top line results are quite good, as all three models came very close to the number of seats won in each case:

This table allocates the 9 uncalled races to the current vote total leader, and so counts the Mississippi Senate special election which is headed for a run-off as a Republican seat. To compute the expected number of seats for a party, I added the chance the model gave that party’s candidate of winning the race.

The models gave probabilistic forecasts for each race, showing not only a chance of each candidate winning, but also expected vote share, as well as 10th and 90th percentile vote shares, to give a sense of the possible range of outcomes. So next I thought
I’d check how well the model did at setting these percentiles:

So for the 1233 candidates for whom they made projections, just over half of them in each model had a vote share under their projection, about what I’d expect. Interestingly, the results tended not to contain as many surprises as the model expected. Just over 8% of the Lite projections fell above or below the 80% confidence interval, and for Deluxe and Classic those totals fell below 7%. So the models were projecting more uncertainty than we actually saw this year.

Another way to look at this is to count how often the model’s favorite did not win, and compare that with the model’s expected number of upsets. Expected upsets is the sum of the odds of the non-favorite candidate winning each race. If the model were perfectly calibrated, and results normally distributed around expectations, we should see expected upsets match actual upsets.

All three models predicted more upsets than we actually saw, which is consistent with the model being not confident enough. As Silver expected, the frequency of both projected and actual upsets for each model decreases as the complexity increases. In simpler English, adding more to a model makes it better at prediction: Lite had the most upsets, Deluxe the fewest, and Classic was in the middle.

I also broke down the above table into the four categories the site was showing on election night: Toss Up (no candidate has a 60% or better chance to win), Lean (favorite is 60-75% likely), Likely (favorite is 75-95% to win), and Solid (favorite expected to win more than 95%):

None of the solid favorites lost in any of the models, although with well over 300 races, the probabilities given would have suggested 1-2 longshot upsets. Toss Up races were good, with the favorite expected to lose about 45% of the time, and, albeit in small sample sizes, they did lose about that often, or sometimes more. It’s the Lean and especially Likely categories where we see the big gap in upset races. Rather than winning about 1 in 7 or 1 in 8 Likely races, the underdog won only about 1 in 20, only a third as often. For Lean races we’d expect about 1 in 3 to be an upset, but the favorites won 3 in 4 or better.

So in terms of both extremity of vote share, and frequency of updates, I found that the models predicted much more uncertainty than election results showed.

Does that mean they were poorly calibrated? You can’t really tell from a single election. Often polling in retrospect underestimates one party or the other across the board (which one it favors is basically a coin flip), and in such environments you’d see more upsets. Before the election Silver talked about the downside risk for Republicans being much larger than for Democrats – that is, more GOP-held seats were Likely or Lean, and so if results proved more Blue, we would have seen many more seat gains for the Democrats, but if results were more Red, Democrats were still likely to gain House seats, just not so many.

I tweaked my analysis code to allow a parallel shift in vote share – that is I take the actual results, and then take, say, 2 points from the Democrat and give it to the Republican. So this is an effective 4 point swing towards Republicans. Here’s what would have happened in that scenario:

Now the Deluxe model almost exactly nails the number of upsets, while Classic is close, and only Lite is markedly low. In this more red environment, the GOP narrowly holds the House, with 220 seats, and it wins 14 Senate seats, for a 4 seat gain there. For all three models, the number of candidates with vote shares outside the 80% range is a little more than you’d expect in each direction.

If the error were in the opposite direction, and we saw Democrats do 2 points better across the board and Republicans 2 points worse, this is what I get:

This environment also produces more upsets, although each of the models still expected a few more upsets than occurred in this scenario.

Now instead of Republicans narrowly holding the House, Democrats flip more than 50 House seats, and also barely take control of the Senate by 1 seat. Interestingly, I still see fewer than 10% of candidates getting above the 90th percentile or below the 10th percentile expected vote share.

These scenarios also show how seat total projections would be impacted by a systematic error. Instead of virtually nailing the number of House seats, the models would be off by 15-20 or so, and they would do worse at calling Governor and Senate races.

This year was quite good overall for polling accuracy, and so the models overstated the uncertainty for this year’s election. Even in a different environment, the models only get to about the number of upsets predicted. In order start to see more upsets than the models predicted, I need to shift each race at least 3-4 points towards one party of the other, and that would be a *very* large polling miss. So while the models did quite well in projecting the actual outcome, I think they were likely overstating uncertainty.

Tom Tango asked someone to see what players with at least a 30-point wOBA reverse platoon split in their first 2500 plate appearances did for the rest of their careers, and Sean Foreman kindly provided split data from his database.

So I hacked together a Perl script to read his data and analyze it. The script lets you tweak filters, setting a minimum reverse split over some minimum number of plate appearances versus both left and right handed pitchers. Taking Tango’s suggested criteria, I found just 7 matching players, with an aggregate 40 point negative wOBA split:

Tango expected the group would be neutral, and in aggregate they showed a 9-point “normal” split in wOBA. Only two of the 7 still showed a reverse platoon split, and in both cases it was much narrower than their start of career split.

But just 7 batters making his cutoff seems a little small of a sample size. So I reran, this time allowing batters with at least a 20 point wOBA reverse split, and now I get 25 players whose aggregate split is 30 points:

Now we finally find a player, Don Zimmer (yes, the former Red Sox manager and Yankees coach!) who showed a greater reverse split over the rest of his career than the start, but again just barely, increasing to 22 points from 21. Just 6 other players showed any reverse split at all over the rest of their careers, and two just by a single wOBA point.

When I set the filter back at 30 wOBA points, but require fewer PA (250 against LHP, and 1000 against RHP), I find a total of 57 players, with this aggregate:

Total 0.308 25543 0.348 67831 0.040 0.349 37399 0.331 93823 -0.018

Interestingly, this group in aggregate shows an 18-point normal platoon split over the rest of their careers, after a 40 point reverse split from the start of their careers.

So overall, it seems that even players who show a pronounced reverse platoon split tend to show a more normal platoon split over the rest of their careers.

If you want to tinker with different filters, you can download the raw data Sean Foreman made available, and pass it to my Perl script with different arguments.

RotoValue now also shows professional standings for both MLB and NBA. And, like the fantasy standings pages, you can customize them to any date range you like. So if you want to see how your team has done since a particular date, or over the past month, or since they called up Gleyber Torres or Ronald Acuna, RotoValue’s ProStandings page can show you. The Braves were two games behind the Nationals between Acuna’s call-up and his going on the DL on May 28th, while the Yankees have gained 7 games on the Red Sox since Torres was recalled through yesterday, June 9th.

Like the rest of RotoValue, this page is advertising-free, but it also adds some data that may be useful to fantasy sports players. In addition to wins, losses, and percentage, the page also shows runs (points for the NBA) scored and allowed per game. It’s a nice quick way to see how teams are doing, and if your pitcher is facing a tough offense next week, or your player is going up against harder defenses.

In addition, I show the “Pythagorean” records for each team, computing the winning percentages estimated from scored and allowed data. The basic model is:
RSX/(RSX + RAX), where X varies depending on the sport. When Bill James first proposed the concept, he used X=2 for MLB, but subsequent research has found a better fit with a somewhat smaller exponent. I’m using 1.83 for MLB, and 13.91 for NBA as proposed by Daryl Morey. Teams whose records are much better (or worse) than their expected records have typically been (un)lucky, and are more likely to revert closer to their expected winning percentage. That can give you insight on which teams may be more (or less) likely to contend for a playoff spot or (often more important for single-league fantasy purposes) trade away talent.

A link to this standings page now appears on the toolbar under the, uh, Standings menu:

In addition to ProStandings, there is also a Scoreboard page, which shows games, and, where appropriate, scores, for a given date.

This year the Angels signed Shohei Ohtani, the young Japanese star who played both as a pitcher and outfielder for the Hokkaido Nippon Ham Fighters of the Japanese Pacific League. As in Japan, Ohtani is expected to continue to play extensively both as a pitcher and hitter.

Some players have played significantly at the major league level in both capacities in different seasons (most recently Rick Ankiel, and more famously Babe Ruth), but in the modern game it is unheard of for one person to contribute both as a batter and a pitcher in the same season. Until, probably, 2018, when Ohtani seems poised to do so.

This raises the question of how to handle Ohtani for fantasy purposes.

Yahoo! is handling Ohtani by turning him into two different players, a batter and a pitcher, and they’re allowing different fantasy teams to own the different stats of Ohtani. While I can see how that might be an easier technical fix in some platforms, it’s ugly, and it loses the benefit of flexibility a two-way player gives a real MLB team. Allowing different fantasy teams to own Ohtani’s batting and pitching stats is simply wrong, and making a team use two roster spots to be able to get all of Ohtani’s stats does’t match the true flexibility the Angels will have by having a single player capable of contributing both in the batters’ box and on the pitching mound.

CBS and ESPN are a little better: they let you put Ohtani either at P or in a batting slot, but they only count the pitching stats when he is in your lineup as a pitcher, and just the batting stats when he’s in your lineup as a batter. While that’s not terrible for a daily transaction league (you can swtich Ohtani in your lineup based on where he’ll play for the Angels to get most of his stats), you’ll still lose out on Ohtani’s offensive stats on days he’s pitching in an NL park and has to bat for himself. If you’re in a league with weekly transactions, though, this solution forces you to decide whether you want Ohtani’s batting or pitching numbers for an entire week.

RotoValue lets you have the best of both worlds: by default, it will count all Ohtani’s stats, no matter where you play him.

Because RotoValue already supports an option to count pitcher’s batting stats (and/or batters’ pitching stats), it was actually quite easy to enhance it to support counting both types of statistics for a player who qualifies at both an offensive and pitching position, while only counting the primary statistics for players who just qualify at one type. So by default, RotoValue will count both batting and pitching stats for Ohtani whenever he plays, but will ignore other pitchers’ batting statistics and other batters’ pitching statistics. Strictly speaking, this is not special treatment for Ohtani, but would apply to any player who qualifies both as a pitcher and a batter. It’s just that for now, Ohtani is likely to be the only one to do so: for now RotoValue is listing him as SP/DH, although it’s possible we may give him OF eligibility instead (or in addition to) DH eligibility.

Suppose you really want Ohtani to count only as a pitcher. You can simply set his custom position to be SP only, and his batting statistics will no longer count. Or you could make him eligible at DH or OF only, and then only his batting statistics would count. Such changes would take effect immediately and retroactively: reload a standings or team stats page after changing the setting, and the full season totals will reflect the new settings.

So – how do you change a player’s position?

If you have administrator rights for your league, on the Player Detail page there’s a menu item under Settings: “Set custom position for…”.

That menu option gives you a page where you can override the position for a player:

If you simply want all of Ohtani’s stats to count, you won’t have to make any of these changes, but if you want him to be a pitcher only (or, less likely, a batter only), you’d need to override his default position. So RotoValue gives you the choice to handle Ohtani how you think it makes better sense for your league.

Alternatively, if you want to count all pitchers’ batting (and/or batters’ pitching) stats, you can choose those options directly from the bottom of the main league Settings page:

This week’s Riddler at FiveThirtyEight reruns a puzzle initially run back in February. The game is a “war” between two warlords fighting over 10 castles. Each warlord has 100 soldiers, and the 10 castles are worth from 1 to 10 points each. If you send more soldiers to a given castle than your opponent, you win it and its point value; if you both send exactly the same number of soldiers, the point value for the castle is shared between the two. The winner of the war is the one with more total points, and so ties are possible overall only if you’ve shared at least one castle with an odd point value. The goal of the first Riddler challenge was to have the most success against all other entries in the contest.

They’ve made the entries from the first contest public, and so that gives one a chance to see not only the winners, but all the entrants, and to play around with the data. I quickly hacked up a perl script to take their data and compute the results of the first contest. That ran pretty slowly (doing head-to-head battles for all the nearly 1400 entrants took about 12 seconds on my laptop), but accurately. I then rewrote the code in C, and saw the same thing run in about a tenth of a second, two orders of magnitude faster.

This made me wonder whether brute force might be useful to attack the problem, so first I wrote a C program to count all the possible allocations of the 100 armies across 10 castles. This took, uh, a while to run – about 8 hours – but I got my answer – a little over 4.25 trillion combinations.

That didn’t sound too good, but I quickly modified code so I could start trying combinations in order against the prior entries, and print out any one that did as well or better than the best combination I now tried. Last Sunday night I kicked that program off on my laptop. A week later, the program is still running, and has only gone through about 6 billion combinations, but it did find a few which did better than the best performing entry last time, so I’m entering the best of those today.

I wanted to see if I could speed things up, and one way to do that was to test, not against nearly 1400 entries, but a subset. I figured that people might look at which ones did well last time, and try to do something similar, so my subset was the 179 entries that, by my calculations, won 70% or more of their head-to-head matchups in the prior contest. I found many combinations that beat all 179 of those, so I then tried to compare lots of those against the larger dataset. So far, that has not given me a better result against the whole set from the last contest, but I’m entering one of those also, on the theory that this week’s entries will tend to look more like the very good ones from the last time than the whole set.

So my first entry is:

0 0 2 2 11 21 3 31 26 4

I’ll update this post later with links to the code used to generate these.

Also, I realized that when I first built the C programs, I did not add any optimization switches. The count.c program which took several hours to run before runs in under 10 minutes on my laptop now that I’ve added -O2 optimization with gcc. So if I’d done this in the first place, I’d have been able to search a lot further!

A few years back Brian Kenney introduced the hashtag #KillTheWin, which still lives on. Baseball fans point out egregious cases of a pitcher getting a win in a game despite pitching poorly. While the win might have been a useful metric for pitchers in the dead-ball era, when starters completed more than half their starts in aggregate, as baseball has evolved and bullpens have risen in prominence, the definition certainly has its flaws.

But it also has its history. Baseball fans still marvel at Cy Young’s total of 511, a seemingly impossible goal. While it’s become less common in the 21st century, a 20-game winner was once a standard of excellence among pitchers. Killing the win loses touch with that history. So instead, perhaps, we should consider trying improve it instead. In 2014, Tom Tango proposed a complete redefinition, a simple points system that would usually determine the pitcher most deserving of a win (or loss) in a given game. I wrote about the idea, and created a page to compute wins and losses under that proposal. Recently Tango e-mailed me with a different idea: suppose we keep the basic structure of the current rule, but instead of giving the win to the pitcher who recorded the last out before a team takes its lead for the final time, we give it to the first pitcher whose line for the game closes in a position to win the game, so long as that pitcher’s team still wins.

The odds of the projected HR leader actually leading the league is an interesting question. I’ve been doing projections since 2011, so I thought I’d sweep my database for the RotoValue projections and see what that history was. That gives me just five years, but it turns out my projection model did correctly name the MLB home run leader once in those five years, or 20% of the time. Chris Davis hit 47 HR in 2015, leading MLB, while my model projected him to hit 35 HR. Note that when you’re leading the league, you’re not only very likely beating your own projection, you’re also probably beating all the projections, because the projected totals can be considered a weighted average of all possible outcomes for that player, and the possibility of a bad year or injury will pull that average down from what a peak player will produce when healthy. Also it’s not unusual for the league leader to be a player having a breakout year, well above what his past performance suggested.

Last year, Mark Trumbo’s career-best 47 HR topped MLB, despite my model projecting him for just 21.3 HR, the 46th best total. My projected 2016 leader was Chris Davis again, now projected to hit 37.9, and he actually slightly edged that out, with 38 HR. But I was not projecting the overall home run surge, and Davis’s 38 HR ranked only 12th best.

The actual MLB HR leader has regularly surprised my projections model. Only one other year, 2012, when Miguel Cabrera’s 44 HR led, was the actual leader among my preseason projected leaders (my model projected 34.5 HR for Cabrera, the 5th best projection). In the other years, the leader was projected 36th (Nelson Cruz moving to Baltimore in 2014), 90th (Chris Davis in 2013) and 73rd (Jose Bautista in 2011) by my model.

Over the 6 years, of the 60 players projected to finish in the top 10, 22 of them did so. Also, 24 of the 60 equaled or bettered their projected total, while 36 failed to reach the projected value.

Below the jump I’ve put tables for each year showing players projected to be in the top 10 in HR in my model, along with any players who actually finished in the top 10, along with their projected values.

I periodically attempt the FiveThirtyEight Riddler, edited by Oliver Roeder. This week he’s actually presenting two, a shorter “Riddler Express”, and a more time consuming “Riddler Classic”.

The Sesame Street character Count von Count likes to, well, count, and he now has his own twitter feed! For those who don’t recall the character from the show, he’s a purple muppet dressed in black, as a sort of kindly Dracula, who would count up in an eastern European accent.

Well, the twitter feed is simply the Count counting, albeit in words written out describing the number. As I type this, his latest tweet is “Eight Hundred Seventy Nine!”

So the Riddler Express is to find out how high can Count von Count count on twitter in this way before hitting its 140 character limit. Because he is enthusiastic, all his tweets must end with an exclamation mark.