The Power of Goals.

Pages

Monday, 20 April 2020

There's been a huge increase in football related scatter plots recently. So as the guy who produced the first such plots, I thought I'd quickly run through why I thought this simple plot was useful and then try to expand the idea to provide additional usefulness.

The initial plots were designed to both inform and characterise playing style.

I think still the most successful plots use related metrics, for example expected assists and expected goals per 90 for individual players.

These "makers and takers" plots easily split players into those whose predominant talent is to create chances, those who get onto the end of opportunities and those rare players who excel at both disciplines.

Here's one for Arsenal 2019/20.

It's got sample size issues, but it's fairly evident that the creative players are towards the top left and the goal poachers are to be found in the bottom right.

Another quite neat aspect of this type of plot is that you can run a line through a player to the origin and any one with a similar ratio of xG and xA will lie close to that line.

In league wide samples, therefore you can find emerging players with similar qualities to the established stars.

There's a lot of data swilling around today, these plots are simple to make, three minutes tops, and with some thought about what you're trying to illustrate, they inform pretty well.

Over the weekend I came back to the idea, to see if I could add information that tells you a little bit more than just the raw connection between two metrics.

Here's what I came up with. It's again just a simple scatter plot, but I've used bubble size to introduce a third variable (metric volume per 90).

In addition I've used a single performance metric (NS xG added from ball carries) along the x axis and instead of plotting a complementary metric on the vertical axis, I've used a number to denote how diverse the x axis metrics are for each player.

This just plots the top 20 NS xG added by players through their ability to successfully carry the ball forward and move their team into a more dangerous pitch position.

It's a good one to chose because you know that Adama Traore will top the list (and he does).

Rather than a sterile scatter, you've now got a chart that not only tells you about a performance metric, it also instantly adds another layer (success volume) from which you can draw addition information about the characteristics of a player.

In short, those towards the right of the plot add more NS xG per 90 than others.
Larger bubble size indicates more successful progressive carries per 90.
And higher up the chart indicates more disorder and unpredictability in what a player will positively achieve for his team when on the all.

I've annotated players with the additional information you can draw from these plots.

Thursday, 26 December 2019

Liverpool’s
bilingual mastermind behind the team’s meteoric rise to dominate club, domestic,
European and now world football is gradually gaining a higher media profile.

Not Jurgen
Klopp, although he has played a part in the Red’s success, but Dr Ian Graham,
their current director of research.

Ian’s
recent appearances in both the spoken and written media has not only
highlighted the importance of an integrated approach to squad building that
utilizes a data driven approach, alongside more traditional methods, it has
also given a small glimpse into the analytical methods employed.

The latest
profile landed courtesy of Liverpool.com and described some fundamentals of
Liverpool’s analytical philosophy.

One
particularly resonated with Infogol’s
approach of quantifying every footballing action in the same currency of goals
or more specifically x goals.

The idea
that every action, be it a pass, tackle or long throw changes the likelihood
that a side will ultimately score isn’t a new concept.

It was
probably first introduced into the public analytical domain by Dan Altman in his whistle stop
OptaPro presentation in 2015 and hints of such models have been recently
emerging from Opta itself and Twelve football.

Such a
non-shot xG model also powers Infogol’s “Team of the Week”.

The gradual
migration, at least inside the industry, from a purely chance based evaluation
to a more holistic one somewhat mirrors the earlier transition from merely
counting shots, as exemplified by total shot ratios from 2008 to a more
informative, location based xG model, subsequently.

However,
creating such non-shot models that quantify every on-field action is not a
simple task. The granular data required to build non-shot models dwarfs that
that was needed to create TSR, which itself was rudimentary and basic compared
to that required to create a proficient xG model.

These leaps
in data driven evaluation presents a dilemma for the aspirations of public and
hobbyist analysts, an area that provided much of the driving force behind the early
explosion in football analytics.

Latterly,
monetization of ideas and a larger appetite for quantitative metrics to
supplement opinion driven insight in the media and clubs, has swept many of
those same hobbyists behind a non-disclosure paywall.

Less
co-operation, dwindling numbers, availability of adequate data and the need for
diverse technical skills to process that raw data, appears to have stifled the
growth of football metrics in the purely public arena.

At the risk
of falling victim to one of Twitter’s sloganized insults, “back in the day,
metrics didn’t last long before they were improved upon or supplanted
altogether”.

Liverpool.com
suggested that Ian’s weapons grade model might be broadly replicated by current,
readily available and much quoted metrics, such as xG Chain (I’ll let you
google the definition).

Succinctly,
the metric rewards every participant in a move that ends in a goal attempt with
that chance’s entire xG.

The
distribution of goodies can seem churlish, for example, by giving far less
individual credit to the three Middlesbrough players who swept nearly the
length of Stoke’s defensive transition to score a low probability winner on
Friday night, as it would a marginally involved square ball on route to a
multiple passing move that ends with a tap in from six yards.

More
crucially it completely omits actions that aren’t concluded by a created
chance.

To avoid
confusion over units, I’ve simply ranked the xG Chain and the non-shot ball
progression for each player in the recent Merseyside derby and then compared a
player’s rank in one metric with his rank in the other.

It starts
off quite well. Sadio Mane ranks top in both, he was outstanding on the night.
But then, much like Stoke’s trip to Middlesborough, things take a turn for the
worse.

Shaqiri
ranked an impressive 2nd overall in ball progression, but a lowly 16th
in xG Chain, whereas Origi rates highly by the latter, but much less so in the
former.

Overall, a
third of the players have double digit ranking differences between their pecking
order in both metrics. There are some agreements, but the relationship between
the two metrics is generally weak.

Extend the
study to every game played last season and this tenuous correlation between the
two metrics remains.

One of the
strengths of the early analytics movement was the ability to sift mere
statistical trivia (team Y has recorded X when player Z plays, immediately
springs to mind) from useful, if imperfect evaluations that convey insight and
can be used to both evaluate and project future performance.

A great
example of the latter is Dan Kennett’s
recent Allisson tweet, which used big chances to highlight the keeper’s
importance to Liverpool, both in the past and possibly in the future.

Save rates
when faced with Opta’s Big Chances can be framed to be a very good proxy for a
more exhaustive and granular, post shot xG2 modelling of a keepers saves and
goals allowed.

Dan’s tweet
was selective, but also carefully constructed enough to capture the keeper’s
core attributes. Current retweets are approaching around 10 billion!

That should
be the benchmark for widely used metrics and player contribution figures, such
as xG Chain fail that test on numerous counts.

It fails to
differentiate individual contribution, omits larger swaths of creditable
actions and thus fails to correlate well with more exhaustive modelling of a
similar player process.

The
challenge for the public arena as we enter the roaring 20’s is to come up with
constant improvements to substandard and potentially misleading measures….. and
be more like Dan.

Tuesday, 29 October 2019

Old style
goals based analysis hardly gets a run out nowadays with everyone arguing xG
strawmen. So, let’s go the goals route to see if Liverpool’s record in single
goal margin wins is “knowing how to win”, “unsustainable” or “about what you’d
expect”.

Liverpool
won 10 games by a single goal margin last season. That’s a lot, but well below
the single season record held by Manchester United of 16 in 2012/13 and 2008/09.

United’s
number of single goal wins in those subsequent seasons fell to five and eight
respectively (although something more impactful may have also occurred in
2013/14). Their points tally fell as well, by 25 points in 2013/14 and by 5 in
2009/10.

To dilute
the Fergie/Moyes effect, let’s look at the average record in the next season of
teams who won 10 or more games by a single margin.

There’s
over 90 of them during the 20 team history of the Premier League and 80% of those
had fewer wins by the narrowest possible of margins during their next Premier
League season, 74% also saw their points total fall.

These teams
who edged lots of close matches one season shed around 10% of their points in
the next season.

Single goal
wins, on average account for 41% of a side’s Premier League points total, but
in our sample of 90+ teams who won 10 or more, 80% of them accrued more than
41% of their points from such victories.

Everton won
76% of their 59 points in 2002/03 from single goal wins and then tried their
very best to get relegated in 2003/04 as their “luck” in narrow games returned
to earth and they won just 39 points.

In
Liverpool’s case in 2018/19, one goal margin wins only accounted for 31% of
their 97 points. Therefore, their ten such wins places them in a group of sides
who typically regress, but the percentage of total points they win in this
manner is entirely atypical of that group.

To see
where Liverpool stand as being adept at winning single goal margin games, we
need to look at their underlying goals record.

In 2018/19
they scored 89 and conceded 22, taking the Poisson route, that’s consistent
with winning nine games by a single goal over 38 games. They won, as we’ve seen
ten, hardly a worryingly large over-performance.

You can
lump Liverpool in with a group of teams who have achieved good things, partly
as a result of “knowing how to win” (Leicester 2015/16 spring to mind, 14
single goal wins where nine would have been a more equitable return), but
unlike most of these sides, the Reds have the underlying numbers to deserve
their record.