The winning formula: data analytics has become the latest tool keeping football teams one step ahead

This article was taken from the January 2014 issue of Wired
magazine. Be the first to read Wired's articles in print before
they're posted online, and get your hands on loads of additional
content bysubscribing
online.

When Simon Wilson first arrived at Southampton football club he
was a consultant for a technology startup called Prozone. Prozone
had developed a proprietary player-tracking software which, fed by
eight cameras around the pitch, would output a two-dimensional
bird's-eye animation of a football match. The machine could track
each player's movement every 0.1 seconds, registering an average of
3,000 touches of the ball per game, and provide an answer to a
range of statistical questions. Southampton adopted Prozone and
later hired Wilson to work as a performance analyst for the
first-team manager.

"Prozone wasn't part of the culture of the game and most
managers weren't used to it," Wilson says. "I was naïve but
I couldn't understand why they didn't want this kind of
information." Once, just before an August 2005 football league
Championship game between Luton Town and Southampton, Wilson gave a
pre-match briefing to the team and the manager, at the time, Harry
Redknapp. "Harry was more intuitive than analytical," says Wilson.
"He was nervous about overloading the players with information."
Southampton lost 3-2. On the team bus, Redknapp turned to Wilson
and said, "I'll tell you what, next week, why don't we get your
computer to play against their computer and see who wins?"

Some managers, however, did get it -- and one in particular was
Clive Woodward. He had been the coach of England's World
Cup-winning rugby team in 2003, and in 2005 had been offered a
one-year contract to serve as Southampton's director of
football. He had been the first coach to adapt Prozone to rugby,
installing it at Twickenham four years before the World Cup, which
allowed him to collect data on how England and its opponents
played. "When I first saw it I was fascinated because I'd never
seen a game where you're looking down and just see dots and data
and movement," Woodward says. "It removed a lot of the preconceived
notions we had about how other teams played. It made a big
difference when we started to see them as data, as opposed to teams
we had never beaten before." Once, after his players insisted that
there was no space on the field to run into, Woodward took a
printout of a Prozone freeze-frame taken 24 seconds into a match
against France. It showed both teams around the ball in a small
area on the pitch and acres of unoccupied space everywhere else. He
stuck it on board with the message: "The space is the green
stuff".

"Clive would challenge me at every level," says Wilson about
Woodward's time at Southampton. "He would ask questions about every
aspect of the game: why do we spend so much time working out how to
score goals and not how to stop them? I would try to explain to him
what they're doing and he'd just keep asking why." Woodward and
Wilson tried things such as filming players striking the ball, to
study technique from a biomechanical perspective. Those
initiatives, however, never had much impact. Redknapp left before
the end of the year and Woodward departed at the end of his
contract. Wilson had left the club shortly before Woodward,
convinced that there was a better way of running a club. "Woodward
believed that evidence, be it video or statistics or any kind
of data, was fundamental to how you prepare a team," Wilson says.
Woodward remains his biggest influence. "He taught me that we
didn't have to do things just because they had always been done in
a certain way."

Today, 19 of the 20 Premier League teams use Prozone. Each has
its own team of performance analysts and data scientists
looking for the indicators that quantify player performance, the
events that determine matches and trends that characterise seasons.
They are scientists dissecting the world's most popular game,
looking at data from Prozone and other sources to understand what
dictates the difference between winning and losing. In the
environment of the multimillion-pound Premier League, clubs don't
just want a competitive advantage, they need it.

Simon Wilson, manager of strategic performance art Manchester City FC

Liam Sharp

At 3.50pm on 19 March, 1950, a Royal Air Force accountant called
Charles Reep went to watch Swindon Town versus Bristol Rovers.
During the game, he took out a pencil and a notepad and started
scribbling down observations using a system of symbols he had
invented for annotating events on the pitch. Today, sports
scientists call such a system notational analysis. Over the years,
Reep annotated more than 2,200 matches. The data analysis for each
game would typically take him 80 hours. The 1958 World Cup final
alone took him three months to analyse.

Reep showed that football, a dynamic and unpredictable game, had
constant and predictable patterns. He found, for instance, that
teams would, on average, score once every nine shots; that 80
percent of goals were scored from movements of fewer than four
passes; and that 50 percent of goals came from balls recovered
within 30 metres of the goal line, the last third of the pitch.
Reep concluded that teams would be more efficient if they spent
less time trying to string together passes and more time lobbing
the ball into their opponent's area. This strategy became known as
the long-ball game.

There are two problems with the long-ball game. Firstly, it's
not great to watch. Secondly, Reep's data analysis supporting it
was too simplistic. In 2005, Ian Franks, a professor at the
University of British Columbia, and Mike Hughes, a mathematician
who pioneered computational-notational analysis, looked at the data
from two World Cup tournaments. At first, Franks and Hughes found
data compatible with Reep's analysis but, after closer scrutiny,
they showed that most goals happened after fewer than four pass
movements simply because most movements in football are like that,
not because the odds were better. In other words, the frequency of
goals is not the same as the odds of a goal being scored. What
Hughes and Frank found was that teams that completed more passes
had a better chance of scoring. "Of course, you need skilled
players to sustain long-passing moves," says Hughes. "Up until
then, everybody was ignoring the blatant fact that teams who
weren't using a long-ball strategy, like Brazil, were winning the
World Cup."

"Collecting data is always the first step and Reep was a great
accountant," says Chris Anderson, a political economist from
Cornell University, New York, who's been studying football
statistics for three years. "But he wasn't a great analyst and had
a limited understanding of what the numbers were telling." Reep,
according to Anderson, had very strong preconceived notions and
when he found what he was looking for -- a chance to play the game
with minimum input for maximum output -- he didn't investigate
other hypotheses like another analyst would do. "He was welcomed
into football by people who wanted to play a long-ball game and
just wanted to know how do it without considering how wrong this
approach could be." In their book, The Numbers Game, Anderson and
co-author David Sally write: "Reep's quest to use the numbers to
inform strategy fell short because he was an absolutist, determined
to use his data to prove his beliefs. He needed to abandon his idea
that he was looking for the one general rule, a winning formula,
and learn to seek the multiple truths and falsehoods in the numbers
themselves." Reep's assertion that statistics offer us a chance to
see things we'd otherwise miss was absolutely correct.

Clive Woodward, who first used Prozone in 1999

Liam Sharp

When Wilson joined Manchester City in 2006 in order to start a
new department of football analytics, he hired the best
analysts he knew and set himself the goal of changing how the
football team used data. "After a game there wasn't any kind of
analysis," Wilson says. "Emotionally, the manager and the coaching
staff would just draw a line and move on. It was part of the
culture. They wouldn't ask themselves if the game plan had been
right or even well executed. My team of analysts had to fight that
habit and create a continuous loop between what happened in games,
why it happened and what we are going to do next time."

At the time, City were a mid-table club that struggled to win
away games. In September 2008, when the club was acquired by the
Abu Dhabi United Group for Development and Investment, a
private-equity outfit owned by a member of the Abu Dhabi royal
family, the team suddenly found itself with the resources necessary
to mount a challenge for the Premier League. Today, Wilson is
Manchester City's manager of strategic performance analysis. He
co-ordinates five departments, including the team of performance
analysis, which is now led by a sports scientist named Ed Sulley.
After each match, they compile exhaustive reports about the team's
performance data, focusing on statistics that they think most
relevant. The list is extensive. They analyse, for instance, the
number of line breaks, a term borrowed from rugby which means a
forward pass that goes through the opposition's midfielders or,
more crucially, its line of defenders. They look at what happens in
the 20 seconds after the team wins or loses the ball. They pay
attention to City's ball possession in the last third of the pitch,
a measure that they found to be strongly correlated with winning
matches. "When we studied the profile of the top teams against
average teams, the thing we saw was that the best teams dominate
the possession of the ball in that part of the pitch," says Wilson.
"The success rate of the passes was very high, particularly forward
passing. So now, when we recruit players, we pay special attention
to individuals with high pass-completion rates."

Statistics such as line breaks and possession in the last third
are important for City but would probably be irrelevant to a team
with a different style: football analytics is a discipline in which
the way a team plays dictates which statistics are significant. The
challenge is to find out which. "Instead of looking at a list of 50
variables we want to find five, say, that really matter for our
style of play," says Pedro Marques, a match analyst at Manchester
City. Marques and his colleagues are currently using data-analysis
techniques, such as principal-component analysis, to home in on the
match-related variables about winning. "With the right data-feeds,
the algorithms will output the statistics that have a strong
relationship with winning and losing." Wilson recalls one
particular period when Manchester City hadn't scored from corners
in over 22 games, so his team decided to analyse over 400 goals
that were scored from corners. They noticed that about 75 percent
resulted from so-called in-swinging corners, the type where the
ball curves towards the goal. "In the next 12 games of the next
season we scored nine goals from corners," Wilson says. "You
usually get six coaches and they'll have different experiences and
they'll throw their opinions in, whereas we had objective evidence
to suggest that this was a pattern."

When Wilson was working at Prozone as a consultant for
Southampton, he would capture the information from the Prozone
machine on a removable hard disk drive, commute back to Leeds at
2am, process the data overnight and head back to Southampton to
deliver the analysis. Sometimes he would work 20-hour days. He
lived in Leeds with ten other Prozone consultants. The office was
essentially a warehouse filled with computers. The CEO was Ram
Mylvaganam, an engineer who had been a marketing director for Mars.
Mylvaganam didn't know much about football. On the wall of the
Prozone office he hung a picture of a chalk drawing by the pavement
artist Julian Beever, an artwork that, viewed from the right angle,
creates an illusion of 3D. To Mylvaganam, data was like a Beever
drawing -- everything they needed to make sense of the data was
probably right in front of them. "But if you stand in the wrong
place, the data looks like shit," Mylvaganam says.

Mylvaganam first had the idea for Prozone in 1996 when he was
working for a management consultancy and had a contract with Derby
County, a contact he got via Neil Ramsay, a former football agent.
The first version of Prozone was a Portakabin filled with 22
massage chairs, developed by Mylvaganam with a local manufacturer,
that emitted electrical pulses and, supposedly, relaxed the
players' muscles and increased their flexibility. Prozone was a
contraction of "Professional Zone". At 10.30 every morning, Derby's
players reported to Prozone and sat on the chairs for 15 minutes
while the assistant manager, a young coach called Steve McClaren,
gave presentations on a video screen about their game plan.
Feedback, McClaren used to say, was the breakfast of champions.

"McClaren worked in this glorified garden shed, where he would
stay after all the players had gone home, with two video recorders
and a screen. Just editing video," Mylvaganam recalls. "I asked
him, 'Why don't you get one of your monkeys to do it?' and he
replied 'How do they now what is a good move and what's bad? I want
to show them how you win matches.'" Mylvaganam thought he could do
better. He knew a small company in France called Video Sports that
had developed pixel-tracking software. He bought 25 percent of the
company and installed eight cameras around Pride Park stadium. "The
camera technology was bad. Sometimes we'd get the analysis back and
there'd be players missing, so we had to redesign the software in
Leeds." Mylvaganam says. "Still, it was revolutionary. We were
defining statistically what was a game of football."

In 1999, Steve McClaren was headhunted by Manchester United's
manager, Alex Ferguson. McClaren requested Prozone. The firm had
been working gratis for Derby and had no paying costumers, so
Mylvaganam insisted on a financial deal. United agreed to pay
Prozone £50,000 if the club won a trophy that year. That season,
United won the treble -- the Champions League, the Premier League
and the FA Cup -- and Prozone earned its first cheque. In May 1999,
Prozone had two clients and no revenue. By August 2000, six Premier
League clubs were paying customers. Mylvaganam and Ramsay sent
their sports scientists to football clubs to act as Prozone
consultants -- what some of them found wasn't what they were
expecting from multimillion-pound businesses. "At Aston Villa,
there was an old-school bucket-and-sponge physio who didn't really
speak to the manager and a manager who didn't really speak to the
players," says the current managing director of Prozone, Barry
McNeill. "There were few coaching meetings, no preparation
meetings and I was just a 22-year old guy in a suit who had to
explain how the software I had in my PC could add value."

In 2000, Mylvaganam got a call from Sam Allardyce, the Bolton
manager. Allardyce had played in the North American Soccer League
with the Tampa Bay Rowdies, who shared a training ground with the
Tampa Bay Buccaneers, an American football team. He'd been
impressed with the Buccaneers' use of technology. Mylvaganam didn't
think a lower-division team such as Bolton could afford Prozone,
but also knew that if they performed well, they could prove even
better publicity than United's treble. Bolton became the first
lower-division team to use the system. They beat Preston in the
Championship playoffs 3-0 and were promoted to the Premier
League.

At Bolton, Allardyce conceptualised a rigid game plan around
data. His backroom staff included David Fallows, a former Prozone
analyst, Gavin Fleig, who had studied under Mike Hughes, and Ed
Sulley. Allardyce and his performance analysts had a model he
called "the Fantastic Four": four areas that dictated success. Out
of 38 games, they knew that a team had to prevent the opposition
scoring in at least 16 games to avoid relegation. They knew that if
they scored first they would have a 70 percent chance of winning
the game. They knew that set pieces, free kicks and corners
accounted for nearly a third of the goals scored, and in-swinging
crosses were more successful than out-swinging, so they practised
not only those types of crosses but also defending against them.
They also discovered that they would have an 80 percent chance of
not losing if the players outworked their opposition by covering
more distance at speeds above 5.5m/s. Allardyce insisted on players
using long throw-ins, deep into the opponent's area -- if a player
failed to follow that simple command he'd go crazy because he knew
the odds of scoring had been reduced. Bolton's performance analysts
studied a huge number of throw-ins and Allardyce would organise
players in the places on the pitch where the ball had the highest
probability of landing, the so-called positions of maximum
opportunity, or "pomos", to increase the odds of scoring. "Pomos
weren't just relevant to throw-ins. In training, he would shout to
the players to attack their pomos when trying to score," says
Sulley. Between 2003 and 2007, Bolton recorded consecutive
top-eight finishes in the Premiership, a record of
consistency bettered only by the top four. They qualified for
the UEFA cup for the first time in 2005 and again in 2006. When
Allardyce left in 2007, they had an impressive 39 points after
21 games.

Barry McNeill, current managing director at Prozone

Liam Sharp

Sally and Anderson argue in The Numbers Game that football is as
much a matter of skill as it is a question of luck. Goals are rare
and studies show that 44 percent are fortuitous. In any match, the
favourite team wins only 55 percent of the time. Football, they
conclude, is a game dominated by randomness. That, however, doesn't
mean that nothing can be done to influence its outcome. Football's
inherent randomness makes analysis even more impactful. "What makes
a difference isn't data," Anderson says, "but the brain cells that
can translate data into a theory of how you win football
matches."

On 11 October, 2013, England played a World Cup qualifier against Montenegro at Wembley Stadium. Here are some insights on the games from Prozone's analysis

Heat Mapping: the red shows Everton left-back Leighton Baines’s territory during the game (both halves are superimposed here). Baines's corner-kicks are shown at the lower-right.

Analysts used to believe, for instance, that the distance run by
a player was a good indicator of individual performance and that a
team's ball possession had a positive correlation with winning.
Those numbers, however, turned out to be meaningless. Analysts now
know that it is the distance run by a player when sprinting that
indicates good performance, and that it is ball possession within
the last third of the pitch that correlates with success. Better
metrics imply a more refined understanding of the game. "Sometimes
we look only at the individuals and forget the context," says Blake
Wooster, a former director at Prozone, who now runs a sports
startup called 21st Club. "For instance, Barcelona's [Lionel] Messi
is one of the best players ever, but what would happen if you took
him out of that context and put him in another team? You can't
assess talent in a vacuum." An example of that type of contextual
statistics is a model recently developed by Prozone called "goal
expectation".

This assigns each shot a probability related to its position,
and thus determines how well a goalscorer is performing. The
statistic filters out the quality of the opposition and the
quality of the player's team. Last year, for instance,
Tottenham's Gareth Bale had 161 shots and 21 goals, when, according
to the goal-expectation model, he was due to score only 11. "Bale
would regularly shoot from situations with a low probability of
success, such as from a distance of 30 yards, and score," says Paul
Boanas, Prozone's senior account manager and a former performance
analyst. "This type of contextual information helps to explain
why he's worth so much."

Visualised Passes: this web depicts England’s passes during the first half of a game. The blue arrows indicate successful passes and their direction. Red indicates the failed attempts.

Some of the most important elements of football remain very hard
to quantify and it's difficult to understand what we can't measure.
Consider defence. Using data from the last ten seasons of the
Premier League, Anderson and Sally compared the value of a goal
scored and the value of a goal conceded. They found that scoring a
goal, on average, is worth slightly more than one point, whereas
not conceding produces, on average, 2.5 points per match. "Goals
that don't happen are more valuable than goals that do happen,"
Anderson says. "It's counterintuitive. The question is: how do we
measure something that doesn't happen? The challenge is to see the
unseen."

Evaluating an attack consists of measuring what happens with the
ball: shots, passes, crosses, sprints. Although actions such as
tackles, clearances and saves give you a measure of defensive
performance, the essence of defence lies in collective behaviour
that happens off the ball -- marking, cutting off passing channels,
the positioning of defenders. This is a thorny problem that the
analysts at Manchester City are beginning to study. Using Prozone's
tracking data, they are quantifying variables such as the area
occupied by a team and the dispersion of the players. "We are
trying to understand how the individual players co-operate and
develop synergy as a team," Marques says. "Most analysis still
focuses on discrete variables and actions, but most important for
us is to understand the interactions."

Speed Profiling: the arrows show Tottenham Hotspur winger Andros Townsend’s runs during the second half of the game. Yellow arrows are 4-5,5m/s; orange 5.5-7m/s; and red 7m/s+

Every week during the 2011-2012 season, Manchester's City
captain, Vincent Kompany, sat down with the other defenders and a
performance analyst, and examined their performance. "They would
look at videos and statistics and ask questions," Wilson says. "Was
the pressure effective? How many forced errors did they commit?
What would happen in the ten seconds after losing the ball? On the
basis of that analysis they would design their own defensive
tactics for the game. You can have a fantastic analytics team but
you can never win a game with data if you're not influencing the
behaviour of the players." At the end of the season, Manchester
City had conceded the fewest number of goals in the Premier League.
"We beat a lot of records that season," Wilson says. "Most of the
credit is due to the talent on the pitch. But I believe that about
30 percent of that success is down to how well we prepared and
maximised that talent."

Wilson missed the last game of that season, when City played
Queens Park Rangers. City were level on points with Manchester
United, but had a superior goal difference. "I had a flight but it
was delayed, so I ended up only watching the first half on TV,"
recalls Wilson. "By then when we were winning 1-0, so I was
confident." In the second half, QPR scored twice. Two minutes after
stoppage time, City's striker Edin DŽeko equalised. By then, United
were winning their match and, if nothing changed, would be the
champions.

Movement tracing This is Wayne Rooney’s movement profile during the second half. With Daniel Sturridge and Danny Wellbeck in attack, England’s star forward is able to roam freely.

Two minutes later, City's attacker Sergio Aguero received the
ball on the edge of the box, in a position to shoot. According to
Prozone's goal-expectation model, he had a 12 percent chance of
scoring. Instead of shooting, he went around a defender to a corner
of the penalty area and, from a spot where he had a 19 percent
chance of scoring, slotted the ball past the keeper. By the time
Wilson landed at Gatwick, the news ticker running across the TV
screens was saying that Manchester City were the new champions.

Other teams in the fieldOpta
Sports
Prozone's main rival, London-based Opta distributes live sports
data not only to football clubs but also to media and
betting companies.

StatDNA
This US-based video analytics company not only captures game
events, but also provides statistics on defensive pressure and
pass difficulty.

Apollo
MIS
Founded by Ram Mylvaganam, after he left Prozone in 2005, Apollo's
platform brings together training, medical and performance
data.