Blog

UNDERSTANDING Statistics and Probability

Statics and Probability help us to analyse the present conditions and predict the future. Let’s apply this to the UEFA Champions League 2012-13 (yeah the one going on)and predict who’ll be the winner at the WEMBLEY STADIUM on 25th May 2013 according to the present conditions!! If you’re not a supporter of the winning team, don’t fret, probability never rule out the odds. You never know when that corner might score a winning goal!

STATISTICS

Tallying and grouping data helps you to consolidate the given information. By putting the data together, it becomes easy to compare different items.

FREQUENCY DISTRIBUTION BAR GRAPH

Measures of Central Tendency

Why would one calculate quantities like mean, mode and median? They are various values when estimated, help us predict chances of an event happening. Statisticians use mean, mode or median values as standards to compare other values with to determine the probability of an event.

MEAN

Mean is the average score of the points of the teams. Well, let’s understand it in terms of soccer! It helps you estimate the average points scored by each team in the league. The teams having scores more than the average have a greater chance of winning. So next time you boast a team’s going to win don’t just back it up with your gut but the math too.

Mean = Sum of all points/No. of teams

= (12+10+12+8+14+11+12+10+13+13+13+13+10+12+10+15)/16

= 11.75

Teams Below Average Points (< 11.75)

Teams Above Average Points(>11.75)

Arsenal FC

Paris Saint-Germain FC

AC Milan

FC Porto

Real Madrid CF

FC Shalke 04

Chelsea FC

Malaga CF

Celtic FC

Borussia Dortmund

Galatasaray AS

Juventus

FC Bayern Munchen

Valencia CF

FC Barcelona

Manchester United

MODE

Mode of the data is the number which occurs most frequently in a data. In soccer language it is the points which most of teams manage to score. It might not necessarily help us predict the winner, but helps us compare different soccer leagues (La Liga, Budesliga, EUROPA etc.)

It can be easily observed from the frequency table which we drew earlier. However, there are cases where there might be a number of modes. Such as above where 10, 12 and 13 seem to be the scores occurring maximum number of times. In that case we draw another frequency table and introduce what we call a MODAL CLASS. A class is just a range over which the data values are distributed. Here, one can be drawn as follows.

Class Range (No. of Points)

Frequency

8-9

1

10-11

5

11-12

8

13-14

5

15-16

1

Here we observe that most of the teams have scored points in the range of 11-12. Hence, 11-12 is the MODAL CLASS.

Statisticians using Mode as a standard for predicting chances will say that teams scoring points 11-12 have a higher chance of winning.

New Predicted Possible Winners

FC Shalke

Malaga

Real Madrid

Juvento

Manchester United

MEDIAN

Median gives us the middle value of the data. Take care that a median of a data should be estimated once the entire data has been put up in an order (and not the middle of randomly presented data).

If the number of observations (here the number of teams) is odd, the (n+1)/2th item is the median.

If the number of observation is even, (n/2th + (n/2+1)th)/2 item is the median.

Here, 16 teams have been chosen.

Team Number

Points

1

8

2

10

3

10

4

10

5

10

6

11

7

12

8

12

9

12

10

12

11

13

12

13

13

13

14

13

15

14

16

15

Since there are 16 teams, the median would be the average of the scores of team 8 and 9

Hence, the median = (12 + 12)/2 = 12

If the median is chosen as the standard for comparison, teams scoring more than 12 points have a greater chance of winning.

Representing Data

There are numerous ways to present your data. Bar graphs are one of the few ways. Let’s explore a new way to scoreboard.

Points Scored

Number of Teams

8

1

10

4

11

1

12

4

13

4

14

1

15

1

To draw a pie chart, we need to represent each part of the data as a proportion of 360, because there are 360 degrees in a circle.

For example, if 4 of the 16 teams have scored 10 points, then we will represent this on the circle as a segment with an angle of: (4/16) x 360 = 90 degrees.

Points Scored

No. of Teams

Calculation

Degrees

Eight

1

1/16 x 360

22.5

Ten

4

4/16 x 360

90

Eleven

1

1/16 x 360

22.5

Twelve

4

4/16 x 360

90

Thirteen

4

4/16 x 360

90

Fourteen

1

1/16 x 360

22.5

Fifteen

1

1/16 x 360

22.5

Scatter plot is yet another means of depicting your data. It is quite similar to a bar graph but easier to read.

PROBABILITY

Probability tells us the chances or how likely a certain outcome is expected.

To see which teams have a greater chance of winning, we will look at the winners of all the UEFA Champions so far.The following probability has been calculated keeping assuming the fact that enough money is running the club and few player trades have occurred (small enough so as to not to affect the running of the club as a whole)

We are just looking at the data of the team shortlisted in the 2012-13 UEFA Champions League.

(Top 16)

Total UEFA Finals Played so far = 34

We observe that Real Madrid has a high probability of winning over the years. Now that we have the predictions in hand,