Measures Of Central Tendency

10.2 Measures of central tendency (EMA6Y)

Mean (EMA6Z)

Mean

The mean is the sum of a set of values, divided by the number of values in the set. The notation for the mean of a set of values is a horizontal bar over the variable used to represent the set, for example \(\bar{x}\). The formula for the mean of a data set \(\left\{{x}_{1}; {x}_{2};...;{x}_{n}\right\}\) is:

Worked example 3: Calculating the mean

What is the mean of the data set \(\left\{10; 20; 30; 40; 50\right\}\)?

Calculate the sum of the data

\[10 + 20 + 30 + 40 + 50 = 150\]

Divide by the number of values in the data set to get the mean

Since there are \(\text{5}\) values in the data set, the mean is:

\[\text{mean} = \frac{150}{5} = 30\]

Median (EMA72)

Median

The median of a data set is the value in the central position, when the data set has been arranged from the lowest to the highest value.

Note that exactly half of the values from the data set are less than the median and the other half are greater than the median.

To calculate the median of a quantitative data set, first sort the data from the smallest to the largest value and then find the value in the middle. If there is an odd number of values in the data set, the median will be equal to one of the values in the data set. If there is an even number of values in the data set, the median will lie halfway between two values in the data set.

Worked example 4: Median for an odd number of values

What is the median of \(\left\{10; 14; 86; 2; 68; 99; 1\right\}\)?

Sort the values

The values in the data set, arranged from the smallest to the largest, are

\[1; 2; 10; 14; 68; 86; 99\]

Find the number in the middle

There are \(\text{7}\) values in the data set. Since there are an odd number of values, the median will be equal to the value in the middle, namely, in the fourth position. Therefore the median of the data set is \(\text{14}\).

Worked example 5: Median for an even number of values

What is the median of \(\left\{11; 10; 14; 86; 2; 68; 99; 1\right\}\)?

Sort the values

The values in the data set, arranged from the smallest to the largest, are

\[1; 2; 10; 11; 14; 68; 86; 99\]

Find the number in the middle

There are \(8\) values in the data set. Since there are an even number of values, the median will be halfway between the two values in the middle, namely, between the fourth and fifth positions. The value in the fourth position is \(\text{11}\) and the value in the fifth position is \(\text{14}\). The median lies halfway between these two values and is therefore

\[\text{median} = \frac{11 + 14}{2} = \text{12,5}\]

Mode (EMA73)

Mode

The mode of a data set is the value that occurs most often in the set. The mode can also be described as the most frequent or most common value in the data set.

To calculate the mode, we simply count the number of times that each value appears in the data set and then find the value that appears most often.

A data set can have more than one mode if there is more than one value with the highest count. For example, both \(\text{2}\) and \(\text{3}\) are modes in the data set \(\left\{1; 2; 2; 3; 3\right\}\). If all points in a data set occur with equal frequency, it is equally accurate to describe the data set as having many modes or no mode.

The following video explains how to calculate the mean, median and mode of a data set.

Count the number of times that each value appears in the data set

Find the value that appears most often

From the table above we can see that \(\text{4}\) is the only value that appears \(\text{3}\) times. All the other values appear less than 3 times. Therefore the mode of the data set is \(\text{4}\).

One problem with using the mode as a measure of central tendency is that we can usually not compute the mode of a continuous data set. Since continuous values can lie anywhere on the real line, any particular value will almost never repeat. This means that the frequency of each value in the data set will be \(\text{1}\) and that there will be no mode. We will look at one way of addressing this problem in the section on grouping data.

Worked example 7: Comparison of measures of central tendency

There are regulations in South Africa related to bread production to protect consumers. By law, if a loaf of bread is not labelled, it must weigh \(\text{800}\) \(\text{g}\), with the leeway of \(\text{5}\) percent under or \(\text{10}\) percent over. Vishnu is interested in how a well-known, national retailer measures up to this standard. He visited his local branch of the supplier and recorded the masses of \(\text{10}\) different loaves of bread for one week. The results, in grams, are given below:

Monday

Tuesday

Wednesday

Thursday

Friday

Saturday

Sunday

\(\text{802,4}\)

\(\text{787,8}\)

\(\text{815,7}\)

\(\text{807,4}\)

\(\text{801,5}\)

\(\text{786,6}\)

\(\text{799,0}\)

\(\text{796,8}\)

\(\text{798,9}\)

\(\text{809,7}\)

\(\text{798,7}\)

\(\text{818,3}\)

\(\text{789,1}\)

\(\text{806,0}\)

\(\text{802,5}\)

\(\text{793,6}\)

\(\text{785,4}\)

\(\text{809,3}\)

\(\text{787,7}\)

\(\text{801,5}\)

\(\text{799,4}\)

\(\text{819,6}\)

\(\text{812,6}\)

\(\text{809,1}\)

\(\text{791,1}\)

\(\text{805,3}\)

\(\text{817,8}\)

\(\text{801,0}\)

\(\text{801,2}\)

\(\text{795,9}\)

\(\text{795,2}\)

\(\text{820,4}\)

\(\text{806,6}\)

\(\text{819,5}\)

\(\text{796,7}\)

\(\text{789,0}\)

\(\text{796,3}\)

\(\text{787,9}\)

\(\text{799,8}\)

\(\text{789,5}\)

\(\text{802,1}\)

\(\text{802,2}\)

\(\text{789,0}\)

\(\text{797,7}\)

\(\text{776,7}\)

\(\text{790,7}\)

\(\text{803,2}\)

\(\text{801,2}\)

\(\text{807,3}\)

\(\text{808,8}\)

\(\text{780,4}\)

\(\text{812,6}\)

\(\text{801,8}\)

\(\text{784,7}\)

\(\text{792,2}\)

\(\text{809,8}\)

\(\text{802,4}\)

\(\text{790,8}\)

\(\text{792,4}\)

\(\text{789,2}\)

\(\text{815,6}\)

\(\text{799,4}\)

\(\text{791,2}\)

\(\text{796,2}\)

\(\text{817,6}\)

\(\text{799,1}\)

\(\text{826,0}\)

\(\text{807,9}\)

\(\text{806,7}\)

\(\text{780,2}\)

Is this data set qualitative or quantitative? Explain your answer.

Determine the mean, median and mode of the mass of a loaf of bread for each day of the week. Give your answer correct to 1 decimal place.

Based on the data, do you think that this supplier is providing bread within the South African regulations?

Qualitative or quantitative?

Since each mass can be represented by a number, the data set is quantitative. Furthermore, since a mass can be any real number, the data are continuous.

Calculate the mean

In each column (for each day of the week), we add up the measurements and divide by the number of measurements, \(\text{10}\).

For Monday, the sum of the measured values is \(\text{8 007,9}\) and so the mean for Monday is

\[\frac{\text{8 007,9}}{10} = \text{800,8}\text{ g}\]

In the same way, we can compute the mean for each day of the week. See the table below for the results.

Calculate the median

In each column we sort the numbers from lowest to highest and find the value in the middle. Since there are an even number of measurements (\(\text{10}\)), the median is halfway between the two numbers in the middle.

The two numbers in the middle are \(\text{801,2}\) and \(\text{802,3}\) and so the median is

\[\frac{\text{801,2} + \text{802,3}}{2} = \text{801,8}\text{ g}\]

In the same way, we can compute the median for each day of the week:

Day

Mean

Median

Monday

\(\text{800,8}\) \(\text{g}\)

\(\text{801,8}\) \(\text{g}\)

Tuesday

\(\text{797,2}\) \(\text{g}\)

\(\text{796,1}\) \(\text{g}\)

Wednesday

\(\text{798,4}\) \(\text{g}\)

\(\text{797,2}\) \(\text{g}\)

Thursday

\(\text{803,4}\) \(\text{g}\)

\(\text{800,8}\) \(\text{g}\)

Friday

\(\text{802,0}\) \(\text{g}\)

\(\text{804,3}\) \(\text{g}\)

Saturday

\(\text{801,6}\) \(\text{g}\)

\(\text{801,4}\) \(\text{g}\)

Sunday

\(\text{799,3}\) \(\text{g}\)

\(\text{800,2}\) \(\text{g}\)

From the above calculations we can see that the means and medians are close to one another, but not quite equal. In the next worked example we will see that the mean and median are not always close to each other.

Determine the mode

Since the data are continuous we cannot compute the mode. In the next section we will see how we can group data in order to make it possible to compute an approximation for the mode.

Conclusion: Is the supplier reliable?

From the question, the requirements are that the mass of a loaf of bread be between \(\text{800}\) \(\text{g}\) minus \(\text{5}\%\), which is \(\text{760}\) \(\text{g}\), and plus \(\text{10}\%\), which is \(\text{880}\) \(\text{g}\). Since every one of the measurements made by Vishnu lies within this range and since the means and medians are all close to \(\text{800}\) \(\text{g}\), we can conclude that the supplier is reliable.

Outlier

An outlier is a value in the data set that is not typical of the rest of the set. It is usually a value that is much greater or much less than all the other values in the data set.

Worked example 8: Effect of outliers on mean and median

The heights of \(\text{10}\) learners are measured in centimetres to obtain the following data set:

\[\left\{150; 172; 153; 156; 146; 157; 157; 143; 168; 157\right\}\]

Afterwards, we include one more learner in the group, who is exceptionally tall at \(\text{181}\) \(\text{cm}\).

Compare the mean and median of the heights of the learners before and after the eleventh learner was included.

Calculate the median of all \(\text{11}\) learners

Now, with \(\text{11}\) values, the median is the sixth value: \(\text{157}\) \(\text{cm}\). So, the median changes by only \(\text{0,5}\) \(\text{cm}\) when we add the outlier value to the data set.

In general, the median is less affected by the addition of outliers to a data set than the mean is. This is important because it is quite common that outliers are measured during an experiment, because of problems with the equipment or unexpected interference.

A group of 10 friends each have some stones. They work out that the mean number of stones they have is 6. Then 7 friends leave with an unknown number (\(x\)) of stones. The remaining 3 friends work out that the mean number of stones they have left is \(\text{12,33}\).

When the 7 friends left, how many stones did they take with them?

If the mean number of stones the group originally had was 6 then the total number of stones must have been:

A group of 9 friends each have some coins. They work out that the mean number of coins they have is 4. Then 5 friends leave with an unknown number (\(x\)) of coins. The remaining 4 friends work out that the mean number of coins they have left is \(\text{2,5}\).

When the 5 friends left, how many coins did they take with them?

If the mean number of coins the group originally had was 4 then the total number of coins must have been:

A group of 9 friends each have some marbles. They work out that the mean number of marbles they have is 3. Then 3 friends leave with an unknown number (\(x\)) of marbles. The remaining 6 friends work out that the mean number of marbles they have left is \(\text{1,17}\).

When the 3 friends left, how many marbles did they take with them?

If the mean number of marbles the group originally had was 3 then the total number of marbles must have been:

The mode is the age that occurs most often. We have 5 ages to work with and we know one of the ages is 3 (from the median). So the ordered data set is: \(\{x_1 ; x_2 ; 3; x_4 ; x_5\}\) (remember that we always calculate mean, mode and median using the ordered data set). We are told that the mode is 2. Looking at the ordered data set we see that either \(x_1\) or \(x_2\) must be 2 (\(x_4\) and \(x_5\) cannot be 2 as that would make the data set unordered). However, if only one of these values is 2 then the mode will not be 2. Therefore \(x_1 = x_2 = 2\).

\(x_4\) and \(x_5\) can be any numbers that add up to \(\text{18}\) and are not the same (if they were the same then the mode would not be 2), so \(\text{12}\) and \(\text{6}\) or \(\text{8}\) and \(\text{10}\) or \(\text{3}\) and \(\text{15}\), etc.

Note that the set of ages must be ordered, the median value must be \(\text{3}\) and there must be \(\text{2}\) ages of \(\text{2}\).

Four friends each have some marbles. They work out that the mean number of marbles they have is \(\text{10}\). One friend leaves with \(\text{4}\) marbles. How many marbles do the remaining friends have together?

Let the number of marbles per friend be \(x_1, ~x_2, ~x_3\) and \(x_4\).

All Siyavula textbook content made available on this site is released under the terms of a
Creative Commons Attribution License.
Embedded videos, simulations and presentations from external sources are not necessarily covered
by this license.