The 3 defects of the median

To calculate a mean, you add up all the values and then divide the sum by the number of items on your list.

The median is the half-way point of your list of numbers: half of the sample have values higher than the median and half have values lower than the median.

Median has come to be known for its fair reflection in the case of outliers, or in a population with skewed distribution.

For example, you want to know the average annual outcome of the following population.

The mean annual income of the population is 135,000 USD. But this salary is higher than that earned by 80% of the population.

The median income is 90,000 USD. 50% of the population has lower earnings than this amount, and 50% has higher earnings. So, the median represents the “average” concept in a better way.

However, the median is not an impeccable statistic. There are several things that we should consider when using it for communicating statistical information.

Let me tell you about 3 “sins” of the median.

1/ Median does not convey the information of min and max values

Suppose that the median annual income of Yabada’s citizens is 120,000 USD (you will only find the name Yabada in a wonderland map). What is the highest salary?

You do not know the answer, right?

The highest salary could be 250,000, or 350,000, or even 1M USD. It could be whatever. Simply because, by itself, the median does not provide you with that information.

And even if you know the highest value, you cannot tell the lowest value, either.

You will need more than just a simple median.

2/ Median may lead to a false impression

“By the age of 7 months, your son should learn to sit without help.” says a doctor. He is trying to communicate the median concept (that is, the 50th percentile point) to parents.

Unfortunately, parents may not fully understand what the median means. So, when translated into lay language, the doctor may be misinterpreted by parents as saying: “Normally, your son should learn to sit without help at the age of 7 months. If your son cannot sit at 7 months, he is developmentally delayed.”

In this case, 50% of parents could fall into anxiety because their babies are labeled “subnormal”. And the majority of those worries are unnecessary.

So, which data should pediatricians communicate to parents?

Laura Sices wrote an article, in which she proposed 3 figures that doctors should bring into the consultation [1].

The first figure helps to make parents aware of the time when their babies may begin to acquire a given developmental skill. For some skills, infants and young children are at risk of physical injury. For example, parents should know the age at which their babies may start to explore the world by mouth. And parents should hide tiny things that could be dangerous to their offspring before this time. The 10th percentile is chosen for this purpose. (i.e. the point at which 10% of children will have acquired the relevant skill).

The second figure is used to provide information about what is typical at which age. The 50th percentile (that is, the median) is helpful in this case.

The third figure serves as a red flag. If a child is not yet demonstrating an important skill observed in most children that age, the child requires further screening and assessment. The 90th percentile (or another threshold derived from clinical research) is chosen for this purpose. (i.e. the point at which 90% of children will have acquired the skill and thus a ‘reg flag’ would be raised if a child had not acquired the skill by this point).

The median survival time provides another example of where a false impression could be created.

If the median survival time in a given disease is eight months, the chance of surviving at least eight months or longer after starting treatment is 50%. However, in lay language, it is frequently misinterpreted as “I will probably be dead in eight months” [3]. My goodness!

Just keep in mind that 50% patients live longer than that point.

3/ Median is not good for planning

You are planning for a picnic. And you are wondering how many pizzas you should prepare for 10 people.

In this case, an arithmetic average (that is the mean value) will work. Remember that mean = sum (the number of pizzas needed) / the number of participants. So, just get the mean value multiplied by 10 people. And you will know that you should prepare 10 pizzas for the trip.

Nice question, Freya. As I noted in the article above, mean reflects the average value in a better way when the data is skewed. And that’s the case with survival data. Imagine that in a study with 100 cancer patients, 99 die before 5 months, 1 survives 5 years. If we use mean, the average survival time will be > 5 months => longer than the survival time of 99% patients. Median gives us a better estimate.

This blog, written by Leonard Goh, was the winner of Cochrane Malaysia and Penang Medical College’s recent evidence-based medicine blog writing competition. Leonard has written an insightful and informative piece to answer the question: ‘Evidence-based health practice: a fairytale or reality’.