Inspired by "real-life examples of common distributions", I wonder what pedagogical examples people use to demonstrate negative skewness? There are many "canonical" examples of symmetric or normal distributions used in teaching - even if ones like height and weight don't survive closer biological scrutiny! Blood pressure might be nearer normality. I like astronomical measurement errors - of historic interest, they are intuitively no more likely to lie in one direction than another, with small errors more likely than large.

Common pedagogical examples for positive skewness include people's incomes; mileage on used cars for sale; reaction times in a psychology experiment; house prices; number of accident claims by an insurance customer; number of children in a family. Their physical reasonableness often stems from being bounded below (usually by zero), with low values being plausible, even common, yet very large (sometimes orders of magnitude higher) values are well-known to occur.

For negative skew, I find it harder to give unambiguous and vivid examples that a younger audience (high schoolers) can intuitively grasp, perhaps because fewer real-life distributions have a clear upper bound. A bad-taste example I was taught at school was "number of fingers". Most folk have ten, but some lose one or more in accidents. The upshot was "99% of people have a higher-than-average number of fingers"! Polydactyly complicates the issue, as ten is not a strict upper bound; since both missing and extra fingers are rare events, it may be unclear to students which effect predominates.

I usually use a binomial distribution with high $p$. But students often find "number of satisfactory components in a batch is negatively skewed" less intuitive than the complementary fact that "number of faulty components in a batch is positively skewed". (The textbook is industrially themed; I prefer cracked and intact eggs in a box of twelve.) Maybe students feel that "success" should be rare.

Another option is to point out that if $X$ is positively skewed then $-X$ is negatively skewed, but to place this in a practical context ("negative house prices are negatively skewed") seems doomed to pedagogical failure. While there are benefits to teaching the effects of data transformations, it seems wise to give a concrete example first. I would prefer one that does not seem artificial, where the negative skew is quite unambiguous, and for which students' life-experience should give them an awareness of the shape of the distribution.

It is not apparent that negating a variable will be a "pedagogical failure," because there is the option of adding a constant without changing the shape of the distribution. Many skewed distributions involve proportions $X$ for instance, and the complementary proportions $1-X$ are usually just as natural and easy to interpret as the original proportions. Even with house prices $X$ the values $C-X$ where $C$ is a maximum house price in the area could be of interest and is not difficult to understand. Also consider using logs and negative power transformations to create negative skew.
–
whuber♦Mar 7 '14 at 17:30

2

I agree that $C-X$ in the case of house prices would be a little contrived. But $1/X$ would not: it would be "amount of house you can buy per dollar." I suspect that in any reasonably homogeneous area this would have a strong negative skew. Such examples could teach the deeper lesson that skewness is a function of how we express the data.
–
whuber♦Mar 7 '14 at 20:13

3

@whuber It wouldn't be contrived at all. Maximum and minimum potential prices in a market arise naturally as those reflecting different evaluations by market participants. Among the buyers, there is conceivably one that would pay maximum price for a given house. And among the sellers there is one that would conceivably accept minimum price. But this information is not public and so actual observed transaction prices are affected by the existence of incomplete information. (CONT'D)
–
Alecos PapadopoulosMar 8 '14 at 14:04

For an introductory stats class I think this example works well pedagogically - it is something students are likely to have real-life experience of, can reason about intuitively, and can confirm against widely available data sets.
–
SilverfishMar 11 '14 at 0:17

Nick Cox accurately commented that "age at death is negatively skewed in developed countries" which I thought was a great example.

I found the most convenient figures I could lay my hands on came from the Australian Bureau of Statistics (in particular, I used this Excel sheet), since their age bins went up to 100 year olds and the oldest Australian male was 111 , so I felt comfortable cutting off the final bin at 110 years. Other national statistical agencies often seemed to stop at 95 which made the final bin uncomfortably wide. The resulting histogram shows a very clear negative skew, as well as some other interesting features such as a small peak in death rate among young children, which would be well suited to class discussion and interpretation.

R code with raw data follows, the HistogramToolspackage proved very useful for plotting based on aggregated data! Thanks to this StackOverflow question for flagging it up.

Somewhat related to this post, I have heard that retirement ages have negative skewness: most people retire around the nominal age (say, 65 or 67 in many countries) but some (say, workers in coal mines) retire much earlier.
–
Christoph HanckFeb 25 at 5:29

In Stochastic Frontier Analysis, and specifically in its historically initial focus, production, the production function of a firm/production unit in general, is specified stochastically as

$$q = f(\mathbf x) + u-w$$

where $q$ is the actual output produced by the firm, and $f(\mathbf x)$ is its production function (which is understood more as an input-output relation rather than a mathematical expression reflecting "engineering" relations) with $\mathbf x$ being a vector of production inputs (capital, labor, energy, materials, etc). The production function in Economic Theory represents maximum output, given technology and inputs, i.e. it embodies full efficiency. Then $u$ is a zero-mean normal disturbance on the production process, and $w$ is a non-negative random variable representing deviation from full efficiency due to reasons that the econometrician may not know, but he can measure through this set up. This random variable is usually assume to follow a half-normal or exponential distribution. Assuming the half normal (for a reason), we have

This is a skew-normal density, with location parameter $0$, scale parameter $s_2$ and skew parameter $(-\frac {\sigma_2}{\sigma_u})$, where $\phi$ and $\Phi$ are the standard normal pdf and cdf respectively. For $\sigma_u =1, \;\; \sigma_2 = 3$, the density looks like this:

So negative skewness is, I'd say,the most natural modelling of the efforts of human race itself: always deviating from its imagined ideal -in most cases lagging behind it (the negative part of the density), while in relatively fewer cases, transcending its perceived limits (the positive part of the density) . Students themselves can be modeled as such a production function. It is straightforward to map the symmetric disturbance and the one-sided error to aspects of real life. I cannot imagine how more intuitive can one get about it.

This answer seems to echo @Glen_b's suggestion of grad GPA. Highly motivated human behavior aimed at an elusive ideal certainly fits that scenario! Efficiency in general is a great example.
–
Nick StaunerMar 8 '14 at 10:03

2

@Nick Stauner The important point here is that we consider "actual minus target" signed, not the "distance" in absolute values. We keep the sign in order to know whether we are above or below the target. The intuition here is, exactly as you write, that "highly motivated" behavior will push "actual" closer to "target", creating asymmetry.
–
Alecos PapadopoulosMar 8 '14 at 11:12

@NickStauner Indeed, Silverfish's own post of long jump qualifying results also relates to 'highly motivated behavior' (considering limits of what humans can presently achieve as a kind of informal 'elusive ideal')
–
Glen_b♦Jan 28 at 2:42

Negative skewness is common in flood hydrology. Below is an example of a flood frequency curve (South Creek at Mulgoa Rd, lat -33.8783, lon 150.7683) which I've taken from 'Australian Rainfall and Runoff' (ARR) the guide to flood estimation developed by Engineers, Australia.

There is a comment in ARR:

With negative skew, which is common with logarithmic values of floods
in Australia, the log Pearson III distribution has an upper bound.
This gives an upper limit to floods that can be drawn from the
distribution. In some cases this can cause problems in estimating
floods of low AEP, but often causes no problems in practice.
[Extracted from Australian Rainfall and Runoff - Volume 1, Book IV
Section 2.]

Often floods, at a particular location, are considered to have an upper bound called the 'Probable Maximum Flood' (PMF). There are standard ways of calculating a PMF.

+1 This example nicely shows how arbitrary the question actually is: when you measure floods in terms of peak discharge, they will be positively skewed, but measured in log discharge, they (apparently) are negatively skewed. Similarly, any positive variable can be re-expressed in a simple way that skews its distribution negatively (merely by taking a suitably negative Box-Cox parameter). It all comes down to what is meant by "easily grasped," I suppose--but that's a question about the students, not about statistics.
–
whuber♦Mar 11 '14 at 23:08

Here are the results for the forty athletes who successfully completed a legal jump in the qualifying round of the 2012 Olympic men's long jump, presented in a kernel density plot with rug plot underneath.

It seems to be much easier to be a metre behind the main group of competitors than to be a metre ahead, which would explain the negative skewness.

I suspect some of the bunching at the top end is due to the athletes targeting qualification (which required a top twelve finish or a result of 8.10 metres or above) rather than achieving the longest possible distance. The fact that the top two results were 8.11 metres, just above the automatic qualifying mark, is strongly suggestive, as is the way the medal-winning jumps in the Final were both longer and more spread out at 8.31, 8.16 and 8.12 metres. Results in the Final had a slight, non-significant, negative skew.

For comparison, results for the Olympic Heptathlon at Seoul 1988 are available in the heptathlon data set in the R package HSAUR. In that competition there was no qualifying round but each event contributed points towards the final classification; the female competitors showed pronounced negative skewness in the high jump results and somewhat negative skew in the long jump. Interestingly this was not replicated in the throwing events (shot and javelin) even though they are also events in which a higher number corresponds to a better result. The final points scores were also somewhat negatively skewed.

Asset price changes (returns) typically have negative skew - many small price increases with a few large price drops. The skew seems to hold for almost all types of assets: stocks prices, commodity prices, etc. The negative skew can be observed in monthly price changes but is much more evident when you start looking at daily or hourly price changes. I think this would be a good example because you can show the effects of frequency on skew.

I like this example a lot! Is there an intuitive way of explaining it - essentially, "downside shocks are more likely (or at least, likely to be more severe) than upside shocks"?
–
SilverfishMar 7 '14 at 19:17

2

@Silverfish I would phrase it as extreme negative market outcomes are more likely than extreme positive market outcomes. Markets also have asymmetric volatility. Market volatility generally increases more following negative returns than positive returns. This is often modeled with Garch models, such as GJR-Garch (see Arch wikipedia entry).
–
JohnMar 7 '14 at 19:21

3

I also saw an explanation that bad news is released in bunches. I have not used GJR-GARCH. I attempted to use multifractal Brownian motion (Mandelbrot) to model asymmetry, but was unable to make it work.
–
wcampbellMar 7 '14 at 21:15

3

This is at best simplistic. For example, I just took a data set of daily returns on 31 equity indexes. More than half of them have positive skew (using Pearson's skewness) and over 70% are positive on the measure 3 * (mean - median) / stdev. For commodities you tend to see even more positive skew, as supply and demand shocks can both drive prices up rapidly (e.g. oil, gas and corn in recent years).
–
Chris TaylorMar 8 '14 at 13:01

@ChrisTaylor Thanks, this was also a very useful comment. Since younger students are less likely to have an intuition for which effect predominates, and there is a degree of ambiguity depending on the sector, I think I will avoid using this example with introductory students. It would still make a nice discussion point for students taking stats as part of an economics or finance degree, I think.
–
SilverfishJan 28 at 10:17

In fisheries there are often examples of negative skew because of regulatory requirements. For instance the length distribution of fish released in recreational fishery; because there is sometimes a minimum length that a fish must be in order for it to be retained all fish under the limit are discarded. But because people fish where there tends to be legal length fish there tends to be negative skew and mode towards the upper legal limit. The legal length does not represent a hard cut off though. Because of bag limits (or limits on the number of fish that can be brought back to the dock), people will still discard legal size fish when they have caught larger ones.

e.g., Sauls, B. 2012. A Summary of Data on the Size Distribution and Release Condition of Red Snapper Discards from Recreational Fishery Surveys in the Gulf of Mexico. SEDAR31-DW11. SEDAR, North Charleston, SC. 29 pp.

"Skew towards large sizes" would ordinarily be interpreted as positive skew, not "negative." Perhaps you could clarify this answer with an illustration of a typical distribution? The mechanisms you describe--a regulatory upper limit and some tendency to exceed it--could lead either to negative or positive skew, depending on the truncated distribution of the small-size fish (and depending on how the fish are measured: the skewness of their mass distribution would not be the same as the skewness of their length distribution).
–
whuber♦Jun 12 at 16:28

Some great suggestions have been made on this thread. On the theme of age-related mortality, machine failure rates are frequently a function of machine age and would fall into this class of distributions. In addition to the financial factors already noted, financial loss functions and distributions typically resemble these shapes, particularly in the case of extreme-valued losses, e.g., as found in BIS III (Bank of International Settlement) estimates of expected shortfall (ES), or in BIS II the value at risk (VAR) as inputs to regulatory requirements for capital reserve allocations.