Arvind Narayanan's journal

StyleCrave has a post on the supermodels who made the most money last year. Three thoughts flashed into my head when saw it: 1. Hawt! 2. Actually, more anorexic than hot. Ugh. 3. Zipf's law! Of course, I had no choice but to fire up a spreadsheet and see for myself :-)

I did expect what you see in the graph, but it's still creepy how well it fits. Pretty much the only deviants from the predicted curve (rank * income = $30M) are #1, Gisele Bundchen, and #3, Kate Moss. According to the article, Bundchen has a separate footwear line outside of her modeling career, which netted her $6M -- the amount by which her income exceeded the prediction. As for Kate Moss, the article says she "did lose a few big deals last year."

Anyone care to come up with an explanation for why the power law (with exponent 1) applies here? It's not obvious -- after all, the list of richest people by net worth doesn't look anything like this, with Buffet, Slim and Gates all virtually tied at the top.

It _is_ obvious -- rich get richer. It doesn't apply so well to business wealth because the variables and volatility involved are a lot more, but for supermodels, I guess the safest model (p. unintended) is to pay them proportional to their ranking. In other words, it's a fairly simple popularity contest. I doubt the distribution after rank 30 or so is anything Zipfian.

i agree with you qualitatively. it is certainly a popularity contest, and indeed, it would be absurd to suggest that bundchen is twice as beautiful as heidi klum who is at #2!

but quantitatively? note that income isn't proportional to ranking, but inversely proportional. i just don't see why it should be that, and not say, 1/sqrt(rank). the difference is huge and would easily show up in the chart.

the beauty of rich-get-richer explanations for power laws is that the rank doesn't even appear in the formulation of the hypothesis, and then there's a proof showing how the hypothesis translates into a certain distribution for the top-ranked items. for example, the standard argument for social networks goes like this: "assume that a newcomer to the network makes friends via a random walk on the existing nodes. we prove that the above random process produces the following degree distribution..."

your explanation misses that magic by making the popularity hypothesis explicitly dependent on the rank. it is thus less elegant as well as harder to swallow. that doesn't make it wrong, but it does mean you need to find more evidence for your hypothesis. does that make sense?

zipf's original explanation in linguistics involved similar magic -- a word introduction/coalescence process, iirc. you would know that better :-) note that you can't just say, "every word/model gets used/paid in proportion it's/her popularity," because every distribution would be an eigenvector for that hypothesis!

so i do think this stuff is somewhat subtle and needs delicate argument.

You're right, I didn't pay enough attention to the wording -- of course they're not being paid in proportion to their rank. As far as quantitatively figuring it out, it shouldn't be too hard if you can get a sense of how the supermodeling industry works.