Posts categorized "Music"

My friend John R. sent me this excellent Buzzfeed feature on music playlists.

Here are some choice quotes to whet your appetite:

In 2014, when Tim Cook explained Apple’s stunning $3 billion purchase of Beats by repeatedly invoking its “very rare and hard to find” team of music experts, he was talking about these guys. And their efforts since, which have pointed toward curated playlists (specifically, an industrial-scale trove of 14,000 and counting) as the format of the future, have helped turn what was once a humble labor of love for music fans into an increasingly high-stakes contest between some of the richest companies in the world.

The algorithm that can judge the merits of new Gucci Mane, or intuit that you want to sing “A Thousand Miles” by Vanessa Carlton in the shower, has yet to be written... the job has fallen to an elite class of veteran music nerds — fewer than 100 working full-time at either Apple, Google, or Spotify — who are responsible for assembling, naming, and updating nearly every commute, dinner party, or TGIF playlist on your phone.

Spotify says 50% of its more than 100 million users globally are listening to its human-curated playlists (not counting those in the popular, algorithmically personalized “Discover Weekly”), which cumulatively generate more than a billion plays per week.

Machines have always been great at repetitive tasks that follow set rules. But many problems do not fall into that category.

We’ve come to expect that virtually all of our problems can be solved with code, so much so that we summon it unthinkingly before doing almost anything...But what if music is somehow different? What if there’s something immeasurable but essential in the space between what is now called “discovery” and, you know, that old stupidly human ritual of finding and falling in love with a song?

It's the revenge of the humans. Recommendation engines are not good enough. This doesn't mean "science" is not important. The article later explains:

Hypotheses, of course, are meant to be tested, and Spotify curators regularly make adjustments to playlists based on data that shows how people are actually interacting with them.

One frequently used application is a performance tracker called “PUMA,” or Playlist Usage Monitoring and Analysis, which breaks down each song on a playlist by things like number of plays, number of skips, and number of saves.

This is really the way forward for "machines". Machines and humans are both needed: the sum ought to be greater than its parts. Forget the idea that one replaces the other.

Some time ago, there was a lot of hype about how new tech will demolish the superstar effect in entertainment sales because all the little titles in the long tail will be exposed to consumers. I recall Amazon being labeled the shiny example of a company that made profits off the long tail (as opposed to the boring top of the distribution). I still remember this graphic from Wired (link):

A reader Patrick S. pointed me to a study of music services that pronounces "the death of the long tail" (Warning: they want your email address in order to read the full report. The gist of the report was written up in this other blog.). Reading these pieces, one wonders whether this long-tail miracle even existed in the first place. The main thrust of the argument is that the new digital subscription/music services have not changed the allocation of spoils amongst artists. The little guys out in the long tail are still earning much less of a (shrinking) pie.

The long tail is an example of those intuitive, elegant scientific concepts that are much less impactful in the real world than claimed. Here is what I think caught some smart people on the wrong foot:

The distribution of profits has always been much more extreme than the kind of ballpark graphics (like the Wired chart above) shows. The new study for example suggests that the top 1 percent earned 77 percent of all the money. This is much more extreme than the 80/20 rule. From the graphical perpective, you can think of the distribution as one very tall spike and a very flat, very long tail.

The cumulative weight of the very flat, very long tail is still not that heavy compared to the one spike. Even if you manage to increase the size of the tail by 10 percent, it still amounts to a small number.

The above assumes you can increase the size of the tail. But it is quite hard to do. One reason is that the tail consists of millions of little pieces, which don't necessarily move in sync.

The second, and more important reason, is that titles or artists don't randomly end up in the tail. If a title is in the tail, it's an indicator that the artist or title is not appealing to the mass audience.

We fell prey to the romantic notion that there are some unjustly neglected artists, and rejoice in the idea that the long-tail effect may allow a few of these to reverse their fortunes. But a few outliers do not change the overall distribution.

***

The report's authors also make this observation:

Ultimately it is the relatively niche group of engaged music aficionados that have most interest in discovering as diverse a range of music as possible. Most mainstream consumers want leading by the hand to the very top slither of music catalogue. This is why radio has held its own for so long and why curated and programmed music services are so important for engaging the masses with digital.

While I believe this story, I should note that there is no quantitative evidence provided (at least not in the summary). If this is true, it has important implications for anyone in the business of "personalizing" marketing to consumers.

When I think about Facebook's latest "innovation", about who might gain from this forced sharing of personal choices, I think it may be that they want to appease academic researchers.

Here is CNN's description of how this so-called "real-time app" behaves:

because I've logged in to Spotify with my Facebook identity, every song I listen to is automatically shared to Facebook.

Spotify is the recently launched online music service. A number of other big-brand Internet companies is apparently on board, such as Netflix, Hulu, and Yahoo! News. I agree with Farhad Manjoo at Slate who said:

it's somehow eluded Zuckerberg that sharing is fundamentally about choosing. You experience a huge number of things every day, but you choose to tell your friends about only a fraction of them, because most of what you do isn't worth mentioning.

This opinion captures a law of data which hasn't been explored enough: that as quantity goes up, quality of data goes down. Before this change, users voluntarily disclose their preferences via Like buttons or similar setups; this means that users have performed a valuable act of filtering the data before exposing them (free public service!) With this new application, there is surely more data but I doubt if there is more useful data.

Perhaps this is a way to force the non-sharers to share their personal choices. In the past, only those who like to share their likes and dislikes do so but now, many users will find their choices exposed regardless of their wishes. This expansion of the base of data collection is useful for researchers and analysts; however, Facebook has the option of collecting the data without exposing them to people's networks. Facebook simply needs to make deals with Spotify, Netflix, Hulu, etc. to share data in the background. In fact, by encouraging users to use the same logon account, they probably already have this behind-the-scenes sharing in place already.

Previously, academic researchers do not have access to such data as businesses either do not collect them or would not make them available publicly due to competitive reasons. So I think at least one group can benefit from this rather alarming situation.

Felix Salmon laments that the standard of statistical literacy among journalists is appalling. He brings us this example about people who download music illegally:

About 95 percent of music downloads in 2010 were unlicensed and illegal, with no money flowing back to artists, songwriters or record producers, according to Alex Jacob, a spokesman for the International Federation of the Phonographic Industry. So riches could await a company that persuades some of these Internet scofflaws to change their ways.

He argues, rightfully, that the source makes the statistic (from an organization fighting against music piracy) hard to take seriously. He also feels that the number fails the "sniff test". Could it really be true that only 5% of the music are paid for? If the music market were to be say $10 billion today, then they are suggesting that the real market could be as high as 20 times that number, so $200 billion.

I left a comment on his blog to point out the craziness of the last sentence in the quote above, the delusion that if Bittorrent and other illegal download methods suddenly vanished, the online music revenues would jump 5-fold, 10-fold, etc. overnight. This is the same delusion that makes politicians/economists claim that we can solve our unemployment problem by giving our workforce more college degrees (in effect, shifting people from one bucket to a different bucket). I debunked that claim here.

***

The other point Felix made is well worth repeating: 95 percent of downloads is not the same as 95 percent of people doing downloads because a small number of people account for an outsized proportion of total downloads. Although Felix didn't state this directly - he assumed it in an example, it is most likely true that illegal downloaders are on average downloading many more songs than legal downloaders. For price is no barrier to the former group.

What this means is that if 95 percent of downloads were illegal, then the proportion of people who are illegal downloaders is likely to be considerably lower than 95 percent.

One final point: it is also foolhardy to bluntly divide the world into Illegals and Legals. In statistics, we like to think there is a continuum with most people having done at least one illegal download, while perhaps most so-called Illegals have paid at least once for music downloads. So, if we want to do a proper analysis of this phenomenon, we should put a probability of downloading illegally on each individual, rather than assuming that each person is either an Illegal or a Legal, and not both.

***

Yes, this makes data analysis sounds complicated. It's easy to fall into the many traps. But there is only a small number of fundamental concepts, and once you understand those, you'll find them popping up everywhere.

The iPhone version of the music app Pandora sent information to eight trackers. It sent location data to seven of these, a unique phone ID to three, and demographic data to two.

A few things to note when reading this piece: first, the suggestion that a "written privacy notice" is a solution is laughable; second, the notion of a company analyzing its own customers to provide better service versus a company selling our data to other companies for profit; third, "free apps" are not "free", and before you chastise app developers, think how advertising is the only source of revenues for them, and what that entails.

There is a constant refrain in the article, developers claiming that they "do not" tie the phone's unique, undeletable ID to a person, or they "are not" doing so. Nobody said they "cannot". That's because they can. I don't know about other platforms but on the iPhone, you must sign on in order to get any apps from the App Store, regardless of whether the app is free or not; but your App Store account is your credit card so your phone ID can definitely be tied to your credit card -- and sorry to say, the credit card leads very far indeed. Apple (nor Google) is not a neutral player in this game because Apple (and Google) runs large advertising businesses.

Thanks WSJ for running such a wonderful series. When are they giving out Pulitzers?