Can You Teach AI to Dance?

In the same way a dog wags its tail or a flower explodes in full bloom, we express who we are through dance. But what qualities in music make us want to get up and move? It’s an instinct that feels profoundly personal and distinctly human. Besides, what motivates one person to put their hands up isn’t necessarily going to have the same effect on another.

This makes quantifying “danceability,” or the likelihood for a song to urge us onto the dance floor, seem like an impossible challenge. So when Spotify developers decided to construct an algorithm—a set of predefined steps—to decipher which song is the best candidate for a good jam, they really had their work cut out for them.

Artificial Intelligence*, or AI, is any system that mimics human intelligence by recognizing past patterns in human behavior and makes decisions that follow these patterns. The goal is to have an algorithm that would make decisions as a human would—an especially complicated task when applied to something as intimate, dynamic, diverse and culturally specific as taste in music.

The specifics of the AI algorithms powering Spotify’s danceability rating remain shrouded in some mystery, shielded from view by corporate non-disclosure agreements. (Spotify employees declined our interview requests.) However, through a series of blog posts from former interns and employees, it’s known that back in 2014, Spotify announced its acquisition of a small Somerville, Massachusetts-based start-up called The Echo Nest. This startup was one of the first to make use of physical audio attributes like the beats per minute (BPM), tempo, and timbre of a music file to predict certain characteristics for a song, such as “danceability.” It’s likely that this was part of the foundation for Spotify’s current danceability score, which, based on elements including “rhythm stability, beat strength, and overall regularity,” rates tracks on a scale of least to most danceable.

But, how reliable is this algorithm? Can a computer really determine something as fluidly defined as danceability? The developers at YR weren’t so sure… So YR designed a tool to help you compare your own danceability ratings to the AI-powered scores from Spotify. Our awesome six-song playlist is courtesy of the up-and-coming producer Edel, from YR’s music team. Rate all six, and you’ll find out how your tastes line up to the algorithm—or don’t.

What’s Danceable? Your Turn!

*But wait, how does AI actually work?

Here’s the deal. AI is any method that automates a decision-making process in order to improve efficiency and result in more data-driven outcomes. Machine learning is a particular kind of AI that happens when computers are trained to “learn” to make certain decisions by observing many past examples of that decision being made either well, or poorly. Say for example, we want an algorithm to be able to make a decision on whether or not an image is of a dog. It will first need to get educated on what a dog looks like, by being shown many labeled examples of dog and non-dog photos. At this point, after having seen enough examples of dog images, it begins to set up some fixed rules about which features of the image are good clues that indicate it is looking at a dog. For instance, it may notice that seeing a floppy ear or fur-like texture or brown color in the image usually means it is looking at a dog.

Once the algorithm graduates from training, it is deployed in the real world, where it needs to make decisions on images it’s never seen before, and use the assumptions it has learned in training to classify new, unfamiliar images. Note that if it learned that all dogs are brown and is presented with only images of black Labradors after training, the algorithm won’t be able to adopt its assumptions to adjust to that change. This means the assumptions of how features influence its labels, or outcomes, are fixed, and do not change over time. That is, unless we decide to re-train the algorithm, sending it back to learn about new examples with the desired label and feature pairing. The examples we use to train AI algorithms are thus incredibly important and really influence the decisions the algorithm will make in the real world.

Which is why even for a playful concept, like danceability, it’s good to pay attention to how any piece of AI is trained. In the case of Spotify, The Echo Nest technology was initially developed at least in part based on ratings provided by a “passionate group of musicians and music lovers,” some associated with the elite conservatory, Berklee College of Music in Boston, Massachusetts. Their passions and opinions may have played an early role in establishing standards that trained a platform for tens of millions of songs.

“We built this project because we wanted to question the opinion of Spotify’s algorithm,” said 16-year-old Mila Sutphin, one of the developers who created the YR interactive. “Spotify has immense influence on people’s music taste, and if Spotify is telling people what is danceable, it starts to limit the range of music taste.”

YR producer Kuya Rodriguez waxed philosophical on this point. “Music to me is a conversation,” he said. “Whether it be with vocals, with instruments, with sounds [or] atmosphere. Dancing is a response to that conversation, and I believe one of the purest forms of expression.”

It’s difficult to imagine an algorithm that could truly understand that all-too-human expression. And that, perhaps, is a good thing.