乘着风，我走的比云更远 ，比山更高。Riding the wind, I go further than clouds, higher than mountains.

The World of Music: SDP layout of high dimensional data

The World of Music, by researchers at Standford, MIT and Yahoo!, intends to render the music space in an unprecedented way. This visualization shows 9,276 artists and how they are related to each other. The artist relation data is mined from user ratings of artists in the Yahoo! Music service. The researchers used a technique called semidefinite programming (which is sometimes called Semidefinite embedding) to layout and cluster the data. Semidefinite embedding is a method for mapping high dimensional data into a lower dimensional Euclidean vector space.

The dataset used consists of all the ratings made by users on the Yahoo! Music service during a 30-day period. The full dataset contains 250 million ratings on 100,000 artists from 4 million users. The ratings are on a scale from 1 (dislike) to 100 (like). “We pre-processed the data by eliminating all ratings below 75 and considered only users and artists with at least 100 ratings. After these modifications, the new dataset contains 9,276 artists and 150,000 users with 2.5 million ratings.”

About me

I am a researcher in the field of computational social science, using online data to question various social issues.

By dumping and parsing Wikipedia datasets in different languages, I apply the methods of data mining and data visualization to explore the collective research project ‘Mapping a World in Wikipedia’ in an interdisciplinary laboratory at the EPFL.

Aside from research work, I am a street photographer and swing dancer.