Making Wikipedia Knowledge Visible

November 23, 2016 — John Moore, Marketing and Technical Content Team Lead

Over the past few months, Wolfram Community members have been exploring ways of visualizing the known universe of Wikipedia knowledge. From Bob Dylan’s networks to the persistence of “philosophy” as a category, Wolfram Community has been asking: “What does knowledge actually look like in the digital age?”

Mathematician Marco Thiel explored this question by modeling the “Getting to Philosophy” phenomenon on Wikipedia. “If you start at a random Wikipedia page, click on the first link in the main body of the article and then iterate, you will (with a probability of over 95%) end up at the Wikipedia article on philosophy,” Thiel explains. Using WikipediaData, he demonstrates how you can generate networks that describe this phenomenon.

He is able to document that about 94% of all Wikipedia articles lead to the “Philosophy” page if one follows the links as instructed, generating in the process some mesmerizing and elegant visualizations of the way that we categorize information.

University student Andres Aramburo also touched on the theme of Wikipedia categories by developing a method for clustering Wikipedia articles by topic. He began by taking a random sample of Wikipedia articles using a Wolfram Language function that he created for this specific task. He then used the links in and out of these articles to generate a graph of the relationships between them. “It’s not a trivial task” to determine if two articles are related to one another, he notes, since “there are several things that can affect the meaning of a sentence, semantics, synonyms, etc.” His visualizations include radial plots of the relationships between articles and word clouds listing shared words for related articles.

One final thread worth highlighting is Community’s celebration of the decision to award Bob Dylan the Nobel Prize in Literature. Wolfram’s own Vitaliy Kaurov created the visualization of the “Universe of Bob Dylan” featured at the top of this post. Alan Joyce (Wolfram|Alpha) generated a graph that compares the lengths of Dylan’s songs (in seconds) to the years in which they were recorded.

And first-time Wolfram Community participant Amy Friedman uploaded her submission from the 2016 Wolfram One-Liner Competition, an amusing word cloud of the poet’s songs in the shape of a guitar.

What new ways of visualizing Wikipedia knowledge can you dream up? With built-in functions like WikipediaData and WikipediaSearch, the Wolfram Language is the perfect tool for exploring Wikipedia data. Show us what you can do with those functions and more on Wolfram Community. We can’t wait to see what you create!