Talking with... Salar al Khafaji

In the past couple of years, we’ve seen the number of free data visualization tools increase almost exponentially, opening up a whole new market niche outside what once was the exclusive domain of BI applications and softwares. But, as it happens with most markets, only a handful of players stand out from the growing crowd of DIY infographic services.

Overall, this proliferation of free services is a good thing. More free visualization, data mining and scrapping and publishing tools means more opportunities to fight data and visual illiteracy, for example. Also, “the great thing is that the wealth of free tools has changed the relationship between readers and journalists forever. The data belongs to everyone now and everyone has the power to visualize it.”, said to us Simon Rogers, the Data Editor of Twitter, in an interview back in 2013. As for the survival in such an intense corporate environment, that’s not that different from any other segment of activity or industry. The data visualization tools ecosystem will continue to evolve, as companies try to cope with all the challenges that most startups face. Many will fall, some will survive.

One of those “players” that managed to stay up to date with a more demanding (visually educated?) audience of journalists, educators and online publishers is Silk.co. The company, created in 2011 as “silkapp.com”, is based in Amsterdam and San Francisco, and is funded by New Enterprise Associates, Atomico Ventures and several other angel investors. We recently highlighted a couple of announcements by Silk.co, and had the opportunity to ask a few questions to Salar al Khafaji, the company’s CEO and co-founder, about the platform new features, andthe state of visualization, big data and storytelling.

Visualoop (VL) – Salar, you once described Silk as the “equivalent of MongoDB, or in other words, information that may already be in some database-style format, combined with data that is not.” Can you elaborate, please?

Salar al Khafaji (SK) – It’s mostly an explanation for techies: MongoDB is a very flexible way to mash up all different kinds of data. It’s probably the easiest database to get up and running. Silk is MongoDB for non-techies – the fastest and easiest way to build an online database from a spreadsheet. It takes only a minute to go from a CSV to a Silk.

Unlike MongoDB, Silk has a full-fledged and really beautiful front-end with data visualizations, automated mapping, filters, and a lot more. There is no need to think about UI, UX or coding up pages because we’ve built that for you. You can also take different sets of data and mash them up in a Silk. This is somewhat Mongo-like, although it doesn’t require the user to think about schema and field mapping. It’s all very intuitive. The technology is transparent and GUI driven.

VL – And can you tell us how it all begun for you guys?

SK – Silk started out of my interest in structured, machine-readable data and the lack of it on the Web.

The biggest frustration behind this was probably Wikipedia. It’s an amazing resource where many people have contributed a great amount of knowledge, but all the structure has been lost due to the underlying platform. So even though someone spent time to add “Population: 200 million” to the Brazil page, this meaning was lost because of the way wiki’s work – so you can’t filter on pages with a specific population and view them on a map, for instance.

Of course, the first Silk that we created actually contained this type of data, and in Silk you can easily do what I just described.

VL – In that same article, you revealed some discomfort around the term ‘big data’, or at least, with the way the term has been used around for a lot of situations. One year after, have you seen any changes in the way people ‘throw around’ the term ‘big data’?

SK – If anything, its thrown around even more often. That’s not a terrible thing. But I believe “Big Data” is both overused and needlessly intimidating. I mean, scientists were solving “Big Data” problems 30 years ago. So for me its mostly marketing. I believe that data is for everyone. The tools to manage and visualize data at all scales are becoming more and more accessible. Look at the landscape of DIY visualization and infographics tools. There are dozens now, most of them targeting consumers. Where previously you had a handful of analytics and BI software like Tableau, there is an explosion of companies providing products to analyze data.

Silk is about publishing data, visualizing data, sharing data, analyzing data and creating a social data community around our platform. Big or small is irrelevant. Far more important than whether data is big or small is whether the data is accessible and easy to understand or query. That’s where Silk is really focused and its what makes Silk a platform that appeals equally to techies and people who barely have used a spreadsheet.

I believe “Big Data” is both overused and needlessly intimidating. I mean, scientists were solving “Big Data” problems 30 years ago. So for me its mostly marketing.

VL – Another ‘buzzword’ that has been all over the place more recently is “Storytelling” – which seems to be closer to what Silk aims to be, instead of a big data company, right?

SK – We do use that term inside Silk quite a bit. Data can tell a story, often many stories. Querying data in a Silk can help you understand what the story actually is. This is different than just an infographic or other data visualization tools. Silk doesn’t tell you what the story is. Silk lets you find your own story by querying data. Yes, a Silk creator can suggest a story and often that may be the most interesting one.

But Silk is designed to let anyone ask questions with data and their answers might tell a completely different story. Here’s an example. We published data on what public schools in the state of California have very low vaccination rates. A Silk user can look at the map and see the whole state and get some analysis statewide. They can also choose to build a map of just their own county or their own school district. Or look at scatter plot showing the percentage of poor households in a school district or county versus the percentage of families that have had their children vaccinated. So a Silk visualization can be relevant at a national, at a state, at a county, or at a school level, depending on how you display the data.

Another thing that makes Silk a unique data storytelling platform is that it’s not just about the data – you can add videos, Twitter streams, images, text blocks, audio – all on the same page. So the visualizations are just one element of the data communication and the story.

VL – In addition to these two all-present terms, and based on what you have seen in San Francisco and Europe, what other trends should we be paying attention to, in terms of data communication?

SK – Here are a few that we are watching. First, we are seeing a trend in journalism that everyone who is working at a publication feels the need to learn data communication and data journalism. They want to be able to take data and use it to support their story with a visualization. And they want to to do it themselves in 30 minutes or less. This is great because we will have many more literate data journalists.

In a similar vein, we’re seeing how data communication is becoming thoroughly democratized. Just as Blogger made it easy to be a publisher and YouTube made it easy to be an producer, data publishing and data communication is becoming something that a wider and wider number of people participate in. This is wonderful. The world is a better place when more of us understand how to use and communicate with data.

Another observation – to date, most data communication has tended to have a linear flow. I think that flow is artificial and often constrains the way we might want to ask questions. I’m hopeful that the newer data communication and visualization platforms are a bit less structured and encourage people to ask questions rather than just display information.

Just as Blogger made it easy to be a publisher and YouTube made it easy to be an producer, data publishing and data communication is becoming something that a wider and wider number of people participate in. This is wonderful. The world is a better place when more of us understand how to use and communicate with data.

VL – Now, overall, millions of “Silks” have been published since you guys launched, and in your website’s gallery there are several high-quality examples of data narratives and analysis built with your platform. Is there any particular one that you find specially interesting – one you’d recommend to someone who has never tried or heard about your platform?

SK – One of our very favorite users is Human Rights Watch. This is a wonderful non-government advocacy group that tries to protect the rights of humans anywhere on Earth. They were struggling to track the votes of the UN Human Rights Counsel. They wanted to track votes by country, by resolution, by session and several other facets. The composition of the counsel changed regularly. They also wanted to combine the vote tracking with qualitative analysis and commentary. They built votescount.hrw.org on Silk, and it is now their primary mechanism to count votes at the HRC, build visualizations and track results. They also add copy to pages where they want to place analysis. They transformed something like 30 separate spreadsheets and Word documents into a single Silk. They were very happy.

Another favorite Silk is the World Startup Wiki. This is an ambitious effort by Bowei Gai. He sold his first startup CardMunch to LinkedIn. Bowei was traveling around and decided how cool it would be to build series of guides that would give entrepreneurs a 15-minute primer on everything they needed to know about launching a startup in each country of the world. He tried some legacy wiki platforms but decided instead to build using Silk. He programmatically built thousands of pages that captured huge amounts of data about the startup particulars of each country – most influential investors, best funded startups, sectors that had few entrants, etc. They build some very nice interactive visualizations to tell their story. We love what Bowei and his team are doing with Silk. Those are just two but there are many, many more.

VL – Finally, Salar, can you share some of Silk’s plans for 2015, and beyond?

SK – While we can’t discuss specific features, the general direction of Silk is to make it easier for people to publish data online and find, interact and use the published Silks. In the last few weeks, we’ve release a better way to import CSV data into Silk and renewed our visualizations – and we will continue to work and improve on those features. For the next year, we will also start to make it easier for people to find existing Silks that they might find interesting and then follow those Silks and the people behind them.

VL – Thank you so much, Salar!

SK – Thank you for this opportunity and your excellent blog, Tiago!

We thank Salar for answering our questions in such detail. You can connect with him on Twitter (@salar) and LinkedIn, and don’t forget to try out Silk.co.

Written by Tiago Veloso

Tiago Veloso is the founder and editor of Visualoop and Visualoop Brasil . He is Portuguese, currently based in Bonito, Brazil.