big data

Post navigation

Over the past two years I have kept trying to explain to people that Google is appropriating the commons and is building value on something that isn’t theirs. Another way of saying the same thing is that Google is using you as a sensor to build knowledge about the world as a whole. The example I usually bring up is how any action you take on Google Maps furthers Google’s knowledge about what the world looks like and how people (want to) operate in it.

Zuboff has written an article that I wished I was capable of writing. She made the mechanism way more explicit and has called it: surveillance capitalism. I can’t wait to read her forthcoming book.

On the problem:

The equation: First, the push for more users and more channels, services, devices, places, and spaces is imperative for access to an ever-expanding range of behavioral surplus. Users are the human nature-al resource that provides this free raw material. Second, the application of machine learning, artificial intelligence, and data science for continuous algorithmic improvement constitutes an immensely expensive, sophisticated, and exclusive twenty-first century “means of production.” Third, the new manufacturing process converts behavioral surplus into prediction products designed to predict behavior now and soon. Fourth, these prediction products are sold into a new kind of meta-market that trades exclusively in future behavior. The better (more predictive) the product, the lower the risks for buyers, and the greater the volume of sales. Surveillance capitalism’s profits derive primarily, if not entirely, from such markets for future behavior.

On the solution:

In undertaking this challenge we must be mindful that contesting Google, or any other surveillance capitalist, on the grounds of monopoly is a 20th century solution to a 20th century problem that, while still vitally important, does not necessarily disrupt surveillance capitalism’s commercial equation. We need new interventions that interrupt, outlaw, or regulate 1) the initial capture of behavioral surplus, 2) the use of behavioral surplus as free raw material, 3) excessive and exclusive concentrations of the new means of production, 4) the manufacture of prediction products, 5) the sale of prediction products, 6) the use of prediction products for third-order operations of modification, influence, and control, and 5) the monetization of the results of these operations. This is necessary for society, for people, for the future, and it is also necessary to restore the healthy evolution of capitalism itself.

This afternoon I attended a session at info.nl in Amsterdam with Brewster Kahle who wants to create “Universal Access to All Knowledge”. He has founded The Internet Archive, a non-profit library with about 150 people. It is best known for its Wayback Machine (collecting about 5 billion web pages a month, amazingly still fitting in a container).

They are convinced that it is feasible to store all the world’s knowledge. Texts are being digitized (i.e. scanned) for representation on the screen (see Open Library for examples) and are openly available. The Internet Archive have made their own scanners pushing the costs per scanned page (mostly labour) down to about 10 cents per page. Their scanning centers now have 3,000,000 free ebooks available online (incl. 500,000 for the blind/dyslexic and 250,000 modern books available for lending) and they have about 8 million more to go. They have made a book mobile that can download and print a book for about one dollar.

Book Mobile

They are also focusing on archiving all audio, offering unlimited storage and unlimited bandwidth for free and for ever to bands who want to store their tapes online. They have over 1 million audio items in over 100 collections. They are doing similar things to moving images, making permanent archives of video sites that have gone out of business, home movies and even television (do check that one out, it makes TV news quotable and even includes a lending model for physical DVDs of TV news).

They store their 10 Petabytes of data in a redundant fashion and also store 600,000 books in a physical archive (growing fast of course).

Brewster also talked a little bit about his case against the US government when he received a national security letter from the FBI which was deemed unconstitutional with a bit of help from the EFF and from the fact that he is a library.

Daniel Erasmus from Digital Thinking Network (DTN) did a short presentation on NewsConsole which uses a big data approach and aims to collect all the world’s news and put it in an interface that allows for easy interacting with it. I’ve been using it for a while to find news in the field of learning technology. I particularly liked his key lessons from working with big data, like:

SQL won’t cut it

Big data is messy, a lot of effort goes into cleaning it up

Moving a petabyte of data is very expensive and difficult, store it correctly the first time

Testing on small subsets doesn’t work, because you get unexpected bottlenecks when you scale

I am attending Ars Electronica: The Big Picture in Linz, Austria. This is a festival for Art, Technology and Society. If you are interested to know a bit more about the festival, then don’t watch this useless trailer:

Look at the program instead or check out their newly launched and fabulous archive.

One of the symposia I attended was titled Sensing Place/Placing Sense II. It consisted of three panels: Collect, Communicate and Compel. The session was introduced as follows:

A growing part of the general public is concerned that cities are planned and governed in a responsible way. In the contemporary information society, however, the democratic obligation of the citizens to rigorously inform themselves so that they can participate in public affairs has become impossible to fulfill. Rather than submitting to the opinions of self-proclaimed experts, citizens need new ways to make sense of what is going on around them. Accountability technologies stand for new innovative approaches to bottom-up governance: technologies to monitor those in power to make sure that they are held accountable for their actions. Accountability technologies are designed to support coordinated data collection, analysis and communication to achieve social change. The past years saw many examples dedicated to this concern: citizen sensing of traffic noise or congestion, pollution; monitoring of mobility infrastructures and urban energy consumption; whistleblowers revealing corruption and misuse of power. We are interested in such projects and technologies that have succeeded in making an impact on the reality of the city. We are interested in the motivations, strategies and tactics of the people who create and use these technologies. We are also interested in the role of representation – does it make a difference how information is presented? How can data generated by citizens interface with official structures and put into action?

Below my notes and quick thoughts on what was discussed.

Collect. Data from the top-down and bottom up – reflecting on truth, trust and and politics

His talk was full of examples of citizens collecting data and using it to hold companies and governments accountable.

Another exciting project is their attempt to make an open source and cheap spectrometer. You can support their idea on Kickstarter. Also check out the Spectruino if you are interested in this type of technology.

Amber Frid-Jimenez talked about a cool artistic project titled Data is Political. I obviously love this idea and like the fact that it is diametrically opposite to one of Google’s innovation principles (number 7) that I wrote about earlier. The question they ask themselves in the project is: How does the scale of expanding databases affect artists and designers and how they work?

She played a couple of videos. The one I liked best was Benjamin Mako Hill talking about who should contol our technology (read his appeal on the site of the Free Software Foundation). Note the irony in the fact that this video was played from an Apple computer.

Communicate. Data journalism and information activism – communicating data to the public

Michael Kreil is from OpenDataCity and wears a “There is no place like 127.0.0.1” t-shirt. He talked about a few of their projects. One example is their train monitor: Zugmonitor where they publish live data of the German trains including an API to access the data.

Kreil is also the creator of the Malte Spitz phone usage visualization I wrote about earlier. He made an interesting point questioning whether it makes sense to have a private company (i.e. Deutsche Telekom) have a lot of personal data about politicians, lawyers or doctors. Think about it.

Sami Ben Gharbia has been living in exile in the Netherlands for the last 13 years. He is one of the founding directors of Global Voices Online and has started Nawaat. He talked about information visualisation in Tunesia. He gave a few examples like the 2007 project tracking the use of the presidential airplane:

or their collaboration with Wikileaks: Tunileaks. They are now focusing on open data from the government.

Marek Tuszynski is one of the founders and creative director of one of my favourite NGOs: the Tactical Technology Collective which helps human rights advocates use information, communications and digital technologies to maximise the impact of their advocacy work. They make beautiful and useful resources for activists. See Drawing by Numbers for an example.

Marek talked about what he calls the spectrum of evidence. There are a few steps: first you have to find it (often it is hidden), then you need to collect it (often there is a lot of data in all kind of forms) and then you need to curate it (show who is talking, who is listening/looking). Now that you have the evidence, you can do three things: expose to get the idea, understand to get the picture or explore to get the detail.

He talked about how the ubiquity of data visualisation tools is making data a very abstract thing: a Googlemap with some pointers on it can be locations of road-salt depots in England or casualties in Bhagdad and look completely the same. One person tried to battle this through tatooing the data on his skin, a physical manifestion of the evidence.

Image from an NPR article

Compel. From data to action – strategies for achieving change in the public sphere

Michel Reimon is a politician and a writer. He finds it hard to make a distinction between these two jobs. He is writing a book with the working title: “in _ formation, how we coordinate society”.

Reimon referenced systems theory by Niklas Luhmann. According to Luhmann there are five generalized media: love, power, money, art and truth. These are translated by Reimon into four ways that people can influence eachother in a political way. Through: relationship, physical force, compensation andinformation. Reimon says that we have shifted from relationship towards physical and then towards compensation (i.e. money, about 500 years ago). Right now we are shifting again into an age where information is the main thing used to organize ourselves. Is this the next big shift in the organization of society? Is information becoming more important than compensation to coordinate “the people”?

According to him corruption affects all aspects of life: health, safety, education, water. One in four people in the world have to pay a bribe when they interact with one of these services. 50% of the people in the 80 countries that they research think that politics and law enforcement are corrupt and two-third of the people think things are getting worse. What makes it very bad is that there is a “trickle-up” effect.

There is now a surge in enthusiasm around the concepts of accountability and transparency. He talked about social accountability and participatory budgetting. There are some big challenges in this: it is difficult to encourage engagement and sustain it, there is the problem of free-riding and overkill, and there is circumvention and sometimes even citizen capture.

Ambient accountability is the systematic use of the built environment and physical public space, in order to further transparency, accountability and the integrity of public services. Dieter showed some examples of billboards battling corruption. They are quite ineffectual, but an example of a taxicab passenger bill of rights is already more relevant:

Taxi Bill of Rights (from Beck Taxi)

Another example is the famous “How’s my Driving” bumper sticker which reduces accidents by as much as 40% for trucking companies that decide to use it.

He has created a first possible typology for ambient accountability. It should facilitate three types of things:

Making it clear what ought to happen

Facilitate monitoring and tracking to show what is actually going on

An overview of who is responsible and how to complain

Ambient accountability can overcome some of the issues mentioned above: it complements ex-ante and ex-post measures of social accountability, it scales and persists, it limits the collective action and capture problems, mixes preventative and corrective effects, is open to bottom up, top down and mixed interventions and there is no need to invent from scratch. Dieter thinks it might work quite well because it is norm-promoting, aligned with a realistic view of citizenship and completely just in time and just in the right place.

Thomas Diez is the director of Fablab Barcelona. He is developing a project with the city of Barcelona exploring the relationship between machines building things and urbanism. According to Thomas we are now in a second renaissance (or a third industrial revolution) driven by personal computers and personal production. If you look at the history of Barcelona you can see the different stages of industrialisation. What will the future city look like if we take personal production as the basis? To find out they will set up a network of fablabs in the city each deeply connected to local communities (in collaboration with the IAAC?). The car was the last technology to change the way our cities work. Diez thinks the Internet will now change the way we configure our cities. He also criticized the concept of smart cities that you see everywhere and has decided to put out an alternative: the smart citizen: a platform to generate participatory processes of the people in the cities. Connecting data, people and knowledge, the objective of the platform is to serve as a node for building productive open indicators and distributed tools, and thereafter the collective construction of the city for its own inhabitants.

I was at a full day about innovation at Mediaplaza in Utrecht today. We used a room that had a stage in the center and chairs on four sides around it. This is a bit weird as the speaker has to look in four directions to be able to connect with the audience. The funny thing is that it actualy works (also because there are four screens on each wall): each of the speakers could do nothing else than be dynamic on the stage.

Below my public notes on a few of the presentations:

Gijs van der Hulst, Business Development Manager at Google

The Wall Street Journal has done some research and found out that there has been an increase of 65% in how often top 500 companies mention the word “innovation” in their public documents in the last five years. Unfortunately the business practices of these companies have not really changed. How can you really effect change?

Google has nine “rules for innovation”:

Innovation, not instant perfection. Another way of saying this is “launch and iterate”: first push it to the market and then see if it is working.

Ideas come from everywhere. They can come from employees, but also from acquisitions or from outsiders.

A licence to pursue your dreams. An example of a 20% project that was very succesful is Gmail. This was started by somebody who didn’t like how email was working at the time.

Share as much information as you can. This is very different from most companies. The default for documents within the company is to share with everyone.

Users, users, users. At Google they innovate on the basis what users want, not on profit.

Data is apolitical. Opinions are less important than the data that supports them. They always seek evidence in the data to support their ideas. Personal note from me: Really? Really?? You cannot be serious!

Creativity love constraints. Their obsession with speed (with hard criteria for how quickly the interface has to react to user input) is an example of an enabler for many of their innovations.

You’re brilliant? We’re hiring. In the end it is about people and Google puts a lot of effort into making sure they have the right people on board.

Larger companies are more bureaucratic than smaller companies. Google is now more bureaucratic than it used to be. One of the ways this can be battled is by reorganizing which is exactly what Google has done recently.

Sean Gourley, Co-founder and CTO of Quid

Sean talked about our eye as an incredible machine with an incredible range. We enhanced our sight through microscopy and telescopy which opened up views towards the very small and the very big. We have yet to develop something that helps us see the very complex. He calls that “macroscopy”. For macroscopy you need:

big data

algorithms

visualization

He used this framing for his PhD work on understanding war. His team used publicly available information to analyze the war. When wikileaks leaked the US sig event database they could validate their data set and found that they had 81% coverage. His work was published in Science and in Nature. He decided to take it further though as he really wanted to understand complex systems. They needed to go from 300K in funding and 6 people towards an ambition level of about $100M and a 1000 people. He sought venture capital and had Peter Thiel as his first funder for Quid.

Sean then demoed the Quid software analyzing the term “big data”. Quid allows you to interactively play with the information. They extract entities from the information. So for example there are about 1500 companies involved in the big data space which can be put into different themes allowing you to see the connections between them while also sizing them for influence. Next was a fractal zoom into American Express where they looked at their patents portfolio and explored their IP creating a cognitive map of what it is that American Express does.

In 1997 Deep Blue changed the way we discussed artificial intelligence. We were beaten in chess by brute horsepower. As a reaction Kasparov started a new way of playing chess where you are allowed to bring anything you want to the chess table. The combination of human and machine turned out to be the best one. Gourley sees that as a metaphor for what he is trying to do with Quid: enhancing human cognitive capacity with machines, augmenting our ability to perceive this complex world.

Sean also talked about the adjacent possible: the way that the world could be if we used the pieces that are on the table right in front of you (e.g. the Apollo 13 Air Filter and duct tape).

His research on insurgents has taught him that some of them are successful and when they are, it is because of the following reasons:

Many groups

Internal Competition

Long Distance Connections

Reinforce Success

Fail

Shatter

Redistribute

Polly Summer, Chief Adoption Officer at Salesforce

Salesforce was recently recognized by Forbes as the most innovative company in the world. According to Polly the tech industry has significant innovations every 10 years. For each of these ten-year cycles the industry has 10 times more users.

Polly talked about how she used their social platform called Chatter to collaborate in a completely “flat” way. They now even use Chatter as a means to make the worldwide management offsite meeting radically transparent. The next step in the Chatter platform is to “gamify” it and let the individual contributors rise and recognize their contributions (they’ve acquired Rypple for example).

Agile is about maintaining innovation velocity and delivering at speed. The “prioritize, create, deliver, get feedback, iterate”-cycle needs to be sped up. One way of doing this is by listening to your customers as they are all a natural source for ideas. She showed a couple of examples from Starbucks and KLM:

Polly then shared an example of where Salesforce made a mistake: they announced a premium service that they wanted to charge extra for. Customers complained loudly on social media and within 24 hours they reversed their decision.

In 2000 they asked themselves the questions: Why isn’t all enterprise software like Amazon.com? Right now in 2011 they asked themselves a different question: Why isn’t all enterprise software like Facebook? She would consider 2011 the year of Social Revolution. Salesforce’s vision is that of a social enterprise: allowing the employee social network and the customer social network to connect (preferably in a single social profile).

On Fortune 500 Statoil rates first on social responsibility and seventh on Innovation.

Bjarte discussed the problems with traditional management. He used my favourite metaphor, traffic, comparing traffic lights to roundabouts. Roundabouts are more efficient, but also more difficult to navigate. A roundabout is values-based and a traffic light is rules-based. Roundabouts are self-regulating and this is what we need in management models too. He then touched on Theory X and Theory Y.

When you combine Theory X with a perception of a stable business environment you get traditional management (rigid, detailed and annual, rules-based micromanagement, centralised command and control, secrecy, sticks and carrots). If you perceive the business environment as stable and you have Theory Y your management is based on values, autonomy, transparency (can be an alternative control mechanism) and internal motivation. If you combine Theory X with a dynamic business environment you get relative and directional goals, dynamic planning, forecasting and resource allocation and holistic performance evaluation.

Finally, if you combine Theory Y with a dynamic business environment you get Beyond Budgeting.

Beyond Budgeting has a set of twelve principles (it isn’t a recipe, but more of an idea or a philosophy):

Governance and transparency

Values: Bind people to a common cause; not a central plan

Governance: Govern through shared values and sound judgement; not detailed rules and regulations

Transparency Make information open and transparent; don’t restrict and control it

Accountable teams

Teams: Organize around a seamless network of accountable teams; not centralized functions

Trust: Trust teams to regulate their performance; don’t micro-manage them

Accountability: Base accountability on holistic criteria and peer reviews; not on hierarchical relationships

When we combine these three things in a single number then we might run into its conflicting purposes. So the first step towards Beyond Budgeting is separating these three things. So for example the target is what you want to happen and the forecast is what you think will happen. The next step is to become more event driven rather than calendar driven.

Statoil has a programme called “Ambition to Action”:

Performance is ultimately about performing better than those we compare ourselves with.

Do the right thing in the actual situation, guided by the Statoil book, your Ambition to action, decision criteria & authorities and sound business judgement.

Within this framework, resources are made available or allocated case-by-case.

Business follow up is forward looking* and action oriented.

Performance evaluation is a holistic assessment of delivery and behaviour.

From strategic ambitions to KPIs (“Nothing happens just because you measure: you don’t lose weight by weighing yourself.”) and then into actions/forecasts and finally into individual or team goals.

Today and tomorrow I will be attending and speaking at the e-Learning Event in Den Bosch in the Netherlands. This should be one of the biggest learning technology events in the Netherlands. For some reason I have never been before, so I am curious to see how much I enjoy the event.

Theo Rinsema, General Manager Microsoft Netherlands

Rinsema talked about new ways of working (“het nieuwe werken”), a concept that in the Netherlands has been appropriated by Microsoft. His first point was that current times have accelerated the amount of change and that this means that we will have to learn contineously. Learning and change are very much related. The causes for this speed of change can be found in a couple of trends that drive change in the virtual world: cloud computing, data explosion, social computing, apps, natural interfaces, connections, computing ecosystems and mobile workplaces. Cloud computing, for example, lowers the barrier of entry in a market. This create more competition and this accelerates development.

Microsoft in the Netherlands went through a change process (1100 people work for Microsoft in the Netherlands). The focused on productivity (can we really become more productive every year or are we just working more hours?), talent (how can we attract more women to our mostly male organization?) and the boundaries between work life and private life (how do we solve the puzzle where our offices are only utilised 24% of the time, people like the flexibility, but don’t like their private/work mix). They were on a multi-year journey where they one of the key elements was creating trust between employees and about creating real conversations between staff (I wonder whether he has read the Cluetrain Manifesto).

They created a few things:

“Ruimte voor groei-dagen”: an event where the whole organizations get together and works on personal growth.

“Raad van Anders”: they have about 50.000 visitors a year coming to check out their offices to see how they are working. Rinsema thought that Microsoft was starting to believe too much in themselves. They instituted a “board of others”, inviting non-Microsoft people (young people, government workers, women, disabled people) to come into their offices, have open doors everywhere and then get feedback on what Microsoft does (with the press present). This enables Microsoft to “see with different eyes” (Proust would have said: “see with new eyes”).

“Silverlight Society a.k.a. project Crowley”: an alternate reality game in which Microsoft staff thought they were in a pilot from Microsoft research about collaborating in a virtual world. Members of this elite group of beta-tester had to solve more and more complex problems day by day forcing them to collaborate with each other and use social networks. 290 people participated.

I appreciated Rinsema’s talk for sounding authentic and for not mentioning SharePoint as an enabler for these new ways of working. This means he is smarter than 95% of the collaboration consultants in this space.

Erwin Blom on the Social Media Revolution

Erwin Blom from Fast Moving Targetsis a journalist who got addicted to the Internet in 1994 when he was working for Dutch media outfit VPRO. He produced a music program for the radio and found out that he suddenly wasn’t the expert anymore, his community of listeners knew more than him. He later became heaf of new media for the VPRO and now works for himself looking at how the net changes many aspects of society.

He showed Draw Something as an example of where people learn very naturally: his children play the game to learn English and learn how to visualize. It is incredible how quickly that game grew and for how much the creators were bought by Zynga. Another example of using game-based things is Codecademy. Another example is Foodzy. It teaches you about your own behaviours around food and teaches you a lot about food. Blom considers YouTube the largest collection of lessons in the world. In general these things work for one person, but they work even better if there are multiple people doing the same thing.

With social media everybody now is a publisher. We have endless means to tell each other stories. We underutilize the potential of storytelling (an important skill). We are now all connected and can ask each other questions and can have good conversations with people that were out of our reach (in many dimensions) before. Knowledge is now available everywhere, we need to learn how to find and select the information. Network building skills and “personal branding” skills are important for future proofing. You have to be present on this platforms and create narratives about yourselves.

He showed a nice example of what his daughter learns from her blog. She is learning about how to tell a story, about how to write headlines, about dealing with commentary about and she learns discipline (blogging twice a week). His son writes at Game Testers United and learns similar lessons. Blom asks himself why this isn’t a part of their school education. Can’t we make schools media production companies?