On September 20-21, IBM is hosting The Big Data Governance Summit at the Ritz-Carlton Bachelors Gulch in Vail, Colorado.

This event is about Metadata, Stewardship, Security, Privacy, Data
Quality, and Big Data. We can reach to the skies, pull in petabytes of
relational tables, twitter feeds, video, audio, and documents, but its
all garbage in and garbage out without Data Governance.

Velocity, Volume, and Variety without Veracity creates Vulnerability.

Everyone
knows this, and its our task to do something about it. We have to show
how it can be done – how anyone can build vibrant, dynamic, Big Data
Ecosystems that use common standards, ontologies, and methods to tag
huge volumes of data, index its value and context at high Velocity, and
search across its variety to discover trends with large clusters of
computational power that deliver high Veracity and low Vulnerability.

This
is the promise of Big Data Solutions, uniting disparate data sources
across our organizations, our cities, and our planet; leveraging data
sets based on purpose specification; searching for meaning and value
with brute force speed.

I can see this promise. Its within our
grasp. We can bridge our stovepipes of data and non-standard behaviors
into lean, mean, transformation machines that yield incredible insights
and informational power.

But this promise is only in reach with
Data Governance Solutions to provide metadata tagging, standards,
ontologies, purpose-based access protocols, audits, security &
privacy, data quality, discrete retention rules, and new tools and
technologies to automate how we do it.

The purpose of this event
is to explore how we can bring these ideas forward to help the world
adopt Big Data Ecosystems more rapidly, more successfully, more
fruitfully.

We are meeting at the Ritz-Carlton Bachelor's Gulch,
which is the wonderful venue where we first shared the IBM Data
Governance Council Maturity Model with the world in 2007. We will look
at real life examples of firms using Big Data, exploring ecosystems, and
developing standards to model and simulate them.

This meeting is hosted by the IBM Data Governance Council but it is open to all.

This morning, General Motors announced that it would no longer advertise its cars on Facebook. This announcement comes a day before the Facebook IPO, and casts a shadow on the business model of Facebook. GM said that they will continue to support their page and user community on Facebook, but that ads just weren't effective in helping consumers to make car buying decisions. Ford jumped on this announcement to say they would continue to buy ads on Facebook and that Social Media requires a consistent commitment to innovation and community development.

Maybe. But I think GM's decisions does illustrate a key problem for Facebook and Twitter - the revenue model. Social Media grew up without dependencies on ad-based revenue. On Facebook, you aren't a customer. You are a product, and its your likes, dislikes, friends, photos, videos, and content that generate value. Selling products to products via advertising is hard. Members don't use Social Media to go shopping. There's no commerce platform there. They use it to be social. There are so many other outlets that are more effective for advertising than Social Media.

So how should Facebook and Twitter make money? My idea: make it collective. The value is in the data.

1. Make terms and conditions explicit that every member owns their own data via copyright. This does two positive things.

A. It indemnifies Facebook and Twitter for the crazy, infringing, and potentially libelous posts of their members by allowing them to claim that they are conduits of content rather than publishers or distributors.

B. Copyright establishes the rights to royalties for content created and posted on their networks, which enables the next step.

2. Allow members to opt-in to Big Data analysis by Social Media partners and intermediaries.

3. Charge Social Media for Big Data Searches by data volume.

4. Pay members royalties every time their data is used in Big Data Searches.

This simple model creates powerful incentives that transform user members from products into mutual social network content providers with an economic interest in posting content that will be used in Big Data searches. It establishes data property rights that insulate Facebook and Twitter from vouching for the content on their networks. Members will also discover that providing high quality data that companies want to search for means more royalties and so the system will produce better behaviors. And it creates a 2-tier royalty distribution model that will also pay Facebook and Twitter handsome revenue that will change online advertising and make every other content aggregater change too.

Of course, Facebook and Twitter will have to sort our who's a person and who's a bot, and will have to provide content creation tutorials to help users/customers create content that has value by sharing the top 100 Big Data queries and sample results.

But this Business Model has something for everyone and is a true win:win. It benefits customers by establishing data property rights and royalties for content. It benefits organizations who want to do Big Data searches by providing ever richer data streams of high quality and availability. And it benefits Facebook, Twitter, and their investors by providing an enormous profit making engine selling Data.

The Data is the Value. The more there is, the more valuable it becomes. Pay your customers to create higher quality data and charge your partners to use it. Its a simple Business Model.

I use Big Data every day. I don't have Hadoop, a Data Warehouse, ETL, or a big analytical engine. But I use search engines, which are indexes of web-pages from around the world, to discover related and unrelated facts. I use Twitter and Linkedin, which aggregate the ideas of millions of people, to understand the sentiments of the people I follow. And I make decisions, and mistakes, with this information every day.

We all do. And in that context, we are all Big Data users and abusers, and we can identify with larger enterprises that are also confronting vast streams of information from every corner of the globe, created by individuals, communities, corporations, and governments. We as individuals never had industrial data management applications. We never had Data Governance Councils, Stewards, or Data Management professionals. So we've been selecting data streams first and using the ultimate analytical engine - our brains - to integrate that information, glean trends, and make decisions.

What's new about Big Data is that large enterprises are copying the information processes that We The People use every day. They are selecting streams first, aggregating them second, determining application third, making decisions fourth. Judging consequences of decisions... later, if at all. Organizations around the world are deciding to retain information much longer because there is a belief that latent, slow developing, trends may lie dormant in that information that can be discovered much later.

But with vast volumes of information, long retention cycles, high velocity decision-making has the potential to do enormous damage as much as enormous good. And we know from experience, that decision-making is often influenced by cyclical trends, personal prejudice, and national dogma. Counter-Cyclical views can be marginalized. Whistle-blowers can be fired.

But Big Data also offers an historic opportunity for Data Management. This industry for too long has been seen as back-office archivists recording the deeds and attributes of heroic business leadership in dingy databases in large glass-house mainframes and data warehouses. They have taken back seats to application developers and business analysts who first and foremost collect the requirements of business users for new applications, features, and functions.

But Big Data changes all of that. It makes information sources and streams more important than applications, features, and functions. It changes the emphasis in value creation and puts the onus on Information Management to produce better sources and streams, easier aggregation and integration, manufacturing information products any user can leverage in any application they wish.

Its large enterprises automating the way We The People use online information every day, and the power and consequences of this paradigm shift are profound and potentially quite scary.

We need Information Governance over every part of Big Data to assure that organizations can answer these fundamental questions:

1. Can we trust our sources?

2. Do we know where they came from?

3. How do we verify the authenticity of the information?

4. Can we verify how the information will be used?

5. What decision options do we have?

6. What is the context for each decision?

7. Can we simulate the decisions and understand the consequences?

8. Will we record the consequences and use that information to improve our Big Data information gathering, context, analysis, and decision-making processes?

9. How will we protect all of our sources, our processes, and our decisions from theft and corruption?

This morning, the Information Governance Community began discussing these issues in a global teleconference moderated by IDC. We have just scratched the surface of these issues and have much more to discuss. We have agreed to create a new category - Big Data - in our Maturity Model to provide organizations with new methods to benchmark their Big Data Governance maturity. But we also agreed that our existing Maturity Model categories also apply and we need to update them to include Big Data issues and questions.

I believe this is critical work. Big Data is an enormous opportunity to make information the arbiter of value creation in the Information Age. But it is also an enormous risk because the same solutions can be used to make dangerous and destructive decision-making a high volume, high velocity science.

Every new technology can be used for both good and evil. Join the Information Governance Community to help ensure Big Data serves the best possible uses.