In the 1980’s, John Naisbitt wrote, “We have for the first
time an economy based on a key resource [information] that is not only
renewable, but self-generating.Running
out of it is not a problem, but drowning in it is.[i]” Little did Naisbitt know how much information
we’d be creating 30 years later.By some
estimates we are generating over 1 zettabyte (1x1021) per year[ii].How do you avoid drowning in all that data,
and gain insights?That is the realm of
Big Data Solutions.

The IBM
Analytics Solutions
Center recently ran a
seminar on Big Data.We started off
talking about the ‘big data conundrum.’The volume of data is growing so rapidly, that the fraction of data that
an enterprise can analyze is decreasing.Because of this gap, we’re getting ‘dumber’ about our organization and
job over time.This is driving the need
for improved analytics and platform technology that can help us to process this
large volume of data.

What do customers want to do with big data?Popular requests we’ve heard include: I/T log
analytics, RFID tracking and analytics, fraud detection and modeling, risk
modeling, 360o view of a
person/place/thing, call center record analysis, and fusion of multiple
unstructured objects (e.g., pictures, audio).Since we now collect so much data, the possibilities are only limited by
your imagination –and our ability to extract insights from the data.

In order to process these large volumes of data, special
systems and applications are being deployed.Many of these are based on the Apache Hadoop middleware which supports a
distributed file system and processing environment for scalability,
flexibility, and fault tolerance.IBM’s
big data platform includes offerings based on Apache’s Hadoop with enhancements
to improve workload optimization, security, and cluster hardening.The IBM offering (BigInsights) also comes
packaged with advanced analytical capabilities for data visualization, text
analysis, and support machine learning analytics.One interesting item was the announcement
that the enhancements would be packaged to allow them to work with other Hadoop
distributions, such as the Cloudera™ hadoop.Another offering discussed in the seminar was the Stream computing
offering designed to efficiently process “data in motion,” such as stock ticker
streams and social media feeds.

One of the biggest challenges given the huge volume of
information is finding the right information.Governments, Utilities, and financial companies have this problem in
particularly because of the huge volumes they deal with.A recent IBM acquisition, Vivisimo, has
developed a next-generation search engine to provide search across multiple big
data and traditional platforms.Vivisimo
provides a scalable search application framework that can perform a federated
search across many different data sources including the web, social media,
content stores, and more traditional structured database systems.One feature that may be particularly
appealing to government agencies and corporate environments is its ability to
map individual access permissions of each data item, authenticate users against
each target system and limit access to information a user would be entitled to
view if they were directly logged into the target system.

They offer a clever search tool that provides easy
navigation and discovery, using both structured metadata (faceted search) and
keywords that the program dynamically discovers based on analysis of
unstructured content. Vivisimo provides an agile development layer, to allow
users to quickly create applications and dashboards to discover, navigate and
visualize information.

The seminar also featured a customer case study of using big
data for cybersecurity mission operations. IP traffic is growing at 29% CAGR, and with it,
the cyber-threats they are facing. Unfortunately, the customer’s headcount
isn’t growing, so more automated ways are need to detect and respond to threats.For this application, timeliness is key –
dealing with threats in real-time.To
identify potential threats, they want to be able to compare current threat and
traffic data to norms from the recent past, and similar periods in the
past.Their solution utilizes the
Netezza data warehouse appliance for near real-term data and IBM BigInsights
for long term storage.The solution eliminates
as many mundane “data retrieval” tasks as possible for the analyst, and provided
the analysts with those datasets that had a high probability of being
“interesting.” In this way, the solution helps the analyst deal with the
extreme data volumes, and yet remains flexible to the changing threat
environment.

Do you have an opportunity to use massive amounts of data to
accomplish a business/mission objective that can’t be done when we were limited
to small volumes of data?Do you have an
innovative solution?We’d like to hear
your stories about big data.

For more on the Big Data seminar, see our ASC website under past events.

Does your government agency monitor the social media for information relevant to your mission? Should it?

IBM's Analytics Solution Center recently held a seminar to explore
how agencies and companies can obtain value and insight using social
media analysis.

Pat Fiorenza discussed how agencies can develop an ROI Model - Return
on Influence Model - for social media. Agencies use social media
analytics to help inform their decision making by gathering
information/research, and learn what other agencies and citizens are
saying. Interesting examples from CDC and Govloop were provided.
Learn more here.

Ed Burek, IBM, talked about how savvy companies are now taping into
customer generated content, how government agencies could do the same to
learn how tax payers feel about government actions and messaging. He
gave examples of how regulatory agencies could received the unvarnished
comments from those impacted by regulations, as well as how they could
stay on top of "negative chatter." IBM has created a framework to
derive business insight from the vast amounts of social media that is
now being transmitted. Called Cognos Consumer Insight it provides real
time information on trends and sentiment.

Rick Lawrence, IBM Manager for Machine Learning at Watson Research
Center next talked about the leading edge of social media analytics. He
provided examples from the research portfolio on discovering Who are
the Key Influencers? , Identifying emerging topics of discussion, and
Mapping the billions of tweet to concepts that we really care about.

All of the presentations are available on the ASC website under Past Events (May 10, 2012)

Does your agency care about what its constituents are saying about it
on social media? Does your agency need to have real time intelligence
on events within its mission space? With 340 million Tweets per Day, 2
million blog posts, and 500 million facebook updates, how can you find
the important information? Social Media Analytics may be an idea
whose time has come.