We assembled a panel of experts to explore how big data changes the
status quo for architecting the enterprise. We'll learn how large
enterprises should anticipate the effects and impacts of big data, as
well the simultaneous impacts of cloud computing and mobile.

It’s
been an interesting thread throughout the conference for me to factor
where big data begins and plain old data, if you will, ends. Of course,
it's going to vary quite a bit from organization to organization.

When an enterprise architect and the business architect
looked at data a few years ago, they might not have been as aware of
these boundaries and the importance of data. They perhaps were thinking
that the database administrators and the business intelligence (BI)
folks would take care of that, and they just had to manage the fruits
of the data vis-à-vis applications and integration points.

I
don’t think that’s the case anymore, and one of the points we're going
to get into now is where the enterprise architect needs to be factoring the impacts of big data.

Furthermore, there seems to be
the need to do things differently, not just to manage the velocity and
the volume and the variety of the data, but to really think about data fundamentally and differently. For many companies, data is now a product
itself. That data can be monetized.

The analysis from
the data becomes important to more and more people in the company, so
that your employees, your partners, and those in your supply chain will
be interacting with your data -- and the analysis from your data -- more
than before.

So I think we need to also think about data differently. And, we need to think about security, risk and governance.
If it's a "boundaryless organization" when it comes your data, either as a
product or service or a resource, that control and management of which
data should be exposed, which should be opened, and which should be very closely guarded all need to be factored, determined and implemented.

Chris,
let’s start with you. You mentioned that big data to you is not a
factor of the size, because NASA's dealing with so much. It’s when you
run out of steam, as it were, with the methodologies. Maybe you could
explain more. When do you know that you've actually run out of steam
with the methodologies?

Chris Gerty: When we
collect data, we have some sort of goal in minds of what we might get
out of it. When we put the pieces from the data together, it either
maybe doesn't fit as well as you thought or you are successful and you
continue to do the same thing, gathering archives of information.

At that point, where you realize there might even
something else that you want to do with the data, different than what
you planned originally, that’s when we have to pivot a little bit and
say, "Now I need to treat this as a living archive. It's a 'it may live
beyond me' type of thing." At that point, I think you treat it as
setting up the infrastructure for being used later, whether it’d be by
you or someone else. That's an important transition to make and might be
what one could define as big data.

Gardner:
Andras, does that square with where you are in your government
interactions -- that data now becomes a different type of resource, and
when you are not able to execute or avail yourself of its value, then
you know you need to do things differently?

Andras Szakal: The importance of data hasn’t changed. The data itself, the veracity of the data, is still important. Transactional data will always need to exist. The difference is that you have certainly the three or four Vs,
depending on how you look at it, but the importance of data is in its
veracity, and your ability to understand or to be able to use that data
before the data's shelf life runs out.

Some data has a shelf life that's long lived. Other
data has very little shelf life, and you would use different approaches
to being able to utilize that information. It's ultimately not about the
data itself, but it’s about gaining deep insight into that data. So
it’s not storing data or manipulating data, but applying those
analytical capabilities to data.

Gardner: Bob,
we've seen the price points on storage go down so dramatically. We've
seem people just decide to hold on to data that they wouldn’t have
before, simply because they can and they can afford to do so. That means
we need to try to extract value and use that data. From the perspective
of an enterprise architect, how are things different now, vis-à-vis
this much larger set of data and variety of data, when it comes to
planning and executing as architects?

Robert Weisman:
One of the major issues is that normally organizations are holding two
orders of magnitude more data then they need. It’s an huge overhead,
both in terms of the applications architecture that has a code basis,
larger than it should be, and also from the technology architecture that
is supporting a horrendous number of servers and a whole bunch of technology stuff that they don't need.

The issue for the architect is to figure out as what data is useful, institute a governance process, so that you can have data lifecycle management,
have a proper disposition, focus the organization on information data
and knowledge that is basically going to provide business value to the
organization, and help them innovate and have a competitive advantage.

Can't afford it

And
in terms of government, just improve service delivery, because there's
waste right now on information infrastructure, and we can’t afford it
anymore.

Gardner: I suppose big data is part of
the problem, dealing with so much in redundancy and duplication through
the lifecycle of data and what have you, but the data is also part of
the solution in terms of getting the knowledge about what you should or
shouldn't be doing as a business. So it's difficult to know what to keep
and what not to keep.

I've actually spoken to a few
people lately who want to keep everything, just because they want to
mine it, and they are willing to spend the money and effort to do that.
Jim Hietala, when people do get to this point of trying to decide what
to keep, what not to keep, and how to architect properly for that, they
also need to factor in security. It shouldn't become later in the
process. It should come early. What are some of the precepts that you
think are important in applying good security practices to big data?

Planning the architecture, looking at bringing in
third-party controls to give you the security mechanisms that you are
used to in your older platforms, is something that organizations are
going to have to do. It’s really an evolving and emerging thing at this
point.

Gardner: There are a lot of unknown
unknowns out there, as we discovered with our tweet chat last month. Some
people think that the data is just data, and you apply the same
security to it. Do you think that’s the case with big data? Is it just
another follow-through of what you always did with data in the first
place?

Hietala: I would say yes, at a conceptual level, but it's like what we saw with virtualization.
When there was a mad rush to virtualize everything, many of those
traditional security controls didn't translate directly into the
virtualized world. The same thing is true with big data.

When
you're talking about those volumes of data, applying encryption,
applying various security controls, you have to think about how those
things are going to scale? That may require new solutions from new
technologies and that sort of thing.

Gardner:
Chris Gerty, back to your experiences at NASA. You've taken the approach
of keeping as much of that data and information as open as you can,
fostering more research and the ability for people to do things with the
data that you may never have been visioned yourselves. When it comes to
that governance, security, and access control, are there any lessons
that you've learned that you are aware of in terms of the best of
openness, but also with the ability to manage the spigot?

Gerty:
Spigot is probably a dangerous term to use, because it implies that all
data is treated the same. The sooner that you can tag the data as
either sensitive or not, mostly coming from the person or team that's
developed or originated the data, the better.

Kicking the can

Once
you have it on a hard drive, once you get crazy about storing
everything, if you don't know where it came from, you're forced to put
it into a secure environment. And that's just kicking the can down the
road. It’s really a disservice to people who might use the data in a
useful way to address their problems.

We
constantly have satellites that are made for one purpose. They send all
the data down. It’s controlled either for security or for intellectual property (IP),
so someone can write a paper. Then, after the project doesn’t get
funded or it just comes to a nice graceful close, there is that extra
step, which is almost a responsibility of the originators, to make it
useful to the rest of the world.

Gardner: Let’s
look at big data through the lens of some other major trends right now.
Let’s start with cloud. You mentioned that at NASA, you have your own private cloud that you're using a lot, of course, but you're also now dabbling in commercial and public clouds. Frankly, the price points that these cloud providers are offering for storage and data services are pretty compelling.

So
we should expect more data to go to the cloud. Bob, from your
perspective, as organizations and architects have to think about data in
this hybrid cloud
on-premises off-premises, moving back and forth, what do you think
enterprise architects need to start thinking about in terms of managing
that, planning for the right destination of data, based on the right mix
of other requirements?

Weisman: It's a good
question. As you said, the price point is compelling, but the security
and privacy of the information is something else that has to be taken
into account. Where is that information going to reside? You have to
have very stringent service-level agreements (SLAs)
and in certain cases, you might say it's a price point that’s
compelling, but the risk analysis that I have done means that I'm going
to have to set up my own private cloud.

Right now, everybody's saying is the public cloud is
going to be the way to go. Vendors are going to have to be very
sensitive to that and many are, at this point in time, addressing a lot
of the needs of some of the large client basis. So it’s not
one-size-fits-all and it’s more than just a price for service.
Architecture can bring down the price pretty dramatically, even within
an enterprise.

Gardner: Andras, there's this
mash up of cloud and big-data trends, the in-memory approaches, where we
are no longer taking batches of data, cleansing it, and deduping it and
bringing it into a warehouse, going through batch. We're still doing
that' of course, but it seems that for a number of different
applications of data and analytics,
in-memory technology particularly, if you can control that in a cloud
environment, private cloud or otherwise, it’s starting to change the
game for that fast, real-time feedback loop benefit.

It's a roundabout way of asking if the cloud and big data come together in a way that’s intriguing to you and in what ways?

Szakal: Actually it’s a great question. We could take the rest of the 22 minutes talking on this one question. I helped lead the President’s Commission on big data that Steve Mills
from IBM and -- I forget the name of the executive from SAP -- led. We
intentionally tried to separate cloud from big data architecture,
primarily because we don't believe that, in all cases, cloud is the
answer to all things big data. You have to define the architecture
that's appropriate for your business needs.

However, it
also depends on where the data is born. Take many of the investments
IBM has made into enterprise market management, for example, Coremetrics,
several of these services that we now offer for helping customers
understand deep insight into how their retail market or supply chain
behaves.

Born in the cloud

All
of that information is born in the cloud. But if you're talking about
actually using cloud as infrastructure and moving around huge sums of
data or constructing some of these solutions on your own, then some of
the ideas that Bob conveyed are absolutely applicable.

I
think it becomes prohibitive to do that and easier to stand up a hybrid
environment for managing the amount of data. But I think that you have
to think about whether your data is real-time data, whether it's data
that you could apply some of these new technologies like Hadoop to, Hadoop MapReduce-type solutions, or whether it's traditional data warehousing.

Data
warehouses are going to continue to exist and they're going to continue
to evolve technologically. You're always going to use a subset of data
in those data warehouses, and it's going to be an applicable technology
for many years to come.

Gardner: So suffice it
to say, an enterprise architect who is well versed in both cloud
infrastructure requirements, technologies, and methods, as well as big
data, will probably be in quite high demand. That specialization in one
or the other isn’t as valuable as being able to cross-pollinate between
them as it were.

Szakal: Absolutely. It's
enabling our architects and finding deep individuals who have this
unique set of skills, analytics, mathematics, and business. Those
individuals are going to be the future architects of the IT world,
because analytics and big data are going to be integrated into
everything that we do and become part of the business processing.

Gardner:
Well, that’s a great segue to the next topic that I am interested in,
and it's around mobility as a trend and also application development.
The reason I lump them together is that I increasingly see developers
being tasked with mobile first.

When you create a new
app, you have to remember that this is going to run in the mobile tier
and you want to make sure that the requirements, the UI,
and the complexity of that app don’t go beyond the ability of the
mobile app and the mobile user. This is interesting to me, because data
now has a different relationship with apps.

We used to
think of apps as creating data and then the data would be stored and it
might be used or integrated. Now, we have applications that are simply
there in order to present the data and we have the ability now to
present it to those mobile devices in the mobile tier, which means it
goes anywhere, everywhere all the time.

Let me start
with you Jim, because it’s security and risk, but it's also just
rethinking the way we use data in a mobile tier. If we can do it safely,
and that’s a big IF, how important should it be for organizations to
start thinking about making this data available to all of these devices
and just pour out into that mobile tier as possible?

Hietala:
In terms of enabling the business, it’s very important. There are a lot
of benefits that accrue from accessing your data from whatever device
you happen to be on. To me, it is that question of "if," because now
there’s a whole lot of problems to be solved relative to the data
floating around anywhere on Android, iOS,
whatever the platform is, and the organization being able to lock down
their data on those devices, forgetting about whether it’s the
organization device or my device. There’s a set of issues around that
that the security industry is just starting to get their arms around
today.

Mobile ability

Gardner:
Chris, any thoughts about this mobile ability that the data gets more
valuable the more you can use it and apply it, and then the more you can
apply it, the more data you generate that makes the data more valuable,
and we start getting into that positive feedback loop?

Gerty:
Absolutely. It's almost an appreciation of what more people could do
and get to the problem. We're getting to the point where, if it's
available on your desktop, you’re going to find a way to make it
available on your device.

That same security questions
probably need to be answered anyway, but making it mobile compatible is
almost an acknowledgment that there will be someone who wants to use it.
So let me go that extra step to make it compatible and see what I get
from them. It's more of a cultural benefit that you get from making
things compatible with mobile.

Gardner: Any
thoughts about what developers should be thinking by trying to bring the
fruits of big data through these analytics to more users rather than
just the BI folks or those that are good at SQL
queries? Does this change the game by actually making an application on
a mobile device, simple, powerful but accessing this real time updated
treasure trove of data?

Gerty: I always think of
the astronaut on the moon. He's got a big, bulky glove and he might
have a heads-up display in front of him, but he really needs to know
exactly a certain piece of information at the right moment, dealing with
bandwidth issues, dealing with the environment, foggy helmet wherever.

It's
very analogous to what the day-to-day professional will use trying to
find out that quick e-mail he needs to know or which meeting to go to --
which one is more important -- and it all comes down to putting your
developer in the shoes of the user. So anytime you can get interaction
between the two, that’s valuable.

Gardner: Bob?

Weisman: From an enterprise architecture
point of view my background is mainly defense and government, but
defense mobile computing has been around for decades. So you've always
been dealing with that.

The main thing is that in many
cases, if they're coming up with information, the whole presentation
layer is turning into another architecture domain with information
visualization and also with your security controls, with an integrated
identity management capability.

It's like you were
saying about astronaut getting it right. He doesn't need to know
everything that’s happening in the world. He needs to know about his
heads-up display, the stuff that's relevant to him.

So
it's getting the right information to person in an authorized manner, in
a way that he can visualize and make sense of that information, be it
straight data, analytics, or whatever. The presentation layer, ergonomics,
visual communication are going to become very important in the future
for that. There are also a lot of problems. Rather than doing it at the
application level, you're doing it entirely in one layer.

Governance and security

Gardner:
So clearly the implications of data are cutting across how we think
about security, how we think about UI, how we factor in mobility. What
we now think about in terms of governance and security, we have to do
differently than we did with older data models.

Jim Hietala, what about the impact on spurring people towards more virtualized desktop
delivery, if you don't want to have the date on that end device, if you
want solve some of the issues about control and governance, and if you
want to be able to manage just how much data gets into that UI, not too
much not too little.

Do you think that some of these
concerns that we’re addressing will push people to look even harder,
maybe more aggressive in how they go to desktop and application
virtualization, as they say, keep it on the server, deliver out just the
deltas?

Hietala: That’s an interesting point.
I’ve run across a startup in the last month or two that is doing is
that. The whole value proposition is to virtualize the environment. You
get virtual gold images. You don't have to worry about what's actually
happening on the physical device and you know when the devices connect.
The security threat goes away. So we may see more of that as a solution
to that.

Gardner: Andras, do you see that that
some of the implications of big data, far fetched as it may be, are
propelling people to cultivate their servers more and virtualize their
apps, their data, and their desktop right up to the end devices?

Szakal:
Yeah, I do. I see IBM providing solutions for virtual desktop, but I
think it was really a security question you were asking. You're
certainly going to see an additional number of virtualized desktop
environments.

Ultimately, our network still is not
stable enough or at a high enough bandwidth to really make that useful
exercise for all but the most menial users in the enterprise. From a
security point of view, there is a lot to be still solved.

And part of the challenge in the cloud environment that we see today is the proliferation of virtual machines (VMs)
and the inability to actually contain the security controls within
those machines and across these machines from an enterprise perspective.
So we're going to see more solutions proliferate in this area and to
try to solve some of the management issues, as well as the security
issues, but we're a long ways away from that.

Gardner:
Okay, I am going to put you on the spot a little bit, because I want
you to provide to us some examples of how you think big data is being
used in a way that's fundamentally different than traditional data.

If
you don't have permission to name these people don't, but you can just
describe the use case. Let's just start with you Chris. You probably
have quite a few in your own organization, but are there any ways that
you're aware of that people are using big data that illustrate how
fundamentally different and powerful this is going to be?

Most compelling

Gerty: We have several small projects that have come out of the events that we’ve worked on. The International Space Apps Challenge
I mentioned before. These are mostly in the visualization realm, but
it's the problems that go beyond those events that are really the most
compelling. I’ll briefly touch on one.

A challenge
that we’ve put out in the last Space Apps Challenge was to write an app
that would allow someone to use NASA data to allow a farmer anywhere in
the world to have an iPhone app or iPad app and say. "I live here. What should I grow? What could make me the most money and help my village the most?"

The
team that worked on it quickly realized that even great satellite data
didn't work for their application. There are too many other factors.
There was the local economy, the runoff levels, and things that they
just didn't have access to from the NASA data. So they decided that this
was more than a just weekend project and they wanted to build that data
set that they needed, so that they could finally make the product.

They
found other collaboration mechanisms to continue the project after the
Spaces Apps Challenge. They’ll be returning this year to the second one
that we do in April with an entirely different view on the world,
because they actually have some data sets now that they've been building
up. They made some mechanism to capture it from the local environment.

Gardner:
So that’s a great reminder that we’re not just talking about big data,
but we’re talking about multiple big data and which ones you can pull
together -- joined or otherwise -- to collate and produce big-data
analysis results for something very, very interesting.

Gerty:
Big data, by itself, isn't magical. It doesn't have the answers just by
being big. If you need more, you need to pry deeper into it. That’s the
example. They realized early enough that they were able to make
something good.

Gardner: Chris, that’s a very
good cause, but in a purely commercial sense, as we see more companies
doing cloud ecosystem and partnership activities, when they start to
share their data with that big "if" of secured and provisioned properly
with other people in their markets, in their businesses, very powerful
and interesting things can happen. Jim Hietala, any thoughts about
examples that illustrate where we’re going and why this is so important.

Hietala:
Being a security guy, I tend to talk about scare stories, horror
stories. One example from last year that struck me. One of the major
retailers here in the U.S. hit the news for having predicted, through
customer purchase behavior, when people were pregnant.

They
could look and see, based upon buying 20 things, that if you're buying
15 of these and your purchase behavior has changed, they can tell that.
The privacy implications to that are somewhat concerning.

An
example was that this retailer was sending out coupons related to
somebody being pregnant. The teenage girl, who was pregnant hadn't told
her family yet. The father found it. There was alarm in the household
and at the local retailer store, when the father went and confronted
them.

Privacy implications

There
are privacy implications from the use of big data. When you get
powerful new technology in marketing people's hands, things sometimes go
awry. So I'd throw that out just as a cautionary tale that there is
that aspect to this. When you can see across people's buying
transactions, things like that, there are privacy considerations that
we’ll have to think about, and that we really need to think about as an
industry and a society.

Gardner: Just because you can do something, doesn't necessarily mean you should.

Allen Brown:
Can I put some of the questions in and see how you can do with them?
The first one is more of a bit of a security question, but also concerns
things like thoughts on self-protecting data, like the Jericho Forum
issues, and another one that says, in terms of security, that big data
may not have strong confidentiality and availability requirements, but
for collaboration, doesn't integrity nearly always need to considered.
Other examples are that there is no integrity requirement.

Gardner: Jim, I think it’s best directed to you to start. These are issues about controlled managements. Any thoughts?

Hietala:
I'll get straight to the integrity piece. The integrity of the data,
whether it’s on older platforms or big data, is certainly an issue. When
folks are using big data, that data has to have integrity, and there
has to be adequate controls protect the data. So I think that is kind of
a fundamental thing for big data as well.

Gardner: Anyone else on these issues of protection?

Gerty: It’s not only a matter of data protection.
It's what we do with the data. Big data is a term that is kind of
heading towards the end of its usefulness, because it's not the data and
how large it is that's useful. It's actually how we apply these deep
analytics solutions, for example Watson.
You saw the Watson win on Jeopardy, but now Watson is a product that’s
being used to help some customers diagnose disease and work with the
insurance companies.

How you actually utilize that data
to derive value through this deep analytics solution is through a new
set of artificial-intelligence applications called cognitive computing.
So cognitive computing, how you drive all of this information, and how
you apply it in the context of its usefulness to privacy and security is
going to be huge in the following years.

Gardner: Allen, other questions from the audience or online?Brown: Interoperability is the focus of a couple of questions.
One is asking if you can address the expected interoperability issues
across semantics of big data. The other part of it asks what’s the
unique challenge or problems that unstructured, big data from Twitter, Facebook, and so on present?

Gardner:
This might be an area where the concepts work for traditional data, and
it might still be the case that is we have to pull all these different
data types, structured and unstructured, together to work in some
holistic fashion. Bob, any thoughts about big data, correlating of
different data is that different from the past? Is there something new?

Weisman:
I'm looking at techniques that were pioneered 20-30 years ago on the
artificial intelligence, knowledge base system side, and are still is
relevant today. As a matter of fact they're more relevant than they've
ever been. There is lot opportunity, but it doesn’t forego having a good
interoperability architecture, understanding where your contacts are,
and being able to integrate data. Right now most of analytics is
kiboshed, because they spent all their time doing data integration,
versus analytics, and it’s a great waste of a lot of people's times.

So if you architect this from the get-go, get the proper metadata,
which will address some of the integrity, and understand the concept of
data quality which is what’s coming through, that will go a long way to
resolving some of these issues, but the architecture is going to be
key, as is rigorous planning.

More usable

Gardner: Andras, same question. Is there something new or different about treating data in order to make it more useable?

Szakal:
Big data is coming to us in all sorts of forms and formats. It’s coming
from different sources. We don't really know the validity. The validity
is determined by the application of the analytics solution. You'll have
to have some internal process, some governance process, to determine
whether you're getting the validity of the data that you expect.

When I was working as a graduate student for the psychology department as the SPSS
programmer, people would bring their work to me. They would try to
apply analytics to make any point they possibly could. It's the old
story about making statistics mean anything you want. But you have to be
very careful about how you do that, because it’s going to have a huge
impact on your business.

Gardner: Jim, in the
realm of privacy and security, any thoughts about what types of
unstructured content you may or may not want to bring in? Is this
something now that you need to consider, picking and choosing of data
types with an overview or lens towards security and privacy issues?

Hietala:
In terms of unstructured content, there’s a whole lot of work to be
done there to understand the growth of that stuff in average enterprise
and what's really in unstructured content stores. A lot of that is
ending up in collaboration platforms today, and most organizations don’t
have a great understanding of what’s really in there.

It’s
the regulated data in there, sensitive data in there. That’s an area
where there’s work to be done by most enterprises to understand that
unstructured content and the risk that it represents to the business.

Gardner:
We haven’t got into it,, but another factor is the whole social sphere
of data, and information that is being generated constantly.

Brown: The next question is a concern about whether it's causing a disruption to object orientation.
Object-oriented data is encapsulated by the application, and making big
data shared seems to break this approach. What are your thoughts on
that?

Gardner: All right, from an architectural
standpoint we're treating data a little bit differently, separating it
entirely from an application or service.

Hietala:
We just did a study that of this exact same question and problem. We
found that there's no official programming model of the big-data world
or in the cloud, although it is all about the client and integration
with services. But there are all sorts of programming models out there. I
would say that you apply the one that’s got the best and most
appropriate approach.

Information centric

Weisman:
It’s starting to put the emphasis back on the information syllable and
information technology. Object orientation was meant to basically
support an information-centric approach, and now it’s being used much
more as a service-centric approach. Now we’re going to go back to a much
more information-centric, information-engineering approach and a lot of
the architecture enabled by big data.

Gardner:
Maybe you could just expand that a little bit for me? Does that mean we
have a different type of application? That is to say that data is the
application? What were the implications of what you just said?

Weisman:
When object orientation first came out, the idea was to take the data
and build services around it. Now, we have services that pass data back
and forth. Most organizations have hundreds of applications with
encapsulated data within them, and they can’t share it. Often the same
information is found in hundreds of applications, which causes a huge
security headache. Now we should be looking at getting much more
information centric which is the core of information technology,
information related technology.

Gardner: So
really it's a flip architecturally, when you think about maintaining a
pool resource of information, and applications are either newly built to
expose and leverage, or all your existing applications also have to
bring into and connect to and integrate. Fair enough?

Weisman:
I think it’s a separation between process-centric services and
information-centric services and harmonizing those. That will probably
be the best bang for the buck.

Gardner: So now
we're into IT transformation and business transformation, and you have
to rethink your data center and your entire apparatus for supporting
your storage. People are going to get into that anyway for some of the
reasons we talked about, but again, we could look at big data and say
this is an accelerator to some of those transformation efforts.

Brown:
Something that has been troubling me is around the data architecture.
Mike Walker, now at Dell, on the live stream, is asking what specific
guidance and best practices can you give to enterprise data architects
to properly architect their information architectures.

Weisman:
We're talking that this afternoon. There’s going to be an entire track
or two tracks on data architecture, which will be providing the guidance
and it’s big-data centric.

Gerty: You're still
going to be able to identify the service that provides the
authoritative source for a set of data and marry that with other
information, as necessary, whether it be sentiment analysis or what not,
but you're always going to have to be able to point to that
authoritative source.

Brown: Well, data architectures can be highly structured and big data can be somewhat unstructured. How do you marry the two?

Authoritative records

Gerty:
How do you marry the two? Transactional systems are still very
important. You have to be able to identify the authoritative records.
Big data usually comes in multiple sources from multiple, different
venues. The best example of the use of big data is around sentiment
analysis, taking feeds from Twitter, Facebook, and these multiple
sources, and then being able to analyze the information to the context
of the authoritative sources. So your analytics have to take all of this
into consideration. Brown: Okay, we are just out of time. I just want to get a quick
comment on these two other live streams. How are companies dealing with
the shortage of big data scientists? Are they training current
employees?

Gardner: A key question is who is
actually spearheading this? Who is in the best position to be qualified?
Under whose auspices do these big data initiatives fall? Let’s start
with you Chris. Any insight as to how you've done it at NASA?

Gerty:
I would draw a parallel from when I was in Mission Control and pretty
highly trained. They wipe your brain and fill it up with everything you
need to know, but we weren't really enabled to make those decisions,
until we went through the data, page by page, and looked at each
individual blip. If you can automate those, then you need less of
whomever it is who's doing the job.

Automation there
would have helped us immensely to make those decisions on the fly,
rather than going over pages and pages of data from our batteries
charging. It's not maybe that you need more data scientists, but you
need the right data scientists. Then you need to be able to leverage off
of other people’s data scientists. That's why open source is so
attractive to us. You only need to do it once and then you can go off of
it.

Gardner: Jim Hietala, the people that should be doing this, their qualification certification, organizational structure, any thoughts?

Hietala:
It's way too early to certify people in this category right now. We
really need individuals who went to graduate school to understand the
proper application of analytics and mathematics. Those individuals would
be highly valuable and prized, especially as they learn to how to apply
that knowledge to your business.

Gardner: It’s tough to find the people who have deep and the wide expertise. Last word you, Bob?

Weisman:
We have to take a look at career development within the CIO ranks.
Making sense of data requires good business knowledge and too many
people are being isolated within the CIO rank. They should be
circulating throughout the companies, so they know what the company is
doing, and then come back in. It's much more valuable.

There
are some programs now that are joint ventures between the computer
science departments and the business schools, and I think those are at
the graduate level. As Andras was saying, they could provide people in
their early 30s that can really do a fantastic job, and we really start
taking advantage of this.

Brown: That's all we have time for. I think you've done a marvelous job, thank you very much.

Gardner:
We’ve been talking with a panel of experts on how big data changes the
status quo for architecting the enterprise. We've heard how large
enterprises should better anticipate and prepare for the effects and
impacts of big data, as well the simultaneous impacts of cloud computing
and mobile.

This special BriefingsDirect discussion
comes to you in conjunction with The Open Group Conference in Newport
Beach, California. I'd like to thank our panel: Robert Weisman, CEO and
Chief Enterprise Architect at Build The Vision; Andras Szakal, Vice
President and CTO of IBM's Federal Division; Jim Hietala, Vice President
for Security at The Open Group, and Chris Gerty, Deputy Program Manager
at the Open Innovation Program at NASA.

This is Dana
Gardner, Principal Analyst at Interarbor Solutions, your host and
moderator through these thought leadership interviews. Thanks again for
listening, and come back next time.

Transcript
of a BriefingsDirect podcast from The Open Group Conference in January
on how big data forces changes in architecting the enterprise. Copyright
The Open Group and Interarbor Solutions, LLC, 2005-2013. All rights
reserved.