Opinion
Took Me a While, Sorry I Was in the Cloud

My apologies to my readers, I have been quite busy with speaking at Enterprise Data World in San Francisco, then attending Gartner Conference in Vegas, then accepting a position as principal practice consultant in the enterprise information management practice with EMC Corporation. Yes, that's right, I said EMC Corporation! Even the world's premier storage company realizes that the value of information is paramount to its success as a storage vendor, and this means the providing EIM services to its loyal customer base.

I am very proud to join the organization that is reinventing cloud computing. But this blog is not about my employers nor is it about any vendor in specific – it's a forum for "us" to express thoughts and ideas about the state of Information Technology [IT] and enterprise information management [EIM]. I say "us" because a blog is only a one-sided conversation unless there is input from readers and that is why I devoted the last entry to comments. Keep the comments coming and make this a dialog for industry awareness.

I started out by noting that I was "lost in the cloud", let's talk about the idea of the "cloud" – what is it anyway? It"s a virtualized information infrastructure upon which the organization can rely on, no matter what happens. Stated another way, our abstracted information (don't know nor care where it will reside) will be safe (secure), it will be reliable (trusted) and it will without question be always available when needed (expeditiously delivered or timely). This idea that the "right people, see the right data at the right time" has been something that I have harped on enough.

Think about it, what would happen if your organization could rely on its applications to get data quickly, no matter where the users physically were – the data would follow them. How about following the sun (or clock) so that data would be faster wherever the processing was – now that's a concept whose time has come (it's almost here now).

The idea of a cloud is not unique to information; it actually is the concept of virtualized resources that is the foundation of abstraction of layers. As I have written previously, layered abstractions allow for the simplification of complex problems.

One solution to extremely complex problems is to divide and conquer. The agent-based architecture is one way to solve an extremely complex interconnected series of processes or events. It works on the premise that an agent, with minimal interaction with other agents, can solve a microscopic problem and then relay this "result" to an agent coordinator which will collate the results creating the solution. Global IDs, a very innovative data governance toolkit that provides for data discovery, profiling, mapping, analysis and alike, implements this architecture and allows for the wide-scale capture of enterprise assets of information. In this, the age of information, it is surprising that corporations do not have a handle on their corporation's most important assets – all of their information. The capture of all of the corporations' assets is one of the most imperative needs today of businesses. Getting a handle on these assets is what governance is all about. Not to mention the need to secure information – most information loss/theft is actually caused by the employees of companies!

Once you have captured all of your assets, we now need to determine their contents, quality and other where copies reside. Lineage would also be a necessity and this is one of the major advantages of a tool like Global IDs – especially required in master data management (like CDI) but more generally required for all corporate assets. So the needs are: discovery, profiling, classification, verification, data quality analysis, mapping and lineage, movement (ETL), integration, stewardship and monitoring/analysis. This is one very complex problem solved using a stack of integrated agents that have common metadata that with each layer gains more insight into your assets.

Another solution to the complex problem is to allow for traceability across abstractions by utilizing a model-driven development paradigm (top-down) where the business architecture models are interconnected to application, information and infrastructure architectures (i.e., services-based architectures – SOA, EDA, CEP, etc.) and metadata provides for the glue between all of the layers (note this common theme of metadata being the link between abstractions). This would be nirvana as a business user could select a process and see the complete impact upon the enterprise of IT (applications, databases, infrastructure) with complete traceability in both directions (bottom-up and top-down). In converse, turning off a server (in the infrastructure layer) would affect these data elements, those applications and these specific business processes.

Let me give an example of the above, at a major electronics vendor I was brought in to decommission servers that were not being utilized to their fullest – the average for servers is about 14 percent for most enterprises. Since the enterprise was not documented, we would go to each server and capture information (TCP/IP, Server O/S, Make, Model, Applications Loaded, …). We were able to combine a number of under-utilized assets thereby reducing the licensing cost of the enterprise enormously while reducing machines to support, data center footprint … We came across this one server that had no utilization and data but no applications. We decided the only thing to do would be to "turn the server off" and see who screamed – one way of determining the impact! No one complained for months so we threw the unit away. Almost a year later, the team learned that the CFO was frantically looking for his "year-end accounting server"!

If this environment would have been in a cloud (or virtualized), its utilization would have been near 60%+ and been highly scalable by just adding more servers and load balancing would automatically compensate for the additional computing. The servers would provide for the abstraction of the ‘what server is my application running on.

So, cloud infrastructures allow us to solve a number of problems while saving corporations big $$$ by making the support team more efficient, faster, effective, auto-provisioning, highly reliable and secure - while allowing for centralized management and safety (backup, recovery) of all assets in real time. The idea of virtualization is really catching on and you will want to keep your "head in the clouds" so to speak.

Again, keep those comments coming so our dialog becomes a solution to challenges we all face in Information Technology and EIM topics. Feel free to ask questions, and I will try to find the answers for you…