Since the birth of Hadoop in 2005-06, the way we think about storing and processing information has evolved considerably. The term “Big Data” has become synonymous with this evolution. But still, many of our customers continue to ask, “What is Big Data?”, “What are its use cases?”, and “What is its business value?”. The Internet is overloaded with definitions, characteristics, and benefits; however, few discussions synthesize all three of these topics in one place. This paper answers these questions, and proposes a total cost calculation framework for CTOs and CIOs that are evaluating solutions for their organization’s use case(s). In the text below, I examine an on-premise Hadoop ecosystem as a general purpose Big Data solution in relation to alternative commercial purpose-built storage technologies (-e.g. Oracle, Teradata, IBM, SAP, Microsoft, EMC, etc). It may be difficult to determine the exact point at which you should leverage one over the other. It is my contention that when the total cost of using all your data exceeds what you are able to spend using purpose-built technologies, it is time to consider using a general purpose solution like Hadoop for process offloading.

What is Big Data?According to Edd Dumbill, a well respected thought leader and VP of Strategy for Silicon Valley Data Science, a big data and data science consulting company, Big Data is “data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or does not fit the structures of your database architectures. To gain value from this data, you must choose an alternative way to process it.” This definition was published in an article entitled “What is Big Data?” in Big Data Now: 2012 Edition by O’Reilly Media, and touches on the three primary characteristics of data [1]:

Volume: The size of your data. Data has mass, and there is a cost to moving it around the network.

Velocity: The speed with which the data either arrives or is created, and how quickly it needs to be consumed in order to make use of it.

Variety: The differences in structure between all the types of data within an enterprise.

Scour the internet and you will find that there are other, less commonly discussed but relevant characteristics associated with Big Data to include Veracity and Volatility [2]. Veracity refers to the truthfulness of the data or your degree of trust in what it is conveying. Volatility is how often your existing data changes or is updated by the new data you are receiving/creating. There are certainly other characteristics as well. I recently began using the term Viscosity to describe the degree of data fragmentation in client environments, and the level of effort required to reassemble it into a coherent view. In this context, organizations with low viscosity have significant fragmentation, and duplication throughout their enterprise.

The term “Big Data” has come to focus on these characteristics and imply that traditional database architectures, such as on line transaction processing (OLTP) and on line analytic processing (OLAP) purpose-built technologies simply will not scale to meet your data capacity needs. However, massively parallel processing (MPP) database architectures are one example of purpose-built technologies that have been developed to support both OLTP and OLAP data structures at enormous scales up into the petabytes (PB). For example, there is a 50 PB MPP cluster at eBay [3]. Certainly this size conforms to any logical definition of Big Data.

Depending on your use case, it is possible that a purpose-built technology may suite your needs at scale. While MPP systems remain most effective with structured, tabular and transactional data sets, it is possible to store most everything except massive files in relational structures. However, this may not be the best fit for your use case(s) in terms of appropriateness or cost. There is limited published pricing data for commercial offerings, but MPP systems are notably expensive. When including the cost of software, hardware, and licensing/support, the cost per terabyte (TB) of an MPP system is estimated at tens of thousands of dollars [4]. At these prices, a one PB system can cost tens of millions of dollars. In contrast, the equivalent cost of Hadoop is roughly $2,000 per TB, leaving a one PB Hadoop cluster to cost roughly $2 million. That is a significant initial cost savings; however, use case(s) will always drive the total cost of any solution.

Coincidentally, eBay has also released information on their production 50 PB Hadoop cluster, one of the largest such clusters in the world [5]. The fact that eBay uses both types of systems demonstrates that there is a place for each, and that the difference may come down to price and purpose. Given the relative lower cost of Hadoop, I submit that it is easier to identify Big Data if we add cost to our definition. Therefore, Big Data is the result when (a) the sum of all your data’s characteristics coupled with (b) the resources required to achieve your use case exceeds (c) the cost you are willing/able to spend using traditional approaches. When that inflection point is reached, it is clearly time to consider other, non-traditional approaches for process offloading. Each unique situation warrants a cost/benefit analysis to determine if a general-purpose solution like Hadoop is right for your use case.

What are the Use Cases for Big Data?Process offloading refers to the act of moving workloads from one implementation to another to achieve better suitability, performance, availability, etc., at a lower price point. Both traditional and non-traditional solutions have advantages and disadvantages given a particular workload, and they should be leveraged accordingly to maximize cost efficiencies. Let us examine Hadoop’s use cases for process offloading.

Hadoop is comprised of two major components: the Hadoop Distributed File System (HDFS) and MapReduce, a framework for writing applications to process large amounts of content over multiple nodes (servers). Hadoop is often referred to as a schemaless system because data is not forced into a schema upon ingest. Ultimately, there is a structure known as the key/value pair in which data is expressed as a collection of [key]->[value] tuples or records. This is the most fundamental data structure in computer science. Hadoop uses the key/value pair because nearly any data can be expressed, stored, processed and retrieved using this minimal structure. Because key/value is so rudimentary, a schema can be applied at query time based on the question being asked. This adds tremendous flexibility and differs significantly from traditional approaches like OLTP and OLAP, which require you to know/define the data model up front, and have an understanding of the questions you intend to ask. Figure 1 illustrates these different process flow models. Having to know what questions you intend to ask, and constructing a pre-defined schema will add artificial constraints to the answers you are able to get from the data.

Another issue with schema-based systems is scalability. Traditional relational architectures scale vertically with ease, but are difficult to design for horizontal scaling due to their rigid data structures (tables, table relationships, rows, columns, indices) which must be sharded or split across multiple nodes. The integrity of these structures must be maintained while offering near-real-time (on line) create, read, update and delete (CRUD) operations on data. This is not trivial, and it requires commercial companies to make significant financial investments to do it well, which drive up the cost of those solutions. As a schemaless system, the latest release of Hadoop (2.x) scales horizontally to 10,000+ nodes without the added complexity inherent to traditional MPP systems [6].

Many organizations have purpose-built solutions for asking business intelligence questions, providing disaster recovery/backup, etc., but scaling these solutions beyond an initial, narrowly defined usage for structured data usually involves significant cost increases. As a schemaless computational file system, Hadoop can be applied to an almost endless set of challenges at a lower cost. Below we walk through six higher-order use cases to illustrate how these savings can be realized:

1. Raw Storage/Data Lake: Backing up all the data your enterprise collects and creates daily, to include its historical holdings, for continuity of operations (COOP) and disaster recovery (DR) has previously been too expensive, and therefore unfeasible. Instead, businesses make difficult tradeoffs as to what will and will not be recoverable should disaster strike. Imagine the possibilities if you were able to economically store everything in your enterprise for the price of traditional commodity hard disks. Fortunately, Hadoop makes this dream a reality with its internally redundant data structure that by default makes three copies of all data written to HDFS. This scalable, schemaless raw storage lends itself conceptually to what is now being called a “data lake”. A data lake is based on the notion that data can be tagged with metadata about its source, contents, structure and other characteristics. These properties stay with the data as it is minimized into key-value pairs and written to the Hadoop file system. To process the data, all one needs to know is what data they wish to process leveraging these properties. This allows many different types of data to exist side-by-side within the simple structures of the Data Lake. The amount of pre-processing is minimal, as data is no longer fit into specific schemas up-front, making the data accessible to a wider variety of purposes. This would not be cost effective using traditional commercial systems.

2. Multi-Format Data Analysis: There are many different types of data beyond structured and unstructured text, to include audio, video, and images. Analyzing structured and unstructured text at scale can be an expensive and difficult challenge, but analyzing large collections of digital media is not even possible using traditional relational systems. Many businesses have previously been unable to unlock the potential of their data holdings due to an inability to process digital content, such as the ability to analyze and track objects in video, or to identify and extract biomarkers in healthcare images. HDFS accepts all these formats for analysis without the need for a schema. Hadoop’s ability to work with unstructured text and binary data (audio, video, imagery) extends well beyond the native capabilities offered by existing storage solutions, providing an enormous capability advantage.

3. Data Cleansing/Transformation Businesses often contend with multiple relational data models, unstructured text and streaming data. You likely need to correlate, cleanse, de-duplicate, synchronize and normalize/de-normalize these data sets as they move between databases and tools to create a complete, clean operating picture for downstream analysis. The vast majority of work in conducting analytics is often preparing the data for use. In addition, new initiatives to leverage autonomous self-reporting devices and sensors provide continuous streams of data, creating explosions in the amount of information if used in their raw form. Purpose-built technologies present challenges when attempting these types of tasks due to their reliance on schemas. General purpose solutions, like the Hadoop ecosystem, deliver an economical way of storing, pre-processing and/or summarizing these data sets and streams, thereby minimizing the unchecked growth in commercial licensing investments within your enterprise.

4. Data Exploration: When new questions arise, the relevant variables and their relationships must be identified from within your data before you can begin to calculate definitive answers. However, these elements are not always understood, nor are the best algorithms for analyzing the data. Exploration is often required in order to build a model that will answer the questions being asked. Traditional relational architectures with pre-defined schemas are not likely to provide a platform for discovery. In these cases, identifying key variables and useful analytic methods is a trial/error process. Hadoop provides a flexible, schemaless environment that reduces the friction associated with the iterative process of exploring and analyzing data when the model is unclear. Hadoop provides a sandbox for exploring data without having to increase commercial capacity or spend the time building new schemas.

5. Data Science & Personalization: Data science leverages tools and techniques from many different areas of study, to include statistics, machine learning, mathematics, probability/uncertainty modeling, etc., to surface meaning from data, and generate data-driven products. This is essentially the art of making data actionable, either by a user or a machine. Data science is not exclusive to Big Data, but there is tremendous knowledge potential in large data sets. One use of data science is for personalization, the act of exhaustively analyzing large quantities of related data, such as the online behaviors of millions of Internet users to in order to calculate recommendations for a specific individual. The results are then presented in the form of “you might also like” books, movies, and other targeted advertisements. These techniques are also being applied to healthcare where symptoms, genetics, treatments, and outcomes are being analyzed to optimize treatment for specific individuals to optimize treatments. Hadoop is a perfect platform for data collecting, synthesizing, munging, cleaning and joining disparate data sets for analysis to achieve decision-relevant insight.

6. Data Anonymization: Certain industries, perhaps healthcare more than any other, require anonymized data for research. Rules governing the release of such data to the public generally require the information contain no personally identifiable information (PII). In the case of healthcare specifically, the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule, released in 2003, governs the use and disclosure of protected health information (PHI). The Privacy Rule does not give a specific algorithm for achieving the level of de-identification required, and there are many ways to approach anonymization in general, which depend greatly on how the data will be used. If a portion of your business model relies on providing anonymized data to internal or external groups for analysis, you want that process to be as clear, efficient, and repeatable as possible. Hadoop provides a solution for codifying and institutionalizing these algorithms for your enterprise. This increases the speed and effectiveness of all groups depending on anonymized data, providing them with an approved, documented process, and an authoritative source from which to receive data.

This list is not meant to be exhaustive, and there are definitely other use cases. Each use case is applicable to a wide variety of domains, to include finance, cyber, healthcare, defense, and scientific research.

What is the Business Value of Big Data?Our new definition of Big Data (when the cost of using all your data for your use case exceeds what you are able to spend using purpose-built technologies) lends itself to a cost/benefit analysis. Figure 2 establishes a rubric through which to express the decision calculus of Big Data for process offloading. This framework illustrates the components of cost, discussed below, that every CIO and CTO should take into account when evaluating solutions for their use cases.

Projects should always start with gathering and analyzing requirements. In an analytic context, these are the questions you want to ask of your data. Or more generally, how you intend to use the data you would like to store in Hadoop. These requirements have obvious implications for leveraging the relevant data assets.

The Data / Characteristics (AS-IS) corner of the triangle refers to all data related to your requirements, and all the attributes discussed earlier, to include the amount, how quickly it grows/changes, differences in type/structure, where it resides on the network, etc.

Once the associated data has been identified, a solution is identified and designed. In the case of data analysis, the solution often involves models and techniques to change and analyze the data to find answers. Overall, this step includes any processes, human or machine, that are necessary to get the results you are looking for.

The Purpose / Answers (TO-BE) corner is your end-state vision, which is sometimes expressed in terms of success criteria and/or key performance indicators. In the case of data science, this corner represents the answers you want from your data, in addition to how users should expect to access those answers, and how frequently the answers need to be updated (real-time, hourly, daily, monthly, etc).

Lastly, there are often numerous ways for this solution to be physically implemented. Each possible implementation requires specific people, intellect (expertise, experience), technology (licenses, support), time, and physical capital (power, space, cooling) to assemble and extend (write algorithms, or build solutions on top of) the desired end-state. There are many factors here to consider. For example, certain software licenses will charge by the number of users, which may limit your derived business value (in terms of productivity) if that cost prevents your entire team from leveraging the software. As well, the more data you have, the more physical or virtual compute resources you may need.

Together, these elements influence the total cost of the solution. Ultimately, cost is the tipping point that can cause you to change the scope of your requirements and timeline, the data you use, the models/techniques you employ, the answers you are able to achieve and the algorithms/technologies you implement. Often, it is necessary to find an affordable balance to achieve the organization’s goals and objectives. However, these trade-offs may cause you to compromise certain business objectives, and reduce the business value derived from the solution.

The business value of Hadoop is the result of overcoming the functional limitations established by the cost of scaling purpose-built technologies, and having to make fewer compromises to achieve your data-driven business objectives. This relationship between cost and business value is illustrated in Figure 2. By managing (containing or reducing) cost, it becomes possible to maintain or broaden your scope and implement the solution that is right for you. Hadoop may allow you to get more from your data, with a significantly lower cost investment, resulting in tangible economic value. If Hadoop is able to satisfy your use case, then it is likely you will benefit from cost containment (and possibly savings) by preventing or reducing the expansion of more expensive purpose-built technologies.

ConclusionIt is important to choose the right technology for your particular use case. Hadoop continues to mature as a widely supported open source solution nearing its ten year anniversary. It is also supported by several commercial vendors offering on-site support. Depending on your particular use case(s), Hadoop may or may not be the best solution. Some Big Data is consistent, known, structured, and aligns well to the use cases best served by purpose-built technologies. However, when you do not have that, or cost constraints limit your business value, it is time to consider using a general purpose solution like the Hadoop ecosystem for process offloading. The formula presented in this paper provides a lens for CIOs/CTOs to examine potential solutions, business objectives, and cost constraints. Hadoop’s low cost and broad applicability are definitely worth exploring. I recommend you conduct your own cost/benefit analysis to determine if Hadoop is right for you and your use case(s). You may find that relative to commercial products, Hadoop will allow you to achieve greater business value and substantial cost savings.

"We build IoT infrastructure products - when you have to integrate different devices, different systems and cloud you have to build an application to do that but we eliminate the need to build an application. Our products can integrate any device, any system, any cloud regardless of protocol," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.

The cloud promises new levels of agility and cost-savings for Big Data, data warehousing and analytics. But it’s challenging to understand all the options – from IaaS and PaaS to newer services like HaaS (Hadoop as a Service) and BDaaS (Big Data as a Service). In her session at @BigDataExpo at @ThingsExpo, Hannah Smalltree, a director at Cazena, provided an educational overview of emerging “as-a-service” options for Big Data in the cloud. This is critical background for IT and data professionals...

Internet of @ThingsExpo has announced today that Chris Matthieu has been named tech chair of Internet of @ThingsExpo 2017 New York
The 7th Internet of @ThingsExpo will take place on June 6-8, 2017, at the Javits Center in New York City, New York.
Chris Matthieu is the co-founder and CTO of Octoblu, a revolutionary real-time IoT platform recently acquired by Citrix. Octoblu connects things, systems, people and clouds to a global mesh network allowing users to automate and control design flo...

The WebRTC Summit New York, to be held June 6-8, 2017, at the Javits Center in New York City, NY, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 20th International Cloud Expo and @ThingsExpo. WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web co...

Amazon has gradually rolled out parts of its IoT offerings, but these are just the tip of the iceberg. In addition to optimizing their backend AWS offerings, Amazon is laying the ground work to be a major force in IoT - especially in the connected home and office.
In his session at @ThingsExpo, Chris Kocher, founder and managing director of Grey Heron, explained how Amazon is extending its reach to become a major force in IoT by building on its dominant cloud IoT platform, its Dash Button strat...

Complete Internet of Things (IoT) embedded device security is not just about the device but involves the entire product’s identity, data and control integrity, and services traversing the cloud. A device can no longer be looked at as an island; it is a part of a system. In fact, given the cross-domain interactions enabled by IoT it could be a part of many systems. Also, depending on where the device is deployed, for example, in the office building versus a factory floor or oil field, security ha...

In addition to all the benefits, IoT is also bringing new kind of customer experience challenges - cars that unlock themselves, thermostats turning houses into saunas and baby video monitors broadcasting over the internet. This list can only increase because while IoT services should be intuitive and simple to use, the delivery ecosystem is a myriad of potential problems as IoT explodes complexity. So finding a performance issue is like finding the proverbial needle in the haystack.

The idea of comparing data in motion (at the sensor level) to data at rest (in a Big Data server warehouse) with predictive analytics in the cloud is very appealing to the industrial IoT sector. The problem Big Data vendors have, however, is access to that data in motion at the sensor location.
In his session at @ThingsExpo, Scott Allen, CMO of FreeWave, discussed how as IoT is increasingly adopted by industrial markets, there is going to be an increased demand for sensor data from the outermos...

Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value.
In his session at 20th Cloud Expo, Ed Featherston, director/senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.

In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential.
Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...

SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2017 New York.
The 20th Cloud Expo and 7th @ThingsExpo will take place on June 6-8, 2017, at the Javits Center in New York City, NY.
"The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Internet to enable us all to im...

Whether your IoT service is connecting cars, homes, appliances, wearable, cameras or other devices, one question hangs in the balance – how do you actually make money from this service? The ability to turn your IoT service into profit requires the ability to create a monetization strategy that is flexible, scalable and working for you in real-time. It must be a transparent, smoothly implemented strategy that all stakeholders – from customers to the board – will be able to understand and comprehe...

"Once customers get a year into their IoT deployments, they start to realize that they may have been shortsighted in the ways they built out their deployment and the key thing I see a lot of people looking at is - how can I take equipment data, pull it back in an IoT solution and show it in a dashboard," stated Dave McCarthy, Director of Products at Bsquare Corporation, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.

What happens when the different parts of a vehicle become smarter than the vehicle itself? As we move toward the era of smart everything, hundreds of entities in a vehicle that communicate with each other, the vehicle and external systems create a need for identity orchestration so that all entities work as a conglomerate. Much like an orchestra without a conductor, without the ability to secure, control, and connect the link between a vehicle’s head unit, devices, and systems and to manage the ...

Everyone knows that truly innovative companies learn as they go along, pushing boundaries in response to market changes and demands. What's more of a mystery is how to balance innovation on a fresh platform built from scratch with the legacy tech stack, product suite and customers that continue to serve as the business' foundation.
In his General Session at 19th Cloud Expo, Michael Chambliss, Head of Engineering at ReadyTalk, discussed why and how ReadyTalk diverted from healthy revenue and mor...

As data explodes in quantity, importance and from new sources, the need for managing and protecting data residing across physical, virtual, and cloud environments grow with it. Managing data includes protecting it, indexing and classifying it for true, long-term management, compliance and E-Discovery. Commvault can ensure this with a single pane of glass solution – whether in a private cloud, a Service Provider delivered public cloud or a hybrid cloud environment – across the heterogeneous enter...

You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time.
In his session at 19th Cloud Expo, Mark Allen, General Manager of...

Financial Technology has become a topic of intense interest throughout the cloud developer and enterprise IT communities.
Accordingly, attendees at the upcoming 20th Cloud Expo at the Javits Center in New York, June 6-8, 2017, will find fresh new content in a new track called FinTech.

The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location.
With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ...

Bert Loomis was a visionary. This general session will highlight how Bert Loomis and people like him inspire us to build great things with small inventions. In their general session at 19th Cloud Expo, Harold Hannon, Architect at IBM Bluemix, and Michael O'Neill, Strategic Business Development at Nvidia, discussed the accelerating pace of AI development and how IBM Cloud and NVIDIA are partnering to bring AI capabilities to "every day," on-demand. They also reviewed two "free infrastructure" pr...

The holiday season is nearly upon us (I’ve already heard Christmas songs being played…really?) and retailers are usually the big winners during the holiday season. However, leading retailers are already thinking beyond the current holiday season, and not just from marketing and merchandising perspectives. These leading retailers are considering how this holiday season – and the resulting wealth of customer, product and operational data – can be converted into new analytic insights that can be used to optimize key business processes, uncover new monetization opportunities and create a more comp...

I was on a high-rise construction site 34-floors above the city. I was talking to the construction crew when a fight broke out. There was an explosion and the floor collapsed. I removed the virtual reality (VR) goggles and laughed. It was so real. The VR solutions provided an incredible experience, almost like being there. As good as my experience was, it was not reality. It was a controlled pre-programmed experience - a notional idea. Today, however, VR and sensor technologies enable a notional idea to become reality – a Real-Reality.

The cloud promises new levels of agility and cost-savings for Big Data, data warehousing and analytics. But it’s challenging to understand all the options – from IaaS and PaaS to newer services like HaaS (Hadoop as a Service) and BDaaS (Big Data as a Service). In her session at @BigDataExpo at @ThingsExpo, Hannah Smalltree, a director at Cazena, provided an educational overview of emerging “as-a-service” options for Big Data in the cloud. This is critical background for IT and data professionals, as experts estimate that “as-a-service” cloud sourcing will increase from today’s 15% to 35% by 20...

Internet of @ThingsExpo has announced today that Chris Matthieu has been named tech chair of Internet of @ThingsExpo 2017 New York
The 7th Internet of @ThingsExpo will take place on June 6-8, 2017, at the Javits Center in New York City, New York.
Chris Matthieu is the co-founder and CTO of Octoblu, a revolutionary real-time IoT platform recently acquired by Citrix. Octoblu connects things, systems, people and clouds to a global mesh network allowing users to automate and control design flows, processes and sensor data, and analyze/react to real-time events and messages as well as big dat...

As we enter the final week before the 19th International Cloud Expo | @ThingsExpo in Santa Clara, CA, it's time for me to reflect on six big topics that will be important during the show. Hybrid Cloud: This general-purpose term seems to provide a comfort zone for many enterprise IT managers. It sounds reassuring to be able to work with one of the major public-cloud providers like AWS or Microsoft Azure while still maintaining an on-site presence.

2016 brought about more cyberattacks than we thought possible, especially involving ransomware, and we definitely won't see that trend breaking stride in 2017. By next year, we expect every single adult in the US will know a blood relative that has had their identity stolen - the Internal Revenue Service reported that 2.7 million people had their identities stolen in 2014 and according to TransUnion, 19 people fall victim to identity theft every minute.

For large enterprise organizations, it can be next-to-impossible to identify attacks and act to mitigate them in good time. That’s one of the reasons executives often discover security breaches when an external researcher — or worse, a journalist — gets in touch to ask why hundreds of millions of logins for their company’s services are freely available on hacker forums.
The huge volume of incoming connections, the heterogeneity of services, and the desire to avoid false positives leave enterprise security teams in a difficult spot. Finding potential security breaches is like finding a tiny ne...

Monitoring of Docker environments is challenging. Why? Because each container typically runs a single process, has its own environment, utilizes virtual networks, or has various methods of managing storage. Traditional monitoring solutions take metrics from each server and applications they run. These servers and applications running on them are typically very static, with very long uptimes. Docker deployments are different: a set of containers may run many applications, all sharing the resources of one or more underlying hosts. It's not uncommon for Docker servers to run thousands of short-te...

The IoT continued its toddler-like growth and stumbles in 2016. Here are five trends to look for in 2017 as the IoT enters its adolescence and how to benefit from them.
1. Ecosystems begin to determine winners and losers
Previously these were nice in-the-future concerns; now they will really count. Filling out a whole product value proposition through partnerships has repeatedly proven its importance across B2B and enterprise software sectors. In the IoT, they will be even more critical.

My daughter called with a frantic message. She was driving my car (why she was driving my car when she has her own is the subject for another time) and a warning message appeared on the car console: “Engine overheated! Stop engine and allow to cool down” (see Figure 1).
Fortunately, my daughter was nearly home, so she got the car home, shut it down and called me immediately (I was on the road somewhere…Washington DC, Philadelphia, Knoxville, Chicago, Toronto…I don’t even remember where anymore). I called my trusty mechanic (Chuck) and he was able to work my car into the schedule when I got ba...

There’s a funny thing about digital transformation: we are simultaneously over-hyping it and understating it. On the one hand, every tech company in the world is talking about it. It doesn’t matter how mundane the technology; every company is somehow relating their products to digital transformation.
On the other, many people are failing to grasp the import and impact of what digital transformation really means. In far too many cases, business and IT leaders are dismissing it as nothing more than a marketing ploy. The unfortunate result is that the over-hypedness of digital transformation i...

I recently recovered from ACDF surgery where they remove a herniated or degenerative disc in the neck and fuse the cervical bones above and below the disk. My body had a huge vulnerability where one good shove or fender bender could have ruptured my spinal cord. I had some items removed and added some hardware and now my risk of injury is greatly reduced.
Breaches are occurring at a record pace, botnets are consuming IoT devices and bandwidth, and the cloud is becoming a de-facto standard for many companies. Vulnerabilities are often found at the intersection of all three of these trends, so ...

Okay, let me get this out there: I find the term “Citizen Data Scientist” confusing. Gartner defines a “citizen data scientist as “a person who creates or generates models that leverage predictive or prescriptive analytics but whose primary job function is outside of the field of statistics and analytics.” While we teach business users to “think like a data scientist” in their ability to identify those variables and metrics that might be better predictors of performance, I do not expect that the business stakeholders are going to be able to create and generate analytic models. I do not believe...

In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain.
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management...

Hewlett Packard Enterprise advanced across several fronts at HPE Discover 2016 in London, making inroads into hybrid IT, Internet of Things, and on to the latest advances in memory-based computer architecture.
A leaner, more streamlined Hewlett Packard Enterprise (HPE) advanced across several fronts at HPE Discover 2016 in London, making inroads into hybrid IT, Internet of Things (IoT), and on to the latest advances in memory-based computer architecture. All the innovations are designed to help customers address the age of digital disruption with speed, agility, and efficiency.

With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterp...

We have been seeing a sudden rise in the deployment of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL). It looks like the long “AI winter” is finally over. It is interesting to note that AI was mentioned by Alan Turing in a paper he wrote back in 1950 to suggest that there is possibility to build machines with true intelligence. Then in 1956, John McCarthy organized a conference at Dartmounth and coined the phrase Artificial Intelligence. Much of the next three decades did not see much activity and hence the phrase “AI Winter” was coined. Around 1997, IBM’s Deep Blu...

Unless your company can spend a lot of money on new technology, re-engineering your environment and hiring a comprehensive cybersecurity team, you will most likely move to the cloud or seek external service partnerships. In his session at 18th Cloud Expo, Darren Guccione, CEO of Keeper Security, revealed what you need to know when it comes to encryption in the cloud.

As cloud computing simultaneously transforms multiple industries many have wondered about how this trend will affect manufacturing. Often characterized as “staid”, this vertical is not often cited when leading edge technological change is the topic. This view, however, fails to address the revolutionary nexus of cloud computing and the manufacturing industry. Referred to as Digital Thread and Digital Twin; these cloud driven concepts are now driving this vertical’s future.

Almost a year ago, I wrote these words, "Technology has reached the tipping point for me, it moved from a help to a hindrance." The plethora of adrenaline- and endorphin-inducing mobile apps, 24x7 news, notifications, alerts and updates, drip fed my brain and hindered my "deep work and deep thoughts." In Cal Newport's new book titled, Deep Work he posits that most knowledge workers need concentration and substantial time, dedicated and uninterrupted, to produce their best work. He argues that a lot of technologies and open office layouts today inhibit creativity, "deep work" and "deep thoughts...

Cloud computing budgets worldwide are reaching into the hundreds of billions of dollars, and no organization can survive long without some sort of cloud migration strategy. Each month brings new announcements, use cases, and success stories.