Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Over the last decade, when discussing big data in the context of manufacturing, it is near impossible to avoid the topic of Industry 4.0 (also referred to as the 4th Industrial Revolution). But, what exactly is Industry 4.0 and why is it so important to manufacturing companies? As such, why should open-source data management technologies be an integral part of this discussion? Read-on for answers!

What is Industry 4.0?

While competing definitions for Industry 4.0 exist within literature, I believe McKinsey provides the clearest explanation, defining Industry 4.0 as the next phase in the digitization of the manufacturing sector, driven by four disruptions: 1) the astonishing rise in data volumes, computational power, and connectivity, especially new low-power wide-area networks; 2) the emergence of analytics and business-intelligence capabilities; 3) new forms of human-machine interaction, such as touch interfaces and augmented-reality systems; and 4) improvements in transferring digital instructions to the physical world, such as advanced robotics and 3-D printing.

Why Does This Matter to Manufacturers?

Often slow to adopt new information technologies, manufacturers are eagerly implementing Industry 4.0 initiatives to solve age-old manufacturing problems. To understand why, consider the this: despite decades of continuous efforts to improve manufacturing operations, the total cost of poor quality to manufacturers amounts to a staggering 20 percent of sales revenues (American Society of Quality), while unplanned downtime costs amount to approximately $50 billion per year (Deloitte). So, clearly, process improvements derived from Industry 4.0 are sure to get the attention of manufacturers.

What's the Connection Between Industry 4.0 and Big Data?

Stated as concisely as possible, Industry 4.0 is intrinsically a big data problem! Consider the fact that digitalization, a central tenant of Industry 4.0, must be underpinned by digital data. This digital data, often referred to as the "digital thread" or "digital twin," must be defined, captured, and managed across the entire product lifecycle – from how a product is engineered (design data), to how it is produced (manufacturing sensor data), or to how it is monitored and serviced in the field (connected device data). The net-net? Big data is foundational to Industry 4.0.

How Big is Big?

Data volumes associated with Industry 4.0 are huge. Consider the fact that a major source of Industry 4.0 data arises from manufacturing sensors on the shop floor. According to Wikibon, this type of “time series” data is projected to grow at twice the rate of any other big data source (including social media). When consolidating this data into a centralized manufacturing data lake, it is not uncommon to store data volumes in the Petabyte range.

What Value Have Companies Achieved?

To no surprise, leading companies are moving aggressively to establish data lakes and analyze this treasure trove of information. Therefore, organizations are accruing significant value from their Industry 4.0 analytics initiatives. According to McKinsey, big data-enabled use cases, such as predictive maintenance, can reduce factory equipment maintenance costs by 10 to 40 percent, reduce equipment downtime by up to 50 percent, and reduce equipment capital investment from 3 to 5 percent by extending the useful life of machinery. Similar performance improvements have been noted by manufacturers using big data analytics to improve manufacturing process quality and yield performance by 20 to 50 percent.

Why Open-Source for Industry 4.0?

Given the critical importance of Industry 4.0 to future manufacturing competitiveness, open-source data management makes a convincing case for itself. First, with big data analytics technology evolving at such a rapid pace, open-source communities provideinnovation that no single company can sustain. Second, given the volume and growth associated with Industry 4.0 data, open-source data management provides a significantly lower cost of ownership to manufacturers. Finally, due to the very nature of open-source software, elimination of vendor “lock-in” risk is assured.