Climbing the IoT data mountain

Businesses are in the midst of a digital transformation. To transform, they must become software companies, they must turn their products and services online, and they must provide more intelligent solutions. This new and connected world has companies turning to the Internet of Things (IoT) to lead them towards new business opportunities.

The Internet of Things has been making headlines for years now, but according to Bart Schouw, director of IoT at Software AG, the area is still new and unknown. Businesses are still trying to navigate and understand how they can implement this, and how they can then monetize it.

It is important to distinguish that the Internet and the Internet of Things are two very different things. Bryan Hughes, CTO of IoT at SpaceTime Insight, an advanced analytics solution provider, explained the Internet is the digital representation of digital things such as web pages while IoT is the digital representation of physical things.

These physical things or devices provide the ability to better understand and engage with customers, anticipate failures, and avoid costly downtimes; but to do that businesses need to be able to collect, analyze, store and comprehend data. Data is the key to making valuable insights. The problem in the IoT world is you now have the ability to apply inexpensive sensors onto solutions that enable you to collect all kinds of different information. Suddenly, you end up with a mountain of data you can easily get buried in.

“We collect all this data, and as a result the data is growing like crazy,” said Svetlana Sicular, research vice president for data and analytics at Gartner. “There is a big challenge to understand which data is useful, and how to make sense of it. So far, there aren’t too many companies that are successful analyzing this data.”

Treading through deep data waters The biggest change that is enabling the IoT age is changing how data is processed and stored, according to Jack Norris, SVP of data and applications at MapR Technologies. Norris explains the first phase of IoT was focused on deploying devices and getting access to data. Today, we are in the second phase of IoT where we are focused on the data itself: How do we collect it appropriately, how do we analyze it, and how do we act on it.

“You need an approach that can handle high scale, high speed and reliability all at the same time because it is really about understanding the context of the data as fast as possible and being able to act in real time,” he said.

The number one best practice to gain valuable insight from IoT solutions is to utilize data analytics, according to Gartner’s Sicular. “It is not only about installing sensors and creating alerts, but it’s about understanding the long-term value of analytics,” she said. “Analytics is what brings the value of an existing IoT project to the next level.”

Sicular explains there are four types of ways to approach analytics: Through a platform with analytical capabilities; a general purpose analytics tool; do-it-yourself, custom-developed solutions; and a packaged application that solves a particular use case.

But before you deal with analytics, you need to handle the data itself, and that includes the speed of data, the amount of data, and new users of the data. According to Sicular, the first question you need to answer is “How are you going to store the data?”

Software AG’s Schouw explained more IoT platforms are actually living in the cloud today because the cloud provides the ability to scale up and scale out as well as the space necessary to store information.

Suraj Kumar, vice president and general manager of PaaS at Axway, added “The reason companies turn to a public or private cloud is because IoT devices and the amount of data need a place that provides high levels of scalability and elasticity. A cloud-oriented solution enables both from a compute perspective and a data storage perspective.”

When deciding on a cloud provider, companies need to understand their business requirements. According to Kumar, there are basic security requirements and compliance requirements that need to be considered. For instance, healthcare providers need to ensure cloud providers have HIPAA compliance, financial companies need to have compliance policies as well, and government entities need to have federal compliance. “Businesses need to ask who has access to the data and data center, and what kind of security and controls do they have in place,” he said.

In addition to understanding the business requirements from a cloud perspective, knowing the business outcomes will help businesses handle all the data as well as generate insight. Donna Prlich, chief product officer at Pentaho, a Hitachi Group Company, explained when there are so many different varieties of data and information coming in from different data sources, it can be overwhelming to know what to look at. When you focus on the business use case, you are not trying to take every single data source in. Instead, you are looking at the places the data applies to the business outcome, according to Prlich.

“Focusing on business outcomes and what you are trying to accomplish, starting small, and growing is going to be super important to be successful,” she said.

Rather than trying to provide every single piece of information, Axway’s Kumar suggests looking at what the customer wants and tailor a solution to a particular customer experience.

SpaceTime Insight’s Hughes believes an IoT/data analytic approach should consist of four things: the collection of data, edge computing, processing in real-time, and security. For collection, it is about building a system that can withstand the real world, meaning building systems that are designed for failure. Edge computing allows you to go from the end point to the cloud. Then you need to be able to process everything in real time somehow, and reduce as many attack vectors as possible, Hughes explained.

“The challenge is that as we move towards the future, more and more things will be operating in very remote locations, or traveling through the air or across the country on the roads and rails. In most cases, connecting through cellular networks. In these cases, the amount of data generated cannot be transmitted feasibly to the cloud for processing. Instead, data collection and analytics needs to move to the edge, improving latency, reliability, and cost,” Hughes said.

In the end, to provide a successful solution, Software AG’s Schouw says top management needs to have close ties with operations because IoT has a huge impact on the business. “If top management hasn’t bought into it and doesn’t understand why it is going to change, operations will never be able to make those painful decisions to reorganize and realign the organization along the new business models because organizations are built to resist change,” he said.

Applying AI and machine learningEven taking into account all of the best practices and putting the necessary tools and platforms in place, it still is almost impossible to sift through all the data, especially in real time. Humans aren’t always capable of understanding the right questions to ask, Hughes explained.

“The growth of analytics and business intelligence has always been around knowing the question to ask, and then being able to ask the question to get the answer,” he said. “In the mountains of data, it is not about knowing the question to ask. It is about discovering patterns in the data.”

To discover the patterns, businesses need to leverage machine learning. According to Schouw, machine learning learns from the data, discovers patterns that might not be clear to the human eye, and deciphers whether or not the user should act on the data or ignore it. Additionally, it helps make data analytics more of an automated process.

“Even if you have predictive analytics in place to spot a pattern and alert an operator to it, there might be so many alerts and data going on that a human doesn’t want to be confronted with that continuously. Being able to have artificial intelligence or machine learning take automated action on it is something you want to do. That is a big efficiency gain,” Schouw said.

According to Schouw, artificial intelligence is actually becoming the new UI. Schouw provided Amazon Alexa as an example. A user might be able to tell Alexa to lock the door, but Alexa has to figure out what door to lock, the front door, the back door, the bathroom door? And if you want to lock the front door and someone is still inside, Alexa can ask if you still want to lock it. “In those complex environments, the question is how do you want to interact, and AI will be the way humans want to interact with it,” he said.

While machine learning and artificial intelligence are providing many productivity benefits to IoT organizations, Axway’s Kumar believes this area is still new and expects there will be many improvements in the future to have the technology automate certain decision-making and provide even deeper insights. “Machine learning and artificial intelligence can drive further improvement in getting insight, helping with decisions and automating certain decisions when it comes to IoT analysis,” Kumar said.

Software AG’s Schouw notes machine learning is not a magic bullet. You can’t just apply machine learning out of the box, connect to things, and expect it to tell you when things will fail. You need to have an understanding, and you need to invest in data scientists that can help you build up that knowledge so you can start applying things like predictive maintenance to machine learning, he explained.

Having tools support your strategy Once you find a place for the data to live and come up with a strategy, you need a solution to execute on that strategy and apply techniques like machine learning and artificial intelligence. There is no right answer or single tool to help you magically handle the Internet of Things landscape, but there are some features you can look for in a tool to help make life easier.

When looking for a tool, the first thing to consider is your data pipeline, according to Pentaho’s Prlich. How are you going to manage the pipeline and how are you going to address the different types of users in your organization. For instance, you might have data scientists, ETL engineers, analysts, and developers all working with the data in some shape or form, Prlich explained.

“You want to ask ‘Is this something that can help me solve the end-to-end problem all the way from data engineering through data preparation to data analytics, or is it a siloed event or set of tools?’” said Prlich.

In addition, Prlich explained the solution should be open where team members can bring in a set of tools or platforms that can coexist with one another. “You want to think about what is coming in the future, the tools you are choosing, and if things shift and change, are you prepared to manage that,” she said. This helps companies “future proof” themselves for what is coming next.

MapR’s Norris suggests having a distributed data fabric that can extend to the edge and intelligently process data. The IoT landscape requires businesses to collect data, aggregate, and learn across a whole population of devices to understand events and situations. At the same time, businesses need to inject intelligence to the edge so they can react to those events very quickly. According to Norris, enterprises need to be able to converge the different data cycles, harness data flows and provide agility. Having a common data fabric can help handle all of the data in the same way, control access to the data, and apply intelligence in a high performance and fast way.

Security and privacy aspects Security continues to be a huge challenge with the Internet of Things. As devices become more widely used and spread out, a bigger surface area of attack is created. According to SpaceTime Insight’s Hughes, machine learning comes into play here as well because businesses need to be able to perform intrusion detection.

“You can’t fully secure anything. You need to be able to understand as quickly as possible when there has been a breach. Machine learning comes into play for that and can do anomaly detection to determine whether or not a system has been breached, and then respond to it quickly,” he said.

In addition, security has to be granular, according to MapR’s Norris. Any time data is moving, it has to be encrypted so it is not easily accessed. There also has to be some intelligence to how data flows, where the data moves, and where it is processed.

Axway’s Kumar believes API management plays a big role in IoT and data analytics because most devices leverage APIs in some way or another. An API management solution can help ensure the data being passed is securely opened up and transmitted. However an API management solution will only take you so far when it comes to security. In addition, businesses need to ensure the policies implemented for API management are solid, provide governance, and enforce a set of corporate security policies that don’t enable data to be accessed by people who don’t need to access it, Kumar explained.

“API management combined with best practices, policy management, and governance help essentially both securing and putting the data into places that it needs to for further analysis or storage,” he said.

As far as the privacy aspect of all of this, there are two pieces of it, according to Kumar. There is the user aspect and the company aspect. From the user perspective, we typically just go through and click okay into disclosures when we sign up for a service and connect our devices online. Users trust the business to protect them, or look out for their best interest. “People are either open or don’t have the full knowledge of privacy, so on the business side there are stricter rules they need to follow,” said Kumar.

While businesses typically share data to third parties for further analysis, there are strict data privacy laws on how the data is stored and who has access to it that businesses need to adhere to in most cases, Kumar explained.

“People are realizing it is not just about the devices, and not about the fast proliferation,” said MapR’s Norris. “It is about being able to deploy and leverage IoT effectively.”

How a manufacturing company became an IoT companyWhen you think of the Internet of Things, typically smartwatches, connected home devices, and smart cars come to mind. But the Internet of Things extends to all different kinds of industries. For instance, Caterpillar Marine, a subsidiary of the construction and mining equipment provider Caterpillar, recently turned to the Internet of Things to gain real-time insight into their fleets and ships, and provide better customer experience.

Through the company’s data analytics service developed by ESRG Technologies, Caterpillar is able to collect information from sensors on their ships to manage their fleets. The type of information collected predicts machinery failure, allowing Caterpillar to schedule necessary maintenance. This type of service can provide massive savings and have a huge impact on a business. “We identify trends toward failure before they become alerts,” Jim Stascavage, marine asset intelligence technology manager at Caterpillar Marine, said in a case study. “The deviation won’t trigger an alarm, but it should, because the trend is starting to go in the wrong direction.”

Caterpillar Marine wanted to go further with their analytical capabilities, and uncover trends that could potentially provide them with biggest cost savings or payoffs. Caterpillar choose a data integration and business analytics solution from Pentaho to help it combine its sensor data with operational data and find meaningful patterns in the equipment and solutions. The data Pentaho was able to collect included things like temperature, pressure, geographical coordinates and geometric angles. “We’re mashing all this data together and trying to figure out what it means for the performance of the ship,” Stascavage said in the case study. “It’s not simple for even one ship, so you can imagine how complex it is across an enterprise. There are literally trillions of data points that need to be evaluated every year.”

According to Caterpillar, this new IoT analytical approach was able to provide better insight into equipment performance, strengthen customer relationships with ROI, and even provide savings in fuel efficiency, unscheduled downtime and environmental compliance.

“We see this convergence of the machine-generated data being able to cut into organizations. Applying the other data sources for context is really what is driving these great business outcomes. That is what we see in the early IoT market. It is moving quickly and there is a lot of opportunity to take advantage of,” said Pentaho’s Prlich.

Article Tags

About Christina Cardoza

Christina Cardoza is the News Editor of SD Times. She is responsible for the oversight of the daily news published to the website as well as the company's weekly newsletter, News on Monday. She covers agile, DevOps, AI, machine learning, mixed reality and software security. She is an undeniable nerd who loves Marvel comics and Star Wars. On Follow her on Twitter at @chriscatdoza!