Distributed Analytics in IoT – Why Positioning is Key

The current global focus on the “Internet of Things (IoT)” have highlighted extreme importance of sensor-based intelligent and ubiquitous systems contributing to improving and introducing increased efficiency into our lives. There is a natural challenge in this, as the load on our networks and cloud infrastructures from a data perspective continues to increase. Velocity, variety and volume are attributes to consider when designed your IoT solution, and then it is necessary to design where and where the execution of analytical algorithms on the data sets should be placed.

Apart from classical data centers, there is a huge potential in looking at the various compute sources across the IoT landscape. We live in a world where compute is at every juncture, from us to our mobile phones, our sensor devices and gateways to our cars. Leveraging this normally idle compute is important in meeting the data analytics requirements in IoT. Future research will attempt to consider these challenges. There are three main classical architecture principles that can be applied to analytics. 1: Centralized 2: Decentralized and 3: Distributed.

The first, centralized is the most known and understood today. Pretty simple concept. Centralized compute across clusters of physical nodes is the landing zone (ingestion) for data coming from multiple locations. Data is thus in one place for analytics. By contrast, a decentralized architecture utilizes multiple big distributed clusters are hierarchically located in a tree like architecture. Consider the analogy where the leaves are close to the sources, can compute the data earlier or distribute the data more efficiently to perform the analysis. This can have some form of grouping applied to it, for example – per geographical location or some form of hierarchy setup to distribute the jobs.

Lastly, in a distributed architecture, which is the most suitable for devices in IoT, the compute is everywhere. Generally speaking, the further from centralized, the size of the compute decreases, right down to the silicon on the devices themselves. Therefore, it should be possible to push analytics tasks closer to the device. In that way, these analytics jobs can act as a sort of data filter and decision maker, to determine whether quick insight can be got from smaller data-sets at the edge or beyond, and whether or not to push the data to the cloud or discard. Naturally with this type of architecture, there are more constraints and requirements for effective network management, security and monitoring of not only the devices, but the traffic itself. It makes more sense to bring the computation power to the data, rather than the data to a centralized processing location.

There is a direct relationship between the smartness of the devices and the selection and effectiveness of these three outlined architectures. As our silicon gets smarter and more powerful and efficient, this will mean that more and more compute will become available, which should result in the less strain on the cloud. As we distribute the compute, it should mean more resilience in our solutions, as there is no single point of failure.

In summary, the “Intelligent Infrastructures” now form the crux of the IoT paradigm. This means that there will be more choice for IoT practitioners to determine where they place their analytics jobs to ensure they are best utilizing the compute that is available, and ensuring they control the latency for faster response, to meet the real time requirements for the business metamorphosis that is ongoing.