Traffic accidents in the UK, 1979-2004.

Whether you are a journalist, a researcher or a data geek, in order to start working with large data sets, you have to complete laborious tasks of setting-up an infrastructure, configuring an environment, learning new unfamiliar tools and coding complicated apps – with DC/OS you can start crunching those numbers within minutes.

On 27 February – 2 March, DataArt exhibited with Canonical at Mobile World Congress in Barcelona. The sheer scope of the world’s biggest mobile industry event was mind boggling – 100,000 attendees and 2200 exhibitors spanned nine halls and one dozen outdoor spaces at Fira Gran Via and Fira Montjuïc.
DataArt demoed enterprise predictive maintenance IoT solution, enabling preventative, condition-based monitoring of a piece of manufacturing equipment. We used accelerometer-based sensors and an IoT gateway running Snappy Ubuntu Core to capture the vibration profile of a fan and analyzed it in AWS, to determine whether it’s in range of a normally operating equipment, and if not – to trigger a maintenance alert.

DataArt, the maker of DeviceHive, and Canonical, the maker of Snappy, Ubuntu and Juju, present Open IoT Solutions on Azure Events.

DataArt and Canonical are demonstrating industrial preventive maintenance and home IoT scenarios, that can be prototyped, scaled, and deployed. DataArt’s DeviceHive running on Canonical’s Ubuntu VM, are available on the Microsoft Azure Marketplace, providing accessibility to a flexible IoT platform. New bundled IoT solutions and examples, DeviceHive on Snappy (RPii), Data Analytics stack deployed by Juju, and Microsoft Azure services, will be discussed and demonstrated.

Machine learning, cloud, visualization, Hadoop, spark, data science, scalability, analytics, terabytes, petabytes, faster, bigger, more secure, simply better. The kind of a merry-go-round that keeps spinning in your head after you spend three days on the exhibit floor at Strata+Hadoop conference. And lots of elephants, of course

Not only did we attend Strata with fellow colleagues from DataArt and DeviceHive, we also helped our friends at Canonical and brought our demo to their booth. Canonical was showing Juju: a cloud infrastructure and service management tool. We brought our favorite demo: industrial equipment monitoring rig. No PowerPoint slides, only real stuff. A Texas Instruments SensorTag’s accelerometer attached to a fan to monitor its vibration. To simulate the vibration we used a piece of duct tape attached to one of the blades to set the whole thing off balance. Sensor data was streamed using DeviceHive, generating time series data, which was aggregated by Spark Streaming and displayed on a nice dashboard. Everything deployed using Juju, working nicely in AWS.
While the exhibition floor had a lot of great companies pitching their awesome products, I think the main highlight of this year’s event was Spark. Learning Spark, running Spark, managing Spark, using Spark for this and using Spark for that. Almost everyone, big or small, was talking Spark, integrating it into their solutions or making their data accessible through Spark. In just a few years Spark has proven to be a great platform for data discovery, machine learning and cluster computing in general. Spark ecosystem will keep expanding, changing the way we work with our data, increasing velocity of data-related projects. Next generation analytics tools will surely interface with Spark or rely on Spark, allowing enterprises to push the envelope of what can be derived from their data. Next generation parallel computing tools will bring business, engineers, data scientists and devops closer together.
Databricks, a company commercially supporting Spark, was demonstrating their data analytics product which allowed to create research notebooks and interactively write Spark jobs, run them on AWS cluster, create queries and visualize data. On top of that add Spark Streaming and you can execute your models on a live stream of data. While Databricks is hosting the landing page for the UI, your data as well as the machine to host the infrastructure to run Spark resides in your AWS environment. I’m curious to know how it will compare with Amazon’s Space Needle they are unveiling at re:Invent 2015 in Las Vegas.
Besides Spark, it is also becoming apparent, that working with data at large is no longer about a particular choice of a right database or distributed file system. Data platforms are coming. The world is starting to think in terms of data platforms: a set of technologies and architecture patterns designed to work together to solve a variety of data-related problems. Data platform largely defines how we access, store, stream, compute and search structured, unstructured, sensor generated data. A solid example of such a platform is Basho Data Platform where Basho is taking its Riak database and making it a part of something much bigger than a Key-Value store.
Personal improvement takeaways:

DataArt will be showcasing Big Data, IoT and predictive maintenance solutions at Strata+Hadoop World NYC 2015, September 30 — October 1. Powered by Canonical's Ubuntu Snappy Core and orchestrated by Juju, we will showcase how to deploy DeviceHive's lambda architecture and evolve your industrial IoT solution from proof of concept to a scalable production system.

Stop by Canonical/Ubuntu Booth #358. If you would like to connect at the show, please leave your contact information here. Looking forward to seeing you at Strata+Hadoop World NYC.

DataArt’s Big Data Competence Center announces the launch of a new beta computer application. The app analyses U.S. and U.K. media news flow and converts it into easy-to-understand charts and infographics.