01: AI - The dawn of the data age

26 February 2019 Nordea Markets and Nordea Corporate & Investment Banking AI progress – avoiding "artiﬁcial stupidity" Claudio Simao, Chief Technology Officer and Head of Innovation Hub at listed measuring and information technology group Hexagon, shares his thoughts on current AI capabilities being very domain- and application-speciﬁc rather than holistic, and how extracting quality data from great volumes of raw data and avoiding biases in this data will be crucial for further adoption and value creation from AI. JT: There are many almost mythical beliefs about Artiﬁcial Intelligence out there today. In your view, what are the capabilities of AI at present, and what uses of AI are corporates looking for (and which should they be looking for)? CS: To begin with, I don't believe in a holistic approach for AI at the stage of development we are at today. Today's existing AI frameworks are quite many and fragmented, and they are very dependent on domain knowledge and the applications for which they are used, and from which AI solutions are tuned in. The applications used in consumer and professional markets are different. In the consumer area, text and voice processing have become quite mature. Image recognition and process optimisation AI applications are much more effective in the professional arena and are the areas in which Hexagon is mostly investing now. To give some examples of applications and use cases, in voice we see a lot of voice bots, from smartphones and now moving to professional devices as a hands-free interface. In text, we see legal and tax applications, where AI is used to scan big volumes of legal documents for patterns or irregularities and contextualisation of meanings. Image and process optimisation are less mature areas but with plenty of use cases in professional applications, including object tracking, object change detection for security or quality control, or, for example, customised fertiliser and pesticide management in agriculture, optimisation of fleets in transportation and blast optimisation in mining, among a multitude of potential use cases. JT: The exponential growth in available data from growing human connectivity has been a critical driver for the growing sophistication and usefulness of AI. What do you think is needed for further reﬁnement and improvement of AI applications – more data, better data, or both? Can Hexagon contribute? CS: Considering what architectures for AI solutions should look like, the ﬁrst thing we need is to connect sensors and other dynamic data sources, which are increasingly gravitating towards real-time connections. And the data gathered needs to be linked to historical data lakes in a contextualised manner, even pre-tagged or pre-analysed data. Then we need the real-time analysis of the data, mining it for anomalies, differences and changes, and making forward predictions. And we train the system to optimise this analysis. We can then create subsystems that can train pre-trained models in real time. Finally, this can create autonomous systems using the AI modules. The data is available; now it is critical to extract quality data from the raw data In advanced, mature stages of the past industrialisation cycle, all automation systems were deterministic-driven, using expert systems for automation and optimisation. We are now moving from deterministic to heuristic and resemblance-driven optimisation systems, with much more contextual-driven predictive algorithms for system architecture. Systems can learn and adapt, not just follow predetermined original instructions. We train AI systems with correlation, but it is important to remember that correlation is not necessarily causation. Depending on the data used to train the AI models or algorithms, we can introduce biases due to unbalanced data. For sure, more data is better than less data, as it improves the model, better capturing the system's behaviour. So, we need more and more data. But, we also need quality data. Because nowadays we can virtualise almost any data, so Big Data is available. And the fact that data is sometimes "siloed" is now being addressed by the different industries. But, the next critical issue is how we extract quality data from all of that connected raw data. You know the saying – garbage in, garbage out. This area is developing very fast, with new techniques and methodologies for turning bad data into quality data, for example sourced from social media. One such technique 30