Sunday, 7 January 2018

Automatic
speech transcription, Self-driving cars, a computer program beating the world
champion GO player and computers learning to play video games and achieving
better results than humans. Astonishing results that makes you wonder what
Artificial Intelligence (AI) can achieve now and in the future. Futurist Ray Kurzweil predicts that by 2029 computers will have human level intelligence and
by 2045 computers will be smarter than humans, the so called “Singularity”. Some
of us are looking forward to that, others think of it as their worst nightmare.
In 2015 several top scientists and entrepreneurs called for caution over AI as it
could be used to create something that cannot be controlled. Scenarios envisioned
in movies like 2001, a Space Odyssey or the Terminator in which AI turns
against humans, violating Asimov’s first law of robotics, are not the ones we’re
looking forward to. Question is if these predictions and worries about the
capabilities of AI, now or in the future, are realistic or just fairy tales.

What is AI?

AI is
usually defined as the science of making computers do things that require
intelligence when done by humans. To get a computer to do things it requires
software. To let a computer do smart things it needs algorithms. Today the most
common algorithms used in AI are Supervised learning, Transfer learning,
Unsupervised learning and Reinforcement learning. Note that the nowadays popular
term Deep Learning is just a form of Supervised Learning using (special forms
of ) Neural Nets. Supervised learning takes both input and output data
(labelled data) and uses algorithms to create computer models that are able to
predict the correct label for new input data. Typical applications are image
recognition, facial recognition, automatic transcription of audio, (speech to
text) and automatic translation. Supervised learning takes a lot of data, about
50,000 hours of audio are required to train a human like performing speech transcription
system. Transfer learning is similar to Supervised Learning but stores
knowledge gained while solving one problem and applying it to a different but
related problem. For example, applying knowledge gained while learning to
recognise cars to recognise trucks. Unsupervised learning doesn’t use labelled
data and tries to find patterns in data. There are little to no successful
practical applications of Unsupervised learning however. Reinforcement learning
also doesn’t use labelled data but uses feedback mechanisms to let the computer
programme “learn” how to improve its behaviour. Reinforcement learning is used
in AlphaGo (the programme that beat the GO world champion) and in teaching computers
to play video games. Reinforcement learning is even more data hungry than the
other AI techniques. Besides playing (video) games there are no practical
applications of Reinforcement learning yet.

What makes AI
successful?

As Andrew Ng, Coursera founder and Adjunct Professor at Stanford University indicates,
the most successful applications of AI in practice use supervised learning. He
estimates that 99% of the economic value created today with AI is using this
approach. The AI supported optimisation
of ad placements on webpages is by far the most successful in terms of the
additional revenue it generates for its users. Very little economic value is
created with the remaining techniques, despite the high level of attention these
have had in the media. Todays “rise” of AI may have struck you as a surprise. A
couple of years ago we were not even aware of the practical usability of AI,
let alone imagined that we would have AI on our phone (Siri) or in our house
(Alexa) supporting us with everyday tasks. However, AI is nothing new, it has
been researched since the 1960’s. The current leading algorithm used to
estimate the Deep Learning neural networks, backpropagation, was popularised by
Geoffrey Hinton in 1986, but has its roots somewhere in the 1960’s. Lack of
data and computational power made the algorithm impractical. This has changed
as the availability of (labelled) data has grown tremendously and, more
importantly, computing power has increased significantly by the introduction of
GPU computing. These two factors are the key reasons for AI to be successful
today. So it’s not research driven progress, but engineering driven progress. Still,
for the best performing supervised learning applications, super computers or
High Performance Computing (HPC) systems are required because huge neural nets
need to be constructed and estimated. To illustrate, Google’s AlphaGo programme
ran on special hardware with 1202 CPUs and 176 GPUs when playing against Go
Champion Lee Sedol. Many experts, among them Rodney Brooks, roboticist and AI
researcher, questions if much progress can be expected as computational power
is not expected to increase much further. Therefore, it could be that we're not
at the beginning of an AI revolution, but at the end of one.

What can we
expect from AI in the future?

Browsing
through the newspapers and other media the number of stories on the achievements
of AI and how it will impact the world is huge. Futurist predictions about what
AI will allows us to do in the future are mind boggling. Will we really be able
to upload our mind to a computer and live forever or learn Kung Fu like Neo in
the Matrix movie? Most of these predictions state that AI will increase in
power quickly assuming it is driven by an exponential law of progress, similar to
Moore’s law. This is doubtful as for AI to acquire the predicted powers it not only
requires faster computers, it also requires smarter and more capable software
and algorithms. Trouble is, research progress doesn’t follow a law or pattern and
therefore can’t be predicted. Deep Learning took 30 year to deliver value. Many
AI researchers see it as an isolated event. As Rodney Brook says there is no
“law” that dictates when the next breakthrough in AI will happen. It can be
tomorrow, but it can also take a 100 years. I think most futurists make the
same prediction mistake as many of us do. We tend to overestimate the effect of
a technology in the short run and underestimate the effect in the long run (Roy
Amara’s law). Take for example computers. When they were introduced in the 1950’s
there was widespread fear that it would take over all jobs. Now 60 years later,
most jobs are still there, new jobs have been created due to the introduction of
computers and we have applications of computers we never even imagined.

https://www.warnerbros.com/matrix/photos

As Niels Bohr said many year ago: ”Predictions
are hard, especially if they are about the future” this also applies to
predicting how Artificial Intelligence will develop in the next years. AI today
is capable of performing very narrow tasks well, but the success is very
brittle. Change the rules of the task slightly and it needs to be retrained and
tuned all over again. For sure there will be progress, and more activities we do
will get automated. Andrew Ng has a nice rule of thumb for it, any mental activity
that takes about of second of thought from a human will get automated with AI. This
will impact jobs, but at a much slower rate than many predict. This will
provide us the time to learn how to safely design and use this technology, similar
to the way we learned to use computers. So, when we are realistic about what AI
can do in the future, there is no need to get too excited or upset, sit back and enjoy
Hollywood’s AI doomsday movies and other fairy tales about AI. If you have the
time I recommend reading some of the work AI researchers publish, for example Rodney
Brooks, Andrew Ng, John Holland or scholars like Jaron Lanier or Daniel Dennett.

Sunday, 5 November 2017

To assure
public safety regulatory agencies like the ACM in the Netherlands or Ofgem in
the UK monitor the performance of gas and electric grid operators on area’s
like costs, safety and the quality of their networks. The regulator compares
the performance of the grid operators and decides on incentives to stimulate
improvements in these areas. Difficulty with these comparisons is that grid
operators use different definitions and/or methodologies to calculate performance,
which complicates a like for like comparison on for example asset health,
criticality or risk across the grid operators. In the UK this has led to a new
concept for risk calculations, the concept of monetised risk. In calculating
monetised risk not only the probability of failure of the asset is used, also
the probability of the consequence of a failure and its financial impact are
taken into account. The question is if this new method delivers more insightful
risk estimations to allow for a better comparison among grid operators. Also,
will it support fair risk trading among asset groups or the development of improved
risk mitigation strategies?

The cost -
risk trade-off that grid operators need to make is complex. Costly risk
reducing adjustments to the grid need to be weighed against the rise in cost of
operating the network and therefore the rates consumers pay for using the grid.
For making the trade-off, an estimate of the probability of failure of an asset
is required. In most cases, specific analytical models are developed to
estimate these probabilities. Using pipeline characteristics like type of material,
age, and data on the environment the pipeline is in (i.e. soil type,
temperature and humidity) pipeline specific failure rate models can be created.
Results from inspections of the pipeline can be used to further calibrate the model.
Due to the increased analytics maturity of grid operators, these models are
becoming more common. Grid operators are also starting to incorporate these failure
rate models in the creation of their maintenance plans.

Averaging
the Risk

As you can
probably imagine, there are many ways for constructing failure rate models.
This makes it difficult for a regulator to compare reported asset conditions
from the grid operators, as these estimates could have been based on different
assumptions and modelling techniques.That is why, in the UK at least, it was agreed between the 4 major gas
distribution networks (GDN), to standardise the approach. In short, the method can be described as
follows.

Identify the failure modes of each asset category/sub group in the asset base and estimate the probability of failure for each identified failure mode.

For each failure mode the consequences of the failure are identified, including the probability of the consequence occurring.

For each consequence the monetary impact is estimated.

By summing up over all failure modes and consequences, a probability weighted estimate of monetised risk for an asset category/sub group is calculated. Summarising over all asset categories/sub groups gives a total level of monetised risk for the grid.

This new
standardised way of calculating risks makes the performance evaluation much easier,
it also allows for a more in-depth
comparison. See for more details on the method the official documentation.

An
interesting part of this new way of reporting risk is the explicit and
standardised way of modelling asset failure, consequence of asset failure and
cost of the consequence. This is similar to how a consolidated financial
statement of a firm is created. Therefore, you could interpret it as a
consolidated risk statement. But can risks of individual assets or asset groups
be aggregated in the described way and provide a meaningful estimate of the total
actual risk? The above described approach sums the estimated (or weighted
average) risk for each asset category/sub group, so it’s an estimate of the
average risk for the complete asset base. However risk management is not about
looking at the average risk, it’s about extreme values. For those who read Sam
Savage’s The Flaw of Averages or Nassim Taleb’s Black Swan know what I’m
talking about.

Risking the
Average

Risks are
characterised by extreme outcomes, not averages. To be able to analyse extreme values,
a probability distribution of the outcome you’re interested in is required.
Averaging reduces the distribution of all possible outcomes to a point estimate,
hiding the spread and likelihood of all possible outcomes. Also, averaging
risks ignores the dependence between each of the identified modes of failure or
consequence. To illustrate let’s assume that we have 5 pipelines, each with a
probability of failure of 20%. There is only one consequence (probability =1)
with a monetary impact of €1,000,000. The monetised risk per
pipeline than becomes €200,000 (=0,20*€1,000,000), for the total grid it is equal to €1,000,000. If
we take dependence of the failures into account than there will be a 20%
probability of all pipes failing when these are fully correlated events. There
will be a 0,032% change of all pipes failing if they are fully independent. The
estimated financial impact than ranges from €1,000,000 in
the fully correlated case to €1,600 in the fully independent case.
That’s quite a range which isn’t visible in the monetised risk approach.

Regulators
must assess risk in many different areas. Banking has been top of mind in the
past years, but industries like Pharma and Utilities also had a lot of attention.
How a regulator decides to measure and asses risk is very important. If risks
are underestimated, this could impact society (like a banking crisis, deaths
due to the admission of unsafe drugs or increase of injuries due to pipeline
failures). If risks are overestimated costly mitigation might be imposed,
again impacting society with high costs. The above example shows that the monetised
risk approach is insufficient as it estimates risk with averages, where in risk
mitigation the extreme values are much more important. What than is a better
way of aggregating these uncertainties and risks than just averaging them?

Monte Carlo
Simulation

The best
way to better understand the financial impact of asset failure is to construct
a probability density function of all possible outcomes using Monte Carlo simulation
and based on that distribution make the trade-off between costs and risk. Monte
Carlo Simulation has proven its value in many industries and in this case will
provide what we need. Using the free tools of Sam Savage’s probabilitymanagement.org
the above hypothetical example of 5 pipe lines can be modelled and the
distribution of financial impact analysed. In just a few minutes the below
cumulative distribution (CDF) of the financial impact for the 5 pipelines case can
be created. Remember that the monetised risk calculation resulted in a risk
level equal to the average, €1,000,000.

From the graph it
immediate follows that P(Financial Impact<=Monetised Risk) = 33%. It implies
that the P(Financial Impact>Monetised Risk) = 1-33%=66%. So, a 66% chance that
the financial impact of pipe failures will be higher than the calculated monetised
risk. Therefore we’re taking a serious risk by using the averaged asset risks. Given
the objective of better comparison of grid operator performance and enabling risk
trading between asset groups, the monetised risk method is to simple I would
say. By averaging the risks, the distribution of financial impact is rolled up
into one number leaving you no clue on what the actual distribution looks like (See
also Sam Savage’s : The Flaw of Averages) A better way would be to set an
acceptable “risk threshold” (say 95%) and use the estimated CDF to determine
the corresponding financial impact.

This
approach would also allow for better comparison of grid operators by creating a
cumulative distribution for all of them and plotting them together into one
graph (See example below). In a similar way risk mitigations can be evaluated and
comparisons made between different asset groups, allowing for better informed
risk trading.

Standardising
the way in which asset failures and consequence of failures are estimated and monetised
definitely is a good step towards a comparable way to measure risk. But risks
should not be averaged in the way the monetised risk approach suggests. There
are better ways, which will provide insight on the whole distribution of risk. Given
the available tools and computing power, there is no reason not to do so. It will
improve our insights on the risks we face and help us find the best mitigation
strategies to reducing public risks.

Friday, 27 January 2017

What is the latest data science success story you have read? The one from Walmart? Maybe a fascinating result from a Kaggle competition? I’m always interested in these stories wanting to understand what has been achieved, why it was important and what the drivers for success were. Although the buzz on the potential of data science is very strong, the number of stories on impactful practical applications of data science is still not very large. The Harvard Business Review recently published an article explaining why organisations are not getting value from their data science initiatives. Although there are many more reasons than mentioned in the article one key reason for many initiatives to fail is a disconnect between the business goals and the data science efforts. Also, the article states that the focus of data scientists is to keep fine tuning their models instead of taking on new business questions, causing delays in the speed at which business problems are analysed and solved.

Seduced by inflated promises, organisations have started to mine their data with state of art algorithms expecting that it is turned into gold instantly. This expectation that technology will act as a philosopher’s stone, makes data science comparable to alchemy. It looks like science, but it isn’t. Most of the algorithms fail to deliver value as they can’t provide an explanation as to why things are happening nor provide actionable insights or guidance for influencing the phenomena being investigated. To illustrate, take the London riots in 2011. Since the 2009 G20 summit, the UK police has been gathering and analysing a lot of social media data, but still they were not able to prevent the 2011 riots from happening nor track and arrest the rioters. Did the police have too little data or lack of computing or algorithmic power? No, millions have been spent. Despite all the available technology the police was unable to make sense of it all. I see other organisations struggle with the same problem trying to make sense of their data. Although I’m a strong proponent of using data and mathematics (and as such data science) for answering business questions, I do believe that technology can never be sufficient to provide an answer. Likewise, the amount, diversity and speed of the data.

Inference vs Prediction

Let’s investigate the disconnect between the business goals and the data science efforts as mentioned in the HBR article. Many of today’s data science initiatives result in predictive models. In a B2C context these models are used to predict whether you’re going to click on an ad, buy a suggested product, if you’re going to churn, or if you’re likely to commit fraud or default on a loan. Although a lot of effort goes into creating highly accurate predictions, questions is if these predictions really create business value. Most organisations require a way to influence the phenomenon being predicted instead of the prediction itself. This will allow them to decide on the appropriate actions to take. Therefore, understanding what makes you click, buy, churn, default or commit fraud is the real objective. To be able to understand what influences human behaviour requires another approach than creating predictions, it requires inference. Inference is a statistical, hypothesis driven approach to modelling and focusses on understanding the causality of a relationship. Computer science, the core of most data science methods, focusses on finding the best model to fit the data and doesn’t focus on understanding why. Inferential models provide the decision maker with guidance on how to influence customer behaviour and thus value can be created. This might better explain the disconnect between business goals and the analytics efforts as reported in the HBR article. For example, knowing that a call positively influences customer experience and prevents churn for a specific type of customer gives the decision maker the opportunity to plan such a call. Prediction models can’t provide these insights, but will provide the expected number of churners or who is most likely to churn. How to react on these predictions is left to the decision maker.

Keep it simple!

Second reason for failure mentioned in the HBR article is that data scientists put a lot of effort in improving the predictive accuracy of their models instead of taking on new business questions. Reason mentioned for this behaviour is the huge effort for getting the data ready for analysis and modelling. Consequence of this tendency is that it increases model complexity. Is this complexity really required? From a user’s perspective, complex models are more difficult to understand and therefore also more difficult to adopt, trust and use. For easy acceptance and deployment, it is better to have understandable models. Sometimes this is even a legal requirement, for example in credit scoring. A best practice I apply in my work as a consultant is to balance the model accuracy well against the accuracy required for the decision to be made, the analytics maturity of the decision maker and the accuracy of the data. This also applies to data science projects. For example, targeting the receivers of your next marketing campaign requires less accuracy than have a self-driven car find its way to its destination. Also, you can’t make more accurate predictions than the accuracy of your data. Most data are uncertain, biased, incomplete and contain errors, when you have a lot of data this becomes even worse. This will negatively influence the quality and applicability of the model based on this data. In addition, research shows that the added value of more complex methods is marginal compared to what can be achieved with simple methods. Simple models already catch most of the signal in the data, enough in most practical situations to base a decision on. So, instead of creating a very complex and highly accurate model, better to test various simple ones. They will capture the essence of what is in the data and speed up the analysis. From a business perspective, this is exactly what you should ask you data scientists to do, come up with simple models fast and if required for the decision use the insights from these simple models to direct the construction of more advanced ones.

The question “How to get value from your data science initiative?” has no simple answer. There are many reasons why data science projects succeed or fail, the HBR article only mentions a few. I’m confident that the above considerations and recommendations will increase the chances of your next data science initiative to be successful. Can’t promise you gold however, I’m no alchemist.

Thursday, 13 October 2016

We are all
well aware of the predictive analytical capabilities of companies like Netflix,
Amazon and Google. Netflix predicts the next film you are going watch. Amazon
shortens delivery times by predicting what you are going to buy next, Google
even lets you use their algorithms to build your own prediction models.
Following the predictive successes of Netflix, Google and Amazon companies in
telecom, finance, insurance and retail have started to use predictive
analytical models and developed the analytical capabilities to improve their
business. Predictive analytics can be applied to a wide range of business
questions and has been a key technique in search, advertising and
recommendations. Many of today's
applications of predictive analytics are in the commercial arena, focusing on
predicting customer behaviour. First steps in other businesses are being taken.
Organisations in healthcare, industry, and utilities are investigating what
value predictive analytics can bring. In these first steps much can be learned
from the experience the front running industries have in building and using
predictive analytical models. However, care must be taken as the context in
which predictive analytics has been used is quite different from the new
application areas, especially when it comes to the impact of prediction errors.

Leveraging
the data

It goes
without saying that the success of Amazon comes from, besides the infinite
shelf space, its recommendation engine. Similar for Netflix. According to McKinsey, 35 percent of what consumers purchase on Amazon and 75 percent of
what they watch on Netflix comes from algorithmic product recommendations. Recommendation
engines work well because there is a lot of data available on customers,
products and transactions, especially online. This abundance of data is why
there are so many predictive analytics initiatives in sales & marketing. Main objective of these initiatives is to
predict customer behaviour, like which customer is likely to churn or buy a
specific product/service, which ads will be clicked on or what marketing
channel to use to reach a certain type of customer. In these types of
applications predictive models are created either using statistical (like
regression, probit or logit) or machine learning techniques (like random
forests or deep learning) With the insights gained from using these predictive
models many organisations have been able to increase their revenues.

Predictions
always contain errors!

Predictive
analytics has many applications, the above mentioned examples are just the tip
of the iceberg. Many of them will add value, but it remains important to stress
that the outcome of a prediction model will always contain an error. Decision
makers need to know how big that error is. To illustrate, in using historic
data to predict the future you assume that the future will have the same
dynamics as the past, an assumption which history has proven to be dangerous.
The 2008 financial crisis is prove of that. Even though there is no shortage of
data nowadays, there will be factors that influence the phenomenon you’re
predicting (like churn) that are not included in your data. Also, the data
itself will contain errors as measurements always include some kind of error.
Last but not last, models are always an abstraction of reality and can't
contain every detail, so something is always left out. All of this will impact
the accuracy and precision of your predictive model. Decision makers should be
aware of these errors and the impact it may have on their decisions.

When
statistical techniques are used to build a predictive model the model error can
be estimated, it is usually provided in the form of confidence intervals. Any
statistical package will provide them, helping you asses the model quality and
its prediction errors. In the past few years other techniques have become
popular for building predictive models, for example algorithms like deep
learning and random forests. Although these techniques are powerful and able to
provide accurate predictive models, they are unable to provide a confidence
intervals (or error bars) for their predictions. So there is no way of telling
how accurate or precise the predictions are. In marketing and sales, this may
be less of an issue. The consequence might be that you call the wrong people or
show an ad to the wrong audience. The consequences can however be more severe.
You might remember the offensive auto tagging by Flickr, labelling images of people with tags like “ape” or “animal” or the racial bias in predictive policing algorithms.

Where is
the error bar?

The point
that I would like to make is that when adopting predictive modelling be sure to
have a way of estimating the error in your predictions, both on accuracy and
precision. In statistics this is common practice and helps improve models and
decision making. Models constructed with machine learning techniques usually
only provide point estimates (for example, the probability of churn for a
customer is some percentage) which provides little insight on the accuracy or
precision of the prediction. When using machine learning it is possible to
construct error estimates (see for example the research of Michael I. Jordan) but
it is not common practice yet. Many analytical practitioners are not even aware
of the possibility. Especially now that predictive modelling is getting used in
environments where errors can have a large impact, this should be top of mind
for both the analytics professional and the decision maker. Just imagine your
doctor concluding that your liver needs to be taken out because his predictive
model estimates a high probability of a very nasty decease? Wouldn’t your first
question be how certain he/she is about that prediction? So, my advice to decision
makers, only use outcomes of predictive models if accuracy and precision
measures are provided. If they are not there, ask for them. Without them, a
decision based on these predictions comes close to a blind leap of faith.

Wednesday, 3 August 2016

One of the main
news items of the past few days is the increased level of security at Amsterdam
Schiphol Airport and the additional delays it has caused travellers both
incoming and outgoing. Extra security checks on the roads around the airport
are being conducted, also in the airport additional checks are being performed.
Security checks have increased after the authorities received reports of a
possible threat. We are in the peak of the holiday season where around 170.000
passengers per day arrive, depart or transfer at Schiphol Airport. With these
numbers of people for sure authorities want to do their utmost to keep us save,
as always. This intensified security puts the military police (MP) and security
officers under stress however as more needs to be done with the same number of
people. It will be difficult for them to keep up the increased number of checks
for long. Additional resources will be required, for example from the military.
Question is, does security really improve by these additional checks or could a
more differentiated approach offer more security (lower risk) with less effort?

How has
airport security evolved?

If I take a
plane to my holiday destination …I need to take of my coat, my shoes, and my
belt, get my laptop and other electronic equipment out of my back, separate the
chargers and batteries, hand in my excess liquids, empty my pockets, and step through
a security scanner. This takes time, and
with an increasing numbers of passengers waiting times will increase. We all
know these measures are necessary to keep us save but taking a trip abroad
doesn’t start very enjoyable. These measures have been adopted to prevent the same
attack from happening again and has resulted in the current rule based system of
security checks. Over the years the number of security measures has increased enormously,
see for example the timeline on the TSA website, making it a resource heavy activity
which can’t be continued in the same way in the near future. A smarter way is
needed.

Risk Based
Screening

At present most
airports apply the same security measures to all passengers, a one size fits
all approach. This means that low risk passengers are subject to the same
checks as high risk passengers. This implies that changes to the security checks
can have an enormous impact on the resources requirements. Introducing a one minute
additional check by a security officer to all passengers at Schiphol requires
354 additional security officers to check 170.000 passengers. A smarter way would be to apply different
measures to different passenger types, high risk measures to high risk passengers
and low risk measures to low risk passengers. This risk based approach is at
the foundation of SURE! (Smart Unpredictable Risk Based Entry) a concept introduced
by the NCTV (The National Coordinator for Security and Counterterrorism) Consider
this, what is more threatening, a non-threat passenger with banned items (pocket
knife, water bottle) or a threat passenger with bad intentions (and no banned
items). I guess you will agree that the latter is the more threatening one and this
is exactly where risk based screening focusses on. Key component in risk based security is to
decide what security measures to apply to which passenger, taking into account
that attackers will adapt their plans when additional security measures are installed.

Operations
Research helps safeguard us

The concept
of risk based screening makes sense as scarce resources like security officers,
MP’s and scanners are utilized better. In the one size fits all approach a lot
of these resources are used to screen low risk passengers and as a consequence
less resources are available for detecting high risk passengers. Still, even
with risk based screening trade-offs must be made as resources will remain scarce.
Also decisions need to be made in an uncertain and continuously changing environment,
with little, false or no information. Sound familiar? This is the exactly the
area where Operations Research shines. Decision making under uncertainty can
for example be supported by simulation, Bayesian belief networks, Markov
decision and control theory models. Using game theoretic concepts the behaviour
of attackers can be modelled and incorporated, leading to the identification of
new and robust counter measures. Queuing theory and waiting line models can be
used to analyse various security check configurations (for example centralised
versus decentralised, and yes centralised is better!) including the required staffing.
This will help airports to develop efficient and effective security checks
limiting the impact on passengers while achieving the highest possible risk
reduction. These are but a small number of examples where OR can help, there
are many more.

Some of the
concepts of risk based security checks, resulting from the SURE! Programme are
already put into practice. Schiphol is working towards centralised security and
recently opened the security check point of the future for passengers traveling
within Europe. It’s good to know that the decision making rigour comes from
Operations Research, resulting in effective, efficient and passenger friendly security
checks.

Thursday, 21 July 2016

Every
utility deploys capital assets to serve its customers. During the asset life cycle an asset manager repetitively
must make complex decisions with the objective to minimise asset life cycle
cost while maintaining high availability and reliability of the assets and
networks. Avoiding unexpected outages, managing risk and maintaining assets
before failure are critical goals to improve customer satisfaction. To better manage
asset and network performance utilities are starting to adopt a data driven approach.
With analytics they expect to lower asset life cycle cost while maintaining
high availability and reliability of their networks. Using actual performance data,
asset condition models are created which provide insight on the asset deterioration
over time and what the driving factors of deterioration are. With this insights
forecasts can be made on the future asset and network performance. These models
are useful, but lack the ability to effectively support the asset manager in
designing a robust and cost effective maintenance strategy.

Asset
condition models allow for the ranking of assets based on their expected time
to failure. Within utilities it is common practice to use this ranking in
deciding which assets to maintain. By starting at the assets with the shortest
time to failure, assets are selected for maintenance until the budget available
for maintenance is exhausted. This prioritisation
approach will ensure that the assets most prone to failure are selected for
maintenance, however it will not deliver the maintenance strategy with the
highest overall reduction of risk. Also the approach can’t effectively handle constraints
in addition to the budget constraint. For example constraints on manpower availability,
precedence constraints on maintenance projects, or required materials or
equipment. Therefore a better way to determine a maintenance strategy is required
taking into account all these decision dimensions. More advanced analytical
methods, like mathematical optimization (=prescriptive analytics), will provide
the asset manager with the required decision support.

In finding the
best maintenance strategy the asset manager could instead of making a ranking, list
all possible subsets of maintenance projects that are within budget and
calculate the total risk reduction of each subset. The best subset of projects to
select would be the subset with the highest overall risk reduction (or any other
measure). This way of selecting projects also allows for additional constraints,
like required manpower, required equipment or spare parts, time depended budget
limits, to be taken into account. Subsets that do not fulfil these requirements
are simply left out. Also, subsets could be constructed in such a manner that mandatory
maintenance projects are included. With
a small number of projects this way of selecting projects would be possible, 10
projects would lead to 1024 (=2^10) possible subsets. But with large numbers
this is not possible, a set of 100 potential projects would lead 1.26*10^30 possible
subsets which would take too much time, if possible at all, to construct and
evaluate them all. This is exactly where
mathematical optimisation proofs its value because it allows you to implicitly
construct and evaluate all feasible subsets of projects, fulfilling not only
the budget constraint but any other constraint that needs to be included. Selecting
the best subset is achieved by using an objective function which expresses how
you value each subset. Using mathematical optimisation assures the best
possible solution will be found. Mathematical optimisation has proven its value
many times in many industries, also in Utilities, and disciplines, like maintenance.
MidWest ISO for example uses optimisation techniques to continuously balance
energy production with energy consumption, including the distribution of
electricity in their networks. Other asset heavy industries like petrochemicals
use optimisation modelling to identify cost effective, reliable and safe
maintenance strategies.

In improving
their asset maintenance strategies, utilities best next step is to adopt
mathematical optimisation. It allows them to leverage the insights from their
asset condition models and turn these insights into value adding maintenance decisions.
Compared to their current rule based selection of maintenance projects in which
they can only evaluate a limited number of alternatives, they can significantly
improve as mathematical optimisation lets them evaluate trillions (possibly
all) alternative maintenance strategies within seconds. Although “rules of
thumb”, “politics” and “intuition” will always provide a solution that is “good”,
mathematical optimisation assures that The Best solution will be found.

Tuesday, 19 July 2016

Data driven
decision making has proven to be key for organisational performance
improvements. This stimulates organisations to gather data, analyse it and use
decision support models to improve their decision making speed and quality. With
the rapid decline in cost of both storage and computing power, there are nearly
no limitations to what you can store or analyse. As a result organisations have
started building data lakes and invested in big data analytics platforms to
store and analyse as much data as possible. This is especially true in the consumer
goods and services sector where big data technology can been transformative as it
enables a very granular analysis of human activity (up to the personal level). With
these granular insights companies can personalise their offerings, potentially increasing
revenue by selling additional products or services. This allows for new
business models to emerge and is changing the way of doing business completely.
As the potential of all this data is huge, many organisations are investing in big
data technology expecting plug and play inference to support their decision making.
The big data practice however is something different and is full of rude
awakenings and headaches.

That big data
technology can create value is proven by the fact that companies like Google, Facebook
and Amazon exist and do well. Surveys from Gartner and IDC show that the number
of companies adopting big data technology is increasing fast. Many of them want
to use this technology to improve their business and start using it in an exploratory
manner. When asked about the results they get from their analysis many of them
respond that they experience difficulty in getting results due to data issues,
others report difficulty getting insights that go beyond preaching to the choir.
Some of them even report disappointment as their outcomes turn out to be wrong
when put into practice. Many times the lack of experienced analytical talent is
mentioned as a reason for this, but there is more to it. Although big data has
the potential to be transformative, it also comes with fundamental challenges
which when not acknowledged can cause unrealistic expectations and disappointing
results. Some of these challenges are even unsolvable at this time.

Even if
there is a lot of data, it can’t be used properly

To
illustrate some of these fundamental challenges, let’s take an example of an online
retailer. The retailer has data on its customers and uses it to identify generic
customer preferences. Based on the identified preferences offers are generated
and customers targeted. The retailer wants to increase revenue and starts to
collect more data on the individual customer level. The retailer wants to use
the additional data to create personalised offerings (the right product, at the
right time, for the right customer, at the right price) and to make predictions
about future preferences (so the retailer can restructure its product portfolio
continuously). In order to do so the retailer needs to find out what the
preferences of its customers are and the drivers of their buying behaviour. This
requires constructing and testing hypotheses based on the customer attributes
gathered. In the old situation the number of available attributes (like address,
gender, past transactions) was small. Therefore only a small number of
hypothesis (for example “women living in a certain part of the city are inclined
to buy a specific brand of white wine”) can be tested to cover all possible
combinations. However with the increase in the number of attributes, the number
of combinations of attributes that are to be investigated increases
exponentially. If in the old situation the retailer had 10 attributes per
customer, a total of 1024 (=210) possible combinations needed to be evaluated.
However when the number of attributes increases to say 500 (which in practice
is still quite small), the number of possible combinations of attributes increases
to 3.27 10150 (=2500)
This exponential growth causes computational issues as it becomes impossible to
test all possible hypotheses even with the fastest available computers. The
practical way around this is to significantly reduce the number attributes
taken into account. This will leave much of the data unused and many possible
combinations of attributes untested, therefore reducing the potential to improve.
This might also cause much of the big data analysis results to be too obvious.

The larger
the data set, the stronger the noise

There is
another problem with analysing large amounts of data. With the increase in the
size of the data set, all kinds of patterns will be found but most of them are
going to be just noise. Recent research has provided proof that as data sets
grow larger they have to contain arbitrary correlations. These correlations
appear due to the size, not the nature, of the data, which indicates that most of
the correlations will be spurious. Without proper practical testing of the
findings, this could cause you to act upon a phantom correlation. Testing all
the detected patterns in practice is impossible as the number of detected
correlations will increase exponentially with the data set size. So even though
you have more data available you’re worse of as too much information behaves
like very little information. Besides the increase of arbitrary correlations in
big data sets, testing the huge number of possible hypotheses is also going to
be a problem. To illustrate, using a significance level of 0.05, testing 50
hypothesis on the same data will give at least one significant result with a
92% chance.

This
implies that we will find an increasing number of statistical significant
results due to chance alone. As a result the number of False Positives will
rise, potentially causing you to act upon phantom findings. Note that this is
not only a big data issue, but a small data issue as well. In the above example
we already need to test 1024 hypotheses with 10 attributes.

Data driven
decision making has nothing to do with the size of your data

So, should the
above challenges stop you from adopting data driven decision making? No, but be
aware that it requires more than just some hardware and a lot of data. Sure,with a lot
of data and enough computing power significant patterns will be detected even
if you can’t identify all the patterns that are in the data. However, not many
of these patterns will be of any interest as spurious patterns will vastly
outnumber the meaningful ones. Therefore,
with the increase in size of the available data also the skill level for analysing
the data needs to grow. In my opinion data and technology (even a lot of it) is
no substitute for brains. The smart way to deal with big data is to extract and
analyze key information embedded in “mountains of data” and to ignore most of
it. You could say that you first need to trim down the haystack to better locate
where the needle is. What remains are collections of small amounts of data that
can be analysed much better. This approach will prevent you from getting a big headache
from your big data initiatives and will improve both speed and quality of drive
data driven decision within your organisation.