Artificial Intelligence Needs To Reset

I had the chance to interview my colleague at ArCompany, Karen Bennet, a seasoned engineering executive in platform technology, open and closed source systems and artificial intelligence technology. A former engineering lead from Yahoo!, and part of the original team who brought Red Hat to success, Karen evolved with the technological revolution, utilizing AI in expert systems in her early IBM days, and is currently laying witness to the rapid experimentation in machine learning and deep learning. Our discussions about the current state of AI have culminated into this article. It’s difficult to navigate AI amidst all the hype. The promises of AI, for the most part, have not come to fruition. AI is still emerging and has not become the pervasive force that has been promised. Consider the compelling stats that validate excitement in the AI hype:

14X increase in the number of active AI startups since 2000

Investment into AI start-ups by VCs has increased 6X since 2000

The share of jobs requiring AI skills has grown 4.5X since 2013

As of 2017, Statista put out these findings: As of last year, only 5% of businesses worldwide have incorporated AI extensively into their processes and offerings, 32% have not yet adopted, and 22% do not have plans to.

Statista: Adoption level of artificial intelligence (AI) in business organizations worldwide, as of 2017

Filip Pieniewski confirmed in his recent post on Venturebeat: “The AI winter is well on its way”:

We are now in the middle of 2018 and things have changed. Not on the surface yet — the NIPS conference is still oversold, corporate PR still has AI all over its press releases, Elon Musk still keeps promising self-driving cars, and Google keeps pushing Andrew Ng’s line that AI is bigger than electricity. But this narrative is beginning to crack.

We touted the claims of the autonomous driving car. Earlier this spring the death of a pedestrian to a self-driving vehicle raised alarms that went beyond the technology and called to question the ethics or lack thereof behind the decisions of an automated system. The trolley problem is not a simple binary choice between the life of one person to save 5 people but rather evolves into a debate of conscience, emotion and perception that now complicates the path to which a reasonable decision can be made by a machine. The conclusion from this article states:

But the dream of a fully autonomous car may be further than we realize. There’s growing concern among AI experts that it may be years, if not decades, before self-driving systems can reliably avoid accidents.

To use history as a predictor, both cloud and the dot net industries took about 5 years before they started impacting people in a significant way, and almost 10 years before these industries influenced major shifts in the market. We are envisioning a similar timeline for Artificial Intelligence. As Karen explains,

To enable adoption by everyone, a product needs to be in place, one that is scalable and one that can be used by everyone–not just data scientists. This product will need to take into account the data lifecycle of capturing data, preparing it, training models and predicting. With data being stored in the cloud, data pipelines can continuously extract and prepare them to train the models which will make the predictions. The models need to continuously improve from new training data, which, in turn, will keep the models relevant and transparent. That is the objective and the promise.

Building AI Proof of Concepts with No Significant Use Cases

Both Karen and I have come from technology and AI start-ups. What we’ve witnessed and what we’ve realized among discussion with peers within the AI community is the widespread experimentation across a multitude of business issues, which tend to stay in the labs. This recent article substantiates the widespread AI pilots that are more common today:

Vendors of AI technology are often incentivized to make their technology sound more capable than it is – but hint at more real-world traction than they actually have… Most AI applications in the enterprise are little more than ‘pilots.’ Primarily, vendor companies that sell marketing solutions, healthcare solutions and finance solutions in artificial intelligence are simply test-driving the technology. In any given industry, we find that of the hundreds of vendor companies selling AI software and technology, only about one in three will actually have the requisite skills to do artificial intelligence in the first place.

VCs are realizing they may not see a return on their investments for some time. However, ubiquitous experimentation with very few models seeing daylight is just one of the reasons why AI is not ready for prime time.

Can Algorithms be Accountable?

We’ve heard of AI “black-box,” a current approach that has no visibility into how decisions are made. This practice runs in the face of banks and large institutions that have compliance standards and policies that mandate accountability. With systems operating as black boxes, there may be an inherent trust put in algorithms as long as the creation of these algorithms have been reviewed and have met some standards by critical stakeholders. This notion has been quickly disputed given the overwhelming evidence of faulty algorithms in production and the unexpected and harmful outcomes that result from them. Many of our simple systems operate as black boxes beyond the scope of any meaningful scrutiny because of intentional corporate secrecy, the lack of adequate education and understanding how to critically examine the inputs, the outcomes and most importantly, why these results occurred. Karen concurs,

The AI industry today is at a very early stage of being enterprise-ready. AI is very useful and ready for discovery and aiding in parsing through significant amounts of data, however it still requires human intervention as a guide to evaluate and act on the data and their outcomes.

Karen clarifies that machine learning techniques today enable data to be labeled to identify insights. However, as part of this process, if some the data are erroneously labelled, or if there is not enough data representation, or if there are problematic data signifying bias, bad decision-making results are likely to occur. She also attests current processes continue to be refined:

Currently, AI is all about decision support, to provide insights into a form for which business can draw conclusions. In the next phase of AI, which automates actions from the data, there are additional issues that need to be addressed like bias, explainability, privacy, diversity, ethics, and continuous model learning.

Karen illustrates an example of an AI model making mistakes is seen when image captioning exposes the knowledge learned by training on images labeled with the objects they contain. This suggests that having a common sense world model of objects and people is required for an AI product to truly understand. A model only exposed to the limited number of labeled objects and limited variety in the training set will limit the efficacy of this common sense world model. Research into determining how a model treats its inputs and reaches its conclusions, in human understandable terms, is needed for enterprise. Amazon’s release of Rekognition, its facial recognition technology is an example of a technology currently in production and licensed for use while noticeable gaps exist in its effectiveness. According to a study released by the ACLU:

...the technology managed to confuse photos of 28 members of Congress with publicly available mug shots. Given that Amazon actively markets Rekognition to law enforcement agencies across the US, that’s simply not good enough.

Joy Buolamwini, and MIT graduate and Founder of Algorithmic Justice League in this latest interview called for a moratorium on this technology stating it was ineffective and needed more oversight, and has appealed for more government standards into these types of systems before they are publicly released.

AI’s Major Impediments: Mindset, Culture and Legacy

Having to transform from legacy systems is the top barrier to implement AI into many organizations today. Mindset and culture are elements of these legacy systems that provide a systemic view into the established process, values, and business rules that have dictated, not only how organizations operate, but also why these ingrained elements will create significant hurdles for business, especially when things are currently humming nicely. Therefore, there is no real incentive to dismantle infrastructures at the moment. AI is a component of business transformation and while that topic has gained as much buzz as the AI hype, the investment and commitment required to make significant changes are met with hesitation. We’ve heard from companies willing to experiment on specific use cases but are unprepared for the requirements to train, re-engineer process, and revamp governance and corporate policies. For larger organizations who are compelled to make these significant investments, the question shouldn’t be one of return on investment, but rather, sustainable competitive advantage.

The Problems with Data Integrity

AI today needs massive amounts of data to be able to produce meaningful results but is unable to leverage experiences from another application. While Karen argued there is work in progress to overcome these limitations, the transfer of learnings is needed before models can be applied in a scalable way. There are scenarios, however, where AI can be used effectively today, such as revealing insights in images, voice, video and being able to translate languages. Companies are learning that focus should be on: 1) diversity in the data, which includes proper representation across populations 2) ensuring diverse experiences, perspectives and thinking into the creation algorithms 3) prioritizing quality of the data over than quantity These are important especially as bias is introduced and trust and confidence in data degrade. For example, Turkish is a gender-neutral language, but the AI model in Google translator incorrectly predicts the gender when translating to English. As well, cancer spotting AI image recognition is only trained on fair-skinned people. From the computer vision example above, Joy Buolamwini tested these AI technologies and realized they worked more effectively on male vs. female, and on lighter vs. darker skin. The “error rates were as low as 1% on males and as high as 35% on dark females.” These issues occur because of the failure to use diverse training data. Karen concedes,

The concept of AI is simple but the algorithms get smarter by ingesting more and more real world data, however being able to explain the decisions becomes extremely difficult. The data may be continuously changing and AI models require filters to prevent incorrect labeling such as an image of a black man being labeled as a gorilla or a panda becoming labelled as a gibbon. Enterprises relying on faulty data to make decisions will lead to ill-informed results.

Fortunately, given AI’s nascency, very few organizations are making significant business decisions from the data today. From what we’ve witnessed most solutions produce mainly product recommendations and personalizing marketing communication. Any wrong conclusions that result from these have less societal impacts… at least for now. Using data to make business decisions is not new, but what has changed is the exponential increase in volume and mix of structured and unstructured data being used. AI enables us to use data from their source continuously and obtain insight much faster. This is an opportunity for businesses that have the capacity and structure to handle data volume from diverse sources. However, for other organizations, the masses of data can represent a risk because of the divergent sources and formats that make it more difficult to transform the information: emails, system logs, web pages, customer transcripts, documents, slides, informal chats, social networks and exploding rich media like images and video. Data transformation continues to be a stumbling block towards developing clean data sets, hence effective models.

Bias is More Prevalent than We Realize

Bias exists in many business models to minimize risk assessments, and optimize targeting opportunities and while they may produce profitable business results, they have been known to result in unintended consequences that cause individual harm and deepen economic disparities. Insurance companies may use location information or credit score data to issue higher premiums to poorer customers. Banks may approve prospects with lower credit scores, who are already debt-ridden but may be unable to afford the higher lending rates. There is a heightened caution surrounding bias because the introduction of AI will not only perpetuate existing biases, the result from these learning models may generalize to the point it will deepen the economic and societal divide. Bias presents itself in current algorithms to determine the likelihood of recidivism (the likelihood to re-offend) like COMPAS. The Correction Offender Management Profiling for Alternative Sanctions (COMPAS) was created by a company known as Northpointe. The goal of COMPAS was to assess the risk and prediction of criminality for defendants in pre-trial hearings. The types of questions used in the initial COMPAS research revealed enough human bias that the system perpetuated recommendations that unintentionally treated blacks, who would never go on to re-offend, more harshly by law than white defendants, who would go on to re-offend and were treated more leniently at the time of sentencing. With no public standard available, Northpointe was able to create its own definition of fairness, and develop an algorithm without third-party assessment… until recently. This article confirmed, "A Popular Algorithm Is No Better at Predicting Crimes Than Random People" …

If this software is only as accurate as untrained people responding to an online survey, I think the courts should consider that when trying to decide how much weight to put on them in making decisions

Karen stipulated,

While we try to fix existing systems to minimize this bias, it is critical that models train on diverse sets of data to prevent future harms.

Given the potential risks to faulty models pervading business and society, businesses do not have governance mechanisms to police for unfair or immoral decisions that will inadvertently impact the end consumer. This is discussed under ethics.

The Increasing Demand for Privacy

Karen and I came from Yahoo! We worked with strong research and data teams that were able to contextualize behavior from users across our platform. We continuously studied user behavior and understood their propensities across our multitude of properties from Music, to Homepage, to Lifestyle, News etc. At that time, there was there were no strict standards or regulation for data use. Privacy was relegated to user passive agreements of the platform’s terms and conditions, similar to today. The recent Cambridge Analytica/Facebook scandal has brought the personal data privacy front and center. Frequent data breaches occurring at major credit institutions like Equifax and most recently, Facebook and Google + continue to compound this issue. The issue of ownership, consent and erroneous contextualization makes this a ripe topic as AI continues to iron out its kinks. The European General Data Protection Regulation (GDPR) which has come into effect May 25, 2018, will change the game for organizations, particularly those who collect, store and analyze personal user information. It will change the rules for which business operated under for many years. The unbridled use of personal information has come to a head, as the business will now come to the realization there will be significant limitations on data use and more importantly, ownership. We are seeing the early effects of this in location-based advertising. This $75 billion industry which is slated to grow by a 5-year 21% CAGR by 2021 continues to be impeded by the oligopoly of Facebook and Google, securing the bulk of revenues. And now, the GDPR raises the stakes to make these ad-tech companies more accountable:

Twitter @hessiejones

The stakes are high enough that [advertisers] have to have a very high degree of confidence that what you're being told is actually in compliance. It seems like there is enough general confusion about what will ultimately constitute a violation that people are taking a broad approach to this until you can get precise about what compliance looks like.

While regulation will eventually cripple revenues, at least for the moment, the mobile and ad platform industries are also facing increasing scrutiny from the very subjects they have been monetizing for many years: the consumer. This, coupled with the examination around established practices, will force the industry to shift their practices in the collection, aggregation, analysis, and sharing of user information. Operationalizing privacy will take time, significant investment (a topic that needs to be afforded more attention), and a change in mindset that will impact organizational policy, process, and culture.

The Inevitable Coupling of AI & Ethics

The prevailing factor of AI ensures societal benefits, including streamlining processes, increasing convenience, improving products and services, and detecting potential harms through automation. Ceding to the latter means readily measuring inputs/outputs against outcomes in renewed manufacturing processes, service, and assessment solutions, production as well as product quality. As discussions and news about AI persist, this term, “AI” coupled with “ethics” reveals increasingly grave concerns where AI technology can inflict societal damage that will test human conscience and values.

CB Insights: Tech Cos Confront Ethics of AI[/caption] Beyond individual privacy concerns, today we are seeing examples of innovation that border on the unconscionable. As stated previously, Rekognition and Face++ are being used in law enforcement and citizen surveillance while the technology is deemed to be faulty. Employees walked out in protest of Google’s decision to provide artificial intelligence to the Defense Department for the analysis of drone footage, with the goal of creating a sophisticated system to surveil cities in a project known as Project Maven. The same tech giant is also building Project Dragonfly for China, a censored search engine that also has the ability to map individual searches to identity. Decision-makers and regulators will need to instill new process and policies to properly assess how AI technologies are being used, for what purpose and whether there may be unintended fallout in the process. Karen pointed to new questions in determining the use of data in AI algorithms that will need to be considered:

How do we detect sensitive data fields and anonymize them while preserving the important features of a dataset? Can we train on synthetic data as an alternative in the short term? The question we need to ask ourselves when creating the algorithm: What fields do we require to deliver the outcomes we want? In addition, what parameters should we create to define “fairness” in the models, meaning does this treat two individuals differently? And if so, why? How do we continuously monitor for this within our systems?

An AI Winter is a Serendipitous Opportunity to Get AI Ready

AI has come a long way, but still needs more time to mature. In a world of increasing automation and deliberate progress towards increasing cognitive computing capabilities, the impending AI winter has afforded business the necessary time to determine how AI fits into their organization and the problems it wants to solve. The impending casualties of AI need to be addressed in policy, in governance and its impact on individuals and society. Its impact is far greater in this next industrial revolution as its ubiquity will become more nuanced in our lives. The leading voices of AI from Geoff Hinton, Fei Fei Lee and Andrew Ng, have called on an AI reset because Deep Learning has not yet proven to scale. The promise of AI is not waning, rather the expectations for its real arrival is pushed further out – perhaps 5-10 years. We have time to work these issues on Deep Learning, other AI methods, and the processes to effectively extract value from data. This culmination of business readiness, regulation, education, and research are necessary to bring both business and consumers up to speed and to ensure a regulatory system is in place to properly constrain technology and one that leaves humanity at the helm a little while longer.

About Karen Bennet

Karen Bennet, a Principal Consultant at Arcompany, is an experienced senior engineering leader with more than 25 years in the software development business in both open and closed source solutions. More recently, Karen's efforts have focused on Artificial Intelligence, enabling enterprise, particularly in the banking and automotive sectors, to experiment with AI/ML. She has extensive leadership engineering positions at Cerebri AI, IBM, Yahoo!, Trapeze and was an early leader, who helped grow Cygnus and Red Hat into sustainable businesses.