Building an Artificial General Intelligence

Wednesday, 25 February 2015

I found this video interview with Eric Horvitz, the head of Microsoft Research, to be a really interesting window into their work and vision - a large part of which is AI. As Eric states, there is a "growing consensus that the next, if not last, enduring competitive battlefield, among major IT companies will be AI". Very well said.

Saturday, 21 February 2015

Just a few days ago, Jay posted here about the recent public attention to AI and the potential future dangers that it presents.

Discussion on the topic continues, but it's not all talk. There are at least two institutes focussed on practical steps to mitigate the risks.

There is an institute called MIRI that exists solely to "ensure smarter-than-human intelligence has a positive impact". Founded in 2000, it is non-profit and has many very high profile people on the team, including Eliezer Yudkowsky, Nick Bostrom and entrepreneur Peter Thiel. They publish and hold workshops regularly.

Also, the Future of Life Institute is currently focussed on AI. It recently received a well publicised hefty donation from Elon Musk, founder of Space X and Tesla. As mentioned in Jay's article, Musk is from the camp that believes AI could pose an existential threat to mankind.

More of the discussion is set to become practical work and these organisations are leading by example. Usually considerations of the ethical issues lags behind the science. In this case however, due to science fiction and misunderstandings of the technology, the perceived dangers are way ahead of the actual threat. That's a healthy situation to be in to ensure we are prepared.

Tuesday, 17 February 2015

In the past year or so, there has been a spate of
high-profile pronouncements by respected scientists and engineers,
cautioning the world about the potential ill-effects of AI. Stephen Hawking,
one of the foremost experts on theoretical and astro physics told the BBC that
"The development of full artificial intelligence could spell the end of
the human race" [1] and Elon Musk,
serial entrepreneur and founder of Tesla Motors and SpaceX, told an audience at
MIT, that “AI represented an existential threat to mankind [2].”

AI has been around since the 1950s and Hollywood movies like
the Terminator have fired the public imagination about rogue AI, at least in
the west, where the general public views AI with much suspicion. But the
scientific community has always been very cautious about its ability to do
anything beyond narrowly defined problem statements, like defeating a
grandmaster at chess, as IBM’s Deep Blue did, in the 1990s. AI just a few years
ago, was hopeless at tasks that a 6 year old could easily do, like identify
people and objects in a family photograph, or communicate with her elderly
grandfather, or answer abstract questions about what she wants to become when
she grows up.

What is it about AI now, that is spurring these alarmist statements by
respected scientists and engineers?

In the past couple of years, a type of learning algorithm
called Deep Learning, has taken the world of AI in general, and Computer Vision
and Natural Language Processing (NLP) in particular, by storm. Computer Vision
and NLP are both sub-fields within AI; one deals with the science of making
computers understand the world around them through vision, and the other deals
with trying to make computers understand human speech.

Deep Learning is a learning algorithm that uses a
many-layered (hence the word deep) Artificial Neural Network to learn
representations of objects or faces (in images) or words, phrases and sentences
(in speech) from many known examples, called training data. Given a new
image, it is then able to find and label objects in that image, or understand
the meaning of a spoken sentence.

Deep Learning techniques are excelling at pretty much every
AI-related application that engineers are throwing at it, and incremental
advances in the field that used to take years and decades to achieve, is now
happening in months. An illustration of the rapid advances made in the field
through Deep Learning can be made through the example of ImageNet. Computer
Vision engineers test the competence of each others’ algorithms using large
databases of images containing objects, whose positions and labels have been
painstakingly labelled by humans, and the biggest of these is ImageNet [3]. ImageNet
contains 16 million images and 20,000 object classes, and every year, Computer
Vision algorithms are compared against each other based on whether or not they
can detect the presence and location of one or more objects in each of those 16
million images.

The sort of labelling a computer vision algorithm is expected to produce, on an example image from ImageNet [4]

The Error Rate (a measure of the number of labels an
algorithm gets wrong on these images, and consequently a measure of the competence
of an algorithm) has dropped precipitously since the introduction of Deep
Learning algorithms into the competition, from about 40% in 2010, to 6.7% in
2014, by Google’s Deep Learning algorithm, called GoogLeNet [5], named after Yann Le Cunn, one of the pioneers of Deep
Learning research. This is a big deal in the field. The improvement per year,
every year prior to that in the previous decade on ImageNet and similar
databases had been about 1-2%. Engineers were using what were known as
hand-engineered features and making very small, incremental advances. Image
features are statistics of how objects and parts thereof look like on a pixel
level and are supposed to be invariant to changes in viewpoint, lighting, etc.
and engineers had been manually designing them for the better part of a decade.

A Convolutional Neural Network or CNN (a particular kind of Neural Network where the first few layers are Convolutional or averaging layers) learns to identify a face using a series of representations whose levels of abstraction, from edge-like segments - to face parts, ears, noses, mouths, etc. - to full faces increase through the layers of the network [6]

Suddenly in 2012, Geoff Hinton and his team from the University
of Toronto, used a Convolutional Neural Network or CNN (a particular kind of
Artificial Neural Network that also uses Deep Learning techniques) that learnt
its own features, and doubled the performance of the 2nd best team that used
conventional, hand-engineered features [7]. By 2014,
almost every competing algorithm in the competition was a CNN-Deep Learning
based algorithm, and the latest error measures of about 6-7% are closing in on
the human error rate of identifying objects in these images, of 5% [8].

Not only can Deep Learning algorithms determine the type and
location of objects in a scene, but they have also been shown to develop a
relationship between those objects, i.e. a semantic understanding of the scene [10].

These algorithms are remarkably versatile. Instead of
designing specialist algorithms, one for each class of AI problem, as had been
done throughout the 90s and the first decade of this century, engineers are
using the same Deep Learning algorithm for other problems like medical diagnosis [12, 13] and Natural
Language Processing (NLP), alongside Computer Vision. Speech recognition using
this technique has shown similar improvements (of about 30% in the error rate),
and an astonishing demo of Microsoft’s Chief Research Officer speaking to a
Chinese audience in English, while his speech is instantaneously
machine-translated into a Chinese voice is available on Youtube [14].

Some of the factors driving this exponential improvement in
the capabilities of Deep Learning algorithms now (Neural Networks on which they
are based, have been around since the early 80s) are:

more data to learn from (billions of images and
video, some labelled, but mostly unlabelled from the internet)

unsupervised learning - previously, image
recognition algorithms learnt from a large amount of labelled data, but
versions of the Deep Learning algorithm (auto-encoders [15] and
Restricted Boltzmann Machines or RBMs [16])
figure out the labels from the data itself - the Google “cat classifier” [17] for
example, learnt what a cat was by looking at millions of youtube videos.
It wasn’t told which videos had cats in them. Labelling data is a slow,
error-prone and painstaking process, so an ability to learn from
unlabelled data is critical.

significantly more computing power available
today (compared to the 80s and 90s, when Neural Networks were previously
in vogue), is able to train networks with many more (deep) layers, and the
fact that unlike a regular Neural Network, all the neurons in one layer of
a CNN (and other modern Neural Network topologies) are not connected to all the neurons in the next layer, which makes
training them faster and easier. Clusters with tens of thousands of
processors are now being used for Deep Learning - Google’s cat classifier
had 1 billion neuronal connections.

The ability to harness Graphics Processor Units
or GPUs, whose growth has been accelerated by the computer gaming industry,
is another recent advantage. A GPU is traditionally used by a computer to
render images and has only recently been co-opted for parallelizing Deep
Learning algorithms. GPUs, with thousands of processor cores (as opposed
to a handful in the CPU), can process in parallel the thousands of pixels
or pixel-groups in an image - a CPU would have to do that sequentially -
drastically speeding up the learning process. The Google learning system
required 1000 computers and cost 1 million dollars to build, whereas a
similar system was recently built for 20,000 dollars, using GPUs on just
16 computers [20].

GPU growth (in terms of the number of floating point operations per second - FLOPS - has surpassed CPU growth in the last decade [21]

Quantum Computing, another field that is slowly taking off,
is another reason why this exponential growth in the performance of these
algorithms is set to continue. Regular computers store information in the form
of binary bits, which can be either 0 or 1. Quantum computers, on the other
hand, take advantage of the strange laws of Quantum Physics, and store
information in Qubits, which can be both 1 and 0 at the same time. This is
especially useful in solving one specific type of problem, called optimization.
Most learning algorithms, including Deep Learning algorithms need to perform
optimization, the task of finding the minimum in a “landscape” whose peaks and
troughs represents a cost function that describes the learning problem, which
could be learning how to recognize a particular object in an image, or a word
in an audio recording. Typically, this landscape is big, and multi-dimensional,
and it is not possible to search each part of the landscape exhaustively in a
finite amount of time. Instead, algorithms explore a few candidate hills and
“walk” down to the valleys closest to them, and the lowest point amongst these
valleys is taken to be an acceptable compromise.

A classical computer with a sequential algorithm can explore
only one part of this optimization landscape at a time. Because there is no
time to exhaustively search the entire landscape, algorithms using classical
computers, using the “compromise solution” often get stuck in a local minimum,
a local valley, that might not be the absolute lowest point on the landscape.
Thanks to superposition, the possibility of the qubits being both 1 and 0 at
the same time, an algorithm running on a quantum computer is able to explore
multiple points in the optimization landscape simultaneously and find the real
minimum much more quickly than a classical computer.

A quantum computer from DWave [22], reminiscent of the black “monolith” from 2001, A Space Odyssey

It is this exponential advance in the capability of AI,
likely to be sustained in the coming years by GPUs and quantum computing, that
has eminent scientists worried. However, long before some of the
humanity-ending, doomsday scenarios that are being bandied about have any
chance of becoming a possibility (scientists disagree on the amount of time it
will take for AI to become self-aware and surpass human intelligence -
estimates vary from decades to centuries), there is a much more immediate
problem. This is one of job-losses and the subsequent social upheaval, brought
about by AI algorithms and machines that are very good at doing some of the
jobs done by humans today.

The most common jobs in the U.S. in 2014 - truck driving is the most common job in a majority of states [23]

Like driving. Google’s autonomous cars have been zipping
around the laneways and highways of the Bay Area for some time now, and Google
is now looking to team up with major players in the auto-industry to bring
their autonomous cars to the market within the next 5 years [24]. Uber, the
taxi-company that pioneered real-time ride-sharing - users use their
GPS-equipped smart-phones to hail cabs and share rides, has just announced a
collaboration with Carnegie Mellon University to bring autonomous taxis to the
market [25]. You would use
your app to hail (and possibly share) a ride on an autonomous taxi, just like
you do today, on human-driven Uber taxis.

Self-driving cars, driven by never-tiring robots with eyes
in the back of their heads (actually the cars ARE the robots - equipped with
sensors and AI software) have the potential to completely disrupt the taxi and
trucking industries (truck driving is the most common job in the U.S. today -
see above figure). Highway driving is easier than city driving (in spite of the
higher speeds, highways are constrained environments and there are fewer things
you’ve got to look out for) and autonomous driving on highways was in fact
demonstrated much earlier, in the mid 90s [26],
and is likely to get over the legal hurdles quicker.

The services sector (blue), that occupies a major chunk of the employment pie in developed economies is most at risk from AI [27]

Drivers are part of the services sector, and it is their
jobs, along with others in the service industry, that occupies a disproportionately
large slice (80%) of the employment pie in advanced economies (dark blue in the
above figure). It is these jobs that AI is currently getting better than humans
at, and where there is the biggest potential for disruption [28].

This upcoming upheaval in the labour market is different
from the changes in productivity (and resulting fall in employment) during the
industrial revolution in the 18th and 19th centuries. Change during that
revolution was more gradual, allowing workers 50 years or more to transition
from hard, manual labour to less labour-intensive jobs in mills and factories,
whereas current trends indicate an exponentially increasing capability for AI
to take over the services jobs - jobs that form the backbone of western
economies - within the next 10 years.

Employment trends in the US since 1975, divided between routine and non-routine jobs [29].

Even before these latest advances in AI, the job market in
advanced economies since the 1970s has been increasingly polarized, with job growths
at two ends of the spectrum - highly skilled, non-routine jobs for engineers,
managers and medical practitioners and low skilled, non-routine jobs that
require physical activity, like waiters, mechanics and security guards. This is
at the expense of jobs that are routine and repetitive, involving following a
pre-defined set of instructions - a lot of assembly-line work for example, has
either been replaced by automation, or moved to lower income countries.

Economics researchers at the New York Federal Reserve have
broken down the job market into the following matrix, with routine and
non-routine jobs that require cognitive vs manual skills.

They’ve analyzed data that demonstrates that non-routine
jobs which require cognitive skills have been increasing compared to
non-routine jobs that require manual skills, and routine jobs that require
cognitive and manual skills have both been decreasing.

I would further divide routine, manual jobs into ones that
involve dextrous manipulation and some skill, like that of a technician or a
mechanic, vs the manual jobs that don’t require these things - picking and
placing and sorting of parts on an assembly line in a factory. With robots yet
to achieve the levels of dexterity of humans, the latter class of job is
disappearing much sooner than the former.

Now, let’s look at the top 10 occupations in Australia (a
typical advanced Western economy, and a country I’m familiar with, given half a
lifetime spent there), sorted according to the number of people employed in
decreasing order [31], and classify
them according to the above matrix.

What will these people do, and how will they earn a
livelihood? Thankfully, it is not all bad news for employment in the developed
world, and there are a number of ways in which AI is also likely to generate
new employment, and the industry where this is most likely to happen is
healthcare.

As populations in industrialized economies age, a far
greater share of their GDP will be employed in ensuring the well-being of their
people (The U.S. spent 18% of its GDP on healthcare in 2012 [33]). However, the
numbers of doctors and physicians either in practice today or in training for
the future are nowhere near the numbers required to take care of an ageing population.
The solution could be to use AI systems, like IBM’s Watson, to help medical
assistants take over some of the primary health-care duties of physicians.

Watson, a Deep QA (Question and Answer) system named after
the company’s founder, can understand a natural language query (a question you
would pose to a person, as opposed to one tailored to find answers on a search
engine), search very quickly through a vast database and give you a response,
also in a natural language format. Watson shot to fame in 2011, when it
defeated two former winners of the TV game show Jeopardy, where participants
are quizzed on a variety of topics with questions that are “posed in every
nuance of natural language, including puns, synonyms and homonyms, slang and
jargon” [34]. Watson is
designed to scan documents from a vast corpus of knowledge (medical journals
for example) very quickly - currently at the rate of 60 million documents per
second - and come up with a range of hypotheses for a question, which are then
ranked based on confidence and merged to give a final answer, also with a
confidence level associated with it.

Given a set of symptoms, a system like Watson would be able
to list out the top diagnoses, ranked in terms of confidence. These diagnoses
relying on information gleaned from a far larger database that a human mind can
ever learn in a life-time, and improving all the time (Watson, like Deep
Learning systems, is a continually learning system that improves with
feedback), would be free from something called the anchoring bias, a human
tendency to rely on a limited number of pieces of evidence that account for the
symptoms and discount everything else [35].
It will enable an army of physician-assistants, nurses and nurse-practitioners
to administer primary health care, and free up physicians and doctors to do
other things that require a higher level of skill.

So, the large numbers of middle class job losses in the
services industry could potentially be replaced with jobs in healthcare.

AI and robotics is also, ironically likely to create jobs in
manufacturing, after previously having taken away some of those manufacturing
jobs. The previous generation of robots that replaced humans, in automobile
manufacturing for example, were huge and unwieldy, and unsafe to be around -
they had to be installed in cages to prevent injury to their human co-workers.
Requiring experts to program, they were efficient in one specific type of task,
like spot-welding or spray-painting, but once programmed, were not able to
deviate from their pre-programmed routines. A new generation of manufacturing
robots like Rethink
Robotics’ Baxter is smarter, smaller, and safer to be around,
because it can sense and interact with humans around it. Baxter can be trained
interactively by someone who does not need to be a robotics engineer - the
robot is hand-guided to perform a series of movements in response to sensor
inputs - and can be quickly reconfigured to perform different sets of tasks.

Repetitive assembly line work like putting together the
components that go into making an iPad, or stitching together fabrics to make a
garment, work that has largely gone to the developing world in the past 30
years, is likely to come back to the West, thanks to robots like Baxter. Small
to medium enterprises, with a smaller, but smarter workforce, armed with their
robotic assistants, will be able to turn out custom-made consumer electronics
and compete with devices mass-produced in China.

AI is also likely to create new jobs that we are unable to
predict today, just as the rise of Information Technology gave rise to software
developers and data analysts, jobs that didn’t exist just 30 years ago.

Over the next decade, it is very likely that AI will result
in an unprecedented churn in the job market, and potentially threaten the
livelihoods of hundreds of millions of people in the developed world.
Protectionism or government regulation against AI will only delay the
inevitable, because ultimately, the economic gains to be had from AI will
outweigh the negatives.

The increased productivity and revenue from increasing
levels of automation brought about by AI in developed economies could be used
to provide Basic Income,
a guaranteed minimum income to all citizens. Such an income would help people
weather the storm of unemployment and would have the added benefit of allowing
people to pursue activities out of interest, and not out of economic necessity.

However, governments and policy makers need to be informed
and cognizant of the disruptive power of AI and craft policy to help their
people deal with it. Otherwise, they will be left confronting a situation of
mass-scale unemployment and social unrest that will make the Global Financial
Crisis look like a walk in the park.