WIRED open their Innovation Awards series by celebrating British leaders in AI, including the founders of Jukedeck, SwiftKey (acq. Microsoft), VocalIQ (acq. Apple), Blippar and Google DeepMind. I was fortunate to be involved in this project!

Researchers at NYU trained a recurrent neural network on scripts from movies including Ghostbusters, Interstellar and The Fifth Element, and asked the network to generate a novel screenplay. Watch the result acted out here!

Work from MIT’s computer science and AI lab shows how to predict sound from a silent video. The algorithm takes video sequence as input and predicts a corresponding sequence of sound features. Next, it synthesises a waveform by matching these features with a database of impact sounds and transferring the best matches. Watching someone strike objects with a drumstick has never been so interesting :)

A designer’s guide to AI. Leveraging user centered design principles, the author rightly states that AI will enable designers to create bespoke experiences right out of the box for each user. Importantly, these experiences need to a) create emotionally-aware relationships with the user, b) respond to needs that haven’t yet been explicitly expressed, c) prevent negative emotional responses when a user is upset with an AI-caused result and d) be sensitive to sociology. A list of further reading resources is included. Thanks to Joe Thornton for sharing!

An MIT student explains the ups/downs of AI through the lens of gradient descent, where a decreasing slope to the function that defines progress in AI implies that we’re reducing the difference difference between artificial and biological intelligence. He argues that deep learning can be a plausible substrate for future AI systems and that we should focus on solving tough problems to avoid another Winter.

Apple plays feature catch-up with Google, namely on its photo tagging/search capabilities and predictive keyboard, but takes a view that privacy should come first. The Company is using on-device machine learning and differential privacy, a method to protect against deanonymizing efforts when querying statistical information about a sensitive dataset. This compares to Google’s stance of using cloud-based computation (deemed less safe) to power its machine learning features.

Microsoft CEO, Satya Nadella, outlines three key tenets of his vision for developing AI: 1) Augment human abilities and experiences instead of replace us, 2) Work to earn a user’s trust by solving privacy, transparency and security, 3) Technology should be inclusive and respectful of all users.

Following from this point, the New York Times features a piece on how algorithms perpetuate intrinsic biases in their training data, drawing on examples from the police force, image classification tasks and gender discrimination. While no solutions are presented, techniques which expose the reasons why algorithmic outputs were achieved in addition to careful creation of the training dataset would help greatly.

🆓 Open source, FB and GOOG

Google release an open source implementation of what they call Wide and Deep Learning, a two-pronged model that seeks to emulate our ability as humans to memorise concepts and subsequently generalise this knowledge to new situations. This approach is suited for classification and regression tasks where input data points are sparse (e.g. food recommendation engine or search results). Paper here.

Facebook’s Language Technology team, which forms part of Applied ML, was the subject of a recent expose by Forbes diving into its various initiatives. The team recently published their text understanding engine, DeepText, that is able to understand sentiment, intent and entities across more than 20 languages. They’ve also built a new multilingual composer to enable authors of posts on Facebook Pages to reach audiences speaker other languages using automatic machine translation.

Probably the cutest, most advanced and largely autonomous fleet of delivery robots in town, Starship Technologies, announced their launch partners including Just Eat, Hermes and Pronto. I’ve seen one in action and can’t wait to cross more on the sidewalk soon!

On the subject of driverless cars, the ethical framework for defining algorithms that make moral decisions is an outstanding question. Here, researchers show that study participants want to be passengers in vehicles that protect their riders at all cost while preferring that others purchase vehicles controlled by utilitarian ethics, i.e. sacrificing its passengers for the greater good.

Research, development and resources

The 33rd International Conference on Machine Learning, a major event on the AI calendar, took place last month. I encourage you to skim through two sets of summaries notes of the best talks and papers: this one and that one. David Silver of Google DeepMind/AlphaGo fame also ran a brilliant session on reinforcement learning, which you can learn more about in his recent blog post.

Here’s my selection of exciting applied AI research:

Concrete problems in AI safety, Google Brain, Stanford, UC Berkeley and OpenAI. Here, the authors define 5 practical research problems in AI safety (relating to reinforcement learning), defined as preventing unintended and harmful behavior that may emerge from poor design of real-world AI systems. These areas can be summarised as: 1) Safe exploration of an environment by an agent, 2) Robustness to changes in the distribution of input data, 3) The avoidance of unwanted impacts on the environment as a result of learning optimal policies, 4) Avoiding cheating behaviour to optimise a reward function, 5) Learning policies when feedback is expensive to give (and thus sparse).

Safely interruptible agents, Google DeepMind and The Future of Humanity Institute/Oxford. The authors focus on the question of defining the optimal interruptible policy for a reinforcement learning agent. Of note, they show that their Q-learning algorithm, which played Atari games, is safely interruptible. Moreover, they make the case for ensuring that interruptions do not jeopardise the agent’s goal of behaving optimally in its environment. The agent should therefore behave optimally under the assumption that there won’t be future interruptions to its exploration of the environment.

OpenAI publish new work (papers + explanation) extending generative adversarial models (GAN), a neural network architecture proposed by Ian Goodfellow in 2014 that uses a generator to produce data (e.g. an artificial image) from a random input and a discriminator receiving an input from either the generator (the artificial image) or a real dataset (a real image) and is tasked with calling out artificial data. The discriminator must maximise the probability of assigning the correct label to the input (1=real image, 0=artificial image), while the generator optimises to fool the discriminator. The other important feature of the adversarial training method is that the network learns its own cost function instead of requiring it to be crafted by the engineer, meaning that the network learns its own rules for which outputs are good or bad. As such, we’re able to create a model capable of generating data that approaches reality (i.e. images that look real but are actually computer-generated) by learning hierarchical feature representations without supervision. The OpenAI work introduces techniques for making GAN training more stable. They also present a semi-supervised learning approach where the discriminator outputs an image label such that the network can outperform image classification tasks with only 10 labeled examples per class (99.41% accuracy on MNIST handwritten digit dataset).

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks, University of Wyoming, Geometric Intelligence and University of Freiburg. With few exceptions, neural network models are currently black boxes that obfuscate the reasoning underlying their outputs. However, being able to explain why a network classified an image in a certain way or elected not to approve a financial decision is key to their adoption in fault-intolerant domains. The key to shedding light on the inner workings of deep neural networks is to discover the input data that highly activates a specific neuron in the network, a process called activation maximisation. Here, the authors use a deep generator network to perform activation maximisation whereby a generative model outputs a synthetic image that looks as close to images from the ImageNet dataset. Thus, one can understand why a neural network classifies a dog as a dog because the input synthetic input image that highly activates the neurons responsible for the classification is that of a dog.

Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network, Magic Pony Technology and Imperial College London. Curious about the technology Twitter paid $150m for? In this work, the London-startup address the problem of upscaling the quality of a single image or video from low resolution (LR, which is blurry, downsampled and noisy) to high resolution (HR, which is sharp). This is topical for high definition television streaming, medical and satellite imaging, which are particularly bandwidth and computationally expensive. Existing methods for super-resolution (SR) rely on having multiple LR images of the same scene or instead use computationally expensive convolutional neural networks (CNNs)-based methods that increase resolution before enhancements. Here, the authors show that by upscaling from LR to HR at the end of the network, feature extraction occurs in LR space from which HR data is super-resolved. This gives rise to 10x speed and performance compared to the state of the art CNN approaches, making it possible to run super resolution HD videos in real time on a single GPU. Awesome!

Venture capital financings and exits

57 financing events (55 VC and 2 PE) totalling $500m, including:

Zoox, the stealth startup developing both hardware and software for an autonomous taxi fleet, raised a $200m Series A at a $1bn valuation from Aid Partners in Hong Kong, Lux Capital and DFJ.

Orbital Insight, which creates computer vision based geospatial-analysis software, raised a $20m Series B led by Google Ventures and including Bloomberg Beta, Lux Capital and Sequoia. Very neat work here - check out this recent talk by founder here.

Darktrace, the ever growing London-based cybersecurity company using machine learning to detect novel threats, raised $65m led by KKR with participation from Softbank and Summit Partners. This values the 3 year old company at $400m post-money.

7 exit events worth $183m in total announced consideration, including:

Magic Pony Technology, the London-based startup developing machine learning technologies for visual processing, was acquired by Twitter for $150m. The startup raised $6.3m in two rounds from Octopus Ventures, Balderton Capital, Entrepreneur First and angels. It employed a dozen staff and filed 20 patents in its two year life, marking a significant exit for the European AI ecosystem. Congrats all!