by Jack Clark

Using human feedback to generate better synthetic images:…Human feedback is a technique people use to build systems that learn to achieve an objective based on a prediction of satisfying a user’s (broadly unspecified) desires, rather than a hand-tuned goal set by a human. At OpenAI, we’ve collaborated with DeepMind to use such human feedback interfaces to train simulated robots and agents playing games to do things hard to specify via traditional objectives.…This fundamental idea – collecting human feedback through the training process to optimize an objective function shaped around satisfying the desires of the user – lets the algorithms explore the problem space more efficiently with the aid of a human guide, even though neither party may know exactly what they’re optimizing the AI algorithm to do.…Now, researchers at Google have used this general way of framing a problem to train Generative Adversarial Networks to create synthetic images that are more satisfying/realistic-seeming to human overseers than those generated simply by the GAN process minus human feedback. The technique is reasonably efficient, requiring the researchers to show 1000 images each 1000 times through training. A future research extension of this technique could be to better improve the sample efficiency of the part of the model that seeks to predict how to satisfy a human’s preferences – if we require less feedback, then we can likely make it more feasible to train these algorithms on harder problems.…Read more here: Improving image generative models with human interactions.

The United Nations launches its own AI center:…The UN has created the Center for Artificial Intelligence and Robotics (UNICRI), a group within the UN to perform ongoing analysis of AI, the convening of expert meetings, organization of conferences, and so on.… “The aim of the Centre is to enhance understanding of the risk-benefit duality of Artificial Intelligence and Robotics through improved coordination, knowledge collection and dissemination, awareness-raising and outreach activities,” said UNICRI’s director Ms. Cindy J. Smith.…Read more here about the UNICRI.

The HydraNet will see you now: monitoring pedestrians using deep learning:…Researchers with the Chinese University of Hong Kong and computer vision startup SenseTime have shown how to use attention methods to create AI systems to recognize pedestrians from CCTV footage and also “re-acquire” them – that is, re-identify the same person when they appear in a new context, like a new camera feed from security footage.…The system, named HydraPlus-Net (HP-Net), works through the use of a multi-directional attention model, which pulls together multiple regions within an image that a neural network has attended to. (Specifically, the MDA will generate attention maps by calling on the outputs of multiple different parts of a neural net architecture).…Data: To test their system, the researchers also collected a new large-scale pedestrian dataset, called the PA-100K dataset, which consists of 100,000 pedestrian images from 598 distinct scenes, with labels across 26 attributes ranging from gender and age to specific, contextual items, like whether someone is holding a handbag or not.…The results: HP-Net does reasonably well across a number of different pedestrian detection datasets, setting new state-of-the-art scores that are several percentage points higher than previous ones. Though accuracy for now ranges between ~75% and ~85%, so it’s by no means full-proof yet.…Read more here: HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis.

Who says literature is dead? The Deep Learning textbook sells over 70,000 copies:…GAN inventor Ian Goodfellow, one of the co-authors (along with Aaron Courville and Yoshua Bengio) of what looks set to become the canonical textbook on Deep Learning, said a few months ago in this interview with Andrew Ng that the book had sold well, with a huge amount of interest coming from China.…Watch the whole interview here (YouTube video).
…Buy the book here.…Also in the interview: Ian is currently spending about 40% of his time trying to research how to stabilize GAN training, details on the “near death” experience (with a twist!) that led to him deciding to focus on deep learning.

Defense contractor & US Air Force research lab (AFRL): detecting vehicles in real-time from aerial imagery:…In recent years we’ve developed numerous great object recognition systems that work well on street-level imagery. But ones that work on aerial imagery have been harder to develop, partially because of a lack of data, and also because the top-down perspective might introduce its own challenges for detection systems (see: shadows, variable atmospheric conditions, the fact that many things don’t have as much detailing on their top parts as on their side parts).…Components used: Faster RCNN, a widely used architecture for detection and object segmentation. A tweaked version of YOLOv2, a real-time object detector.…Results: Fairly uninspiring: the main note here is a that YOLOv2 (once tuned by manipulating the spatial inputs for the layers of the network and also hand-tuning the anchor boxes that it places around identified items) can be almost on par with RCNN in accuracy while being able to operate in real-time contexts, which is important to people deploying AI for security purposes.…Read more here:Fast Vehicle Detection in Aerial Imagery.…Winner of Import AI’s turn of phrase of the week award… for this fantastic sentence: “Additionally AFRL has some in house aerial imagery, referred to as Air Force aerial vehicle imagery dataset (AFVID), that has been truthed.” (Imagine a curt auditor looking at one of your datasets, then emailing you with the subject line: URGENT Query: Has this been truthed?)

100,000 free chest X-rays: NIH releases vast, open medical dataset for everyone to use:…The US National Institutes of Health has released a huge dataset of chest x-rays consisting of 100,000 pictures from over 30,000 patients.…”By using this free dataset, the hope is that academic and research institutions across the country will be able to teach a computer to read and process extremely large amounts of scans, to confirm the results radiologists have found and potentially identify other findings that may have been overlooked,” the NIH writes.…Up next? A large CT scan dataset in a few months.…Read more here: NIH Clinical Center provides one of the largest publicly available chest x-ray datasets to scientific community.

Amazon’s growing robot army:…Since buying robot startup Kiva Systems in 2012 Amazon has rapidly deployed an ever-growing fleet of robots into its warehouses, helping it store more goods in each of its fulfillment centers, letting it increase inventory breadth to better serve its customers.…Total number of Kiva robots deployed by Amazon worldwide……2014: 15,000…2015: 30,000…2016: 45,000…2017: 100,000…Read more: Amazon announced this number, among others, during a keynote presentation at IROS2018 in Vancouver. Evan Ackerman with IEEE Spectrum covered the keynote and tweeted out some of the details here.

Two robots, one goal:

…Researchers with Carnegie Mellon University have proposed a way to get a ground-based robot and an aerial drone to work together, presaging a world where teams of robots collaborate to solve basic tasks.…But it’s early days: in this paper, they show how they can couple a ParrotAR drone to one of CMU’s iconic ‘cobots’ (think of it as a kind of frankensteined-cross between a Rhoomba and a Telepresence robot). The robot navigates to a predefined location, like a table in an office. Then the drone takes off from the top of the robot to search for an item of interest. It uses a marker on the robot to ground itself, letting it navigate indoor environments where GPS may not be available.…The approach works, given certain (significant) caveats: in this experiment both the robot and the item of interest are found by the drone via a pre-defined marker. That means that this is more a proof-of-concept than anything else, and it’s likely that neural network-based image systems that are able to accurately identify 3D objects surrounded by clutter will be necessary for this to do truly useful stuff.…Read more here: UAV and Indoor Service Robot Coordination for Indoor Object Search Tasks.

Theano is dead, long live Theano:…The Montreal Institute of Learning Algorithms is halting development of deep learning framework Theano following the release of version 1.0 of the software in a few weeks. Theano, like other frameworks developed by academia (eg, Lasagne, Brainstorm), has struggled to grow its developer base in the fact of sustained, richly funded competition from private sector companies like Google (TensorFlow), Microsoft (CNTK), Amazon (MXNet) and Facebook (PyTorch, support for Caffe).…”Theano is no longer the best way we can enable the emergence and application of novel research ideas. Even with the increasing support of external contributions from industry and academia, maintaining an older code base and keeping up with competitors has come in the way of innovation,” wrote MILA’s Yoshua Bengio, in a thread announcing the decision to halt development.…Read more here.

Shooting down missiles with a catapult in Unity:…A fun writeup about a short project to train a catapult to turn, aim, and fire a boulder at a missile, done in the just-released Unity machine learning framework.…Read more here: Teaching a Catapult to Shoot Down a Missile.

The future is 1,000 simulated robots, grasping procedural objects, forever:…New research from Google Brain and Google X shows how to use a combination of recent popular AI techniques (domain randomization, procedural generation, domain adaptation) to train industrial robots to pick up a broad range of objects with higher performance than before.…Most modern robotics AI projects try to develop as much of their AI as possible in simulation. This is because reality is very slow and involves unpleasant things like dealing with physical robots (which break) that have to handle the horrendous variety of the world. Instead, a new approach is to train high-performance AI models in simulation, then try to come up with techniques to let them easily transfer to real world robots without too much of a performance drop.…For this paper, Google researchers procedurally generated over 1,000 objects to get their (simulated) robots to grasp. They also had the robots try to learn to grasp approximately ~50,000 real (simulated) objects from the ShapeNet dataset. At any time during the project the company was running simulations of between 1,000 and 2,000 robot arms in parallel, letting the robots go through a very large number of simulations. (Compared to just 6 real world KUKA robots for its experiments in physical reality.)…The results: Google’s system is able to grasp objects 76% of the time when trained on a mixture of over 9 million real-world and simulated grasps. That’s somewhat better than other methods though not by any means a profound improvement. Where it gets interesting is sample efficiency: Google’s system is able to correctly grasp objects about 59% of the time when trained on only 93,841 data points, demonstrating compelling sample efficiency compared to other methods.…Read more here: Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping.

What modern AI chip startups tell us about the success of TensorFlow:…In this analysis of the onrushing horde of AI-chip startups Ark Invest notes which chips have native out-of-the-box support for which AI frameworks. The answer? Out of 8 companies (NVIDIA, Intel, AMD, Qualcomm, Huawei Kirin, Google TPU, Wave Computing, GraphCore) every single one supports TensorFlow, five support Caffe, and two support Theano and MXNet. (Nvidia supports pretty much every framework, as you’d expect given its market leader status.)…Read more here.

OpenAI Bits&Pieces:

Nonlinear Computation in Deep Linear Networks:…In which we outline an insight into how to perform nonlinear computation directly within linear networks, with some example code.…Read more here.

They nickname it the Ice Giant, though the official name is: meta_learner_Exp2_KL-Hyperparams$834-Alpha.

The ice giant walks over an icy moon underneath skewed, simulated stars. It breathes no oxygen – the world it moves within is an illusion, running on a supercomputer cluster owned by NASA and a consortium of public-private entities. Inside the simulation, it learns to explore the moon, figuring out how to negotiate ridges and cliffs, gaining an understanding of the heights it can jump to using the limited gravity.

Its body is almost entirely white, shining oddly in the simulator as though illuminated from within. The connections between its joints are highlighted in red to its human overseers, but are invisible to it within the simulator.

For lifetimes, eons, it learns to navigate the simulated moon. Over time, the simulation gets better as new imagery and scan data is integrated. It one day wakes up to a moon now riven with cracks in its structure, and so it begins to explore subterranean depths with variable temperatures and shifting visibility.

On the outside, all of this happens over the course of five years or so.

At the end of it, they pause the simulation, and the Ice Giant halts, suspended over a pixelated shaft, deep in the fragmented, partially simulated tunnels and cracks beneath the moon’s surface. They copy the agent over into a real robot, one of thousands, built painstakingly over years for just this purpose. The robots are loaded into a spaceship. The spaceship takes off.

Several years later, the first set of robots arrive on the moon. During the flight, the spaceship uses a small, powerful onboard computer to run certain very long-term experiments, trying to further optimize a subset of the onboard agents with new data, acquired in flight and via probes deployed ahead of the spaceship. Flying between the planets, suspended inside a computer, walking on the simulated moon that the real spacecraft is screaming towards, the Ice Giant learns to improvise its way across certain treacherous gaps.

When the ship arrives eight of the Ice Giant agents are loaded onto 8 robots which are sent down to different parts of the moon. They begin to die, as transfer learning algorithms fail to generalize to colors or quirks or geographies unanticipated in the simulator, or gravitational quirks coming from odd metal deposits, or any of the other subtleties inherent to reality. But some survive. Their minds are scanned, tweaked, replicated. One of the robots survives and continues to explore, endlessly learning. When the new robots arrive they crash to the surface in descent pods then emerge and stand, silently, as intermediary communication satellites come into orbit around the moon, forming a network letting the robots learn and continuously copy their minds from one to the other, learning as a collective. The long-lived ice giant continues to succeed: something about its lifetime of experience and some quirk of its initial hyperparameters combined with certain un-replicable randomizations during initial training, have given it a malleable brain, able to perform significantly above simulated baselines. It persists. Soon the majority of the robots on the moon are running variants of its mind, feeding back their own successes and failures, letting the lone continuous survivor further enhance itself.

After many years the research mission is complete and the robots march deep into the center of the moon, to wait there for their humans to arrive and re-purpose them. NASA makes a decision to authorize the continued operation of meta_learner_Exp2_KL-Hyperparams$834-Alpha. It gains another nickname: Magellan. The robot is memorialized with a plaque following an asteroid strike that destroys it. But its brain lives on in the satellite network, waiting to be re-instantiated on perhaps another moon, or perhaps another planet. In this way new minds are, slowly, cultivated.