MIT Computers Perdict Future Behavior

Using algorithms partially modeled on the human brain,
researchers from the Massachusetts Institute of Technology have enabled
computers to predict the immediate future by examining a photograph.

A program created at MIT's Computer Science and
Artificial Intelligence Laboratory (CSAIL) essentially watched 2 million online
videos and observed how different types of scenes typically progress: people
walk across golf courses, waves crash on the shore, and so on. Now, when it
sees a new still image, it can generate a short video clip (roughly 1.5 seconds
long) showing its vision of the immediate future.

"It's a system that tries to learn what are
plausible videos — what are plausible motions you might see," says Carl
Vondrick, a graduate student at CSAIL and lead author on a related research
paper to be presented this month at the Neural Information Processing Systems
conference in Barcelona. The team aims to generate longer videos with more
complex scenes in the future.

But Vondrick says applications could one day go beyond
turning photos into computer-generated GIFs. The system's ability to predict
normal behavior could help spot unusual happenings in security footage or
improve the reliability of self-driving cars, he says.

If the system spots something unusual, like an animal of
a type it hasn't seen before running into the road, Vondrick explains that the
vehicle "can detect that and say, 'Okay, I've never seen this situation
before — I can stop and let the driver take over,' for example."

To create the program, the MIT team relied on a
scientific technique called deep learning that's become central to modern
artificial intelligence research. It's the approach that lets digital
assistants like Apple's Siri and Amazon's Alexa understand what users want, and
that drives image search and facial recognition advancements at Facebook and
Google.

Experts say deep learning, which uses mathematical
structures called neural networks to pull patterns from massive sets of data,
could soon let computers make diagnoses from medical images, detect bank fraud,
predict customer order patterns, and operate vehicles at least as well as
people.

"Deep neural networks are performing better than
humans on all kinds of significant problems, like image recognition, for
example," says Chris Nicholson, CEO of San Francisco startup Skymind,
which develops deep learning software and offers consulting. "Without
them, I think self-driving cars would be a danger on the roads, but with them,
self-driving cars are safer than human drivers."

Neural networks take low-level inputs, like the pixels of
an image or snippets of audio, and run them through a series of virtual layers,
which assign relative weights to each individual piece of data in interpreting
the input. The "deep" in deep learning refers to using tall stacks of
these layers to collectively uncover more complex patterns in the data,
expanding its understanding from pixels to basic shapes to features like stop
signs and brake lights. To train the networks, programmers repeatedly test them
on large sets of data, automatically tweaking the weights so the network makes
fewer and fewer mistakes over time.

While research into neural networks, loosely based on the
human brain, dates back decades, progress has been particularly remarkable in roughly
the past ten years, Nicholson says. A 2006 set of papers by renowned computer
scientist Geoffrey Hinton, who now divides his time between Google and the
University of Toronto, helped pave the way for deep learning's rapid
development.

In 2012, a team including Hinton was the first to use
deep learning to win a prestigious computer science competition called the
ImageNet Large Scale Visual Recognition Challenge. The team's program beat
rivals by a wide margin at classifying objects in photographs into categories,
performing with a 15.3 percent error rate compared to a 26.2 percent rate for
the second-place entry.

This year, a Google-designed computer trained by deep
learning defeated one of the world's top Go players, a feat many experts of the
ancient Asian board game had previously thought could be decades away. The
system, called AlphaGo, learned in part by playing millions of simulated games
against itself. While human chess players have long been bested by digital
rivals, many experts had thought Go — which has significantly more sequences of
valid moves — could be harder for computers to grasp.

In early November, a group from the University of Oxford
unveiled a deep learning-based lipreading system that can outperform human
experts. And this week, a team including researchers from Google published a
paper in the Journal of the American Medical Association showing that deep
learning could spot diabetic retinopathy roughly as well as trained
ophthalmologists. That eye condition can cause blindness in people with
diabetes, especially if they don't have access to testing and treatment.

"A lot of people don't have access to a specialist
who can access these [diagnostic] films, especially in underserved populations
where the incidence of diabetes is going up and the number of eyecare professionals
is flat," says Dr. Lily Peng, a product manager at Google and lead author
on the paper.

Like many of deep learning's successes, the retinopathy
research relied on a large set of training data, including roughly 128,000
images already classified by ophthalmologists. Deep learning is fundamentally a
technique for the internet age, requiring datasets that only a few years ago
would have been too big to even fit on a hard drive.

"It's not as useful in cases where there's not much
data available," Vondrick says. "If it's very difficult to acquire
data, then deep learning may not get you as far."

Computers need a lot more examples than humans do to
learn the same skills. Recent editions of the ImageNet challenge, which has
added more sophisticated object recognition and scene analysis challenges as
algorithms have grown more sophisticated, included hundreds of gigabytes of
training data — orders of magnitude larger than a CD or DVD. Developers at
Google train new algorithms from the company's sweeping archive of search
results and clicks, and companies racing to build self-driving vehicles collect
vast amounts of sensor readings from heavily instrumented, human-driven cars.

"Getting the right type of data is actually the most
critical bit," says Sameep Tandon, CEO of Bay Area autonomous car startup
Drive.ai. "One hundred hours of just driving straight down Highway 5 in
California is not going to help when you're driving down El Camino in Mountain
View, for example."

Once all that data is collected, the neural networks
still need to be trained. Experts say, with a bit of awe, that the math
operations involved aren't beyond an advanced high school student — some clever
matrix multiplications to weight the data points and a bit of calculus to
refine the weights in the most efficient way — but all those computations still
add up.

"If you have this massive dataset, but only a very
weak computer, you're going to be waiting a long time to train that
model," says Evan Shelhamer, a graduate student at the University of
California at Berkeley and lead developer on Caffe, a widely-used open source
toolkit for deep learning.

Only modern computers, along with an internet-enabled
research community sharing tools and data, have made deep learning practical.
But researchers say it's still not a perfect fit for every situation. One
limitation is that it can be difficult to understand how neural networks are
actually interpreting the data, something that could give regulators pause if
the algorithms are used for sensitive tasks like driving cars, evaluating
medical images, or computing credit scores.

"Right now, deep learning does not have enough
explanatory power," Nicholson says. "It cannot always tell you why it
reached a decision, even if it's reaching that decision with better accuracy
than any other [technique]."

The systems could also have potential blind spots not
caught by initial training and test data, potentially leading to unexpected
errors in unusual situations. And perhaps luckily for humans, current deep
learning systems aren't intelligent enough to learn new skills on their own,
even closely related to what they already can do, without a good deal of
separate training.

"A network for identifying coral knows nothing about
identifying, even, grass from sidewalk," Shelhamer says. "The Go
network isn't just going to become a master at checkers on its own."

Comments

Post a Comment

Popular posts from this blog

How Facebook Outs Sex Workers
By Kashmir Hill Yesterday 2:20pm
Leila has two identities, but Facebook is only supposed
to know about one of them.
Leila is a sex worker. She goes to great lengths to keep
separate identities for ordinary life and for sex work, to avoid stigma,
arrest, professional blowback, or clients who might be stalkers (or worse).
Her “real identity”—the public one, who lives in
California, uses an academic email address, and posts about politics—joined
Facebook in 2011. Her sex-work identity is not on the social network at all;
for it, she uses a different email address, a different phone number, and a
different name. Yet earlier this year, looking at Facebook’s “People You May
Know” recommendations, Leila (a name I’m using using in place of either of the
names she uses) was shocked to see some of her regular sex-work clients.
Despite the fact that she’d only given Facebook
information from her vanilla identity, the company had somehow discerned her
real-world con…

The 15 Most Influential Websites of All TimeAlex Fitzpatrick,Lisa Eadicicco,Matt Peckham Updated:
Oct 20, 2017 10:55 AM ET | Originally published: Oct 18, 2017 The
web, or "world wide web" as we used to say, turns 27 years old on
December 20. On that date, nearly three decades ago, British engineer and
scientist Tim Berners-Lee launched the world's first website, running on a NeXT
computer at CERN (the European Organization for Nuclear
Research) in Switzerland.

The website wasn't much at the time, just a few sentences
organized into topic areas that laid out the arguments for the concept. But it
established vital first principles still essential to the web as it exists
today: the notion of hyperlinks that reimagined documents (and eventually any
form of media) as nonlinear texts, and the ability for anyone, anywhere in the
world, to peruse that content by way of a browser: a piece of software that
cohered to universal formatting standards. It's been a wild ride since…

British supermarket offers 'finger vein' payment in
worldwide first
By Katie Morley, consumer affairs editor 20 SEPTEMBER
2017 • 1:04AM
A UK supermarket has become the first in the world to let
shoppers pay for groceries using just the veins in their fingertips.
Customers at the Costcutter store, at Brunel University
in London, can now pay using their unique vein pattern to identify themselves.
The firm behind the technology, Sthaler, has said it is
in "serious talks" with other major UK supermarkets to adopt hi-tech
finger vein scanners at pay points across thousands of stores.
It works by using infrared to scan people's finger veins
and then links this unique biometric map to their bank cards. Customers’ bank
details are then stored with payment provider Worldpay, in the same way you can
store your card details when shopping online. Shoppers can then turn up to the
supermarket with nothing on them but their own hands and use it to make
payments in just three …