Scientists Develop Computer Program Looks Five Minutes Into the Future

6/17/2018 5:01:01 PM

Sunday, June 17, 2018 - 17:01

Computer scientists from the University of Bonn have developed software that can look a few minutes into the future.

Scientists could develop software which first learns the typical sequence of actions, such as cooking, from video sequences. Based on this knowledge, it can then accurately predict in new situations what the chef will do at which point in time, Science Daily reports.

Researchers will present their findings at the world's largest Conference on Computer Vision and Pattern Recognition, which will be held June 19-21 in Salt Lake City, USA.

The training data used by the scientists included 40 videos in which performers prepare different salads. Each of the recordings was around 6 minutes long and contained an average of 20 different actions. The videos also contained precise details of what time the action started and how long it took.

The computer "watched" these salad videos totaling around four hours. This way, the algorithm learned which actions typically follow each other during this task and how long they last. This is by no means trivial: After all, every chef has his own approach. Additionally, the sequence may vary depending on the recipe.

"Then we tested how successful the learning process was," explains Prof. Dr. Jürgen Gall. "For this we confronted the software with videos that it had not seen before." At least the new short films fit into the context: They also showed the preparation of a salad. For the test, the computer was told what is shown in the first 20 or 30 percent of one of the new videos. On this basis it then had to predict what would happen during the rest of the film.

That worked amazingly well. Gall: "Accuracy was over 40 percent for short forecast periods, but then dropped the more the algorithm had to look into the future." For activities that were more than three minutes in the future, the computer was still right in 15 percent of cases. However, the prognosis was only considered correct if both the activity and its timing were correctly predicted.

Gall and his colleagues want the study to be understood only as a first step into the new field of activity prediction. Especially since the algorithm performs noticeably worse if it has to recognize on its own what happens in the first part of the video, instead of being told. Because this analysis is never 100 percent correct -- Gall speaks of "noisy" data. "Our process does work with it," he says. "But unfortunately nowhere near as well."