As video technology has progressed, so too have the potential use cases, not least in terms of security. I've written previously about technologies that aim to spot criminals from CCTV footage, and there are a growing number of projects aiming to speed up security checks at airports simply from video footage of people as they travel through the airport.

New research from A*STAR highlights the latest development with technology that can understand what humans are doing in video footage. Previous technologies have been developed for use in areas such as security, the Singapore-based team believes their work will have the biggest impact in autonomous transportation, where vehicles will need to be able to quickly detect people like police officers and then interpret their actions rapidly and accurately.

Understanding Intentions

Computers are already fairly good at detecting objects in static images, and such technology is already widely used today, but doing the same in moving images is significantly harder.

"Understanding human actions in videos is a necessary step to build smarter and friendlier machines," the researchers explain.

The team uses deep learning technology to build upon previous attempts at achieving this, which were largely pretty slow and error-prone. The new method combines two forms of neural networks with a static neural network working alongside a recurring neural network. The static network has already been proven to be good at processing still images, whilst the recurring network is more commonly deployed to process changing data, such as in speech recognition. The team believes their approach is the first to bring detection and tracking together in one system.

The system was tested on around 3,000 videos that are routinely used in computer vision experiments. The testing revealed that the system was capable of outperforming existing detection technology, and achieved a success rate of around 20% on videos showing everyday activities. Whilst this is still probably too low for practical use, it does nonetheless show a degree of progress that is very promising.