Abstract

The presence of robots in humans’ daily life is a growing phenomenon. Autonomous navigation is a part of this progression. In this study robust tracking of different objects simultaneously is investigated.
A software capable of tracking different objects, mainly humans and robots, is built using a Kinect camera situated in a moving platform. Computer vision technology is used to provide a tool to improve the human and robot interaction. Choosing a sensor like the recently released Kinect camera, colour and depth images are combined to obtain a differential gain in the tracking implementation. The sensor has been mounted in a moving robot able to provide the position of the robot at any moment from the sensor on its wheels. This information is very valuable to transform the data from different frames and positions to the same reference system.
The use given to the Kinect concerns mainly the tracking of one or two humans from a fixed position. This work shows that it can also be used to track humans and much smaller objects simultaneously when the camera is mounted on a mobile robot.
The Point Cloud Library is the framework used to deal with the data obtained from the sensor. The use of PCL libraries for filtering, segmentation and clustering results in the detection of the object. Then the descriptor creation and comparison with the ones obtained in previous frames will come up with the correlation part. A training process for the recognition of humans and robots is performed beforehand. The descriptor used is the Viewpoint Feature Histogram, which maintains invariance to distance and size.
The main limitation imposed is that the solution implemented should be integrated in the Mobotware framework. Thus, it has been created as a plug-in running directly in the framework. The Kinect introduce limitations such as small range of view and impossibility of being used outdoors.
The application is validated with tests and the results obtained are presented. The recognition of the objects trained has proved to be successful and the time of computation, even though not ideal, is enough to see how the scene varies over time. Enough to track humans and SMR moving at a normal speed. In all cases the system detects and tracks the objects and keeps learning from the detections over time. Even though it tracks well, when movement is added to the camera the tracking process is less effective, mainly due to the apparition of false detections.