How can robots perceive the world and their own movements so that they accomplish navigation and manipulation tasks? In this module, we will study how images and videos acquired by cameras mounted on robots are transformed into representations like features and optical flow. Such 2D representations allow us then to extract 3D information about where the camera is and in which direction the robot moves. You will come to understand how grasping objects is facilitated by the computation of 3D posing of objects and navigation can be accomplished by visual odometry and landmark-based localization.

From the lesson

Pose Estimation

In this module we will be learning about feature extraction and pose estimation from two images. We will learn how to find the most salient parts of an image and track them across multiple frames (i.e. in a video sequence). We will then learn how to use features to find the position of the camera with respect to another reference frame on a plane using Homographies. We will also learn about how to make these techniques more robust, using least squares to hand noisy feature points or RANSAC to remove completely erroneous feature points.