Extended Object Tracking

This example shows you how to track extended objects. Extended objects are objects whose dimensions span multiple sensor resolution cells. As a result, the sensors report multiple detections of the extended objects in a single scan. In this example, you will use different techniques to track extended objects and evaluate the results of their tracking performance.

Introduction

In conventional tracking approaches such as global nearest neighbor (multiObjectTracker, trackerGNN), joint probabilistic data association (trackerJPDA) and multi-hypothesis tracking (trackerTOMHT), tracked objects are assumed to return one detection per sensor scan. With the development of sensors that have better resolution, such as a high-resolution radar, the sensors typically return more than one detection of an object. For example, the image below depicts multiple detections for a single vehicle that spans multiple radar resolution cells. In such cases, the technique used to track the objects is known as extended object tracking [1].

The key benefit of using a high-resolution sensor is getting more information about the object, such as its dimensions and orientation. This additional information can improve the probability of detection and reduce the false alarm rate.

Extended objects present new challenges to conventional trackers, because these trackers assume a single detection per object per sensor. In some cases, you can segment the sensor data to provide the conventional trackers with a single detection per object. However, by doing so, the benefit of using a high-resolution sensor may be lost.

In contrast, extended object trackers can handle multiple detections per object. In addition, these trackers can estimate not only the kinematic states, such as position and velocity of the object, but also the dimensions and orientation of the object. In this example, you track vehicles around the ego vehicle using the following trackers:

A conventional multi-object tracker using a point-target model, multiObjectTracker

You will also evaluate the tracking results of all trackers using trackErrorMetrics and trackAssignmentMetrics.

Setup

Scenario

In this example, which recreates the example Sensor Fusion Using Synthetic Radar and Vision Data (Automated Driving Toolbox), there is an ego vehicle and three other vehicles: a vehicle ahead of the ego vehicle in the right lane, a vehicle behind the ego vehicle in the right lane, and an overtaking vehicle. The overtaking vehicle begins its motion behind the three other vehicles, moves to the left lane to pass them, and ends in the right lane ahead of all three vehicles.

In this example, you simulate an ego vehicle that has 6 radar sensors and 2 vision sensors covering the 360 degree field of view. The sensors have some overlap and some coverage gap. The ego vehicle is equipped with a long-range radar sensor and a vision sensor on both the front and back of the vehicle. Each side of the vehicle has two short-range radar sensors, each covering 90 degrees. One sensor on each side covers from the middle of the vehicle to the back. The other sensor on each side covers from the middle of the vehicle forward.

In this example, you use some key metrics to assess the tracking performance of each tracker. In particular, you assess the trackers based on their accuracy in estimating the positions, velocities, dimensions (length and width) and orientations of the objects. These metrics can be evaluated using the trackErrorMetrics class. To define the error of a tracked target from its ground truth, this example uses a 'custom' error function, extendedTargetError, listed at the end of this example.

You will also assess the performance based on metrics such as number of false tracks or redundant tracks. These metrics can be calculated using the trackAssignmentMetrics class. To define the distance between a tracked target and a truth object, this example uses a 'custom' error function, extendedTargetDistance, listed at the end of this example. The function defines the distance metric as the Euclidean distance between the track position and the truth position.

Point Object Tracker

The multiObjectTracker System object™ assumes one detection per object per sensor and uses a global nearest neighbor approach to associate detections to tracks. It assumes that every object can be detected at most once by a sensor in a scan. In this case, the simulated radar sensors have a high enough resolution to generate multiple detections per object. If these detections are not clustered, the tracker generates multiple tracks per object. Clustering returns one detection per cluster, at the cost of having a larger uncertainty covariance and losing information about the true object dimensions. Clustering also makes it hard to distinguish between two objects when they are close to each other, for example, when one vehicle passes another vehicle.

These results shows that, with clustering, the tracker can keep track of the objects in the scene. However, it also shows that the track associated with the overtaking vehicle (yellow) moves from the front of the vehicle at the beginning of the scenario to the back of the vehicle at the end. At the beginning of the scenario, the overtaking vehicle is behind the ego vehicle (blue), so radar and vision detections are made from its front. As the overtaking vehicle passes the ego vehicle, radar detections are made from the side of the overtaking vehicle and then from its back, and the track moves to the back of the vehicle.

You can also see that the clustering is not perfect. When the passing vehicle passes the vehicle that is behind the ego vehicle (purple), both tracks are slightly shifted to the left due to the imperfect clustering. Similarly, throughout the scenario, the clustering sometimes fails to cluster all the radar detections from the same object to a single cluster. As a result, the point tracker generates two tracks for the overtaking vehicle.

GGIW-PHD Extended Object Tracker

In this section, you use a GGIW-PHD tracker (trackerPHD with ggiwphd) to track objects. Unlike multiObjectTracker, which uses one filter per track, the GGIW-PHD is a multi-target filter which describes the probability hypothesis density (PHD) of the scenario. To model the extended target, GGIW-PHD uses the following distributions:

The model assumes that each distribution is independent of each other. Thus, the probability hypothesis density (PHD) in GGIW-PHD filter is described by a weighted sum of the probability density functions of several GGIW components.

A PHD tracker requires calculating the detectability of each component in the density. The calculation of detectability requires configurations of each sensor used with the tracker. You define these configurations for trackerPHD using the trackingSensorConfiguration class.

Set up sensor configurations

% Allocate memory for configurations.
sensorConfig = cell(numel(sensors),1);
% Allocate memory for resolution noise.
resolutionNoise = zeros(2,2,numel(sensors));
for i = 1:numel(sensors)
% Configuration of each sensor. SensorLimits are defined by the% field of view and are the only parameters for detectability.
sensorConfig{i} = trackingSensorConfiguration(sensors{i}.SensorIndex,...'SensorLimits',[-1/2 1/2].*sensors{i}.FieldOfView',...'FilterInitializationFcn',@initCVPHDFilter,'SensorTransformFcn',...
@cvmeas);
% origin of the sensor with respect to ego vehicle
originPosition = [sensors{i}.SensorLocation(:);sensors{i}.Height];
% orientation of the sensor with respect to ego vehicle
orientation = rotmat(quaternion([sensors{i}.Yaw sensors{i}.Pitch sensors{i}.Roll],'eulerd','ZYX','frame'),'frame');
% The SensorTransformParameters of the configuration requires% information about sensor's mounting location and orientation in the% tracking frame assembed in a struct in the following format:
coordTransforms(1) = struct('Frame','Spherical',...'OriginPosition',originPosition,...'Orientation',orientation,...'HasVelocity',false,...'OriginVelocity',zeros(3,1),...'HasRange',false);
% Set the transform for each sensor
sensorConfig{i}.SensorTransformParameters = coordTransforms;
% All sensors are at the same update rate and the tracker is updated% at the same rate. Therefore, IsValidTime is true for all sensors.
sensorConfig{i}.IsValidTime = true;
if i <= 6
% A radar cannot report a detection with accuracy more than its% resolution for extended targets. Radars typically use a point% target model, which does not capture the correct uncertainty.
resolutionNoise(:,:,i) = diag([sensors{i}.AzimuthResolution sensors{i}.RangeResolution].^2);
% Clutter density for radar sensors
sensorConfig{i}.ClutterDensity = sensors{i}.ClutterDensity;
else% Vision sensors report 1 detection per object. This information% can be captured in the sensor configuration. This enables the% tracker to not generate possible partitions for those detections.
sensorConfig{i}.MaxNumDetsPerObject = 1;
% Clutter density for vision sensor
sensorConfig{i}.ClutterDensity = sensors{i}.ClutterDensity;
endend

Define the tracker.

% The trackerPHD creates multiple possible partitions of a set of% detections and evaluate it against the current components in the PHD% filter. The 2 and 5 in the function below defines the lower and upper% Mahalanobis distance between detections. This is equivalent to defining% that each cluster of detection must be a minimum of 2 resolutions apart% and maximum of 5 resolutions apart from each other.
partFcn = @(x)partitionDetections(x,2,5);
tracker = trackerPHD('SensorConfigurations',sensorConfig,...'PartitioningFcn',partFcn,...'ConfirmationThreshold',0.75,...% Weight of a density component to be called a confirmed track.'MergingThreshold',200,...% Threshold to merge components belonging to same track ID.'AssignmentThreshold',50); % Minimum distance of a detection cell to give birth in the density.

These results show that the GGIW-PHD can handle multiple detections per object per sensor, without the need to cluster these detections first. Moreover, by using the multiple detections, the tracker estimates the position, velocity, dimensions and orientation of each object. The dashed elliptical shape in the figure demonstrates the expected extent of the target.

The GGIW-PHD filter assumes that detections are distributed around the target's elliptical center. Therefore, the tracks tend to follow observable portions of the vehicle. Such observable portions include rear face of the vehicle that is directly ahead of the ego vehicle or the front face of the vehicle directly behind the ego vehicle for example, the rear and front face of the vehicle directly ahead and behind of the ego vehicle respectively. In contrast, the length and width of the passing vehicle was fully observed during the simulation. Therefore, its estimated ellipse has a better overlap with the actual shape.

Prototype Extended Object Tracker

To create the prototype extended object tracker, you first have to specify a model for the extended object. The following model defines the extended objects as rectangular, with similar dimensions to the ego vehicle. Each extended object is assumed to be making a coordinated turn about its pivot, located at the center of the rear axle.

Create the tracker object by defining the model used for tracking, the number of particles used for tracked objects and for undetected objects, and the sampling algorithm. In this example, you use Gibbs sampling to associate the detections with the tracks of extended object.

These results show that the prototype can also handle multiple detections per object per sensor. Similar to GGIW-PHD, it also estimates the size and orientation of the object.

You can notice that the estimated tracks, which are modeled as rectangles, have a good fit with the simulated ground truth object, depicted by the solid color patches. In particular, the tracks are able to correctly track the shape of the vehicle along with the kinematic center.

Evaluate Tracking Performance

Evaluate the tracking performance of each tracker using quantitative metrics such as the estimation error in position, velocity, dimensions and orientation. Also evaluate the track assignments using metrics such as redundant and false tracks.

The assignment metrics illustrate that two redundant tracks, Track 11 and Track 93, were initialized and confirmed by the point object tracker. The redundant tracks result due to imperfect clustering, where detections belonging to the same target were clustered into more than one clustered detection. Also, the point object tracker created and confirmed a few false tracks: Track 2, 18 and 66. In contrast, the GGIW-PHD tracker and prototype tracker maintains tracks on all three targets and do not create any false or redundant tracks. These metrics show that both extended object trackers correctly partitions the detections and associates them with the correct tracks.

The plot shows the average estimation errors for the three types of trackers used in this example. Because the point object tracker does not estimate the yaw and dimensions of the objects, they are now shown in the plots. The point object tracker is able to estimate the kinematics of the objects with a reasonable accuracy. The position error of the vehicle behind the ego vehicle is higher because it was dragged to the left when the passing vehicle overtakes this vehicle. This is also an artifact of imperfect clustering when the objects are close to each other.

As described earlier, the GGIW-PHD tracker assumes that measurements are distributed around the object's extent, which results in center of the tracks on observable parts of the vehicle This can also be seen in the position error metrics for TruthID 2 and 4. The tracker is able to estimate the dimensions of the object with about 0.3 meters accuracy for the vehicles ahead and behind the ego vehicle. Because of higher certainity defined for the vehicles' dimensions in the initCVPHDFilter function, the tracker does not collapse the length of these vehicles, even when the best-fit ellipse has a very low length. As passing vehicle (TruthID 3) was observed on all dimensions, its dimensions are measured more accurately than the other vehicles. However, as the passing vehicle maneuvers with respect to the ego vehicle, the error in yaw estimate is higher.

The prototype uses a rectangular shaped target and uses ray-intersections to evaluate each detection against the estimated track state. This model helps the tracker estimate the shape and orientation more accurately. However, the process of ray-intersections, combined with sequential Monte Carlo methods, is computationally more expensive than using closed-form distributions.

Compare Time Performance

Previously, you learned about different techniques, the assumptions they make about target models, and the resulting tracking performance. Now compare the run-times of the trackers. Notice that GGIW-PHD filter offers significant computational advantages over the prototype, at the cost of slightly decreased tracking performance.

function phd = initCVPHDFilter(varargin)
% Initialize a 2-dimensional constant velocity PHD filter.% The initcvggiwphd creates 3-D constant velocity filter.
phd3d = initcvggiwphd(varargin{:});
% Create 2-D filter using 1st four states. % PositionIndex is [1 3] as constant velocity state is defined as [x;vx;y;vy]. % ProcessNoise is defined as the acceleration in x and y direction. Because% of time-dependence it is of non-additive nature. See constvel for more% information.
phd = ggiwphd(phd3d.States(1:4,:),phd3d.StateCovariances(1:4,1:4,:),...'StateTransitionFcn',@constvel,...'StateTransitionJacobianFcn',@constveljac,...'MeasurementFcn',@cvmeas,...'MeasurementJacobianFcn',@cvmeasjac,...'HasAdditiveMeasurementNoise',true,...'HasAdditiveProcessNoise',false,...'ProcessNoise',diag([0.1 0.1]),...'MaxNumComponents',1000,...'ExtentRotationFcn',@(x,dT)eye(2),...'PositionIndex',[1 3]);
% Set the sizes of the filter when created with a detection cell. Without% any input arguments, the function, initcvggiwphd, creates a filter with% no components.if nargin > 0
% A higher value on degree of freedom represents higher certainty in% the dimensions. The initial value is provided with that of a standard% passenger vehicle. A higher certainity also prevents collapsing of% length or width of the track when only one of the faces is visible.
phd.DegreesOfFreedom = 1000;
phd.ScaleMatrices = (1000-4)*diag([4.7/2 1.8/2].^2);
phd.GammaForgettingFactors = 1.03;
endend

Summary

This example showed how to track objects that return multiple detections in a single sensor scan using different approaches. These approaches can be used to track objects with high-resolution sensors, such as a radar or laser sensor.