What’s TLD?

How it works?

This is a long story, please read Zdenek’s paper. Here is how it works in command-line
if you compiled ccv with FFMPEG support:

./tld <Your Video> x y width height

It will output each tracking coordinates for each frame.

What about performance?

TLD is implemented closely after Zdenek’s paper, but still, varies in quite a few
aspects significantly. I’ve done excessive tests to make sure performance, in
terms of accuracy and speed matches the original implementation.

Accuracy-wise:

TLD uses randomization algorithm, thus, the result can vary from time to time,
I managed to run ccv’s TLD implementation on test videos with “rotation == 0” and
default parameters. With 3 runs and then pick the median, I’ve able to generate
some meaningful data to analyze on.

By enable “rotation” technique, you can achieve near real-time performance on QVGA
video, with minor accuracy loss. With “rotation == 1” (default parameter), TLD
spends around 15ms on tracking, 50ms on detecting, 50ms on learning for 320x240
video on single thread of i7-2620M 2.7GHz.

Under the hood?

ccv’s TLD implementation varies from Zdenek’s original Matlab implementation in
several significant ways:

Zdenek’s implementation uses aspect-ratio normalized examples (15x15); these
examples are normalized so that a simple multiply can yield correlation confidence.
ccv’s implementation uses aspect-aware examples (constraint to area size of 400);
examples are left as it is and using normalized coefficient computation to get
confidence score.

4). Pseudo-random Number Generator:

Zdenek’s implementation uses srand() for random number generation, and seed it
with 0. ccv’s implementation uses a Mersenne-Twister random number generator with
an environment-dependent seed.