OpenFP

What is OpenFP?

OpenFP allows to create audio fingerprint files from music tracks using the OpenFP client to match them against a set of reference fingerprints provided by the OpenFP server.
The following picture provides an overview of the tooling:

Before using the matching SW you need to extract fingerprints using the extraction tool openfp_extract. All fingerprints are then loaded into the openfp_server during startup

When issueing match requests against the server the openfp_match client is used

Which queries the server and presents results.

How does it work?

Audio Extraction

To process input audio we use libfftw3
which can read PCM16 audio data. This format can be created using ffmpeg
with the parameters "-f u16le -acodec pcm_s16le". This is done
automatically by openfp_extract which extracts any input supported
by ffmpeg (videos, audo files).

Fingerprint Extraction

The following graph is a block diagram describing the fingerprint extraction.

Reduce noise using any type of lowpass. During our tests we found that we
achieved the best results by an IIR LP without output reduction. The current implementation
uses a 12Hz IIR low pass (5th order butterworth calculated here)

Quantizes values to binary flags (energy band active/inactive)

Reduce output data (e.g. drop 3 out of 4 results)

Fingerprints

The result of the fingerprint extraction is a set of 32bit
subfingerprints stored in a single output file representing the original
audio file. Each bit in each of the 32bit values describes the spectral
power in a specific bark band. Being a bit value it cannot describe any
actual value, but only a difference, an edge of a band energy change
(as produced by the lowpass filtering).

To speed up matching we calculated MFCCfeature vectors that
we use to cluster the subfingerprints of all fingerprints. These
feature vectors are stored in the fingerprint file every n
subfingerprints. Clustering happens during startup of the
server process.

Fingerprint Matching

Matching is performed by the openfp_server which answers requests
sent by openfp_match. Matching is a two step process:

Find suitable match position

Evaluate fingerprints following match position

Suitable match positions are found by comparing one or more
randomly choosen fingerprints from the audio sample to be checked
with all reference fingerprints. A suitable match is found if the
hamming distance of two compared fingerprints falls below a certain
threshold. For all suitable matches then the average hamming
distance for the following fingerprints (corresponding to a
few seconds of audio) is calculated. Only when the average hamming
distance is also below the matching threshold the fingerprint is
considered a real match.

The two-step matching is motivated by performance considerations.
The openfp_server implementation also facilitates fingerprint
clustering to further eliminate unnecessary comparisons.