Coding of Signal Distances

By PetrosB, on March 20th, 2013

The Data Compression Conference (DCC) 2013 is happening right now, and tomorrow I’ll be presenting a new paper that I co-authored with Shantanu Rane titled “Efficient Coding of Signal Distances Using Universal Quantized Embeddings” [1]. This is continuation of earlier work on coding of distances [2].

I am quite excited about this work for two reasons:

It is the first practical application of my earlier work on universal quantization [3], which I think provides a very promising coding framework for a number of applications.

The results show a remarkable improvement over previous results. We have managed to demonstrate bitrate reduction of more than 33% over our earlier work in [2] and more than 50% over existing methods.

The paper provides a framework to analyze the performance of embeddings when used to evaluate distances in the embedding domain. As it turns out, the embeddings derived from the universal quantization methods in [3] allow the system designer to decide the range of distances to be preserved by the embedding and use fewer bits if the range is smaller. In many inference applications—such as ones using nearest-neighbor search—what matters is that small distances are preserved, not necessarily larger ones. Thus appropriately designing the embedding results to significant saving in bits. This saving is a great example of information scalability, where the coding complexity scales with the amount of information one needs to extract.

The figure below demonstrates the performance of our method in an image retrieval problem. It shows the probability of correct retrieval using a query image as a function of the bitrate used to code each descriptor extracted from the query image. You can see the paper for more details on the retrieval scheme, but that scheme is not really the point of the paper; our focus is on the embedding-based coding method and the analysis. The black lines demonstrate the performance of universal embeddings, while the colored lines the performance of earlier work (including ours).