Download PDF

Introduction

This paper presents the interpolation algorithm based on the Lanczos3 filter that is used in Intel® Integrated Performance Primitives (Intel® IPP). The use of this algorithm gives 1.5 performance gains on the Intel AVX architecture comparing with the Intel SSE implementation.

Intel IPP implements the most popular algorithms from the simplest – nearest neighbor, bilinear – to the more sophisticated – supersampling (the best image quality for reducing image size without any artifacts), different cubic filters, and so-called Lanczos3 filter [1]. The last filter often allows keeping the sharpness of lines with sufficient smoothness of the tonal transitions – much better than bicubic algorithms.

This algorithm uses 36 pixels of the source image for calculation the intensity of each pixel in the destination image. The filter operation is rather expensive for each output pixel and perform 42 multiplications and 35 additions.

This algorithm is used in the functions ippiResizeSqrPixel when the parameter interpolation set to IPPI_INTER_LANCZOS [2].

This function resizes the source image ROI by xFactor in the x direction and yFactor in the y direction. The image size can be either reduced or increased in each direction, depending on the values of xFactor, yFactor. The result is resampled using the interpolation method specified by the interpolation parameter and written to the destination image ROI. Pixel coordinates x' and y' in the resized image are obtained from the following equations [3]:

Equation 2. 2D Resize Transform (Forward)

x' = xFactor*x + xShifty' = yFactor*y + yShift

Where x and y denote the pixel coordinates in the source image.

The function requires the external buffer pBuffer, its size can be previously computed by calling the function ippiResizeGetBufSize.

Flowchart of Algorithm

Figure 2. Pipeline

Filtering

Calculation of the Arrays of the Indexes

The first stage is the finding of the source pixels indexes needed for the interpolation task. It performs with the certain transforms – srcROI clipping and calculating its new coordinates after transforms with the specified parameters factors and shifts [Equation 2].

Then these coordinates is mapped back to source image. These transforms are detailed here.

The augment value dcol is the distance along the X-axis between the integer coordinate (index) of the pixel in the source image and its coordinate (float) in the image obtained as a result of the inverse transform. In addition, the augment value drow is the distance along the Y-axis between the integer coordinate (index) of the pixel in the source image and its coordinate (float) in the image obtained as a result of the inverse transform [Equation 3].

Preparation of the Lanczos3 Filter

Before the interpolation the filter is applied [Equation 1].

Figure 3. Three-lobed Lanczos Window

The filter is implemented as a table tblLanczos3 (see The Filter Kernel).

The function ownLanczos3 using this table values is called two times:

For columns with parameters ownLanczos3(dcol, width, colY);

For rows with parameters ownLanczos3(drow, height, rowY);

Where dcol and drow are the augment values (see Calculation of the Arrays of the Indexes), width, and height are values of the processed image size, colY and rowY are the float values after the Lanczos filter for horizontal and vertical interpo lations.

This function is implemented in assembler, but we present here only its “c” analogue (see The Lanczos3 Filter Implementation).

Processing of the Possible Borders

It should be noted that in general case the resizing operation could require the border processing and the replication of the lacked border pixels. This is performed by the special functions, and the vectoriazation and optimization cannot be applied. As the number of such pixels is too small, these functions do not affect on the performance and are not considered here.

General Processing

The Lanczos interpolation proceeds by means of the ownResize32plLz function (see The General Processing Function), where the parameters are the following:

The interpolation by rows is the most complicated and time-consuming phase of this processing. It is realized by intrinsic function ownRowLanczos32pl (see AVX Implementation for Interpolation by X-Direction), where the parameters are following:

About the Author

Yuri Tikhomirov is a Senior Software Engineer with the Software Solutions Group (Visual Computing Software Division, CIP, Intel IPP). He is focused on CPU-specific code development for Intel IPP libraries for all existing and future Intel Architectures. Yuri has been with Intel for five years. His email is yuri.tikhomirov@intel.com.

Using this translation widget will provide you with a machine translation of the original content. The machine translation is provided for informational purposes only; it should not be relied upon as complete or accurate.