Abstract: In this project, an investigation of Linear Pixel Shuffling for subpattern searching was performed using software created for this purpose. Linear Pixel Shuffling is a method to get a random-like permutation of pixels. Because the algorithm is generic, it has been used for displaying images, graphics rendering, image compression, image morphology, and digital halftones. The algorithm was compared against Space Domain Correlation, Frequency Domain Correlation, and Random Ordering for its quickness in matching a pattern. On average, Linear Pixel Shuffling found a partial match in a third of the time compared to Space Domain, Frequency Domain, and Random Ordering.

Traditional textbook subpattern searching algorithms are based on Space Domain Correlation and Frequency Domain Correlation. Both are mathematically equivalent, the only difference is the domain in which they operate. Other techniques, such as Random Ordering (picking the pixel locations at random), can be used in place of Space Domain and Frequency Domain. In this paper, Linear Pixel Shuffling (LPS) will be investigated as an additional methodology used for subpattern searching.

For every pixel location in the image, find the cross-correlation to the pattern at that location by summing up the pixel-by-pixel multiplications. This is done starting at location [0,0] and moving one pixel at a time across each scanline (i.e. [0,0], then [0,1]...[0,N], then [2,0]...[2,N] and so on until [N,N]).

Instead of the correlation being computed in the space domain, it is computed in the frequency domain. This is done by taking the Fourier Transform of image and pattern, multiplying the transforms together, and then taking the inverse Fourier Transform. There will be a peak in this field corresponding to the location of the pattern in the image. In mathematical theory, this gives you the same results as Space Domain Correlation.

LPS is a method to get a random-like permutation of pixels. Unlike other techniques, LPS creates a pixel ordering that is spread evenly across the image. It wraps around the image by use of modular arithmetic. A simple arithmetic progression is needed to determine the next pixel location. Refer to "Linear Pixel Shuffling for Image Processing, an Introduction"(1) and "Linear Pixel Shuffling Made Easy" by Peter Anderson (pga(at sign)cs.rit.edu) for more information.

Experiments were designed to test the speed and efficiency of LPS compared to three other algorithms: Space Domain, Frequency Domain, and Random Ordering.

The reason for comparing LPS against Space Domain is that Space Domain is the traditional textbook approach to pattern searching. It was hypothesized that, because LPS jumped around, it could find a pattern faster than Space Domain. It would take about the same amount of time for both algorithms to cover every pixel in an image, however with LPS, the pattern's neighborhood could be found before each pixel location was searched. Even if LPS doesn't find the exact center of the pattern before Space Domain, LPS would give a faster indication of where the pattern could be found within the image.

It's well known that Frequency Domain can be much faster than Space Domain; the speed up factor depends upon the ratio between image and subpattern size. For large ratios, i.e. a 3x3 correlation for a 500x500 image, it is faster to work in the space domain. In case the subpattern size is a large fraction of the whole image size, it is faster to work in the frequency domain). This increase in speed is based upon the number of operations (adding, multiplying, etc.) that need to be done. In this experiment, the Frequency Domain algorithm was used to compare against LPS for cases where the Frequency Domain was faster than Space Domain.

Random Ordering has a large chance of being faster than Space Domain or Frequency Domain. The ordering of pixels is guaranteed that every pixel location will be investigated.

The hypothesis of the present paper was that LPS could give faster approximate results. Instead of being sequential, LPS wraps around the image by use of modular arithmetic. LPS tries to space the jumps evenly. All that is needed to determine the next pixel in sequence is a simple arithmetic progression. Even though the exact center of a pattern within an image might be unknown, LPS could indicate where that pattern could be found faster than other subpattern searching algorithms would. This was found to be, indeed, the case.

The objective of this project is to compare Linear Pixel Shuffling as a Subpattern Searching tool, against more traditional algorithms that are based on Space and Frequency Domain Correlation. The study will demonstrate the speed and efficiency of LPS as a potentially enhanced method for Subpattern Searching.

There were three experiments. The first experiment kept the center of the pattern constant while changing the size. The second experiment kept the pattern size the same and moved the pattern around the image. The third experiment is a real world example.

Experiment 1 had five cases. The pattern sizes were: 3x3, 5x5, 11x11, 41x41, and 81x81 (all values are in pixel width and height). The image size was 500x500 pixels and consisted of a black rectangle (the pattern) on a white background. This first experiment was to see if LPS had any potential at all.

Experiment 2 had 20 cases. The pattern was chosen to be an 11x11 pixel square. The pattern size was chosen to be this large to give Random Ordering a chance. The center positions of the pattern can be seen in Figure 1 on page 4. The image size was 500x500 pixels and consisted of a black rectangle on a white background. This is to see if LPS is shift invariant or just lucky in Experiment 1 by jumping to the position in the image where the pattern was early in the execution.

Experiment 3 had one case (see Figure 2 on page 5). Unlike the first two experiments (in which the input data was computer generated with no noise), this experiment features a scanned page of text. The pattern to find is a circle on the text document. The circle was created by a rubber stamp. The image size was 612x792 pixels.

For each experiment, each case ran using all four algorithms (Space Domain, Frequency Domain, LPS, and Random Ordering). The results were then plotted to compare distances (the Euclidean distance between where the object really was and where the procedure thought it was) and correlation values.

The distance values were plotted to see which algorithm found the pattern (or at least the area where the pattern was) the fastest.

Correlation values were plotted to see the semi-random progression of LPS compared to the other algorithms. In this type of graph, Space Domain and Frequency Domain would plot delta functions.

Software was created to compare the various algorithms. Sherlock is an Open Look\xaa application that can read in an image and pattern and run the four algorithms. The image and pattern files are in an internal format which consists of a simple header followed by bytes of data (1 byte/pixel). Currently, the image and pattern must be grayscale (0 = black, 255 = white).

Sherlock was written in C using the XView toolkit on a Sun SPARCStation 4/40. Within Sherlock, two external routines were imported: a Fast Fourier Transform routine originally written in Fortran by Norman Brenner in 1968, which was translated to C by Keith Knox of Xerox Webster Research Center in September 1990; and timing macros by George V. Neville-Neil, Universiteit Twente (gnn@cs.utwent.nl). The software can run either in GUI mode or in batch mode. Batch mode can process commands from a file or interactively from the keyboard. Two result files are created for each algorithm run, the correlation results, and the distance results. An example plot of correlation values for LPS can be seen in Figure 3 on page 7. An example distance plot for all four algorithms can be seen in Figure 4 on page 8.

The correlation values are the amount of correlation between image and pattern at each pixel location within the image. These are not true correlation values in the sense of the Space Domain equation. In order to speed up processing, it was faster to find the difference (rather than the correlation) between image and pattern and then invert so that the maximum value showed the smallest difference. So, in effect, these values are really differences rather than correlations. The search index is the pixel ordering. A search index of 100 is the 100th pixel coordinate correlated. For Space Domain, this graph would show a delta function.

The other results file is comprised of the Euclidean distance between where Sherlock thinks the pattern is and where it really is. Distance is mathematically defined as the distance between 2 points given by:

(EQ 1)

where (x0,y0) are the true coordinates of the pattern and (x1,y1) are the estimated coordinates.

When an algorithm starts, the estimated position of the pattern is set to (0,0). This means that if the pattern is near location (0,0), the distance will start off small as opposed to a pattern that is further away.

The results files are formatted to be read in with ACE/gr, a free 2D graphing program for exploratory data analysis (xvgr - Open Look\xaa or xmgr - Motif version) written by Paul Turner (of the Oregon Graduate Institute, email pturner@ese.ogi.edu). This enables the results to be graphed quickly.

After an algorithm has run, information about it is presented. In batch mode, it is written to the screen. In UI mode, a window pops up with the information (see Figure 5 on page 9). The information given is the same in either case. This includes:

is the name of the image file to use. This option is required. This file must be in Sherlock internal format (see the header_util utility).

-b { batchfile | - }

specifies to run in batchmode. No graphical UI is shown in this mode. A file of commands (batchfile) can be given to be interpreted. If a "-" is given for a filename, batch mode commands are read in from stdin. See Section 4.1.3, "Batch Mode," on page 13. A graphical UI will be presented if this option isn't given.

-ii { y | n }

Determine how to invert the image for the Frequency Domain Correlation algorithm. See Section 4.3, "Frequency Domain," on page 15. A "y" will always invert and a "n" will never invert. By default, Sherlock will automatically determine if inversion is needed.

-ip { y | n }

Determine how to invert the pattern for the Frequency Domain Correlation algorithm. See Section 4.3, "Frequency Domain," on page 15. A "y" will always invert and a "n" will never invert. By default, Sherlock will automatically determine if inversion is needed.

"-l 'p1 p2 xinc yinc'

are the parameters to use for the Linear Pixel Shuffling algorithm. "p1" is the first maximum number. "p2" is the second maximum number. "xinc" is the X increment. "yinc" is the Y increment. All numbers are integers. See Section 4.5, "Linear Pixel Shuffling," on page 16 for more details on what these numbers should be and why they are needed. If not given, they are calculated using a modified Fibonacci sequence.

-p filename

is the name of the pattern file to use. This file must be in Sherlock internal format (see the header_util utility).

-rc filename

is the name to save the correlation data to. This file is written out to a format to be read in by ACE/gr. Default is "corr_results.data".

-rd filename

is the name to save the distance data to. This file is written out to a format to be read in by ACE/gr. Default is "dist_results.data".

-rs number

is the random seed to use in the Random Ordering algorithm. Default is to use the number of seconds since 00:00:00 GMT, Jan. 1, 1970 as returned by the time(3) function.

-s subtitle

is the subtitle for the output graphs (corr_results.data and dist_results.data). Default is to have no subtitle on the graphs.

-t title

is the title for the output graphs (corr_results.data and dist_results.data). Default is to have no title on the graphs.

By default, Sherlock will start in the interactive graphical UI mode. The image will be displayed with a command panel (of button menus) above it (see Figure 6 on page 11). If the pattern is specified on the command line, it is displayed in it's own window. The left footer of the image and pattern windows displays Sherlock's current state (i.e. Ready, Loading..., etc.).

The button menus on the command panel are labeled File, Pattern, and Run.

The File button menu contains I/O options and "Exit". Within this menu, the following choices are available:

Load Image... - load a new image into memory

Load Pattern... - load a (new) pattern into memory

Save Pattern... - save a pattern to disk

Exit - exit Sherlock

FIGURE 7. Sherlock file menu

Selecting the load/save options will bring up a window asking for a directory and a filename. After this information is entered, the user can either click the "Apply" button to apply the input or "Cancel" button to discard the changes. Values entered for directory and filename will remain with the individual windows (i.e. a filename given to the Load Image window will appear the next time this window is displayed and won't appear for the other I/O windows).

The "Save Pattern..." choice is grayed out until there is a pattern in memory to save.

Files to be loaded must be in Sherlock internal format. Files are saved in this internal format.

The Pattern button menu contains choices for dealing with patterns. The "Select" choice will crop the pattern from the image. The mouse can be used to create a crop box (see Mouse Operations paragraph below). The "Display" choice will display a pattern that's in memory. Both choices are grayed out until needed (for "Select" that's creating a crop box, for "Display", that's having a pattern in memory to display).

FIGURE 8. Sherlock pattern menu

The Run button contains the algorithms to run. The four choices are: "Space Domain Correlation", "Frequency Domain Correlation", "Linear Pixel Shuffling", and "Random Ordering". This button is grayed out until there is a pattern in memory so that the algorithms can run.

FIGURE 9. Sherlock run menu

A gauge is displayed when an algorithm is being executed. This gauge displays how much of the algorithm run time has been spent and how much more there is to go.

FIGURE 10. Sherlock status gauge

Mouse Operations are as follows.The crop box used to select a pattern from the image is drawn by holding down the left button. The box can be stretched by moving the mouse with the left button held down. As soon as the left button is released, the crop box can no longer be adjusted. If the left button isn't being held down, the current mouse position is displayed in the right footer of the image window. This allows the user to see how good the algorithm worked to find the center location of the pattern.

The colormap is set to a grayscale colormap (and so the images are limited to 8 bit/pixel grayscale). Closest colors are used if a certain grayscale is unavailable (meaning that a white background can look slightly grayish). The data used in the calculations or saved to disk are the actual values, not the colormapped values.

Sherlock can also run non-interactively in batch mode. In this mode, no images are displayed. This mode can be used in a pure ASCII environment.

A file of commands can be given to be interpreted. If a "-" (dash) is given as the filename for the "-b" option, commands are read from stdin.

The following commands are available:

crop x y width height

Crop out an area from the image to become the pattern. X and Y are the (X,Y) starting coordinates of the area to crop. Width is the width (in pixels). Height is the height (in pixels). For example, the command:

crop 10 20 100 150

will crop an area from the image starting at pixel position (10,20) and crop out an area 100 pixels width by 150 pixels height. This cropped area becomes the pattern.

exit

Exit the program.

invert type {yes|no|auto}

Determine how to invert the image or pattern. This is the same as the "-ii" and "-ip" switches above (with the addition of "auto"). The type parameter determines which data is affected. It can be "image" or "pattern". Setting to "yes" will always invert. Setting to "no" will never invert. Setting to "auto" will let Sherlock try to automatically determine if inversion is needed. "Auto" is the default setting. An example:

invert pattern no

will cause Sherlock to not invert the pattern. This is useful if the pattern's (or image's) first pixel isn't the background color. See Section 4.3, "Frequency Domain," on page 15 for an explanation on why inversion is sometimes necessary.

load type filename

Load a file into memory. The type parameter determines what is being read in (image or pattern). Type can be "image" or "pattern". The filename is the name of the file to read in. For example, the command:

Save a file to disk. The type parameter determines what is being written out (image or pattern). Type can be "image" or "pattern". This version currently only supports saving patterns, so in effect type is useless at this point (i.e. the pattern will be saved regardless of what type is set to). The filename is the name of the file to read in. For example, the command:

save pattern abc.raw

will save the pattern from memory to "abc.raw".

seed = { number | ? }

Set the random seed to the number indicated. If the '?' is given instead of a number, Sherlock will get the random seed for the Random Ordering algorithm from the time(3) function. For example, the command:

seed = 471

will set the seed for the random number generator to 471.

sleep seconds

Sleep for the time indicated. The time argument (seconds) as you might have guessed are in seconds. For example, the command:

sleep 120

will sleep for 120 seconds (2 minutes).

!command

Execute a shell escape. Command is the command to run. For example, the command:

The Fourier Transform function used assumes that the origin is in the middle of the scanline. However, Sherlock calculates using the origin as the first pixel in the scanline. In order to match up perfectly with the Space Domain Correlation, either "beercanning" or "checkerboarding" would have to be done so that the Fourier Transform function sees the data with the origin in the middle. "Beercanning" and "checkerboarding" are methods that can transform an image to move the origin from being the first pixel in a scanline to being in the center of a scanline (or visa versa). In order to save processing time, Sherlock doesn't do either. It compensates by calculating the origin (middle) location of the pattern within the image by knowing that the Fourier Transform function used will always return the maximum correlation location at the upper left corner of the pattern. Had Sherlock done either "beercanning" or "checkerboarding", the maximum correlation would have been found in the center of the pattern, as it does for Space Domain.

In order for the Frequency Domain Correlation to work, Sherlock needs a black (pixel value 0) background with a non-black object. This is so zero padding of black can be applied to the pattern and image to boost the size to that of a power of 2 (so that the Fast Fourier Transform function can work as fast as possible). Therefore, the first pixel (location [0,0]) is checked to see if it's equal to 0. If it is, Sherlock assumes the background is black and the object is non-black. If it isn't, Sherlock inverts the image (or pattern). This assumes that the pattern won't be covering position (0,0). If it is, or some reason, the auto detection isn't working properly, the inversion behavior can be controlled by switches (-ii and -ip) and in the case of batch mode, by commands (invert type {yes|no|auto}).

The steps for this algorithm are as follows:

find power of 2 size equal to or greater than the largest dimension in pattern and image

The % symbol represents remaindering. Both the increments and the maximum values can be given on the command line (-l option). The maximum values (x_max, y_max) are equal to or greater than the image dimensions.

The parameters were obtained from a table of suitable parameters.

The algorithm steps are the same as for SPACE DOMAIN (see above) except for the method of selecting the next pixel within the image (step 3). Instead, the change in position is calculated using the equations listed above.

For the correlation data, the difference values (see Section 4.2 on page 14 which describes this) are saved during an algorithm's execution. After the algorithm is done, the following equation is used for Space Domain, LPS, and Random Ordering data to convert differences into normalized correlation values (from 0% to 100% correlation).

(EQ 3)

NCi is the normalized output data correlation value. Di is the difference value. P is the number of values. For Frequency Domain results, the data is normalized to compare to the other algorithm's results.

The results for each experiment are in two tables. The first table lists the number of seconds it took for each algorithm to find its first partial match of the pattern. The second table lists the number of seconds that it took to find the center of the pattern.

For the first two experiments, an algorithm was claimed to have found a partial match when the distance became smaller than the initial distance value. This is reasonable since there was no noise in the image and the higher correlation values could only have come from matching the pattern. The LPS parameters used for these experiments were 509 and 557 for the maximum values and 237 and 380 for the increments for X and Y, respectively.

Because the image in Experiment 3 had noise (and therefore no sharp indication of partial success in the correlation), it wasn't enough to say that the algorithm had found the pattern when the distance became shorter. Instead, the Space Domain results were used to find the maximum distance from first contact with the pattern to the center. This was known because Space Domain first makes contact with a pattern at its boundary and then moves toward the center (which results in a delta function). This maximum distance was used as a threshold. Any algorithm that had a distance equal to or smaller than this threshold value was considered to be matching the pattern and not something else. For this experiment, 619 and 809 were the maximum values and 288 and 552 were the increments for X and Y, respectively.

For all three experiments, the center of the pattern was considered matched when the distance was equal to 0.

First, let's look at Experiment 1. The following table shows how long it takes for pixels to be processed. The values were generated from the Space Domain. Since Space Domain processes sequentially in row order, we know how many pixels in the image the algorithm had to process before finding the center location of the pattern. Using this information, we can calculate the seconds/pixel (or pixels/second) rate. This rate can then tell us how long it took to go through N pixels.

For the table above, column 1 is the case (in terms of pattern size). Column 2 is the number of pixels to go through to get to the center of the pattern. Column 3 is the seconds/pixel rate based upon the number of pixels and the time it took to process them. For example, for the first case (3x3), we know from Table 2 on page 18 that it took 12 seconds for Space Domain to find the center. Since the center is at (400,400), we know column 2 to be 200,400 pixels. Just divide 12 by 200,400 to get the seconds/pixel rate. Column 4 is the inverse of column 3 to show the pixels/second rate. Column 5 is the number of pixels that had to be processed by LPS until a partial match was made. This is determined by looking at the pixel ordering that LPS went through. Since the same parameters were used for LPS in Experiments 1 and 2, this ordering will be constant. Column 6 is the number of seconds it took to get through the number of pixels listed in column 5.

Since each case has a different sized pattern, the time it takes to calculate the correlation at each pixel location in the image is different. The bigger the pattern, the more calculations are needed, the longer the time (as shown in columns 3 and 4). Column 6 is based on the individual times for each cases.

As can be seen in column 6, LPS finds a partial match in well under 2 seconds.

Figure 12 on page 22 shows the pattern size vs. the number of seconds it took an algorithm to find a partial match. This graph shows LPS, Frequency Domain, and Random Ordering to be pixel size invariant (with LPS being the quickest of the three) compared to Space Domain which is dependent on pattern size.

The descriptions for the columns are the same as for the table in the last section. Column 1 lists the cases in terms of pattern centers. Unlike Experiment 1, the pattern is the same size for all cases. This should lead us to believe the timings for columns 3 and 4 should be the same. They do differ by a small amount. The mean for seconds/pixel is 6.4668E-04. This is roughly 26 times greater than the standard deviation (2.4581E-05). Column 6's results are based on this mean value.

Again, it can be seen that LPS finds a partial match under 2 seconds.

The results for Frequency Domain in the 408x67 case obviously don't seem to belong. This anomaly is explained by background processes running at the same time. Repeat trials of this case show the results to be similar to other case results.

The same approximate values resulted for Space Domain, LPS, and Random Order as in the original test case. Rerunning this particular case shows that the 140 seconds for Frequency Domain from the original experiment is not correct. The mean values for Frequency Domain are close to other case results, as expected.

Figure 13 on page 25 shows the position vs. the number of seconds it took an algorithm to find a partial match. Position is in terms of sequential order (i.e. for a 500x500 pixel image, pixel (1,1) is the 501th pixel). This graph shows LPS, Frequency Domain, and Random Ordering to be position invariant (with LPS being the quickest of the three). Space Domain is shown to be dependent on position.

The descriptions for the columns are the same as for the table in the first section. Comparing the number of seconds in column 6 (~96.5 seconds) to the number of seconds LPS took to find a partial match in Table 5 on page 20 (98 seconds), we see that the values for LPS in Table 5 are indeed correct.

The results for Space Domain varied depending upon the location of the pattern within the image and the pattern size. If the pattern was located near the starting position, Space Domain had a better chance of being the first to find the pattern. A bigger pattern size meant more calculations per correlation value.

Frequency Domain had good results (compared to Space Domain) when the ratio between image and pattern was small. That is, the pattern was a large fraction of the image. It was consistent for Experiments 1 and 2 when the size of the image didn't change. This was due to the fact that for a given image size, Frequency Domain will always have to do the same number of operations and so the time it takes should be consistent. Timing results were independent of position and pattern size since the timing of this algorithm is based on the number of operations performed. The number of operations is based on the number of pixels (power of 2 size) in which it has to transform.

Since the pattern size is the same in all cases for Experiment 2, the timing difference can be averaged. On average, LPS found a partial match 47.7, 97.4, and 82.5 seconds faster than Space Domain, Frequency Domain, and Random Ordering, respectively for finding a partial match.

LPS was able to find the pattern much quicker than the other algorithms most of the time. Within 2 seconds it was able to scan the 500x500 pixel image and match a part of the pattern (experiments 1 and 2). Results show LPS to be 10 to 5,422 seconds faster for changes in pattern size, 47.7 to 82.5 seconds faster depending on pattern position, and 452 to 10,088 seconds faster for a real test case. It is pattern size invariant and independent about position.

This project did not investigate the effects of image/pattern noise. However, the single real test case examined suggests that LPS may yield similar processing speed improvements when noise is present.

Despite the fact that LPS took longer to find the exact center (in some cases), LPS is useful as a subpattern searching algorithm. Intelligent processing can be added to allow LPS to find the center much faster than is shown here. LPS has consistently demonstrated that it has the capability to find a pattern much quicker than the other algorithms in this project.

The work done here in no way shows the full potential of LPS. In a real application, once the pattern's neighborhood is found, there would be no need to continue looking at every pixel as was done here. This experiment merely lays down the foundation to show that LPS is an algorithm that can be made use of for subpattern searching.

LPS calculates correlation the same way as Space Domain and as such can get fooled. If any noise has a better correlation than the pattern, LPS can mistake the noise for the pattern. Any intelligence than can help Space Domain avoid false matches may also help LPS (such as noise reduction filters, thresholding correlation values, etc.). Noise reduction is beyond the scope of this project.

One problem seen with LPS is when there is an unknown number of patterns. LPS's strength is the ability to find a match quickly. When the number of patterns is not known in advance, the question arises as to when LPS should stop searching.

As was mentioned before, LPS sometimes takes longer to find the exact center. This can be improved on. This experiment takes a simplistic approach and continues to check every pixel, even after finding a partial match.

One way to improve performance is to zero in when that first partial match is found. Instead of continuing, concentrate on the specific area the match was found by doing the LPS algorithm on this sub-sample.

Another way to speedup performance is to improve the correlation calculation. Any way to speed up the calculation would speedup the whole process.

Noise was not really an issue in this project. The first 2 experiments were noise-free. One direction to go with LPS research is to look into how it handles noise. Given the first 2 experiments with noise, the same results would be expected. It's with intelligent processing that noise becomes more troublesome. What happens when you get some correlation? Should LPS investigate the sub-sample? Or is the correlation only noise and LPS would be wasting its time? Also, if LPS is trying to zero in on the center, how long should it process before it realizes that the pattern isn't in the sub-sample? One way to get around these problems would be to set a threshold. Any correlation above this threshold would be considered pattern, anything below would be noise.

In the case of an unknown number of patterns, an intelligent processing algorithm to look into is to setup the LPS parameters based on the size of the pattern so that you are likely to hit a pattern (if one exists) on every jump. One pass through the image would tell you the correlation hot spots to zero in on.

There are several different methods to find parameters for LPS. Another area to look into is to discover which method produces the best parameters.