Filter Banks for Images on Hexagonal Grid

Abstract

As the Hexagon Grid becomes popular for machine vision,
analysis-synthesis filter banks tailored for that Grid
are needed.
In this paper we exhibit two perfect reconstruction (P.R.) decompositions
for signals on a hexagon lattice.
Each example filter bank has better properties than other
banks proposed for the
hexagon grid, and, it would seem, better properties than any possible
comparable filter bank for the square grid.

One example is an orthogonal
coding transform with one low-pass and three high-pass filters
similar to that used in the Simoncelli-Adelson quadtree pyramid for the
hexagon grid.
Unlike the Simoncelli-Adelson filter bank, we achieve (exact)
perfect reconstruction
with compact (and integer-valued) filters.

The other example is a 3-ary ``blurring'' filter bank
which has a perfect inverse (``sharpening'') filter bank.
In prior art, one is usually satisfied with almost-perfect reconstruction
for such systems.

Since mathematical expressions are somewhat difficult to depict
in HTML, this version, while self-contained, omits some of
the detailed mathematical descriptions in the original LaTeX version.

1.0 Introduction

Images are usually sampled on a Square Lattice
but a Hexagon Lattice has several important advantages
which amply justify its use for many applications
(e.g. robot vision, deep-space imaging, biomedical image processing).

In a companion paper (``Decomposition of orthogonal filter banks'')
we describe a general method for
representing perfect reconstruction (P.R.) multi-dimensional filter banks.
In this paper we give two example P.R. decompositions
for signals on a hexagon lattice.

Basic Definitions

To make the paper self-contained,
we offer brief practical definitions of filterbank terminology.

A complete filterbank decomposition replaces the N
pixels of an image with Nfilterbank outputs (transform
coefficients).
Each output is the scalar product of the image and a filter.
The locus of non-zero taps in the filter is called its support kernel.
The support kernels in our example filters are each of size 16 or 17,
about the minimum for a ``high-quality'' filter on a 2-dimensional signal.

A filterbank is compact when all its support kernels
have finite size.

A complete compact filterbank offers perfect reconstruction
when an inverse complete compact filterbank
exists which allows the original signal
to be (perfectly) reconstructed from the filterbank outputs.
The two filterbanks are called analysis (decomposition)
and synthesis (reconstructing inverse).

An analysis filterbank is orthogonal when its (synthesizing)
inverse is exact and is simply the transpose of the analysis filterbank.
Both our example filterbanks
provide perfect reconstruction, but only the first is orthogonal.

In a K-ary filterbank the N filters are derived
from K basic filters by shifting. Our two examples are
4-ary and 3-ary filterbanks respectively.
Since the N image pixels produce only N / K
outputs for each basic filter, K is called the decimation factor.

A K-ary filterbank may have fewer than K basic filters when
rotations are considered. For example, the 2-D wavelet decomposition
in a method like JPEG-2000 is a 4-ary filterbank, but two of the filters
are equivalent after 90o rotation, so it has
only three filters distinct after rotation.
Our example filterbanks have respectively only two and one filters which
are distinct after rotation.

A compaction filterbank is a K-ary analysis filterbank serving the
goal of signal data compression.
Usually one of the K filters is a ``low-pass''
filter which will carry most of the information and the other (K-1)
filters are ``high-pass.''

A pyramidal filterbank is a filterbank system
where the low-pass outputs
are recursively presented back to the analysis filterbank.

Filterbank Example 1: Quadtree (Pyramidal) Compaction

The Quadtree Wavelet Pyramid provides the essential step in many
2-D signal coding methods when signal samples are arranged on
a square grid.
As Simoncelli and Adelson showed
cite{Simoncelli:90a},
Quadtree Pyramids are available
for images on a hexagon grid that have better behavior than
the square-grid Quadtrees.
One nicety is that all three high-pass filters are identical,
after rotation by 120o or 240o.
Such a filter bank might be called ``quasi-binary'' --- after
rotation there are only two basic filters.
(The square grid quadtree has two high-pass filters identical after
rotation by 90o and a third, trouble-some, high-pass filter
called the ``saddle-shape.'')
In Section 2 we discuss the advantages of hexagonal sampling, the
Simoncellian pyramid and some other design choices.

The popular Simoncelli-Adelson filter bank
cite{Simoncelli:90a}
uses 37-tap filters and is only nearly-orthogonal.
In Section 3 we exhibit a similar 4-ary
decomposition with comparable compaction power, better
spatial localization (16-tap filters),
easier computation and exact orthogonality.

Filterbank Example 2: Blur and Perfect-Inverse Sharpen

Functions to attenuate (``blur'')
or enhance (``sharpen'') high-frequency information are quite
useful in signal processing.
The two functions are logically the inverse of each other, but
sharpen/blur filter banks which provide perfect reconstruction
(operate as biorthogonal analysis/synthesis filter bank pairs)
are rarely used, and designers are often satisfied with approximate
reconstruction.

In Section 4 we exhibit 3-ary ``blurring'' and ``sharpening''
filter banks which are each other's perfect inverses.
(Obviously we speak of deliberate digital blurring;
there is no perfect inverse for real-world noisy blurring so
our method doesn't supplant methods like Wiener filtering.)

The blurring filters have similar nearly circular
shapes with one centered at each sample of the signal, and similarly
for the sharpening filters.
The inverse of a finite non-trivial 1-ary filter bank
is always infinite,
so in prior art, one is usually satisfied with almost-perfect
reconstruction
for such systems, but we achieve exact reconstruction with a simple trick:
The three basis filters are not identical, although they are very
similar and become identical after 120o and 240o rotation.
This filter bank might be called ``quasi-unary'' --- after
rotation there is but a single basic filter.

2.0 Design Choices for Compaction Transform

The image compaction filter bank espoused in this paper operates on the
hexagon grid, is 4-ary, ``Simoncellian''
and has as its support a particular 16-pixel
``snowflake-like'' shape.
We will first justify these choices.

Hexagon Grid

The hexagon grid has several major advantages over the square grid:

Pixels have nearly Circular Shape.
The Voronoi cell of the new lattice is a hexagon. The hexagon is a
good approximation to a circle, leading directly to the next three advantages.

Hardware Advantages. Physical pixels (printer dots,
electron beams) usually have
a circular shape and thus operate more effectively on the hexagon grid.

Sampling Efficiency. Mean distance to the nearest
lattice point is reduced compared with a Square Grid of the same density.

Circular 7-pel Shape. A compact and symmetric 7-pixel
shape provides filter kernels which are more effective and
more efficient than any Square Grid counterpart.
There is also a circular 12-pel shape, and so on.

Symmetry. The regular hexagon has 6-fold symmetry
versus only 4-fold symmetry for the square.

Well-behaved Connectivity. Differential treatment of
lateral and catercorner neighbors complicates many low level pixel
processing tasks on the square grid, and even gives ``paradoxical'' behavior.
(Regions may lack contiguous boundaries in the 4-neighbor topology,
while distinct edges may cross in the 8-neighbor topology.)
These difficulties are avoided on the Hexagon Grid where each pixel has
six equal neighbors, and where
the dual graph of the Voronoi is triangular
cite{Deutsch:70}.

Improved Angular Resolution. With more lateral neighbors,
curves and edges can be followed more easily and
accurately
cite{Golay:69}
cite{Legault:73}.
Line ``aliasing'' is less of a problem
cite{Staunton:89a}.

Well-behaved Operators. Some of the cited advantages have led
designers to treat even the square grid
(and ``quincunx grid'') using the elegant methods
appropriate for the hexagon grid
cite{Graham:82}.

Storage Savings. The improved sampling efficiency means
fewer pixels need be stored for a given resolution.

Speed. Most of the above advantages lead directly to faster
image processing, often twice as fast as on the square grid
cite{Deutsch:72}.

The final point here applies particularly to Simoncellian
decompositions, as we discuss in the next subsection.

Orientation Decomposition

Separable 4-ary (quadtree) wavelet coding on the square grid uses a
low-pass term (``LL''),
two similar high-pass terms (``HL,'' ``LH'' ---
horizontal and vertical gradients)
and a ``saddle-shape'' term (``HH'').
This fourth term is unfriendly, since it combines the leftover energy
from disjoint regions of Fourier space
cite{Simoncelli:90c}.
Compression is achieved by discarding low-energy coefficients but small
errors in the ``saddle-shape'' term may cause diagonal edge energy
to appear to ``flip'' orientation --- thus a small arithmetic error can cause
a severe visual flaw.

On the hexagon grid there are
4-ary image decompositions comprising a low-pass term
and three high-pass terms equivalent after rotation by 120o.
Such decompositions were
introduced by Simoncelli
cite{Simoncelli:90a},
so we'll call them
Simoncellian.
With finer orientation banding and without the ``unfriendly'' saddle-shape,
quantization policies for lossy image compression are
simpler and perform better than the square-grid
counterparts
cite{Balak:93}
cite{Wegmann:96}.

Pyramidal Reduction by Four

The Hexagon Grid allows Tritree and Septree
Pyramids
cite{Burt:80}
cite{Burt:81}
cite{Golay:69}
as well as the Quadtree.
Resolution reduction by a factor of seven is too coarse to get
much benefit from pyramidalization,
but reduction by three might seem appealing.
However the Tritree has some serious disadvantages.

The axes of the lower resolution grid
in a 4-tree (Quadtree) will be parallel to the
original axes, but for the 3- and 7-trees the axes will
be rotated by 30o and approx 19.1o respectively.
Such rotations might be severely impractical.

The choice of a Quadtree allows one to exploit coding software and
visualization tools already developed for the Square Grid.
The Quad-tree Pyramid form of a rectangular
shaped image is often visualized by dividing the image into four congruent
rectangles with the upper-left being a ``thumbnail'' view, and recursively
so dividing the thumbnail.
This can also be done with a Hexagon Grid image of overall rectangular shape,
but there doesn't seem to be a similar visualization method for Tri-tree
or Sep-tree Pyramids (even if the overall image were of triangular
or heptangular shape).

A Tritree Pyramid would use three basic filters instead of four;
the three filters might be
a low-pass, and a horizontal and vertical gradient.
Such a system seems superior to the ordinary Square-Grid Quadtree
Pyramid since the ``saddle-shape'' filter is eliminated,
and might be a good choice despite the disadvantages mentioned.
Still, the basic gradients of a hexagon-based system ``should''
be offset from each other by 120o, not 90o,
so the Simoncellian Quadtree seems more ``natural.''

4-Trigon Tile Shape

Use of a quadtree does not directly imply reliance on tiles of
specifically four pixels, but their use is almost dictated.
The support bases for the filters in 4-ary perfect reconstruction
filterbanks may need to be derived from basic 4-pixel subtiles.
(The support kernels for Simoncelli's filters were not multiples of
four in size, but this filter system
did not provide true perfect reconstruction.)

Displayed here are three possible shapes for 4-pixel tiles.
Looking from left to right, we will call them
the R-Trigon, the Rhombus, and the L-Trigon.
(``Trigon'' is an invented word, but it is convenient to have
a specific name for this triangle shape of area four.)

We will use the R-Trigon as a tiling primitive, and will need
a specific pixel numbering, as shown, for definiteness.

We prefer the Trigon to the rhombus because

The tile has three-fold symmetry.
The rhombus has only two-fold symmetry.

The tile has a center pixel.
Each tile (or symmetric filter on the tile)
has a unique most representative pixel.
This simplifies some operations which require mapping between
pixel and transform spaces.

Nice decompositions.
The ``Simoncellian'' banks and those based on
center-surround are derived readily via this tile.
The Rhombus, on the other hand,
leads to decompositions resembling those of the Square.

Since overlap is an important attribute of good coding transforms,
we will combine two or more of these 4-pel basic tiles (``subtiles'')
into a larger basic kernel to serve as the support of
our highest resolution filters.
Seven subtiles (28 pixels) is a plausible choice for a nearly
circular support, but for ease of exposition in this paper
we propose a simpler kernel using just four subtiles.
(It may even be impossible to construct a 28-pel orthogonal
filter bank with the excellent symmetries our example will possess.)

Even when one does not overlap tiles, successive layers of
a coding pyramid will implicitly use larger tiles: 16, 64, etc.
for a Quadtree Pyramid.

Trigon of Trigons

In a hierarchy where 16-pel tiles are built from 4-pel tiles, and so
on, one need not use the same tiling geometry at each resolution level.
In the following figure, the 16-pel shape at the left is
an ``R-Trigon of R-Trigons.''
The shape at the right is an ``L-Trigon of R-Trigons.''
(The colors have no significance except to make the subtiling
more apparent.)

Having chosen the ``Trigon'' as a basic tile (``subtile''),
and a kernel of four such subtiles (16 pixels) as the filter support,
we are left with the choice of
which four subtiles to include in the kernel.
The subtiles form a hexagon grid of one-fourth the original density,
so the rhombus and Trigon that we just saw in a preceding Figure
are the obvious choices here as well.
Most of the reasons we just gave for preferring the Trigon over
the rhombus apply again, so we choose a Trigon of Trigons
as our basic 16-pixel kernel.
There are two additional reasons for preferring that shape:

Distinct Trigon share at most a single subtile.
Rhombi can have two common subtiles which complicates certain
algebraic procedures.

Softer boundary defects.
The rhombus shape leads to the straight lines which cause
visible ``blocking'' defects in lossy image compression.
The Trigon of Trigons shape yields
diffusely shaped boundaries, so errors are
less objectionable subjectively
cite{Pearson:84}.
Furthermore, the ``interlocking'' boundaries increase
inter-block correlations and this leads directly to more powerful
non-linear synthesis filters
cite{Allen:94}
which are known to dramatically improve
the decompressed fidelity of lossy compressed images.

We still have the choice of either an R-trigon of R-trigons, or an L-trigon
of R-trigons.
The former has a more pronounced ``jigsaw-like'' shape
(it has a ``diffuse perimeter'' in which neighboring tiles
will reach into each other like the knobs of a jigsaw puzzle)
and would increase the postprocessing advantage just mentioned,
but this is more relevant in a block than an overlapped
transform, so we use the L-trigon of R-trigons.

Symmetric Kernels

Finally we mention the excellent symmetries that Simoncelli's
filters, and ours, possess.
Most imaging engineers will be intuitively delighted
by our low-pass filter with its three axes of symmetry,
as well as our Simoncellian high-pass
filters which also have an axis of symmetry.

This presents a contrast to the Quadtree separable Wavelet coding used with
the square grid:
It is a well-known theorem that the only orthogonal symmetric
wavelet is the Haar wavelet, which gives quite poor compression performance.

3.0 Highly Symmetric 4 X 16 Filter Bank

The original LaTeX paper uses this example as a case study
in the design of multi-dimensional P.R. filterbanks.
In this version of the paper, we abbreviate the derivation of
the filterbanks.

We introduce a 4 X 4 block orthogonal transform
Gf(X) as a building block.

Gf(X) = (

3f2-1

2f

2f

2f

) .

X / (3f2+1)

2f

f2+1

-2f2

-2f2

2f

-2f2

f2+1

-2f2

2f

-2f2

-2f2

f2+1

As described in the LaTeX paper, the parameter f
is related to a Givens angle y by
f = (1 + cos y)/(sqrt(3) sin y)

A Beautiful Filter Bank!

Using Gf(X) as a building block, the
4 X 16 filterbank T(a,b)we introduce for the hexagonal grid can be
described simply:
T(a,b) = Gb . Shift . Ga

Shift is a 4 X 16 permutation matrix which shifts each
assymetric filter output to an adjoining subtile.
(The LaTeX paper offers more detail and clarity.)
Elsewhere
cite{Allen:98}
I prove that filterbanks on the Trigon-of-Trigons shape
conform to this equation
if and only if they satisfy a number of useful symmetry conditions.

The necessary and sufficient condition that
the high-pass filters of T(a,b)
have zero DC response (This is called the DC separation constraint) is
b = (3a+1)/(3-3a)
or
b = (a-1)/(3a+1)

The values given are exact.
(They must be divided by 1014 to achieve normalization.)
The other two filters can be visualized by rotating the High-pass
filter 120o and 240o.

The filter banks described in this section are orthogonal.
There are similar filter banks which satisfy the same strong symmetry
conditions, but which are biorthogonal --- the analysis and
synthesis filter banks are different.
In general one can design biorthogonal filter banks which provide
better compaction than any orthogonal filter bank
of the same size
cite{Aase:95},
but we have avoided that here to keep our exposition simple.

4.0 Perfect Unblurring

We will exhibit a 3-ary P.R. filter bank system
whose filters all have the same circular shape.
It is useful for invertible sharpening or blurring.

The obvious blurring filter for the hexagon grid is
the seven-tap kernel comprising a large central weight, and six equal
smaller weights at the neighbors.
If the smaller weights are negative the filter is a sharpening
filter.
The seven-tap blurring and sharpening filters
are approximately each other's inverse; by adding 11
tiny-valued taps outside the seven-pel central kernel
we will make the filters to be exact inverses.

The design of the following filter is similar to the first example,
though it uses an extra Shift stage:
T = Ha,b . Shift . Hc,d . Shift . He,f

Another difference is that the synthesis filterbank
differs from the analysis, having the same pattern but
different parameters:
A = (b - a2) / (ab - 1)
B = (a - b2) / (ab - 1)

Apply a 3 X 3 transform to non-overlapping
3-pixel tiles on the hexagon grid; then apply an operator lattice-shift
operator.
The result is three corresponding 3-tap filters
(shown via subscripts 1 to 3)
throughout a 9-point tile cluster,

. . a1 . .
. b1 11 b3 .
. . 12 13 a3
. a2 b2 . .

So far the lattice shift has only relabeled the filters,
but now we apply the 3 X 3 filter bank again (replacing
(a,b) with (c,d)).
The three resultant new filters will be identical after
rotation by 120o so we need only show one,

. . a . .
. b 1 bc .
. . d c ac
. ad bd . .

Now repeat this process one more time: lattice-shift the three 9-point
filters and blend them, again using the 3 X 3 transform,
now parameterized with (e,f).
Again we need only show one filter.

I've taken the liberty of adding lines to
make the familiar 7-pixel ``center-surround''
shape clearly visible.
Outside that kernel,
ten tiny-valued taps are provided and chosen specifically
to allow the perfect inverse filter to be finite.
The perfect reconstruction property of these useful filter banks
was derived effortlessly by the A_1 B_1 A_2 B_2 A_3 cascade.

Here is the blurring filter which is the perfect inverse of
the sharpening filter just shown. Remember that all three
rotations of these filters must be used, in a 3-ary filterbank.