OR a long time I toyed with the idea of applying Fourier
analysis to a typographic page. It seemed to me that certain hard-to-define
visual qualities of a perfectly set page can be revealed and perhaps even
measured by taking a close look at its Fourier spectrum. It remained just an idea
until I came across the wonderful Peter Burnhills book on Aldine typography
(
Type spaces: in-house norms in the typography of Aldus Manutius),
which contained a very detailed analysis of typographic norms and page geometry.
My interest in the subject renewed, and I realized that I had everything
I needed to go ahead and try my idea out. Here is the list of materials:

The plugins are free, work with PSP and Photoshop, and are available with source code (side note:
by strange coincidence, Alex Chirokov graduated from the same place as I did, only 14 years later).

IN HOC VOLVMINE HAEC CONTINENTVR
(a.k.a. Scriptores historiae Augustae, Ren. 87:8; Adams S781; BMSTC I 217; Palau 48)
is typographically similar to the famous octavo classics series (150115). The author, Cassius Dio (164c.235),
was a Roman historian of Greek descent; the full text of this work, in English translation, can be found
here.
As far as I can tell, the leaf refers to the brief reign of Didius Julianus.
It is set in Griffos italic, and is representative of the high quality Aldine typography.

The original 2400dpi scan proved to be too large for processing, so I scaled it down 5 times and cropped the margins.
The resulting image of the text block is 3MB on disk and 9.5MB in memory (32 bits per pixel).
Here are the exact dimensions:

Averaging the measurements above, we get a 18.95±0.05 dpmm image resolution.
Thus, the real size of the image is 70.7±0.2mm × 131.4±0.3mm.

Transformation

Fourier transformation converts a greyscale image into an equivalent form showing the phases
and amplitudes of sine waves of various frequencies; the position of a pixel defines
the angle and frequency of the wave, while the color information for the pixel stores
the phase and amplitude/intensity of the wave
(if you want a more detailed introduction,
go here).
In the original image space, each individual wave looks like a regular set of parallel
bars with intensity alternating between darker and lighter shades of grey. The wave
covers the whole picture and its average intensity is ½ (1=white, 0=black).
The phase of a wave determines positions of its maxima; the interference
of the waves of many different frequencies, amplitudes and phases recreates the
original image. In theory, the transformation does not lose any information and is reversible.
When applied to discrete images, rounding errors do occur, but in
most cases, including ours, the result of direct transformation (FFT) followed by
inverse transformation (IFFT) is visually indistinguishable from the original.

Alex Chirokovs FFT plugin converts the lightness channel of any 32-bit image
into its FFT form in-place, writing the Fourier image in the resulting images H (hue) and
L (lightness) channels. The H channel contains phase information; the L channel
contains amplitudes. After FFT, the S (saturation) channel of the H-S-L split is set
to a grey level of ½ (127) and is not used in reverse transformation. To operate on
channels independently, the image is split into three (one per channel) and later recombined
(most image editors support HSL splitting/combining; PaintShop represents channels
as 8-bit greyscale images). Of the two channels carrying information, the L, which
encodes waves amplitudes, is more interesting (see picture on the left). The
phase channel (H) is harder to interpret and is usually left intact, whereas the
intensity channel can be edited, recombined with the other two, and transformed
back into original image form.

The intensity picture of the original page shows small-to-medium variations of the tone
between adjacent pixels (representing waves of close frequency and direction). These changes,
when blurred by the eye, give very smooth transitions with very little contrast except for
a few places around the middle. The contrast can be artificially enhanced for analysis by
averaging and posterizing the picture into small number of fixed intensities. I chose 3-bit
posterization (8 intensity levels) and colored the result using map-like pseudocolors (blue
to white through green). Also, since the averaging loses important intensity peaks, I added
those peaks using intensity threshholding on the non-averaged channel. The peaks look like
small white dots located in the central area.

Analysis

When discrete Fourier transformation is applied to a finite image, there are natural limits
for the waves frequencies: the shortest identifiable wavelength is two pixels, while the longest
wavelength is equal to the size of the picture (if the picture is not a square, horizontal and vertical
components of the frequency can have different maxima). The FFT image is laid out so that the
information concerning the long waves is in the center of the image, while the shortest waves are
represented by pixels close to the edges. Since amplitude channels of FFT images are symmetrical
around the center, only one half of the image needs to be analyzed.

The picture on the left displays the central part of the FFT image, showing waves with
wavelengths of 0.5 mm and larger. The edges of the FFT image are important for analysis of higher
frequences responsible for grain, imperfections, scanner noise etc. (click on
the picture to see the whole image). The nested ellipses in the picture show the areas containing
waves with wavelengths larger than 1.0, 0.5, 0.25, and 0.125mm respectively.

First, lets take a look at the peaks close to the center (I marked them with red dots). The
center pixel and pixels around the center encode amplitudes of very long waves, responsible for background
color and very smooth transitions of the tone in the original image; they aren't very interesting.
There are half a dozen peaks right above and below the center point, spaced at equal intervals. The
first three peaks above the center are marked vf, vf*2, and vf*3; they represent the main vertical
wave and its harmonics (shorter waves with frequencies which are exact multiples of the main wave).
As it is the case with any musical instrument, the main tone is accompanied by its harmonics with higher
frequences; when superimposed, they shape the actual waveform, turning the pure sinusoidal shape
into something more sophisticated. In our case, harmonics allow for more abrupt transitions between
black and white, which will be presented below.

Positions of the peaks can be measured and compared to the central pixel at 670,1245:

vf, 3rd harmonic up: 670,1146 (99 up)

vf, 2nd harmonic up: 670,1179 (66 up)

vf, 1st harmonic up: 670,1212 (33 up)

central point: 670,1245

vf, 1st harmonic dn: 670,1278 (33 dn)

vf, 2nd harmonic dn: 670,1311 (66 dn)

vf, 3rd harmonic dn: 670,1344 (99 dn)

Given the vertical offset of the peak in pixels, the actual wavelength in pixels can
be calculated as the quotient of the height of the image and the offset; for the
main vertical wave we have 2490/33 = 75.5±2.5pix between maximums. In real
terms, dividing by 18.95±0.05 dpmm resolution, we get a 4.0±0.1mm
wavelength (distance between maxima). As you will see below, this wave corresponds
to the vertical line period and its length is equal to the aldine classics line
increment measured by Peter Burnhill (figure 35 in Type Spaces).

The remaining points of interest are not peaks of high intensity, but rather centers
of hills, responsible for many diagonal waves of close frequencies and orientation.
These centers are marked df1, df2, and df3 (since the FFT image is symmetrical,
the same points are present below the center). Here are the positions of the
centers and their offsets from the center point:

df3: 581±5,1026±8 (89 left, 219 up)

df2: 581±10,1122±15 (89 left, 123 up)

df1: 586,1220(approx.) (84 left, 25 up)

central point: 670,1245

df1: 754,1270(approx.) (84 right, 25 dn)

df2: 759±10,1368±15 (89 right, 123 dn)

df3: 759±5,1464±8 (89 right, 219 dn)

With these numbers, we can easily calculate the orientation and
wavelengths of the diagonal waves. It can be done
componentwise, e.g. the horizontal component of df3's wavelength (x) is 1340/89 = 15pix
between maximums along horizontal axis; the vertical component (y) is 2490/219 = 11.4pix.
The actual wavelength is x×y/sqrt(x&sup2+y&sup2),
which in our case will be 9.0±0.3pix or 0.48±0.02mm in real units. The angle
is atan(x/y), or 53 degrees clockwise from vertical.
Making these calculations for all centers, we get the following results (vf is
added for completeness):

vf: 75.5±2.5pix or 4.0±0.1mm, 90° from vertical

df3: 9±0.3pix or 0.48±0.02mm, 53° cw from vertical

df2: 12±1pix or 0.63±0.05mm, 37° cw from vertical

df1: 16pix or 0.85mm (average), 9° cw from vertical

Visualization

Nice thing about Fourier transform is that you can inverse it and get
the original image. However, it is more interesting to tweak the amplitude channel
before making the inverse transformationthis way one can filter out some
unneeded frequencies or emphasize the important ones. Basic filters for low
and high frequencies just sharpen or soften the image; we will try something
more fancy here.

The easiest thing to do is to mask the waves we arent interested in and
see what will happen with the image after FFTs H, S, and modified L channels
are recombined and IFFT is applied to get back to the image space. To mask the
waves we dont need, we can select the useful ones (above and below the
center), inverse the selection, and reduce brightness of the selection by,
say, 80%. It is important to keep the very center of the FFT amplitude channel
unaltered: it is responsible for the overall brightness of the image and dimming
it down will make the image black, with most of the information lost due to numerical
roundoff errors. The result is shown on the left; the individual letterforms are gone
and the picture resembles what it actually isan interference of many
waves of various frequencies coming from different angles. However, one can
notice that some characteristics of the original picture remain; the
lines are blurred, but still clearly distinguishable, even the running
head is still visible. This is possible because although many component
waves are lost or suppressed, the phase plane which keeps origin points
for all waves is intact and can still determine the layout of the
interference picture.

Alternatively, we can emphasize the interesting waves by selecting them
in the FFTs amplitude channel and increasing their brightness without touching
the rest. The result (shown on the right) has a lot in common with the original
picture, but the details of the original letterforms are traded for the intensity of
major repeating features and now the relation between the two becomes more obvious.
The main vertical wave (called vf on the FFT image) is a line increment; diagonal
waves correspond to various repeated font features. The waves can be seen more
clearly by emphasizing each one individually, dimming other frequencies and
superimposing the picture of the individual wave with the original text. Although
the result is somewhat artificial from the pure FFT standpoint, it shows much
more clearly which features correspond to each wave.

The most obvious match is the 9°
wave (df1); it corresponds to the main repeated stem rate and is angled
at the average ascender / descender angle. Its average horizontal period is 0.84mm,
which corresponds to 2½ Burnhill's units (0.333mm × 2.5 = 0.833mm, ½ of the x-height).
The next wave, df2, can be attributed to repeated connection strokes and frequent
ligatures which have inclined shapes, different from those of standalone letters.
At 37°, it is close to the prototype pen angle. The last one, df3, is
less pronounced than the first two. Its angle is steeper (53°) and its maxima
are closer together (although the horizontal spacing is the same as for df2). Closer
inspection allows us to identify it with features of a, e, æ,
and possibly with high connection strokes of standalone m and n.
The details can be viewed by clicking on the pictures below.

df1: 0.85mm (average), 9° cw

df2: 0.63±0.05mm, 37° cw

df3: 0.48±0.02mm, 53° cw

Acknowledgements

I am grateful to Alex Chirokov for making this work possible and to Andrew Pochinsky
for his thoughtful comments.

Miscellanea

To see the role of phase information, observe what happens with the original
picture when the FFT phase channel is zeroed. The
amplitudes are the same, but all the waves are shifted to a meaningless common
origin, which results in an unrecognizeable interference picture.