1/f Music

In [1]
RF Voss and J Clark found that if you take the melodies
of songs from the radio as sequences of numbers, the power spectra of
these sequences
follow a 1/f line. In [2]
they produced random melodies whose spectra
were either flat ("white noise"), 1/f2 ("brownian" or "brown" noise), or slopes
in between, including the exact midpoint, 1/f ("pink" or "flicker" noise).
Experimental
subjects picked 1/f sequences as the most musical and pleasant sounding,
neither too random, like white noise, nor too boring, like Brownian noise
(a pre-echo of the "edge of chaos" idea.)
This added to the semi-mystical literature on 1/f spectra,
and started a major branch of fractal computer music composition.
The work was publicized in Martin Gardner's Scientific American column
[3].

Waveforms vs. Melody Lines

Just to be absolutely clear, the noise we're talking about is not audio noise
(although there are white, pink and brown noises in the audio spectrum too).
Just as sound can be represented as sequences of (say) 44 thousand numbers
per second, with frequencies from 20 thousand cycles per second down to
twenty cycles per second,
so a melody line can be represented as a sequence of numbers,
with each number representing the height of a note on the musical staff.
These numbers change
as often as a new note is played, and range in frequency from the fastest
figures being played (e.g. sixty-fourth notes at one quarter note per second
would be 16 numbers, or 8 cycles, per second), down to a frequency of one
cycle in the duration of a piece (e.g., if a piece lasts an hour, that's
1/3600 cycles per second).

Random sequences of numbers are called "noise" even though the sequences of
note numbers we're talking about are at frequencies below the range of
human hearing. Also, we'll still be talking about "power", which varies like
the square of amplitude, even though these aren't sequences of voltages.

White, Brown, and Pink Noise

If each number is an independently-generated random number, then you have white
noise. Over a long time, the Fourier transform of a sequence of random numbers
has about the same amplitude, and thus the same power, at every frequency.

Brownian noise is white noise integrated. That is, each number in the sequence is
a random number in a range centered on zero, added to the previous
number. With brownian noise, the amplitude of each frequency is the inverse of
the frequency, and since the power is the square of the amplitude, it's
proportional to 1/f2.

Noise is "pink" when the power at each frequency is proportional to 1/f, which means
the amplitude is like 1/sqrt(f). This isn't as easy to arrange as white or brown
noise. Pink noise (or any other spectrum shape) can be made with inverse Fourier
transforms, or with multi-stage filters applied to white noise, but Voss found an
easy way to get pretty-good pink noise.

Voss's Pink Noise Generator

The key is that pink noise contains the same amount of power in each octave.
From one octave to the next, there are twice as many frequencies, but the 1/f
rolloff means that each has half the average power. That means that a good approximation
to pink noise can be had by adding the outputs of a series of random number generators,
one for each octave of the noise you want to generate. For instance, a melody with
sixteen notes can be generated with four random number generators being updated at
every note, every other note, every fourth note, and every eighth note:

Alternatively, we can draw this as a binary tree of numbers:.

The best information I've seen on
generating 1/f noise with computers is [4]

Pyramid Music

This adding of a number to all the numbers in a sequence,
taking all the notes in a phrase and moving them all up or down the scale by the
same amount, is called transposition, and happens all the time to phrases within songs
(think of the melody to
"My Baby Does the Hanky Panky"). Phrases can be
repeated by having nodes in the tree share subtrees. The idea
of "pyramid music" is to give the levels one, two, three, four nodes, etc.,
instead of one, two, four, eight...

These hierarchically-combined, transposed phrases
still have 1/f spectra (although more spikey),
but sound more musical than plain 1/f melodies.
Repetition and transposition give you themes and melodies,
and the two together make it sound as if something's going on here,
as if somebody's doing something on purpose. Pink melodies
sometimes sound childish or pedantic, but always as if they're
intended to be music even when they're really bad.
Artificial intelligence programs often act alien, stiff or inscruitable;
sometimes they make mistakes that no person would make,
but I think consistent artificial childishness or human-like stupidity
holds some kind of clue that's rare in AI.

Structure Choices

In Voss's pink noise generator, each node in the tree has two subtrees all its own,
and one random number that transposes the combination of the two subphrases.
Forcing sharing of subtrees means the nodes have to choose which subphrases to combine.
(You can see in the diagram above that the pattern of lines is no longer uniform;
I've chosen the connections semi-randomly.)
I call these choices "structure choices," and I keep them separate from the choices
of transpose amounts. The reason is that when I add parallel sequences of numbers
related to the same song,
for instance dynamics or the circle-of-fifths sequence explained below,
The parallel sequences need to follow the same song structure, yet have different
choices corresponding to the transpose amounts in the melody.

It would be easy for the random structure choices to use some phrases a lot, and
leave other phrases out altogether. I add the constraint that every phrase (or note)
from one layer be used in the layer above. The diagram above follows this rule.
The way the program enforces the rule is just to redo the whole set of structure choices
for a layer until they satisfy the rule (I spent a
lot of time sweating over inventing a smarter method before seeing this obvious way!)

Doing by Copying, Syncopation

Rather than build the melody by walking a tree structure, it's easier to build
each layer as a list of notes, one layer at a time. Each new phrase is just some
contiguous set of notes from the previous layer, copied into the new layer with a
transpose added.

This allows syncopation: the notes from the
lower level don't have to start on a boundary where a whole phrase was assembled,
but can be offset by some number of beats. The yellow blocks in the score below
show how the "Maple Leaf Rag" uses variations of the same phrase, once starting at
the second beat of a measure, then starting at the first beat:

Scale, Key or Mode

1/f and pyramid melodies sound
better on diatonic (white note) or pentatonic (black note) scales than
on a chromatic scale. Transposed phrases, in particular, aren't usually moved
an exact chromatic interval but fit to the scale in use.
But having the program force notes into a predetermined
scale seems arbitrary and lame,
and doesn't allow for accidentals or changes of key in mid song. I would
like
the scale, mode or key of the melody to come from something more primitive
or organic somehow.

The most natural method I've found so far is to let
each note be a compromise between one sequence
that goes up and down a chromatic scale, and another sequence that goes
around the circle of fifths. I call the current incarnation of this
method "PyraQuant5." There are more details about it below.

Example Music

There are two examples of PyraQuant5 music here. One is long-playing,
but a relatively small file because it's in MIDI format:
PyraQuant5_s1134278015_qp73.mid
This is a 73-minute, 163k MIDI file consisting of about 200 22-second
piano pieces. The long name refers to the arguments the program was run
with, in particular the random number seed.

The second is a larger file that plays for a much shorter time:
PyraQuantTheme.mp3
It's a 45-second, 381k MP3 version of my favorite two pieces from the MIDI
file. They sound a bit like Vince Guaraldi's Charlie Brown music.

Everyone asks whether I've tried making the notes different lengths
instead of just pounding eighth notes. I would like to. A phrase at
any level of the hierarchy could be replaced by a single note, or a rest.
But I haven't figured out how to combine longer notes and rests with
syncopation and leading and trailing notes.

Meanwhile, an ancestor of PyraQuant called PYRAMUS7,
produced long and short notes by what I think of as a cheat:
it combined any string of eighth notes at the same pitch into a single
longer note. PYRAMUS7 produces pyramid music on a diatonic scale, not
using the circle-of-fifths method. Here are 17 minutes of PYRAMUS7 songs
that I found interesting and collected years ago, now converted to MIDI:
PYRAMUS7 HITS.

You might find
PyraMus7HitsToTxt's 48 lines of
BASIC easier to digest than the 2500 or so lines of C that go into PyraQuant (!)
There is more about the program in the source tar file below.

PyraQuant Code Description

PyraQuant5 is my nickname for the current version of the program,
actually called pyracirc5. It's a pure C program that produces output
in either MIDI, AIFF or WAV format (options -m, -a, -w).

The following is a slightly out-of-date description of how the main
program works. See the README file in the source code directory for
descriptions of the other source files involved.

I generate two streams of pyramid random numbers.
The "linear number" goes up and down the chromatic scale, like a 1/f
melody. The "circle number" goes around the circle of fifths. Generally
it doesn't go more than half way around in either direction. So,

The generated numbers aren't necessarily rounded to ints (see below).
The note for the melody is a compromise between the "linear" and "circle"
numbers: for each of an octave of (exact) chromatic notes around the linear
number, it looks at the distance the note
is from the linear number, combined with the distance it is on the
circle of fifths from the circle number. The note with the
minimum combined distance is picked for the output melody.
The combining function has been Euclidian distance and Manhattan distance
(sum of the absolute values of the differences)
at various times. Right now it's Manhattan distance with varying weights.

The "pyramid" sequences are built out of phrases arranged in layers.
The bottom layer of a 2(n-1)-note song is n independent random notes
(one-note "phrases").

In the higher levels, each phrase is built by
concatenating two adjacent phrases from the layer below, then
transposing the resulting phrase up or down by a small random amount.
Each phrase may also be inverted or time-reversed, and there are
additional tweaks having to do with syncopation and with making sure
that each phrase in a level is used at least once in the next.

Volume, balance, timbre, attack and decay were added as pyramidal
sequences too. (Some of these are only heard through aiff or WAV
outputs, not MIDI.)

The choice of which phrases to combine into larger phrases
is the same (copies of the same rng) for all the sequences: linear,
circle-of-fifths, vol, bal, timbre, attack, decay, but the random
transposes or offsets use separate random numbers in the separate
sequences.

At some point I made zero the most common transpose, so there would
be a lot of exact copying of phrases, at least in the melody-
determining sequences (expression sequences aren't quantized and
always have some amount of "transpose" between copies of a phrase).

PyraQuant: the original notes and all the phrase-transposes for the
linear and circle-of-fifths sequences are quantized to semitones.
This is the default now, with an option to turn it off.

circness: a pyramidal sequence that determines the relative importance
or weight of the circular sequence compared to the linear sequence.

Source Code

The source is in pyracirc5.tgz.
"Pyramus7HitsToTxt.txt" is a BASIC program, but the rest of the code is in C.
If you're used
to building C programs in a Unix-style environment it will seem familiar,
if not you'll probably be lost, sorry. README explains the jobs of the
various source files.