Principles of Sound Synthesis

This article aims to discuss principles, techniques and popular equipment to synthesise musical instrument
sounds.

The Structure of Sound

Sound is the perceived vibration (oscillation) of
air resulting from the vibration of a sound source (e.g. guitar sound
board, speaker cone, hair dryer, etc). We can describe such regular
(periodic) vibration in terms of the sum of simpler vibrations (harmonics). In other words any periodic oscillation and
hence resulting waveform can be described in terms of the sum of its
harmonics. Each harmonic being a simple sine wave (often called a pure
tone) with it’s own respective frequency and amplitude. The
graphs below shows how a simple pure tone varies with respect to time.

The graph below shows a more complex waveform comprised
of three pure tones of different respective frequencies and amplitudes.

The relative amplitudes and frequencies of harmonics with
respect to time define timbre – i.e. the sound we hear. The
graph below shows how the plotting amplitude against frequency of a
given waveform gives us a visual representation of it’s harmonic
content.

Though note the harmonic content will change over time if the waveform
doesn't repeat. In other words if it isn't periodic.

Synthesis provides a means of constructing timbres
to emulate existing instruments and create new sounds. There
are two approaches to synthesising these timbres

1) Analysis of the sound itself and directly trying to emulate
it.

2) Analysis of the physical workings
of the musical instrument and trying to model its sound generation.

Acoustic Sound Generation

Typically most musical instruments can be considered as
consisting of three parts

Excitation Source – this gives the system energy, for instance
a violin bow, plectrum, hammer or player’s breath.

Wave-guide – this is the main part of the instrument that oscillates
(for instance the string of a guitar or the air column in a flute). When
the system oscillates in a steady periodic fashion it produces a complex
waveform that can be expressed in terms of a fundamental frequency and harmonics.

Resonator – this primarily takes energy away from the wave-guide
to produce sound. The resonator is typically the sound box, bell or body
of the instrument. The resonator will oscillate in it’s own way in sympathy
with the wave-guide so changing the oscillation of the wave-guide and modifying the
resulting timbre. The sympathetic oscillating properties of the
resonator are often referred to as formants. These formants have an important
bearing on the quality of the sound produced.

When we synthesis sound we will need to consider levels
of excitation, oscillation/harmonics and how the system releases energy
to be heard as sound. We will firstly review
some classic synthesis techniques and then explore some relatively new
‘state of the art’ avenues for sound synthesis.

Additive Synthesis

A C3 Hammond Organ

The classic Hammond C3 shown allows the musician to build
the harmonic content of a sound by adding in ‘flutes’ (approximating
to sine waves) of progressively higher frequencies. This
is done by pulling out drawbars to varying extents, so increasing the
signal levels of the respective harmonics.

Hammond organs are designed to emulate the steady state
of forced response instruments such as brass, woodwinds and bowed strings.
Little effort is made to emulate the transient response though the player
can choose for the second or third harmonics to be more prominent at an earlier
time.

Vibrato can be achieved from the spinning horn of
a Leslie speaker. When the horn rotates the
resultant Doppler shift of frequency superimposes on the original signal
giving a pleasing phasing effect. More sophisticated
computer based additive synthesisers exist giving the programmer greater
control over harmonic and temporal parameters.

Subtractive Synthesis

A minimoog.

The classic minimoog shown provides a number of harmonically
rich oscillators such saw-tooths, pulses and square waves, which can be filtered
to emulate the timbre of real instruments.

Transient and steady state responses of the instrument are
emulated by a time dependant voltage envelope usually consisting of attack,
decay, sustain and release sections whose respective durations can be controlled
by the user.

Such an envelope can also control the filter section
modifying the filtering of the waveforms with respect to time. Vibrato
can be achieved by modulating of the main oscillator with
a low frequency oscillator. Two oscillators
of similar frequencies (one slightly detuned) together give a pleasing
phasing effect and add more weight to the sound. Although
not too bad at brass and strings the Moog doesn’t do a particularly
good job at impersonating anything but itself.

Wave equations beyond the 1D case

For many types of instruments where the wave-guide has more
than one dimension (e.g. plates, bars and membranes) standing waves may be
possible along a number of axes. For instance we could show a rectangular
plate

has an increase in the number of modes of vibration well
beyond a harmonic series. The range of possible in-harmonic frequencies further
increase when we consider bars, bells, membranes and circular plates, the
latter having so many in-harmonic frequencies that no distinct fundamental
dominates.

Simply constructing or filtering a harmonic series
will be insufficient to model these important cases. Analysis
of a well-made bell will show that although some ‘overtones’
do not correspond to a harmonic series they may correspond to other
consonant intervals such as thirds and minor thirds giving the sound
a musically useful character.

Frequency Modulation

A Yamaha DX7

In theory, any instrument sound can be emulated with six
sinusoids operating as frequency modulators of each other. For instance given
a carrier sine wave, its wave shape can be radically changed by a second modulating
wave serving as the input to the function.

For instance consider two sine waves of different respective
amplitudes ( and ) and
frequencies ( and ). We
can modulate one with the other as follows

Looking at the frequency spectrum of x…

We can see extra overtones are added to more familiar
harmonics f, 2f, 3f, 4f, etc. Having just one
modulating oscillator does not give spectacular results. Usually an
FM synth will have 4 to 6 oscillators (‘operators’) that
can be routed in a variety of ways (‘algorithms’) the carrier
always being the last in the modulation chain. The
technique is notoriously hard to implement, most musicians opting for
the subtle tweaking of presets rather than full on programming. Though
the technique does well at impersonating, electric pianos, bells and
xylophones; acoustic pianos and guitars don't sound so convincing.

Sampling

A Fairlight – once the price of a house you can now have
better sampling on your PC for nothing.

The Fairlight shown here is an early and expensive pioneer of the sampler
(following on from earlier tape based mellotrons).
Here the sound of a given note of a musical instrument is digitally
recorded. When the user presses the corresponding
key on an electronic keyboard the sound is replayed. If
the user presses a different key at a different key strike pressure
to sound is pitch shifted (by changing speed of playback) and volume
modified appropriately. The problem with this is that the resulting
timbre will not reflect that of a real instrument. For instance if you
sample a sung ‘hello’ at middle C, the voice will sound
distinctly munch-kin like when played just a few notes higher. As
real musical instruments vary considerably with pitch and volume dynamics
so samplers need to compensate for this with multiple samples for various
pitch ranges and keyboard strike pressures. The
result is, the more information recorded the more convincing the sound.
This causes a problem if memory is limited since ten seconds of
uncompressed stereo CD quality audio will take up a 1MB. However, by streaming
data directly from a hard disk sample files of the order of gigabytes can be created. The technique is employed by Nemesys’s
‘Gigasampler ' software.

Nemesys’s ‘Gigasampler ' software.

Modern samplers are getting better at pianos and strings
though still fail to deliver convincing guitar sounds. The non linearities
of the excitation of a plucked string and the timbral changes between
frets help define the sound so much that sampling alone fails. Another
way to avoid the munch-kin effect during pitch shifting is to try to
preserve duration of the signal with stretching algorithms and more
importantly the formants of the resonator. Such techniques are employed
by Roland’s much acclaimed ‘Variphrase’ samplers and
studio effects units like Antare’s ‘autotune’ (used
to try a make boy and girl bands who can’t sing sound less obnoxious
– as if that’s possible).

Roland’s VP9000 VariPhrase sampler

All the above techniques have serious problems when trying
to deliver the sounds of real musical instruments. The classic instruments
so far shown have…

do the instruments thoroughly account for damping, feedback and second order
resonance?

3) The parameters
used tend to have no relation to physical parameters of instrument

Virtual Modular Synthesis

Rather than using specific hardware, relatively ordinary
PCs with popular sound cards have the ability to emulate oscillators, ADSRs,
filters, frequency modulators, samplers, etc in real time. The generic nature
of PCs allows users to connect up virtual synthesiser components in many familiar
and totally new ways via intuitive user interfaces. The result can be very
impressive, for instance it's hard to tell the difference between a real minimoog
and a virtual minimoog created in Native-Instruments Generator software. Synthesiser
components can be selected from menus, dragged around, inter-connected and
assigned virtual controllers such as knobs and sliders.

The recent development and
popularity of virtual modular synths has been facilitated by

Popularity of personal computers - which are faster, cheaper and more amenable than ever.

Ever faster CPUs, recently Intel and AMD have released 1Ghz processors for PCs.

Generic Application
Programming Interfaces (DirectX, VST2.0) so that software developers can talk
to operating systems and applications to create sounds at a high level in
generic terms rather than in a hardware specific way. For instance, if a developer
wanted to write a chorus/flange effect. With DirectX function calls that chorus
effect can be integrated into numerous different sequencers like Cubase and
Cakewalk, sound editors like Sound Forge and Cool Edit and even virtual synthesisers
like VAZ and Generator.

Examples include
Native Instrument's Generator, Seer System's Reality, Syn-C Modular, Dreamstation
and Vaz Modular. If you are interested in synthesis and have access to a reasonable
computer and midi keyboard I strongly recommend you visit http://www.hitsquad.com/smm where you can
download perfectly usable shareware versions of SynC and Dreamstation as well
as countless demos and some freeware.

Here's Syn-C modular used to construct a virtual Hammond Organ.

Sync Modular – impressive modular synthesis shareware.

Digital Wave-guide Modelling

This is recent development in sound synthesis where
by the physics behind musical instruments is applied indirectly to sound
generation. The whole point of this module has been to explain some
of the physics of musical instruments so that in the future you may
apply your knowledge to improving or creating musical sounds. Physical
modelling is the application of principles learned.

Instead of creating the sound directly with oscillators,
models of the physical processes that produce sound are created. The
result is

more expressive and realistic sounds

a technique ideal for software only implementation on generic PCs

no need for dedicated hardware or lots of memory

The main difficulty is the potential huge demands on the
CPU without optimisations. However, PCs are getting faster.

Brute force approach

Physical modelling in it's purest form would use numerical
methods to solve equations of motion with respect to boundary conditions.
We have spent a lot of time discussing the equations of motion (wave
equations) for a variety of systems and examined in some depth the nature
of excitation and boundary conditions. We could implement many of the
maths learned directly, effective recreating musical instruments in
the computer, though this can be computationally expensive.

Better Approach

Partway solve
equations for changing parameters with

lookup tables

lumped processes

novel algorithms

Delay Lines

Model string as wave-guide

When we pluck a string we create waves that move along
the string. The above diagram shows how arrays
of values called ‘Delay Lines’ can be used to represent
a virtual string (wave-guide). Values representing energy move along
the arrays being modified (filtered) at Bridge and Nut terminations.
High frequency energy may be filtered more than low frequency energy
hence the nature of oscillation is changed by the string terminations
over time.

Some of the energy may be taken from the system at a given point to be converted to sound. For
instance for modelling an electric guitar

Steve Howe of Yes

String plucked at given position (energy added to delay line)

The virtual string vibrates (energy moves along delay lines and filtered at terminations)

Virtual pickups tap the energy (data from a point along the array is extracted)

We can further create an electric guitar sound by…

Feeding energy to a virtual distortion pedal along virtual cables to virtual amp

Add delayed energy back into the delay line to create feedback loop

The Yamaha VL1 (1994) was first commercial physical modelling synthesiser

There is a whole gamut of physics that can be applied to
physical modelling Physical modelling gives the programmer much more control
over how a system will respond with different levels of excitation. This gives
the musician access to a much more expressive and realistic sounding instrument.

Ironically early hammonds and moogs are now being physically
modelled as instruments in their own right. The tone wheels of a hammond organ
and analog circuits of a real moog exhibit non-linear behaviours that give
these instruments character and depth. These two classic cases show how a
bad synthesiser of the sound of real musical instruments is by no means a
bad musical instrument.

Re-use of material permitted provided it
is clearly labelled "(c) University of Salford, www.acoustics.salford.ac.uk"