It often happens that the model which is most natural from a
conceptual (and manipulation) point of view is also the most effective
from a compression point of view. This is because, in the ``right''
signal model for a natural sound, the model's parameters tend to vary
quite slowly compared with the audio rate. As an example,
physical models of the human voice and musical instruments have led to
expressive synthesis algorithms which can also represent high-quality
sound at much lower bit rates (such as MIDI event rates) than normally
obtained by encoding the sound directly
[46,259,262,154].

The sines+noise+transients spectral model follows a natural perceptual
decomposition of sound into three qualitatively different components:
``tones'', ``noises'', and ``attacks''. This compact representation
for sound is useful for both musical manipulations and data
compression. It has been used, for example, to create an audio
compression format comparable in quality to MPEG-AAC
[24,25,16]
(at 32 kpbs), yet it can be time-scaled or frequency-shifted
without introducing objectionable artifacts [149].