Revision as of 16:39, 26 January 2010

1 Introduction

In the late 1990ies I (Henning Thielemann) developed a graphical audio signal synthesis and analysis application called Assampler for Amiga.
However it turned out that graphical programming does not scale well.
Thus some years later I started to rewrite much of its functionality in plain Haskell.
There is both a low-level interface and
a high-level framework for automatical inference of sample rate and use of physical units in a sound processing network.
This generalizes the usual restricted splitting into audio rate and control rate signals.
Routines are now also suitable for real-time processing using Storable Vector library, a Stream like data type and aggressive inlining.
There is an interface for using synthesized sounds for rendering Haskore music.
The library uses the Numeric Prelude library and its numerical type class hierarchy.

2 Overview

The aim of this project is audio signal processing using pure Haskell code.
The highlights are:

advanced framework for signal processing supported by physical units, that is, the plain data can be stored in a very simple number format, even fixed point numbers, but the sampling parameters rate and amplitude can be complex types, like numbers with physical units,

Unlike other software synthesizer packages there are not two global rates, namely control and sample rate. Instead there can be many different rates. The control rate of a signal processor can be bound to the rate of the processed audio signal or it can be independent. In the latter case the internal control parameters are interpolated, because these are the ones that are expensive to compute. In case of constant interpolation and integer ratio of control and sample rate, you get the behaviour known from CSound and SuperCollider.

frameworks for inference of sample rate and amplitude, that is, sampling rate and amplitude can be omitted in most parts of a signal processing expression. They are inferred automatically, just as types are inferred in Haskell's type system. Although the inference of signal parameters needs some preprocessing, the frameworks preserve the functional style of programming and do not need Arrows and according notation.

We have checked three approaches, where the last one is the most promising.

Explicitly maintain a dictionary of signal parameters in a Reader-Writer-State monad, which must be computed completely before any signal processing takes place. This forces all signal parameters to share the same type and prohibits infinitely many signal processors to be involved (e.g., concatenation of infinitely many short noises).

Simulation of logic programming by lazy cycles of function applications (i.e., tied knots, fixed points). The main problems are quadratical computation complexity and a cumbersome and error-prone application. Namely, for each input you have to handle a parameter output, and vice versa for propagation of parameters through the network. You need combinators (infix operators) for combining these functions, but you will easily run into cases where you must plug manually, which is a nightmare.

Unify only the sample rate. Use a Reader functor/monad. Amplitude is propagated from inputs to outputs only. This is a bit conservative, but is simple and comprehensive and fulfils our needs so far.

We checked several low-level implementations in order to achieve reasonable speed. The standard list data structure is very convenient for programming but much too slow for signal processing. We try to get rid of it in several ways:

A fusion framework based on

mapAccumL

and

unfoldr

like functions for plain Prelude lists. Since in current GHC versions (6.8-6.12) the optimisation rules do not fire reliably (e.g., rules are not specialised if a function gets specialised to a monomorphic type) we end up with intermediate list structures too often.

A chunky list based on the StorableVector is much faster if higher order functions like

map

and

unfoldr

are inlined. However, this data structure is not elementwise lazy (a problem for feedback), and can store only values of

Storable

type (e.g., functions are excluded).

A data structure analogous to the

Stream

framework, where a list is represented by a

StateT s Maybe a

which generates signal values by calling the generator function. In this approach fusion happens by inlining, and lists or other data structures can be used for sharing and feedback including sharing.

A similar generator type based on (somehow portable) LLVM assembly code that is compiled to machine code at run-time. The code is fast by default: There is no clutter due to missing inlining, too much laziness or inefficient data structures. We can even utilize parallel (SIMD) instructions to a fair degree. The downside is that you have to specify what parameters are baken into the compiled functions and which ones remain parameters of those functions. You also have to write the core signal functions using LLVM assembly language.

A combination of all but the first approaches seems to be a good choice so far. However, maintaining all code versions for comparison purposes led to much code duplication in the meantime.

Support for causal processes. Causal signal processes only depend on past data and thus are suitable for real-time processing (in contrast to a function like time reversal). These processes are modelled as

mapAccumL

like functions. Many important operations like function composition maintain the causality property. They are important in feedback loops where they statically warrant that no future data is accessed.