1 Introduction

About cl-mixed

This is a bindings library to libmixed, an audio mixing and processing library.

How To

Precompiled versions of the underlying library are included in this. If you want to build it manually however, refer to the libmixed repository.

First, set up your input and output parts such as file decoders and audio systems. Handling this is not a part of libmixed. To see an example of how to incorporate them, see the test.lisp file.

Load the system through ASDF or Quicklisp:

(ql:quickload :cl-mixed)

Now you'll need to integrate your inputs and outputs with the Mixed system. In order to do that, you'll want an unpacker and a packer segment. Both of those create "packed-audio", which holds all the information about how the raw audio data is encoded in a byte buffer.

If you don't already have a byte buffer from your input or output implementation, passing NIL for the c-buffer will automatically create one for you, which you can then access with data. An example call might look like this:

(cl-mixed:make-unpacker NIL 4096 :int16 2 :alternating 44100)

An optional third parameter designates the sample rate of the buffers that the source converts to or the drain converts from. This "buffer sample rate" has to be the same across all segments in a mixer pipeline. It defaults to 44100. Creating a packer looks and works exactly the same as an unpacker.

Next you'll want to create the segments that'll do the actual audio processing you want. For this example, let's create a 3D audio segment (space-mixer) and a fade effect (fader).

(cl-mixed:make-space-mixer)
(cl-mixed:make-fader :duration 5.0)

Next we'll need to create the buffers that are used to manipulate the audio internally and bind them to the appropriate inputs and outputs of our segments.

This here means we create three buffers, input, left, and right, each with a size capable of holding 500 samples. We then connect the source's left output to the fader's single input. Then we connect the right buffer to the unpacker's right output, just so that it has both outputs set. If your unpacker only has one channel, you can leave that out. If it has more, you'll have to repeat it for the other channels as well. Next we connect the fader's output as the space-mixer's first input. Finally we connect the left and right outputs of the space-mixer segment to the left and right inputs of the packer respectively. For the fader segment we can connect the same buffer to both input and output, as it is declared to work "in place". For the space-mixer segment we need distinct buffers, hence the extra input buffer.

Now we can create our segment-sequence object, which keeps the order in which to process each segment.

(cl-mixed:make-segment-sequence source fader space drain)

Finally we can move to the main processing loop, which should look as follows:

Where process-source and process-drain are functions that will cause your source to put samples into its buffer and drain to read out the samples from its buffer. Running this now will just give you a fade in effect, which isn't too exciting. Since we haven't actually set or changed any of the 3D audio parameters, that effect remains inaudible. Changing the loop body to read something like the following

Should cause the source to now also circle around the listener as it is fading in. If you change the tt change factor from 0.001 to something higher, it will circle faster, and the doppler effect should become more noticeable.

The following segments are included with the standard libmixed distribution:

basic-mixer Linearly mix multiple inputs.

delay Delay the input by a given amount of time.

fader Fade the volume of a source in or out.

frequency-pass Filter out low or high frequencies.

generator Generate simple wave forms.

ladspa Use a LADSPA plugin.

noise Generate noise.

pitch Shift the pitch.

repeat Record an input and then repeat it back.

space-mixer Mix multiple inputs as if they were in 3D space.

volume-control Adapt the volume and pan of a stereo signal.

See the next section on how to make custom segments.

Creating Custom Segments

While it is perfectly possible to create custom segments in C and load them into your lisp image, you can also write them directly in CL. This may be desired if performance is not crucial or if you want to quickly prototype an effect before translating it to a lower-level language.

In order to create a segment in Lisp, you must subclass virtual and in the very least implement the mix method for it. Here's an example for a very primitive echo effect:

In order to achieve the echo effect we keep samples of a given duration around in a ring buffer and then decrease their potency with each iteration while adding the new samples on top. Of course, a more natural sounding echo effect would need more complicated processing than this. Regardless, this segment can now be integrated just the same as the fader segment from the above introductory code.

Concepts

Buffer

A buffer holds an array of float-encoded audio samples. Everything within libmixed deals in float samples and reads from these buffers and/or writes to these buffers. This means everything that processes audio will be able to work within the same constraints of 32-bit float encoded samples, without having to worry about different sample rates, sample encodings, or channel layouts. Buffers have a fixed number of samples that they can hold, which should typically be consistent throughout the entire system.

Packed-Audio

The packed-audio is a representation of audio data that is packed into a single array and isn't in the standard buffer format. Many libraries that decode or encode audio files, play audio back, or process audio in some other way expect a single audio sample array of a particular samplerate, sample encoding, and channel layout, rather than the standardised sample buffers that libmixed uses. The packed-audio allows you to handle the conversion from this single array, encoded format, to the buffer format of libmixed and back.

Particularly relevant are the unpacker and packer segments, which you can use to handle the edges of the pipeline where other libraries interact with libmixed.

Segment

Libmixed allows you to define a pipeline to process audio in. This pipeline -- or graph, if you will -- is pieced together by segments that exchange data between each other through buffers. Each segment has a number of inputs, outputs, and fields that you can set and get. Every input and output can also have a differing amount of applicable fields, but each must have at least the buffer field, which designates the buffer that is connected at that point. You can retrieve information about how many inputs and outputs the segment expects or supports as well as information about the fields it understands by using the info function on a segment.

Aside from the inputs, outputs, and fields, each segment has three methods that are central to the mixing of audio: start, mix, and end. start and end allow you to prepare and clean up work shortly before and after mixing has been done. This can be important for real-time audio processing that cannot afford long pauses. The mix method performs the actual mixing operation and should cause new samples to appear in the outputs' buffers.

Mixer

Mixers are segments that take a run-time variable number of inputs and mix them together to a single output per channel. Libmixed provides two standard mixers out of the box, basic-mixer and space-mixer, which should serve most needs.

Segment-Sequence

A segment-sequence simply ties together a number of segments and performs the start, mix, and end operations in the order the segments were added to the sequence. This is mostly for convenience, in order to quickly perform the actual mixing, once the pipeline has been completely assembled already.

A bypassed segment does not perform any operations when
mixed. The exact effects of this varies per segment, but
usually for a segment that transforms its inputs
somehow this will mean that it just copies the input to
the output verbatim.

Note that not all segments support bypassing. Check the
:FIELDS value in the field’s info plist.

This method should be called as close as possible
after all desired calls to MIX are done. Calling
MIX after END is called is an error. Some segments
may require END to be called before their fields
can be set freely. Thus mixing might need to be
’paused’ to change settings.

Returns a plist with the following entries:
:NAME — A string denoting the name of the
type of segment this is.
:DESCRIPTION — A string denoting a human-readable
description of the segment.
:FLAGS — A list of flags for the segment.
Should be one of:
:INPLACE — Output and input buffers may be
identical as processing is
in-place.
:MODIFIES-INPUT — The data in the input buffer
is modified.
:MIN-INPUTS — The minimal number of inputs that
needs to be connected to this
segment.
:MAX-INPUTS — The maximal number of inputs that
may be connected to this segment.
:OUTPUTS — The number of outputs that this
segment provides.
:FIELDS — A list of plists describing the
possible flags. Each plist has the
following entries:
:FIELD — The keyword or integer denoting
the field.
:DESCRIPTION — A string for a human-readable
description of what the field
does.
:FLAGS — A list of keywords describing the
applicability of the field. Must
be one of:
:IN — This field is for inputs.
:OUT — This field is for outputs.
:SEGMENT — This field is for the segment.
:SET — This field may be written to.
:GET — This field may be read.

Note that this value is cached after the first
retrieval. You are thus not allowed to modify the
return value.

This vector will become out of date if the segment’s
buffers are added or removed from the C side directly,
or directly through this vector. Thus you should never
modify this directly and instead always
make sure to go through INPUT.

This processes the given number of samples through
the pipeline. It is your job to make sure that
sources provide enough fresh samples and drains
will consume enough samples. Calling MIX with more
samples specified than any one buffer connected to
the segments in the sequence can hold is an error and
may crash your system. No checks for this problem
are done.

Calling MIX before START has been called or after
END has been called is an error and may result in
crashes. No checks for this problem are done.

If you want to ensure that the sequence is complete
and able to process the requested number of samples,
you should call CHECK-COMPLETE after running START.

This vector will become out of date if the segment’s
buffers are added or removed from the C side directly,
or directly through this vector. Thus you should never
modify this directly and instead always
make sure to go through OUTPUT

The value must be either :RECORD or :PLAY.
When in record mode, the segment will fill its internal
buffer with the samples from the input buffer, and copy
them to the output buffer. While in this mode it is thus
"transparent" and does not change anything.
When in play mode, the segment continuously plays back
its internal buffer to the output buffer, ignoring all
samples on the input buffer.

This vector will become out of date if the sequence’s
segments are added or removed from the C side
directly, or directly through this vector. Thus you
should never modify this directly and instead always
make sure to go through ADD/WITHDRAW.

Accessor to the source segment attached to an input buffer or location.

Some mixers support attaching a source
segment to an input buffer. The effect being that the
segment is mixed before the corresponding buffer is
used, allowing for dynamic addition and removal of
segments without the need to alter the pipeline.

This method should be called as close as possible
to the next calls to MIX. Calling MIX before
START is called or after END is called is an error.
After START has been called, changing some segments’
fields may result in undefined behaviour and might
even lead to crashes.

6.1.6 Classes

Class: basic-mixer()

This segment additively mixes an arbitrary number of inputs to a single output.

Linear mixing means that all the inputs are summed
up and the resulting number is divided by the number
of inputs. This is equivalent to having all the
inputs play as "individual speakers" in real life.

If no handle is given to the object upon creation, the proper
corresponding foreign data is automatically allocated. The
pointer to this data is then associated with the instance to
allow resolving the pointer to the original Lisp object.
Finalisation of the foreign data upon garbage collection of
the Lisp object is also handled.

The actual foreign allocation and cleanup of the data is
handled by ALLOCATE-HANDLE and FREE-HANDLE respectively. The
subclass in question is responsible for implementing
appropriate methods for them.

See HANDLE
See ALLOCATE-HANDLE
See FREE-HANDLE
See FREE
See POINTER->OBJECT

This just delays the input to the output by a given
number of seconds. Note that it will require an
internal buffer to hold the samples for the required
length of time, which might become expensive for very
long durations.

The from and to are given in relative volume, meaning
in the range of [0.0, infinity[. The duration is given
in seconds. The fade type must be one of the following:
:LINEAR :CUBIC-IN :CUBIC-OUT :CUBIC-IN-OUT, each
referring to the respective easing function.
The time is measured from the last call to START out.

LADSPA (Linux Audio Developers’ Simple Plugin API)
is a standard interface for audio effects. Such effects
are contained in shared library files and can be loaded
in and used with libmixed straight up.

Please refer to the plugin’s documentation for necessary
configuration values, and to the libmixed documentation
for how to set them.

Packed-audio represents an interface to an outside sound source or drain.

The object holds all the necessary information to describe
the audio data present in a raw byte buffer. This includes
how many channels there are, how the samples are laid out
and how the samples are formatted in memory. It also includes
the samplerate of the channel’s source so that it can be
converted if necessary.

See MAKE-PACKED-AUDIO
See SOURCE
See DRAIN
See C-OBJECT
See OWN-DATA
See DATA
See SIZE
See ENCODING
See CHANNELS
See LAYOUT
See SAMPLERATE

This segment converts data from individual sample buffers to data for packed-audio.

This is mostly useful at the edges to convert to
something like an audio file library or audio
playback system from the internal buffers as used
by Mixed.

The samplerate argument defines the sample rate
in the input buffers. If it diverges from the
sample rate in the packed-audio, resampling occurs to
account for this. To change the resampling method,
use the :RESAMPLER method. The value must be a
pointer to a C function of the following signature:

The segment is first in record mode when created.
Once you have the bit you want to repeat back, you
can switch the repeat-mode to :PLAY. It will then
ignore the input and instead continuously output the
recorded input bit.

See MAKE-REPEAT
See DURATION
See REPEAT-MODE
See SAMPLERATE
See BYPASS

A segment is responsible for producing, consuming,
combining, splitting, or just in general somehow
processing audio data within a pipeline.

A segment is connected to several buffers at its
inputs and outputs. Each input, output, and the segment
itself may have a number of fields that can be accessed
to change the properties of the segment’s behaviour.

Some of these properties may not be changed in real
time and instead might require a ending the mixing
first.

See C-OBJECT
See INPUTS
See OUTPUTS
See INFO
See START
See MIX
See END
See INPUT-FIELD
See OUTPUT-FIELD
See FIELD
See INPUT
See OUTPUT
See CONNECT

Each input represents an individual source in space.
Each such source can have a location and a velocity,
both of which are vectors of three elements. If the
velocity is non-zero, a doppler effect is applied to
the source.

The segment itself also has a :LOCATION and :VELOCITY,
representing the listener’s own properties. It has
some additional fields to change the properties of the
3D space. In total, the following fields are available:

:LOCATION — The location of the input or
listener. Value should be a list
of three floats.
:VELOCITY — The velocity of the input or
listener. Value should be a list
of three floats.
:DIRECTION — The direction the listener is
facing. Value should be a list of
three floats. Default is (0 0 1)
:UP — The vector pointing "up" in
space. Value should be a list of
three floats. Default is (0 1 0)
:SOUNDSPEED — The speed of sound in the medium.
Default is 34330, meaning "1" is
measured as 1cm.
:DOPPLER-FACTOR — This can be used to over- or
understate the effect of the
doppler. Default is 1.0.
:MIN-DISTANCE — The minimal distance under which
the source has reached max volume.
:MAX-DISTANCE — The maximal distance over which
the source is completely inaudible.
:ROLLOFF — The rolloff factor describing the
curvature of the attenuation
function.
:ATTENUATION — The attenuation function describing
how volume changes over distance.
Should be one of :NONE :LINEAR
:INVERSE :EXPONENTIAL.

See MIXER
See MAKE-SPACE-MIXER
See LOCATION
See INPUT-LOCATION
See VELOCITY
See INPUT-VELOCITY
See DIRECTION
See UP
See SOUNDSPEED
See DOPPLER-FACTOR
See MIN-DISTANCE
See MAX-DISTANCE
See ROLLOFF
See ATTENUATION
See *DEFAULT-SAMPLERATE*

This segment converts data from packed-audio to individual sample buffers.

This is mostly useful at the edges to convert from
something like an audio file library to the format
needed by Mixed.

The samplerate argument defines the sample rate
in the output buffers. If it diverges from the
sample rate in the packed-audio, resampling occurs to
account for this. To change the resampling method,
use the :RESAMPLER method. The value must be a
pointer to a C function of the following signature:

Default methods for INPUT/OUTPUT-FIELD to
handle the recording of the input/output
buffers already exist. Every other method
by default does nothing. You should in the
very least implement a method for MIX on
your subclass.

See SEGMENT
See INFO
See START
See MIX
See END
See INPUT-FIELD
See OUTPUT-FIELD
See FIELD
See INPUTS
See OUTPUTS

Reader for a cons cell that holds information about the buffer in the channel.

If the cons cell stores NIL in its car, then the data buffer
is not owned by the channel and may not be freed by the
library. If it is T, the cdr must be the pointer to the
buffer data. This data is freed when the channel is GCed.