Does the dog have Turing nature?

Taming the ScriptProcessorNode

Jan 30th, 2013

The Web Audio API provides graph based API for audio generation and
processing primitives with a focus on high performance and low latency. For
custom processing that is not covered by the builtin native audio nodes, it
provides a ScriptProcessorNode whose processing is determined by a Javascript
function. Though the ScriptProcessorNode is presented like any other node
type by the API, its behaviour differs from the other native nodes in some
fundamental ways. This post examines some of these differences using a simple
chime model as the use case, and derives some suggestions for the Web Audio API
specification.

The above code produces a single chime that lasts for about 10 seconds, with a
decay time constant of 2 seconds. Though this is not very useful as it stands,
it helps to illustrate a couple of aspects of nodes in the Web Audio API.

First, we see two types of nodes being used – a “source” node (the oscillator)
which generates a signal without needing an input, and a “processor” node (the
gain) which takes an input signal, does something with it and sends a modified
signal to its output.

Second, we see a time limited triggering of the OscillatorNode. The node is
triggered 5 seconds into the future and is stopped 10 seconds after it starts.
The way the Web Audio API is designed, once the oscillator node has stopped,
it becomes rather useless and needs to be discarded. This is because start/stop
can be used only once on source nodes. Therefore when using an OscillatorNode
as in this example, the references to osc and gain are no longer necessary
and we can add the following lines at the end of the above code block.

12

osc=null;gain=null;

With no references holding the oscillator and gain nodes from garbage collection,
one might think that they might be destroyed soon after the references are
given up, before the oscillator gets a chance to generate any sound. Fortunately,
the oscillator holds a reference to itself until its stop time is reached,
after which the reference is released and the subgraph between the source and the
context destination is destroyed. This behaviour of native source nodes is
documented in the Dynamic Lifetime section of the Web Audio API documentation.

Abstracting the chime model

The chime model described in the previous section only plays one chime. This is
practically useless in the real world. A more useful incarnation of the chime model
would permit us to trigger a chime at any time we want and at any frequency
we want as well. The model manages the nodes necessary for its function
purely internally.

This is easily encapsulated as a function –

12345678910111213141516

varAC=newwebkitAudioContext();// We'll assume this here onwards.functionchime(freq,output){varstopTime=AC.currentTime+10.0;varosc=AC.createOscillator();osc.frequency.value=freq||880.0;vargain=AC.createGainNode();osc.connect(gain);gain.connect(output||AC.destination);gain.gain.value=0.25;gain.gain.setTargetAtTime(0.0,AC.currentTime,2.0);osc.start(AC.currentTime);osc.stop(AC.currentTime+10.0);// References to osc and gain are given up// upon return from chime().}

Replacing the gain node with a ScriptProcessorNode

Now let’s consider what happens if we try to replace the gain node’s behaviour
(limited to this example) using a ScriptProcessorNode.

chime_jsgain A straightforward replacement of the gain node with a script node.

Problem 1: The vanishing script node

If you try to make a chime using chime_jsgain, you’ll find
that the sound stops abruptly well before the 10 seconds duration given. This
is because the script node is garbage collected almost immediately. This is
a WebKit implementation bug. The oscillator node, which has a
persistent reference till stop time, holds a reference to the script node
through its output, and yet the script node is garbage collected. A known
workaround for this bug is to maintain a global reference to the script node.
For this, we can use the following simple scheme –

This will result in the script node being preserved during the course of
the chime.

Problem 2: The eternal script node

With the modifications of the preceding section, the oscillator node will
disappear after 10 seconds but the script node will persist and continue to be
processed. To prevent this, we make arrangement for the script node to be
removed from the graph and for its global reference to be dropped once it has
processed enough samples.

functionchime_jsgain(freq,output){varosc=AC.createOscillator();varstopTime=AC.currentTime+10.0;osc.frequency.value=freq||880.0;vargain=keep(AC.createScriptProcessor(1024,1,1));gain.onaudioprocess=(function(){varamplitude=0.25;vardecay=Math.exp(-1.0/(2.0*AC.sampleRate));varstopTime_samples=Math.ceil(AC.sampleRate*stopTime);varfinished=false;returnfunction(event){vari,N,inp,out;if(finished){return;// Don't do anything after stopTime has elapsed.}inp=event.inputBuffer.getChannelData(0);out=event.outputBuffer.getChannelData(0);// Limit the number of samples we compute.varnow_samples=Math.floor(AC.sampleRate*AC.currentTime);N=Math.min(out.length,stopTime_samples-now_samples);for(i=0;i<N;++i,amplitude*=decay){out[i]=amplitude*inp[i];}if(N<out.length){// Reached end time before end of buffer.finished=true;setTimeout(function(){drop(gain);osc.disconnect();gain.disconnect();},0);}};}());osc.connect(gain);gain.connect(output||AC.destination);osc.start(AC.currentTime);osc.stop(stopTime);}

We now have a script node based implementation that works like the native
implementation in all respects. This kind of “stop time” behaviour can be
lifted into a separate function that transforms a function (event, toSamp) {...}
into an onaudioprocess handler.

Replacing the oscillator with a ScriptProcessorNode

Now, let’s consider what happens if we try to implement the OscillatorNode
(limited to this example) using a ScriptProcessorNode. We can use
scriptWithStopTime to take a first shot at it.
While we’re there, we’ll also generalize the chime model to take a time
at which to trigger the chime.

If you schedule a chime into the future instead of “right now”,
the script node oscillator will keep running from “right now” until
the end. The period from “now” to the scheduled start time is all
wasted computation.

The node responsible for the chime starting on time is the gain
node. Since the oscillator is continuously running, the phase at
which the oscillator begins will depend on the time passed in,
but ideally we want the oscillator to start with phase 0 when
the chime starts .. and we want it to be sample accurate.

To solve the wasted computation problem, we can make the onaudioprocess
be a no-op until the start time. This is best done by writing a
scriptWithStartStopTime as we did with scriptWithStopTime.

functionscriptWithStartStopTime(startTime,stopTime,handler){console.assert(stopTime>=startTime);varnode=keep(AC.createScriptProcessor(1024,1,1));varstartTime_samples=Math.floor(AC.sampleRate*startTime);varstopTime_samples=Math.ceil(AC.sampleRate*stopTime);varfinished=false;node.onaudioprocess=function(event){if(finished){return;}vart=Math.floor(AC.currentTime*AC.sampleRate);varfromSamp=Math.max(0,startTime_samples-t);if(fromSamp>=event.outputBuffer.length){// Not started yet.return;}vartoSamp=Math.min(event.outputBuffer.length,stopTime_samples-t);// Handler signature changed to include both start and stop indices.handler(event,fromSamp,toSamp);if(toSamp<event.outputBuffer.length){finished=true;setTimeout(function(){drop(event.node);event.node.disconnect();},0);}};returnnode;}

The above version saves some computation but not all. The JS node does not
need to receive any callbacks whatsoever until it is time to start generating
samples. To achieve this, we need to schedule the node to connect and start
just in time, say 0.1 seconds before the scheduled start time. To do this,
we need to generalize scriptWithStartStopTime to work with the nodes
to which the script node is expected to connect.

functionscriptWithStartStopTime(input,output,startTime,stopTime,handler){startTime=Math.max(startTime,AC.currentTime);stopTime=Math.max(stopTime,AC.currentTime);console.assert(stopTime>=startTime);varkBufferLength=512;// samplesvarprepareAheadTime=0.1;// secondsvarstartTime_samples=Math.floor(AC.sampleRate*startTime);varstopTime_samples=Math.ceil(AC.sampleRate*stopTime);varfinished=false;functiononaudioprocess(event){if(finished){return;}vart=Math.floor(AC.currentTime*AC.sampleRate);varfromSamp=Math.max(0,startTime_samples-t);if(fromSamp>=event.outputBuffer.length){return;// Not started yet.}vartoSamp=Math.min(event.outputBuffer.length,stopTime_samples-t);// Handler signature changed to include both start and stop indices.handler(event,fromSamp,toSamp);if(toSamp<event.outputBuffer.length){finished=true;setTimeout(function(){drop(event.node);input&&input.disconnect();event.node.disconnect();},0);}}functionprepareNode(){varnode=keep(AC.createScriptProcessor(kBufferLength,1,1));node.onaudioprocess=onaudioprocess;// Setup the necessary connections.input&&input.connect(node);output&&node.connect(output);}vardt=startTime-AC.currentTime;if(dt<=prepareAheadTime){prepareNode();}else{setTimeout(prepareNode,Math.floor(1000*dt));}returnnode;}

We can use scriptWithStartStopTime to include such dynamically determined
source or processing nodes within a subgraph … except when two such nodes
need to be connected to each other and both remain unrealized for a while. In
such cases, we can use an intermediate unity gain node for simplicity. Here
then is our final chime model with a js node for the gain or the oscillator.

Accounting for tail time

The gain node is simple in that it produces one output sample for each input
sample it receives on its input pin. Other node types, especially filter and
convolution nodes, may “tail off” well after the input to these nodes has
ceased. In other words, the true stop time of a node at which the node is
free to be garbage collected is at the end of such a tail time, which may
be only known when the node is running.

A simple way to account for this is to have the handler passed to
scriptWithStartStopTime use the toSamp argument only for information and
return the number of samples it actually produced. If the handler has produced
samples up to the buffer’s capacity, then it cannot be stopped. If it stopped
somewhere before the end of the buffer, then it can be taken to be finished and
cleanup can be triggered. This logic can be expressed in a modified
scriptWithStartStopTime as shown below –

functionscriptWithStartStopTime(input,output,startTime,stopTime,handler){startTime=Math.max(startTime,AC.currentTime);stopTime=Math.max(stopTime,AC.currentTime);console.assert(stopTime>=startTime);varkBufferLength=512;// samplesvarprepareAheadTime=0.1;// secondsvarstartTime_samples=Math.floor(AC.sampleRate*startTime);varstopTime_samples=Math.ceil(AC.sampleRate*stopTime);varfinished=false;functiononaudioprocess(event){if(finished){return;}vart=Math.floor(AC.currentTime*AC.sampleRate);varfromSamp=Math.max(0,startTime_samples-t);if(fromSamp>=event.outputBuffer.length){return;// Not started yet.}vartoSamp=Math.min(event.outputBuffer.length,stopTime_samples-t);// Return value of handler is used to decide when to stop. The handler is// required to produce as many samples as possible. If it still can't fill// up the buffer, it is deemed to have finished.varsamplesProduced=handler(event,fromSamp,toSamp);if(fromSamp+samplesProduced<event.outputBuffer.length){finished=true;setTimeout(function(){drop(event.node);input&&input.disconnect();event.node.disconnect();},0);}}functionprepareNode(){varnode=keep(AC.createScriptProcessor(kBufferLength,1,1));node.onaudioprocess=onaudioprocess;// Setup the necessary connections.input&&input.connect(node);output&&node.connect(output);}vardt=startTime-AC.currentTime;if(dt<=prepareAheadTime){prepareNode();}else{setTimeout(prepareNode,Math.floor(1000*dt));}returnnode;}

Reflections on the API

The final form of scriptWithStartStopTime
expresses functionality that is essential to the use of script nodes as
algorithmic signal sources and signal processors that can be scheduled
to sample accuracy. Although this function adds the required functionality,
the Web Audio API would be better off with a script node type whose
lifetime can be managed just like the native nodes and suppor the
dynamic behaviour available with the native nodes.

Not having it builtin results in an inconsistency that is awkward to
work around - the fact that start(t) and stop(t) methods added in pure
Javascript are not only necessary for script nodes that are pure signal
generators, but also for signal processors. This is because a script node might
otherwise end up wasting compute cycles either idling or computing audio that
is going to be discarded by the rest of the chain. A stop(t) method is
necessary for such processor nodes because the Web Audio API does not
currently provide any notification mechanisms for monitoring connections to a
node’s input and output so that processor nodes can self destruct when their
inputs die.

Concluding suggestions

The use case worked through in this post demonstrates how abstractions built on
other nodes do not extend to script nodes without special considerations. The
two kinds of usages for script nodes discussed here – sources and signal
processors – are common scenarios. Having the following features built into
the ScriptProcessorNode helps avoid the problems that crop up in
straightfoward usage of script nodes.

Permit creation of script nodes without inputs. This better models source
nodes. Passing 0 for “number of input channels” may be enough at the API
level.

Add start(t) and stop(t) methods to permit script nodes to be used as
signal sources, with such nodes not taking up any system resources during
their inactive periods.

Add dynamic lifetime support similar to native nodes, whereby unreferenced
“signal processor” script nodes driven by source nodes are automatically
released once the source node finishes, even if the source node is itself a
script node. To achieve this, the time at which the inputs cease must be
available as part of the event structure passed to the onaudioprocess
callback, so that the callback can begin any tail phase that it needs to
complete before it commits suicide.

Specify a convention and/or API to support tail times beyond
the time indicated in a stop(t) call or after its inputs have
been end-of-lifed.

PS: Steller’s jsnode model

The Steller library features a jsnode model that wraps the API’s script
node into a node type that can be scheduled as discussed in this post. It also
adds some conveniences such as multiple inputs (albeit single channel), and
a-rate controllable and schedulable AudioParam parameters. You can use
Steller’s jsnode model like this -