vosim

Description

This opcode produces a simple vocal simulation based on glottal pulses with formant characteristics.
Output is a series of sound events, where each event is composed of a burst of squared sine pulses followed by silence.
The VOSIM (VOcal SIMulation) synthesis method was developed by Kaegi and Tempelaars in the 1970's.

kForm - formant center frequency. Length of each pulse in the burst is 1/kForm seconds.

kDecay - a dampening factor from pulse to pulse. This is subtracted from amplitude on each new pulse.

kPulseCount - number of pulses in the burst part of each event.

kPulseFactor - the pulse width is multiplied by this value at each new pulse.
This results in formant sweeping. If factor is < 1.0, the formant sweeps up, if > 1.0 each new pulse is longer,
so the formant sweeps down. The final pitch of the formant is kForm * pow(kPulseFactor, kPulseCount)

The output of vosim is a series of sound events, where each event is composed of a burst of squared sine pulses followed by silence.
The total duration of the events determines fundamental frequency.
The length of each single pulse in the squared-sine bursts produce a formant frequency band. The width of the formant is determined by rate of silence to pulses (see below). The final result is also shaped by the dampening factor from pulse to pulse.

A small practical problem in using this opcode is that no GEN function will create a squared sine wave out of the box. Something like the following can be used to create the appropriate table from the score.

The count of pulses multiplied by pulse width should fit in the event length (1/kFund).
If this is not fulfilled, the algorithm does not break, we just do not start any pulses that would outlast the event.
This might introduce a silence at end of event even if none was intended.
In consequence,
kForm should be higher than kFund, otherwise only silence is output.

Vosim was created to emulate voice sounds using a model of glottal pulse.
Rich sounds can be created by combining several instances of vosim with different parameters.
One drawback is that the signal is not band-limited. But as the authors point out, attenuation of high-pitch components is -60 dB
at 6 times the fundamental frequency. The signal can also be changed by changing the source signal in the lookup table.
The technique has historical interest, and can produce rich sound very cheaply (each sample requires only a table lookup and a single multiplication for attenuation).

As stated, formant bandwidth depends on the ratio between pulse burst and silence in an event.
But this is not an independent parameter: The fundamental decides event length, and formant center defines the pulse length. It is therefore impossible to guarantee a specific burst/silence ratio, since the burst length has to be an integer multiple of pulse length. The decay of pulses can be used to smooth the transition from N to N+/-1 pulses, but there will still be steps in the spectral profile of output. The example code below shows one approach to this.

All input parameters are k-rate. The input parameters are only used to set up each new event (or grain). Event amplitude is fixed for each event at initialization.
In normal parameter ranges, when ksmps <500, the k-rate parameters are updated more often than events are created. In any case, no wide-band noise will be injected in the system due to k-rate inputs being updated less often than they are read,
but some other artefacts could be created.

The opcode should behave reasonably in the face of all user inputs. Some details:

kFund < 0: This is forced to positive - no point in "reversed" events.

kFund == 0: This leads to "infinite" length event, ie a pulse burst followed by very long indefinite silence.

kForm == 0: This leads to infinite length pulse, so no pulses are generated (i.e. silence).