2013-05-30T10:13:48-07:00https://ccrma.stanford.edu/~tsob/blog/Octopress2013-05-30T10:02:00-07:00https://ccrma.stanford.edu/~tsob/blog/2013/05/30/final-touchesThanks to all who attended the CCRMA class presentations and
Spring Concert. I was happy with the sound of my piece, Powers of Two, on
the predictably amazing 24.6 channel surround sound system.

As my (perhaps) final task for this term, I’d like to produce a binaural
mixdown of the piece. The ATK in SuperCollider looks promising.

]]>2013-05-08T23:17:00-07:00https://ccrma.stanford.edu/~tsob/blog/2013/05/08/composition-notes-iiHere are some more thoughts I’m having as I compose my piece. I am starting to
explore different medium- to long-time-scale gestures – that is, elements of
the piece that arise from the combination of many different processed sound
samples.

Gestures

The important gestural elements seem to be:

Spatial

Temporal

Spectral

Although the temporal and spectral distortion effects are properties of the
frequency quantization effect I’m using (as I have discussed), they
contribute to an emergent gestural character when applied with different
parameters to repetitions of the same input audio samples. This combines quite
well with the way I am spatializing the piece (which I discussed
last time). The “loudspeaker orchestra” type spatialization allows the
listener to localize each sound to a particular speaker. With these gestures
distributed between multiple speakers, it gives the impression of a chorus of
speakers with different characteristics.

I should note, of course, that my spatial diffusion is not technically in line
with the loudspeaker orchestras of, say the BEAST of Birmingham. Such
performances usually involve stereo audio playback, and the dynamic effects
stem from orchestrating the volumes of the different speakers.

Samples

I have been pretty happy with some simple, clean electric guitar recordings. I
broke these into about six phrases and processed them with various window
lengths and frequency-domain bit depths.

At the moment, I’m looking for audio with more percussive low frequency
content, as well as more spectral distortion as opposed to temporal.
The guitar audio has yielded very melodic mid-range content, but I’d like to
move to a more atonal and bass-heavy section.

I’m having some success with some recordings I did of Pablo playing his
berimbau, which I slowed down and the processed with the MDCT quantizer.

Name

I’m considering Powers of Two as a name for this composition. Seeing as
powers of two arise in the MDCT window lengths, the bit depth (bits are of
course powers of two), etc., it seems fitting. I’m still thinking about it,
though.

]]>2013-05-01T21:50:00-07:00https://ccrma.stanford.edu/~tsob/blog/2013/05/01/composition-notesIn this post, I’ll discuss some decisions which have arose as I begin
constructing my piece. As I have mentioned, the main thrust of the piece is to
showcase the different effects that can be achieved with uniform quantization
in the MDCT domain.

Audio to process

I am immediately confronted with the choice of input audio with which to
process. As I mentioned in class, I had initially considered a long recording
of Romain cheerfully whistling and welding his Chanforgnophone in the
garage behind the Knoll on a recent evening. However, I was unhappy with
the results after processing. Since the signal was so complex (containing
Romain’s singing/whistling, transient metallic clunks and drawers slamming,
the low-frequency drone of the welding apparatus, and the intermittent
high-frequency noise of the welding), the effect is somewhere along a
continuum of slightly degraded (but annoyingly so) and extremely time-smeared
and distorted (though not in a controllably harmonic or melodic way).

Instead, I found it much more fruitful to take very short audio samples (around
a second) and batch-process them with (a) MDCT windows sizes from 256 samples
(5.8 ms) to 218 samples (5.94 seconds), and (b) quantization bit
depths from 4 to 14. Here are some examples which I find interesting.

In the prototype piece I show in class today, I explore the combination of
these extreme types of processed samples. However, I will continue to explore
different types of short audio samples with which to process as well as the
middle ground of window length and bit depth parameters.

Spatialization

One other decision I made regards spatialization, since I’d like to take
advantage of the multichannel setup we’ll have at Bing. There are
numerous ways to do this, such as ambisonics,
vector based amplitude panning (VBAP), and the loudspeaker orchestra
approach. (In fact, I think we could also make use
of the WFS array, which seems like it will make its way to Bing for the
concert.) I have chosen to take the loudspeaker orchestra route, in the
tradition of the venerable Acousmonium and the BEAST of Birmingham.
In fact, I believe Chris Chafe’s Tomato Music uses this spatialization
technique (most recently at the Feb. 16th concert at Bing).

I believe that addressing each processed signal to one speaker is the ideal
technique, since the elements of my composition are artificial and exist
divorced from space. That is, I am not creating a virtual space with any
cues regarding reverberation – the true reverberation from the input audio is
highly distorted if existent at all, and I am not adding any digital
reverberation since the long-window processed samples achieve a similar,
if unnatural, effect. Additionally, I think of ambisonics and, to a certain
extent, VBAP, as a way to sonically hide the speakers from the listener. For
my purposes here, I’d like the audience to be aware of each speaker and the
sound it produces. Moreover, I’d like each speaker to cultivate a personality
or character through the piece. Perhaps one speaker tends to speak in the
glitchy, narrow frequency language of the low bit rate processed samples, while
other speakers tend to the higher bit rate, longer window samples (and thus
speak at different time scales, with rich reverb and pre-echo dominating).
Since this piece derives from a technological process, it seems entirely
congruous and desirable that the speakers serve as focal points and performers.

I look forward to playing some of my ideas and getting thoughts or comments!

]]>2013-04-25T11:43:00-07:00https://ccrma.stanford.edu/~tsob/blog/2013/04/25/side-two-2012Here’s an improvised piece I performed with fellow CCRMAlites
Spencer Salazar and
Myles Borins.
Dinkelspiel Auditorium, Stanford. December 1, 2012.

for Guitar, Swarm, Gametrak and Monome
From the program note:
‘Through generative systems, algorithmic composition, gestural control, and live looping Specener Salazar, Timothy O’Brien, and Myles Borins create a unique and improvised sound scape. Developed over the fall quarter under the tutelage of Roberto Morales, “Side Two” is a collaboration of individuals and their respective technologies, steeped in the theory and practice of improvisation.’
Performance from a concert with Roberto Morales-Manzanares & the CCRMA Ensemble. Dinkelspiel Auditorium, Stanford. December 1, 2012.
https://ccrma.stanford.edu/events/roberto-morales-and-ccrma-ensemble

I’m using my guitar as input to my sonified swarm simulation
(on GitHub here). Myles is using his
monome and some Ableton/Max for Live magic. Spencer is
using the GameTrak controller to apply granular synthesis in real time to the
sounds made by Myles and me. The result was a spooky, middle-of-Echoes
soundscape.

From the
program note:

Through generative systems, algorithmic composition, gestural control, and live looping Specener Salazar, Timothy O’Brien, and Myles Borins create a unique and improvised sound scape. Developed over the fall quarter under the tutelage of Roberto Morales, “Side Two” is a collaboration of individuals and their respective technologies, steeped in the theory and practice of improvisation.

]]>2013-04-22T20:51:00-07:00https://ccrma.stanford.edu/~tsob/blog/2013/04/22/spectral-characteristics-of-the-uniform-mdct-domain-quantizerAs I begin to craft a musical work featuring the spectral artifacts of
MDCT-domain quantization, I decided to look at what the quantizer does to some
very basic signals. I started with a basic chirp created
in Audacity. That is, I created a pure sine wave which starts from 20 Hz
and ramps linearly
to 20 kHz over 3 minutes at a constant amplitude of 0.7. I perceive no
spectral artifacts (both from listening and from looking at the spectrograms)
resulting from 8 bit quantization of MDCT values at window lengths equal to
powers of 2 from 1024 to 131072.

Of course, it occurred to me that such slow variation of the pure sine wave
would not yield noticeable distortion except at correspondingly huge window
sizes. I then created a more condensed chirp, 30 seconds long. As you can see
from the below plot, the faster variation in frequency produced some nice
distortion. The windows, which are overlapped by 50%, smear
the frequency components in time. As we’ll see, this is especially interesting
in pre-echo. Moreover, this is a nice example of the reverb-type effect that
is easy to produce with this setup. Finally, one can see that a lower bit depth
tends to concentrate the frequency components highlighted by the effect.

In the above figure, the middle plot is slightly marred by some amplitude
clipping in the output signal. Amplitude which is too big or too small,
depending on the bit depth and window size, is something I will have to look
at closely.

Next, I created some white noise. Thus the main thing to notice here is the
frequency shaping. The below plot shows a very significant distortion in the
spectral characteristics of the signal.

Below, I have also included a spectrogram of the audio example I played in
class. Of note is the spectral flattening which occurs, which I will also
investigate further.

]]>2013-04-20T15:10:00-07:00https://ccrma.stanford.edu/~tsob/blog/2013/04/20/introducing-my-project-for-ccrma-music-220cAs part of Music 220C at Stanford’s CCRMA, I intend to explore the
creative potential of frequency-domain quantization/dequantization. This
follows on and was inspired by the work I did this past winter in
Marina Bosi’s Perceptual Audio Coding course.

However, whereas the goal of that class was to teach students how to develop
coders which achieve perceptual transparency at relatively low data rates,
I approach the same or similar algorithms from the opposite perspective.
I’m not concerned with data rate compression at all, and I’m interested in
generating as much distortion as possible with an end goal of crafting an
electroacoustic piece showcasing the various phenomena which arise.

At this point, I have code which takes 16-bit PCM files (i.e. ordinary
.wav files) and encodes the MDCT-derived frequency components via uniform
quantization. The window length for the time-to-frequency transform is
adjustable, as is the bit depth of the quantization. (I have so far found that
8 or 12 bit uniform quantization yields significant and interesting spectral
distortion, depending on the input audio. More on that later.)

Additionally, I have some simple modifications which allow me to (a) linearly
ramp the window size, and (b) smoothly cycle between window sizes.

I’d like to implement one other feature. I’ve noticed that the quantization
tends to emphasize certain frequencies in the input audio. I believe I could
modify the uniform quantization by scaling the input to the quantizer – that
is, I would take the frequency values which are about to be quantized, and
scale them by some function. After dequantization, I reverse the scaling
operation. Thus, I’d change the uniform quantization to something that’s quite
adjustable in just two extra steps. I suspect that I’d be able to “tune” the
spectral response to achieve musical results. (I should note that this scaling
process is also an approach that Marina Bosi mentions in her book as a
way to improve uniform quantization.)

I have included some initial sound files demonstrating this technique. Note
that the streaming audio might only work on Chrome.