February 23, 2008

A couple of years ago, Pau Guillamet and I wrote a series of tutorials on Linux audio tools for the mainstream Spanish magazine Personal Computer & Internet.
Well, after all this time the tutorials finally see the light!

Actually, I had the editors permission to release it online from some time ago; but this last week I had the perfect excuse to work on its formatting (using wiko, of course) since I gave a seminar on this topic at the esmuc. A seminar that, by the way, I enjoyed very much giving, and I wouldn’t mind repeating the experience!

Indeed, some applications –Ardour specially I’d dare say– have changed a lot during this lapse of time. And other apps have not change that much. Anyway I hope it can be of some use to people willing to introduce themselves to the power or the Linux audio tools.

January 25, 2008

While it is true that the clam-devel mailing-list an irc channel have been a little quiet recently –specially compared with the summer period (well it was called “summer of code” for a good reason!–, this doesn’t mean that we recently had a low development activity. (Being an open-source project the commits say it all)

The quietness is related to David and me being now involved with the acoustics group of the Fundació Barcelona Media, where we work in a more traditional –and so less distributed– fashion collaborating with people who actually sit together. Further, I enjoy very much working with such skilled and interdisciplinary team (half are physicists and half computer scientists), and also assessing that Clam is very useful in these 3D-audio projects. These latest developments on 3D audio rendering where mostly driven, by the IP-RACINE European project aiming to enhance the digital cinema.

The kind of development we do in Clam also changed since last summer. Instead of improving the general infrastructure (for example the multi-rate data-flow system or the NetworkEditor) or improving the existing signal processing algorithms, what we’ve done is… writing plugins. Among many other things the new plugins feature a new lightweight spectrum and fft, and efficient low-latency convolutions.

And this feels good. Not only because the code-compile cycle is sooo fast, but because it means that the framework infrastructure is sufficiently mature and its extension mechanisms are very useful in practice. Further, rewriting the core spectral processing classes allowed us to do a lot of simplifications in the new code and its dependencies. Therefore, the new plugins only depends on the infrastructure, which I’d dare to say is the more polished part of Clam.

And now that IP-RACINE final project demos have been successfully passed, it is a great time to show some results here.

Flamencos in a virtual loft

Listen to it carefully through the headphones (yes, it will only work with headphones!) You should be able to hear as if you were actually moving in the scene, identifying the direction and distance of each source. It is not made by just automating panning and volumes: but modeling the room so it takes into account how the sound rebounds into all the surfaces of the room. This is done with ray-tracing and impulse-responses techniques.

This stereo version has been made using 10 HRTF filters. However, our main target exhibition set up was 5.0 surround, which gives a better immersive sensation than the stereo version. So, try it if you have a surround equipment around:

Credits: Images rendered by Brainstorm Multimedia and audio rendered by Barcelona Media. An music performed by “Artelotú”

Well, the flamenco musicians in the video should be real actors. Ah! Wouldn’t have been nice?

What was planned

The IP-Racine final testbed was all about integration work-flows among different technological partners. All the audio work-flow is very well explained in this video (Toni Mateos speaking, and briefly featuring me playing with NetworkEditor.)

So, one of the project outcomes was this augmented reality flamencos video in a high-definition digital cinema format. To that end a chroma set was set up (as shows the picture below), and it was to be shoot with a hi-end prototype video camera with position and zoom tracking. The tracking meta-data stream fed both the video and audio rendering, which took place in real-time — all quite impressive!

The shouting of the flameco group “Artelotú” in a chroma set

Unfortunately, at the very last moment a little demon jumped in: the electric power got unstable for moment and some integrated circuits of the hi-end camera literally burned.

That’s why the flamencos are motionless pictures. Also, in absence of a camera with position tracking mechanism we choose to freely define the listener path with a 3D modelling tool.

How we did it

In our approach, a database of pressure and velocities impulse-responses (IRs) is computed offline for each (architectural) environment using physically based ray-tracing techniques. During playback, the real-time system retrieves IRs corresponding to the sources and target positions, performs a low-latency partitioned convolution and smoothes IR transitions with cross-fades. Finally, the system is flexible enough to decode to any surround exhibition setup.

The audio rendering (both real-time and offline) is done with Clam, while the offline IR calculation and 3D navigation are done with other tools.

The big thanks

This work is a collaborative effort, so I’d like to mention all the FBM acoustics/audio group: Toni Mateos, Adan Garriga, Jaume Durany, Jordi Arques, Carles Spa, David García and Pau Arumí. And of course we are thankful to whoever has contributed to Clam.

And last but not least, we’d like to thank “Artelotú” to the flamenco group that put the duende in such a technical demo.

Lessons for Clam

To conclude, this is my quick list of lessons learnt during the realization of this project using Clam.

The highly modular and flexible approach of Clam was very suited for this kind of research-while-developing. The multi-rate capability and data type plugins, where specially relevant.

The data-flow and visual infrastructure is sufficiently mature.

Prototyping and visual feedback is very important while developing new components. The NetworkEditor data monitors and controls were the most valuable debugging aids.

December 20, 2007

When upgraded Ubuntu from Feisty to Gusty, the newer freebob audio-firewire driver broke the support of Focusrite Saffire Pro. I have not tested the non-pro versions (Saffire and Saffire LE), but I guess it applies also to them.

Freebob/ffado main developer quickly identified the problem and proposed a provisional patch, which is what I’m gonna explain in this post. For a more definitive solution we’ll probably have to wait to the next ffado release. Hopefully soon.

First, the symtoms: This is how Gutsy freebob complains when starting jack with a Saffire Pro.

The problem is related to the sample rate interface. The quick solution is not using that interface. Just a matter of commenting out a piece of code.

Update 9th March:
Pieter Palmer makes me notice that from version 1.0.9 there is a ./configure switch that does what the below patch does. So you can safely skip the part of this blog about downloading 1.0.7 version and applying the patch.

There is a problem here. For some reason libtool (or some package that includes it) is missing in the debian/control file. This seems to me a packaging bug that makes autoreconf give errors like this one:

possibly undefined macro: AC_ENABLE_STATIC

So, install libtool:

sudo apt-get install libtool

And build:

autoreconf -f -i -s
./configure
make
sudo make install

To make sure that jack will take the new libfreebob (in /usr/local) we will hide the current freebob libs to jackd. I mean the ones installed by the debian package.

I’m very thankful to Pieter Palmer for the quick help at #ffado irc channel.

My next posts will talk about the reason I needed audio-firewire in linux. It was related to a real-time 3D audio exhibition developed with CLAM. It just happened yesterday and I’m very happy that it all worked really well.

// prepare a 3 seconds buffer and write it
const int size = sampleRate*3;
float sample[size];
float current=0.;
for (int i=0; iYou’ll find a complete reference on the available formats on the sndfile API doc. But this are typical subformats of the Wav format. As in the example above, put them after the SF_FORMAT_WAV | portion:

December 16, 2007

In my last how-to we played a multichannel video with surround audio (coded with ac3) using mplayer and jack. Jack (the audio server and router) is perfect to use with firewire audio interfaces. However, the Ubuntu Gutsy mplayer is not jack enabled by default.

To see the list of audio backends:

mplayer -ao help

However, it is very easy to compile a new mplayer .deb package jack enabled. First, create a temp dir (for example debs) and change to it. Install libjack with it’s development headers

sudo apt-get install libjack-dev

Install all the packages to build the mplayer-nogui package. This is done in a single command:

sudo apt-get build-dep mplayer-nogui

Now just get the source package and build it (the -b option is for build). This will take time.

sudo apt-get -b source mplayer-nogui

The mplayer build system automatically searches for libjack headers and when found configures its build system to use jack.

Finally install the new packages (dpkg complains with an error error processing mplayer which you can ignore):

sudo dpkg -i *.deb

To use it, choose the jack audio driver with -ao jack. A nice feature is that it automatically connects all the outputs to the physical device inputs.

mplayer -ao jack someaudiofile.mp3

If your file have more that 2 channels:

mplayer -ao jack -channels 5 surround.wav

As a final note, I’ve checked that debian (not ubuntu) package is jack enabled by default. The reason is that their packaging scripts differ a little bit (specifically debian/control file). The Ubuntu one don’t define the jack-dev build dependency.

December 12, 2007

AC3 is a compressed (lossy) audio format that allows multichannel (from 1 to 6 channels) and is very used in DVDs and video (mpeg2 and mpeg4) files.

Surprisingly, I found creating multichannel ac3 files in Linux much harder that expected. This is basically because most of my searches led to experiences of ffmpeg and no other tools. However, the fact is that encoding more than 2 channels don’t work (at least I wasn’t able) using the ffmpeg version that comes with Ubuntu Gutsy.

For example, this seemingly correct ffmpeg command produced a ac3 file with correct metadata, but totally silent and useless:

Embed the created .ac3 audio to a video

It is very easy to do using avidemux, a very nice graphical user interface application designed to do encoding and cutting video tasks. Use the menus to Open and choose the video file. Choose Audio, Main Track choose external AC3 and Save Video

December 3, 2007

Text processing in Python is so easy that I don’t feel like doing such kind of programming in C++ at all. However, sometimes I don’t have the choice. For example now I just had to implement a CLAM Processing that loads a big table of float data from a text file. Specifically, this is done in the processing ConcreteConfigure method.

What annoys me more of C++ file streams is the time I spend understanding its API. Let alone remembering it! So since I’ve just implemented a quite generic solution that loads a table of floats let’s blog about it and keep it at hand.

The input file. Actually, the file format is very flexible in terms of separators and the number of columns per row. The only requirement is that everything can be read as a float.