Audio and Music Production - Part 1

JACK, audio and inter-connectivity

Producing music on a computer is good fun. Not only do you get to create and record the most outlandish sounds and the most repetitive rhythms, but you can do so from the luxury of a workstation that 10 or 15 years ago would have cost the earth. This series of tutorials aims to cover a broad range of Linux audio software, from synthesizers and effects to sequencing and recording, while attempting to produce a finished and mastered piece of music suitable for burning onto a CD. This month is basically a primer on how to configure your system for real-time sound production using a synthesizer, effects and a drum machine. While we won't cover installation, all of the applications used in this tutorial are readily available, to the extent that they are usually part of any major distribution. For example, with Mandrake 10.1, everything is available in the `contribs' RPM repository, and installation should be as easy as

urpmi package-name

While Linux may not initially seem like the ideal platform for music production, it does actually have quite a lot going for it. With good-quality drivers, an innovative connection protocol and several feature-rich sequencing and recording packages, there is currently a greater amount of audio production software available for Linux than there was for Windows only a few years ago, and it's expanding rapidly. While it could be said that the packages available for Linux are sometimes a little eccentric, this only means that for certain types of music, Linux is ideal.

For the others, you just need to adapt your working methods a little. This means that even if you don't appreciate the finer aspects of Scandinavian minimalist electro, Linux is still perfectly capable ­ you just need to change your approach. It wasn't so long ago that music production involved considerable investment in some serious hardware. Not only did you need a wide variety of instruments to make the actual sounds (and the ability to get something meaningful out of them), you needed something to mix the sounds together, as well as a way to record your masterpiece. As computers have become increasingly powerful, more and more of the actual recording process has moved away from external hardware to the PC, to such an extent that in certain genres of production, everything from the composition and recording to the final stages of mastering can be achieved within software alone.

Studio time

Until recently, Linux was considerably further behind in its audio production capabilities than both Microsoft's Windows and Apple's OS X. It lacked both the hardware and software support, which is hardly surprising considering the amount of investment these kinds of products take to develop. However, things are changing, and it genuinely feels as though Linux is on the cusp of breaking into a more mainstream position for audio work, and more importantly, bringing some of its own open philosophy with it. Where previously a recording studio may have used a Linux box for sharing audio files or as a web server, there are now studios bringing Linux closer to the actual process.

It may seem obvious but perhaps the most important thing to consider for audio work is the soundcard. It's here that sound enters a system, and any artefacts or problems with distortion that enter the recording stage at this point are going to be very difficult to resolve later. The quality of a card is very difficult to judge. Despite a manufacturer often printing various specifications for its product, there's still be a wide range in the quality of the output. When the technical specifications of certain consumer cards are compared with those of professional or semi-professional cards, they may appear to offer broadly similar performance for very little outlay. However, you would almost certainly find that the sound is considerably inferior. This is usually down to the quality of the analoguedigital converters, but other things, such as internal clock stability or circuit shielding (essential to avoid recording the sound of your own hard drive), affect the overall sound quality.

When using a system for audio production, you need it to be responsive. This not only means that when you lower the volume, it gets quieter immediately, but also means that when you play a note on a virtual synthesizer, you don't need to wait half a second until the note comes out of your speakers. The responsiveness, or the latency, of a system is usually a balance between the soundcard and the driver. Started in 1999, the Advanced Linux Sound Architecture (ALSA) project is aimed at unifying the audio sub-system for the Linux kernel, with the intention of replacing the then ageing Open Sound System with a well-designed, low-latency platform for ALSA audio drivers and software to communicate with one another. After five years of development, version 1 of the ALSA sound system can befound as the default audio layer in all 2.6 level kernels, and it's thanks to its brilliant design and modularisation that Linux has the potential to become a killer platform for audio production.

The next most important part of the chain is the quality of the soundcard driver. A poorly designed driver can make the difference between having a constant struggle with stability and response, and being simply a transparent part of your system. With other operating systems, this is usually down to the manufacturer's resources and experience. With Linux, it's usually down to how open the manufacturer has been with the volunteers that develop the drivers for themselves. Two professional soundcard manufacturers known for their support of Linux are M-Audio (now owned by Digidesign/Avid) and RME.

The main disadvantage with ALSA is as a direct result of its power, and that's its complexity. ALSA not only gives you the choice of which modules to include or exclude from kernelspace, but also gives you direct control over all the various internal routing options available for your own audio hardware. Unlike drivers for other systems that restrict you to the manufacturer's (usually rather limited) configuration, ALSA gives you complete freedom over internal routing. With the Creative Audigy card, for example, you can freely assign any of the inputs through an internal matrix, mixed with internal sound sources and sent to any of the outputs, all through ALSA. This can be a little overwhelming, especially if you happen to load the ALSA mixer, only to see 50 or more channels to control, and it's often better to leave at the default values. However, it's good to know the power is there if you need it.

Another innovation that makes Linux audio so promising is Jack. This is the inter-connectivity layer that sits on top of ALSA and enables all the various audio utilities and applications that are built to support Jack (and most are) to share audio and MIDI data. MIDI is the protocol used to communicate between applications and was traditionally used for interconnecting physical pieces of hardware, such as synthesizers to sequencers. However, as the hardware has been translated into virtual software equivalents, there's still a need for inter-operability, in the same way there's a need for inter-process communication.

Jack of all trades

Jack shares the same paradigm that you would use in a recording studio filled with various bits of hardware. The synths are connected to the mixing console, from where the audio signal can be routed to various effect units, before being brought back to the console and sent to a recording device, such as a hard disk recorder or tape machine. Jack enables you to make all the same connections, just virtually. While you don't have the physical mess you would in a studio with wires all over the place, without good management, your Jack connections can soon look like the virtual equivalent. With Jack, there's no longer a need for the physical cables and the clutter, and you can simply route and re-route connections virtually, as long as the software is compatible. Jack has become so useful that it has even been ported to OS X!

There are other considerations when configuring your system for music production. To improve system performance and the latency of your hardware, it's much better to use a kernel configured for multimedia applications. The most stable approach is to use a 2.4-version kernel configured to use the pre-emptive patches. It's still possible to make similar improvements to the 2.6 kernel, but instead of the pre-emptive patches, you need to build and install the real-time kernel module. Unless there's a pre-built kernel for your distribution, it's only really worth putting the effort into this if you plan to extensively use Linux for audio, since you can still get perfectly respectable results with the kernel that's already installed.

Once you've got Jack and ALSA installed and configured for your sound hardware, the next thing to do is actually run Jack. While this can be done from the command line, Jack is easier to manage through a graphical interface, especially when it comes to the various connections that need to be made between audio applications, and one of the best utilities for managing Jack is qjackctl.

Before pressing the start button in qjackctl to launch the Jack server, it needs to be configured through the Setup page, by clicking on Setup. Once you've entered the Setup page, you're presented with most of Jack's configuration options. If you've managed to configure and install a real-time capable kernel, then it's better to change the Server Path to the jackdserver, otherwise it can be left on jackd. Obviously, the audio driver should be configured to use ALSA for best performance. The interface can be left on default, but changing it to the physical interface you use (by selecting hw:0 for example) will improve performance. You also need to make sure that the sample rate is set to a sensible number. It's usually better to use 44100 if you can because this avoids re-sampling the audio if you need to burn it directly to a CD, but some cards (most notably those by Creative) are locked at 48000. After this, the most important parameter on the Settings page is the Frames/Period setting, which is what specifies the size of the buffer for the real-time processing of audio. The lower this is, the less latency or delay. A sensible setting to start with is 1024 (which gives an overall latency of about 42ms at a sample rate of 48K), and if your system behaves itself without producing crackles or clicks, you can easily reduce this. Better soundcards can run with a Frames/Period setting of 128 or less, which gives a barely perceptible latency of around 6ms.

Once Jack is configured, you can safely apply the settings and start the server. If all has gone well, the status display in qjackctl will say Started and should show a running percentage of currently used CPU power (higher CPU usage has an adverse effect on the performance of the Jack server). When problems occur, such as with the Frames/Period being set too low, errors called XRUN cause Jack to skip bits of information to keep up. When this happens, it's reported in the Message window. The total number of errors encountered in the current session can be found in the field just below the running status in the qjackctl status display.

The easiest way to illustrate how all this hangs together is to create a simple setup that generates some sound. Before doing anything else, launch qjackctl and start the server. After the server is running, press Connect to open up the Connections pane. Without any other Jack-compatible programs running, all you should see in the Audio window are your soundcard's physical connections, with inputs under Readable Clients and outputs under Writeable Clients. MIDI connections would be shown under the MIDI tab.

Sound generation

A good sound source to use is the ZynAddSubFX, which is a fairly comprehensive synthesizer. When first launched, it's easier to select the Beginner user interface when prompted. The lower third of the synth's screen is taken up with a virtual keyboard you can click on to create a sound. Alternatively, when the synth's window is active, you can use your QWERTY keyboard to play two octaves in a `virtual keyboard' arrangement. Pressing a key should make the level meters under the keyboard jump, showing that the synth is generating some output, but you probably won't be able to hear anything. The reason for this is that the synth isn't `virtually' wired-up in Jack. If you've still got the qjackctl Connect window open, you should be able to see that the synth has added to the Readable (input) clients, and now shows the new instance of ZynAddSubFX. To be able to hear anything from the synth, you need to select it from the Readable Clients list, choose the corresponding output for your soundcard and click on Connect. Going back to the synth and playing the keyboard now should finally produce a sound.

By default, the sound generated is very simple. It's based on additive synthesis, and consists of a single sine wave following a simple envelope, which gives the sound its distinctive attack, followed by the fading decay. A single sine wave doesn't contain any other harmonics, which is why it sounds `pure' in the way it does. As a contrast, the waveform that contains the most harmonics is called a sawtooth. Despite stating the obvious, it's so called because it looks like a single tooth on a saw. To change the current sound to use a sawtooth waveform, you first need to take a deep breath, then open the Expert view (Misc > User Interface Mode > Yes), click on the Edit Instrument box, then the large Edit underneath Addsynth, and finally the small Change button on the lower left corner. You should now be in the ADsynth Oscillator Editor, which shows a graphical representation of the sine wave that's generating the sound. To change this to a sawtooth, choose Saw from where it currently says Sine. To live in denial and hide all this complexity, select Misc > Switch User Interface in the first window to get back to the original screen.

Rack 'em up

When you now play the sound, you should notice that as a result of its greater harmonic content, it sounds richer and has a considerably different timbre to the original. This kind of sound is perfect for passing through a flanging effect, and we can easily do that with Jack. The virtual equivalent of a rack of effects for Jack is unsurprisingly called `Jack Rack', and once launched it presents very little to give any idea of what it actually does. You may have noticed that a pair of inputs and outputs has appeared in the Connections pane of qjackctl. Jack Rack is a way of processing audio in the same way that external effects units do. To wire Jack rack into the audio path, click on Disconnect All in the qjackctl Connections window and connect the ZynAddSubFX output to the jack_rack input. Now connect the jack_rack output to the alsa_pcm input.

Jack Rack is only compatible with LADSPA effects and, despite their terrible name, they're becoming considerably more numerous. They are Linux's equivalent of the ubiquitous VST effects, or Audio Unit effects for the Mac. The flanger effect we need is from the SWH-Plugins package and is accessed by selecting Add > F > Flanger. This adds the flanger effect to the chain, but to hear the effect you first need to enable it by pressing the Enable button.The sound should change again as it goes through the effect. Try holding several keys down in the synth window and selecting Jack Rack with the mouse.

Of course, with every minimalist piece of synthesizer music, we need some kind of drum beat, and the best way to get this in Linux is with a drum machine called Hydrogen. This is a very capable drum machine that features user-configurable drumkits, pattern sequencing and a really comprehensive mixing section. Unlike the other software here, Hydrogen needs to be told to use Jack, as by default it's configured to use ALSA and complains when it's already being used.

Once the program is running, just go to File > Preferences, and select the Audio System tab, after which you can see where to change ALSA to Jack. Now click on Restart Driver. Once the driver has restarted, you should see that not only has Hydrogen appeared in the Connections window, but it's connected itself to the alsa_pcm outputs by default. As it stands, this is what we would want anyway ­ Jack automatically takes care of mixing all the signals together when there's more than one connection at any one point. To get a drum beat running as soon as possible, try using File > Open Demo > GM_kit_demo1.h2song, and press Play. What you get is the kind of beat you used to get out of Casio keyboards, but it shows what Hydrogen is capable of.

If you're new to all this, it can be overwhelming, but we've already covered the basics of audio under Linux. Jack acts as the central hub for all things, connecting and interconnecting sequencing applications, synthesizers, drum machines and effects. In next month's tutorial we'll attempt to bring together the various parts of this introduction with the start of a composition using Rosegarden, a KDE sequencer and audio recording package, as we take the next step in composing a complete piece of music from within Linux.