PortAudio: Portable Audio Processing for All Platforms

To write an audio application that samples, edits, or otherwise manipulates sound, the first decision you have to make is choosing which platform you want to lock yourself into. After all, even the most basic real-time audio playback functions are close to the bare metal of the operating system. If you're going to put time and maybe money into an audio development effort, of course you want the widest swath of platforms for release. PortAudio answers the call by delivering a free, cross-platform, open-source audio I/O library. It lets you write simple audio programs in C that will compile and run on many platforms, including Windows, Mac, and Linux/Unix.

PortAudio, which provides a very simple API for recording and/or playing sound using a simple callback function, is intended to promote the exchange of audio synthesis software between developers on different platforms. It includes example programs that synthesize sine waves and pink noise, perform fuzz distortion on a guitar, list available audio devices, and much more. Carnegie Mellon University's PortMusic project, which includes MIDI and soon will provide sound file support, recently selected PortAudio as its audio component.

Playing Musical Platforms

The PortAudio library supports an array of platforms including Windows, Linux, and Macintosh variants (see Table 1), but if you don't have prior audio development experience you quickly will find yourself adrift in a sea of API standards. After computer audio became mainstreamed with Windows 3.1 MultiMedia Extension (MME) and the ubiquitous .WAV file back in 1991, a variety of solutions followed. First came Direct Sound in the Windows 95 era, which unfortunately lacked a record capability. The Windows 2000/XP generation then introduced the fastest solution for Windows users: the Windows Driver Model.

The latency requirements of your application should dictate your choice of API. If your sound program does not require a quick response time (close to a "live" performance), you are certainly free to use the MME or Direct Sound platform. However, if you require very low latency (below 20ms response time), you will need ASIO or WDMKS. The downside of ASIO is that it requires (usually) proprietary drivers that at best require end-user installation and at worst are not even available for cheaper audio systems. (For more details, refer to the SoundCard FAQ.)

Getting Ready to Sound Off

To start programming with PortAudio, the first thing you need to do is go to www.portaudio.com and pick out a relevant distro. Because V18.1, the last official release, is nearing the three-year-old mark, you might as well start with a current V19 code snapshot. (An older precompiled DLL for PortAudio V17 also is available, but that's all as of this writing.) Either way, it's a matter of unpacking a ZIP file or tarball, because PortAudio is pretty much distributed in a source-only format.

As you might expect with any streaming interface, PortAudio supports two different programming models: a blocking API and a non-blocking API. The non-blocking API was developed first. The blocking API came later and is still unofficial. Although simple command-line type tools can use a blocking API with little impact, a modern GUI application would need to invoke a thread to manage blocking I/O calls. Otherwise, the app looks dead to both the OS and the end-user during I/O.

This article examines only the non-blocking API. A typical non-blocking PortAudio application requires the following steps:

Write a callback function that PortAudio (PA) will call when audio processing is needed.

Initialize the PA library and open a stream for audio I/O.

Start the stream: PA now will call your callback function repeatedly in the background.

Inside your callback, you can read audio data from the inputBuffer and/or write data to the outputBuffer.

Stop the stream by returning a 1 from your callback or calling a stop function.

Close the stream and terminate the library.

Hello PortAudio, A Sample Application

Although ASIO, WMSDK, and DirectSound layers are available, the sample application discussed in this section uses the Windows MME, the lowest common denominator. First, you need to build a static library out of the following modules:

Note: In the preceding three lines of code, lines two and three should be one continuous line. The line was broken only to display properly on this Web page.

What you get is about five seconds of pure sawtooth wave pleasure! But, that's not the point. You now have a platform-independent, sound-synthesizing piece of code with which you could implement any number of effects.

PortAudio comes with about four dozen test programs. Look at the guitar fuzz distortion box simulator "pa_fuzz.c" (see below) so you can rock on like Peter Frampton and Joe Walsh. Use essentially the same build command as before:

You can represent the signal in many ways with PortAudio. The most common mechanism is to use float values from -1.0 to +1.0 to represent the audio signal (paFloat32). You can also use 16-bit integers if you are more comfortable with that or some other representation. The CubicAmplifier() function simulates the distortion that an analog amplifier would produce, the mathematics of which are beyond the scope of the current discussion.

The PortAudio system is designed to work in a near real-time environment, thus the use of callback functions. The fuzzCallback() function sends an input buffer, output buffer, number of frames (for example, samples), time sequence, buffer status flags, and a pointer to a user-defined storage area. A frame in an input or output buffer contains a complete set of samples for all channels involved (in this case, two for stereo). The program has as many tuples as specified by the incoming frameCount (which may be zero); you've asked for 64 samples (FRAMES_PER_BUFFER).

Although this example uses two-channel audio, you can set up any number of channels. The fuzzCallback() function generates an empty buffer in the case of "no input." If you do have input, you fuzz the left channel (zero) and copy the input clean on the right channel (one). If your distortion was sensitive in the time domain, you could use the timeInfo struct to retrieve the following times in seconds:

When the first sample of the input buffer was received at the audio input

When the first sample of the output buffer will begin being played at the audio output

Initializing PA and opening the stream are next. Pa_Initialize() must of course be the first PortAudio call your application uses, just as Pa_Terminate() is the last. After that, you need to set up the parameters of your input streams and output streams. The default input device is usually Microsoft Sound Mapper, which flows from the line-in input of your soundcard (or equivalent). Other possible inputs might be your modem input, CD audio, or other things depending on drivers and hardware. You also could create sophisticated callback algorithms where you mix multiple channels down to one channel or vice-versa.

Finally, you are ready to call Pa_OpenStream() and get the streams ready for immediate use. Because latency is always your enemy, separate opening the stream from starting the stream. The input and output channels must agree to the same sample rate (in this case, CD quality 44100Hz) and the same number of samples per buffer-load.

At first glance, the remainder of the program may leave you scratching your head. The Pa_StartStream() calls a platform-specific function to get a thread going, which begins callbacks immediately. The Win32 implementations all eventually call CreateThread(), although to me the WDMKS code seems a lot simpler than the Win MME version. The two ways to get out of the callback loop are returning a value of 1 or calling Pa_CloseStream().

Get Creative

Your creativity is the limit to what you can do with PortAudio: convert data streams from one format to another in real time, simulate surround sound or other sophisticated multi-channel audio, or even create performance-quality effects. Best of all, you aren't overcommitted to any platform, which makes PortAudio my choice for open source audio projects.

About the Author

Victor Volkman has been writing for C/C++ Users Journal and other programming journals since the late 1980s. He is a graduate of Michigan Tech and a faculty advisor board member for Washtenaw Community College CIS department. Volkman is the editor of numerous books, including C/C++ Treasure Chest and is the owner of Loving Healing Press. He can help you in your quest for open source tools and libraries; just drop an e-mail to sysop@HAL9K.com.

About the Author

Victor Volkman

Victor Volkman has been writing for C/C++ Users Journal and other programming journals since the late 1980s. He is a graduate of Michigan Tech and a faculty advisor board member for Washtenaw Community College CIS department. Volkman is the editor of numerous books, including C/C++ Treasure Chest and is the owner of Loving Healing Press. He can help you in your quest for open source tools and libraries, just drop an e-mail to sysop@HAL9K.com.

Comments

There are no comments yet. Be the first to comment!

You must have javascript enabled in order to post comments.

Leave a Comment

Your email address will not be published. All fields are required.

Name

Email

Title

Comment

Top White Papers and Webcasts

When individual departments procure cloud service for their own use, they usually don't consider the hazardous organization-wide implications. Read this paper to learn best practices for setting up an internal, IT-based cloud brokerage function that service the entire organization. Find out how this approach enables you to retain top-down visibility and control of network security and manage the impact of cloud traffic on your WAN.

U.S. companies are desperately trying to recruit and hire skilled software engineers and developers, but there is simply not enough quality talent to go around. Tiempo Development is a nearshore software development company. Our headquarters are in AZ, but we are a pioneer and leader in outsourcing to Mexico, based on our three software development centers there. We have a proven process and we are experts at providing our customers with powerful solutions. We transform ideas into reality.