Generating Sound Waves with C# Wave Oscillators

Description

For the longest time, I've been baffled by the concept of sound in computing. How in the world is sound store? How is it played back? In classic Coding4Fun style, we'll learn by doing in this article—by building a wave oscillator application.

Optional Reading

I cover the basics of this article in a multi-part blog series, which you should check out if you have trouble:

What's An Oscillator?

An oscillator is a device or application that generates a waveform. In electrical engineering terms, it's a device that outputs an electrical current with varying voltage. If you plot the voltage over time, you get a regular wave in a particular form, such
as a sine, square, triangle or sawtooth.

An oscillator is the most basic type of synthesizer. Analog synths use electrical circuits to output a sound wave. Digital synthesizers do the same thing, but with software.

You can create a pretty neat sounding instrument by combining the outputs of multiple oscillators. For example, if you have three oscillators oscillating at a frequency of 440Hz (concert A pitch), but each of them has a different waveform (saw, square, sine)
you get a very interesting, layered sound.

But before we get too deep into this subject, let's briefly explore the physics of sound.

The Physics of Sound

Sound happens when air pressure changes on your ear drum. When you clap in an empty room, pressure waves bounce all over the place and dance on your eardrum. The changes in pressure are detected continuously by your ear.

Digitally, “pressure” is referred to by a scalar value called amplitude.
The amplitude (loudness) of the wave is measured thousands of times per second (44,100 times per second on CDs). Every measurement of pressure (aka
amplitude) is called a sample—CDs are recorded with 44,100 samples per second, each with a value between the minimum and maximum amplitude for the bit depth.

Think about 44,100 samples per second. That's a lot of stuff for your ear to detect. That's how we're able to hear so much stuff going on in the mix of a song, especially in stereo tracks where you have 44,100 samples per second, per ear.

It turns out that there is a
horribly intense mathematical theorem which basically tells us that 44,100 samples per second is enough to accurately represent a pitch as high as 22 KHz. The human ear can really hear only up to 20KHz, so a 44.1KHz sampling rate is a more than high-enough
sampling rate.

Terminology

So now you have a rather glancing overview of how sound works, and perhaps some clues as to how we should go about representing it in computers. Let's go over all this new terminology (plus some even newer terms) in delicious, bulleted format:

· Sample: A measurement of a sound wave at a very small point in time. 44,100 of these measurements in a row form a single channel of CD-quality audio.

· Amplitude: The value of a sample. Max and min values are dependent upon the bit depth.

· Bit depth: The number of bits used to represent a sample. 16-bit, 32-bit, etc. Max amplitude is (2^depth) / 2 – 1.

· Sample rate (aka sampling rate, aka bit rate): The number of samples per second of audio. 44,100 is standard for CD-quality audio.

How sound is represented

By now, you've probably surmised that a second of audio data is somehow represented by an array of some integer data type, which has a length of 44,100. You would be correct in that assumption. However, if you want sound to play from a computer's sound card,
that data has to be accompanied with a bunch of format information. WAV is probably the easiest format to deal with.

However, we are taking a slightly easier route, by using DirectSound. DirectSound gives us a lot of nice classes for all the format information, abstracting all that stuff away and allowing us to pump a stream of data into a DirectSound object and play it.
Perfect for a synthesizer app!

So, let's get started!

Building the app

I learned some Blend while working with this app, since it's built on WPF. The image buttons are just radio buttons. I had to differentiate the group number per instance of the user control at runtime (in the constructor of the Oscillator class).

I'm a terrible UI designer for the most part, so this is about as sexy as I'm willing to make this application. But feel free to make it look and act better!

Designing the UI

There's a dirty little secret in this application. It says it can oscillate 3 waves, but in truth, there's a constant (set to 3) that you can modify. You could have six if you wanted. How did I accomplish this? Each synth that you see is an instance of a
WPF user control called Oscillator.xaml:

I have a StackPanel called Oscs in the main window. In the Window_Loaded event handler of the main window, I use this bit of code to add instances of the usercontrol:

The long rectangular canvas is used to plot the values of the generated wave, so you can visualize the wave as it's played. It is scaled along the X axis so you can see the general shape of the wave, which would be impossible without scaling it with 44,100
samples per second.

Earlier in the article, I noted that a sound file is basically a really, really long array of 16- or 32-bit floating point numbers between -1 and 1. We use this data to plot the graph as well. More on that later.

Now that we have the UI figured out (dynamic addition of oscillators), let's take a look at exactly how the sound is produced.

Bzzzzt! Making Sounds and the Mixer

One of the many cool things about DirectSound is that it basically wraps the WAV format for you. You set the buffering/format options and then shove a bunch of data into it, and it will play. Magic.

The way I've architected the solution is a little more modular. None of the oscillators has the ability to play itself—rather, uses its UI to control some values such as frequency, amplitude and wave type. These values are tied to public properties. The
Oscillator component does virtually no audio work at all.

The generation of audio data is handled by the custom Mixer class, which takes a collection of Oscillators and, based on their properties, creates a composite of all the generators. This is done by averaging the samples in every oscillator and putting them
into a new array of data.

The Mixer class looks like this:

One of the workhorses of the Mixer class is the method GenerateOscillatorSampleData. This takes an Oscillator as an argument to give access to the public properties set in the UI. From there, the algorithm generates 1 second of sample data (specified
by the member bufferDurationSeconds) based on the wave type that has been selected in the UI. This is where the mathy stuff comes in to play. Check out this method and the different cases in the switch statement that determine what kind of wave to create
below.

The Mixer is the heart of the app, and it's a beautiful example of object orientation and cohesion. Give it three things (oscillators) and it spits out a new thing you can use (an array of sample data).

Now that we have the sample data, all we have to do is play it back using DirectSound.

Sound Playback with DirectSound

As I mentioned, DirectSound provides a wrapper over the WAV format. You set up your buffer and format information and then feed it a bunch of data in the form of an array of shorts (arrays of trousers are known to cause errors).

First, we initialize the format information and buffer in the Window_Loaded event handler of the main form. The values below are not really arbitrary; there is an explanation of them in the Optional Reading section above (see
Demystifying the WAV Format). This code also contains the code to add the oscillators, as shown earlier in the article.

When you click the Play button, the application takes its collection of oscillators and passes the values of the UI controls to the Mixer (which is initialized on each click with a reference to the main form window, so it can grab the Oscillator user controls).

The mixer outputs an array of shorts, which we write to a DirectSound buffer.

Drawing Pretty Graphs

All that's left is to draw the graph of the waveform on the canvas. Below is the GraphWaveform method. This method could graph anything it wanted to, as long as it was an array of shorts (not trousers). It's reminiscent of trying to graph things using Flash
back in the day, when you had to actually figure out points and lines (most likely on paper), but WPF's Polyline object makes this rather trivial.

Conclusion

This was a really fun little project that took way less time to code than it does to explain. It's a great exercise because it requires you to think about an ancillary field of science before you can sit down and code, which is really what coding for fun's
all about, anyway!

If you want to try this out, the download link for the source code is at the top of the article.

About The Author

Dan Waters is an Academic Evangelist at Microsoft, covering schools in the Pacific Northwest, Alaska, and Hawaii. He is based in Bellevue, WA. Dan has way too many guitars at home and tries to entice both of his young daughters to learn how to play them.
Music, technology, and music+technology are among his favorite hobbies, along with snowboarding and trying to maintain cool dad status. You can find his blog at
www.danwaters.com or follow him on Twitter at
www.twitter.com/danwaters.

The Discussion

Steve Syfuhs

Well really, we can only hear up to around ~14k-15k depending if you are male/female, and our ears stop responding around ~18k...which, in a mostly unrelated way, is partly why some of the older audio codecs were able to compress audio without much audible loss.

This is pretty cool though. I've been wanting to build out an application like this for my test bench for a while now.

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.