In the fall of 2006, I created a simple synthesizer using the C# programming language. This was a proof of concept project commissioned by James Squire, an associate professor at the Virginia Military Institute. I had a blast working with James on this project and got bit by the "softsynth" bug. This led me to begin writing a more elaborate "toolkit" for creating software based synthesizers using C#. Special thanks to James for giving me permission to reuse the code from our project here at Code Project.

This is Part I in a planned three part article. Here, I will be giving an overview of the toolkit. Part II will dig deeper into the toolkit by showing you how to create a simple synthesizer. Part III will demonstrate how to create a more sophisticated synthesizer.

If you are familiar with softsynths, the first question you may be asking is if this toolkit is VST compatible. It is not. However, I have heard of efforts being made to create a .NET/VST host. If this becomes a reality, if it hasn't already, it won't be hard to adapt most of the toolkit to work with it.

This is the second major version of the toolkit. The main change from the first version is the removal of MDX (Managed DirectX) for waveform output. In its place is a custom library I have written. The latency is slightly longer but playback is more stable. In addition, the second version supports functionality for creating your own effects. The toolkit comes with two effects: chorus and echo; I hope to add more in the future. Also, version two supports recording the output of the synthesizer to a wave file.

A synthesizer is a software or hardware (or some combination of the two) device for synthesizing sounds. There are a vast number of approaches to synthesizing sounds, but regardless of the approach most synthesizers use the same architectural structure. Typically, a synthesizer has a limited number of "voices" to play notes. A voice is responsible for synthesizing the note assigned to it. When a note is played, the synthesizer assigns a voice that is not currently playing to play the note. If all of the voices are already playing, the synthesizer "steals" a voice by reassigning it to play the new note. There are many voice stealing algorithms, but one of the most common is to steal the voice that has been playing the longest. As the voices play their assigned notes, the synthesizer mixes their output and sends it to its main output. From there, it goes to your speakers or sound card.

A voice is made up of a number of components. Some create the actual sound a voice produces, such as an oscillator. An oscillator synthesizes a repeating waveform with a frequency within the range of human hearing. Some components do not produce a sound directly but are meant to modulate other components. For example, a common synthesizer component is the LFO (low frequency oscillator). It produces a repeating waveform below the range of human hearing. A typical example of how an LFO is used is to modulate the frequency of an oscillator to create a vibrato effect. An ADSR (attack, decay, sustain, and release) envelope is another common synth component. It is used, among other things, to modulate the overall amplitude of the sound. For example, by setting the envelope to have an instant attack, a slow decay, and no sustain, you can mimic the amplitude characteristics of a plucked instrument, such as the guitar or harp. Together, these components synthesize a voice's output. Below is an illustration showing a four voice synthesizer:

Early analog synthesizers were built out of modules, an earlier name for synthesizer components. Each module was dedicated to doing one thing. By connecting several modules together with patch cords, you could create and configure your own synthesizer architecture. This created a lot of sonic possibilities.

When digital synthesizers came on the scene in the early 80s, many of them were not as configurable as their earlier analog counterparts. However, it was not unusual to find digital representation of analog modules such as oscillators, envelopes, filters, LFO's, etc.

Software based synthesizers began hitting the scenes in the 90s and have remained popular to this day. They are synthesizers that run on your personal computer. Because of the versatility of the PC, many software synthesizers have returned to the modular approach of early analog synthesizers. This has allowed musicians to use the power of analog synthesizers within a stable digital environment.

The output of a synthesizer is a continuous waveform. One way to simulate this in software is to use a circular buffer. This buffer is divided into smaller buffers that hold waveform data and are played one after the other. The software synthesizer first synthesizes two buffers of waveform data. These buffers are written to the circular buffer. Then the circular begins playing. As it plays, it notifies the synthesizer when it has finished playing a buffer. The synthesizer in turn synthesizes another buffer of waveform data which is then written to the circular buffer. In this way, the synthesizer stays ahead of the circular buffer's playback position by one buffer. This process continues until the synthesizer is stopped. The result is a seamless playback of the synthesizer's output.

Because it is necessary for the synthesizer to stay one buffer ahead of the current playback position, there is a latency equal to the length of one buffer. The larger the buffers, the longer the latency. This can be noticeable when playing a software synthesizer from a MIDI keyboard. As you play the keyboard, you will notice a delay before you hear the synthesizer respond. It is desirable then to use the smallest buffers possible. A problem you can run into, however, is that if you make the buffers too small, the synthesizer will not be able to keep up; it will lag behind. The result is a glitching effect. The key is to choose a buffer size that minimizes latency while allowing the synthesizer time to synthesize sound while staying ahead of the playback position.

The key challenge in designing this toolkit has been to decide what classes to create and to design them in such a way that they can work together to simulate the functionality of a typical synthesizer. Moreover, I wanted to make it easy for you to create your own synthesizers by simply plugging in components that you create. This took quite a bit of thought and some trial and error, but I think I have arrived at a design that meets my goals. Below I describe some of the toolkit's classes. There are a lot of classes in the toolkit, but the ones below are the most important.

The Component class is an abstract class representing functionality common to all effect and synthesizer components. Both the SynthComponent class and the EffectComponent class are derived from the Component class.

The Component class has the following properties:

Properties

Name

SamplesPerSecond

The Name property is simply the name of the Component. You can give your Component a name when you create it. For example, you may want to name one of your oscillators "Oscillator 1." The SamplesPerSecond is a protected property and provides derived classes with the sample rate value.

The SynthComponent class is an abstract class representing functionality common to all synthesizer components. A toolkit synthesizer component is very much analogous to the modules used in analog synthesizers. Oscillators, ADSR Envelopes, LFO's, Filters, etc. are all examples of synthesizer components.

The SynthComponent class has the following methods and properties:

Methods

Synthesize

Trigger

Release

Properties

Ordinal

SynthesizeReplaceEnabled

The Synthesize method causes the synthesizer component to synthesize its output. The output is placed in a buffer and can be retrieved later. The Trigger method triggers the component based on a MIDI note; it tells the component which note has triggered it to synthesize its output. The Release method tells the component when the note that previously triggered it has been released. All of these methods are abstract; you must override them in your derived class.

The Ordinal property represents the ordinal value of the component. That doesn't tell you very much. I will have more to say about the Ordinal property later. The SynthesizeReplaceEnabled property gets a value indicating whether the synthesizer component overwrites its buffer when it synthesizes its output. In some cases, you will want your component to overwrite its buffer. However, in other cases when a component shares its buffer with other components, it can be useful for the component to simply add its output to the buffer rather than overwrite it.

The toolkit comes with a collection of SynthComponent derived classes, enough to create a basic subtractive synthesizer. These components should be treated as a starting point for components you write.

There are two classes that are derived from the SynthComponent class: MonoSynthComponent and StereoSynthComponent. These are the classes that you will derive your synth components from. They represent synthesizer components with monophonic output and stereophonic output respectively. Both classes have a GetBuffer method. In the case of the MonoSynthComponent class, the GetBuffer method returns a single dimensional array of type float. The StereoSynthComponent's GetBuffer method returns a two dimensional array of type float. In both cases the array is the underlying buffer that the synth component writes its output to.

The EffectComponent class is an abstract class representing functionality common to all effect components. Effects reside at the global level; they process the output of all of the Voices currently playing. At this time, the toolkit comes with two EffectComponent derived classes: Chorus and Echo.

The EffectComponent class has the following methods and properties:

Methods

Process

Reset

Properties

Buffer

The Process method causes the effect to process its input. In other words, Process causes it to apply its effect algorithm to its input. The Reset method causes the effect to reset itself. For example, Reset would cause an Echo effect to clear its delay lines. The Buffer property represents the buffer the effect uses for its input and output. Effects should read from its buffer, apply its algorithm, and write the result back to the buffer.

Voice Class

The Voice class is an abstractclass that represents a voice within a synthesizer. You will derive your own voice classes from this class. The Voice class is derived from the StereoSynthComponent class, and overrides and implements the SynthComponent's abstractmembers.

The Synthesizer class is the heart and soul of the toolkit. It drives synthesizer output by periodically writing the output from its voices to the main buffer. It also provides file management for a synthesizer's parameters.

The SynthHostForm class provides an environment for running your synthesizer. Typically, what you will do is create a System.Windows.Forms based application. After adding the necessary references to the required assemblies, you derive your main Form from the SynthHostForm class. It has the following members you must override:

Methods

CreateSynthesizer

CreateEditor

Properties

HasEditor

The CreateSynthesizer method returns a synthesizer initialized with a delegate that creates your custom Voices. This will become clearer in Part II. The CreateEditor returns a Form capable of editing your synthesizer's parameters. And the HasEditor property gets a value indicating whether you can provide an editor Form. Providing an editor Form is optional. If you don't want to create an editor, you can rely on the SynthHostForm to provide a default editor. If no editor is available, CreateEditor should throw a NotSupportedException.

The Voice class treats its synthesizer components as nodes in a directed acyclic graph. Each component has an Ordinal property (as mentioned above). The value of this property is one plus the sum of all of the Ordinal values of the components connected to it. For example, an LFO component with no inputs has an Ordinal value of 1. An oscillator component with two inputs for frequency modulation would have an Ordinal value of 1 plus the sum of the Ordinal values of the two FM inputs. Below is a graph of a typical synthesizer architecture. Each component is marked with its Ordinal value:

When you create your own voice class, you derive it from the abstract Voice base class. You add components to the Voice with a call to its AddComponent method. The Voice adds components to a collection and sorts it by the Ordinal value of each component. As the Synthesizer is running, it periodically calls Synthesize on all of its voices. If a voice is currently playing, it iterates through its collection of components calling Synthesize on each one.

Because the components are sorted by their Ordinal value, the order in which a component synthesizes its output is in sync with the other components. In the example above Oscillator 1's frequency is being modulated by LFO 1 and Envelope 1. The outputs of both LFO 1 and Envelope 1 need to be synthesized before that of Oscillator 1; Oscillator 1 uses the outputs of both LFO 1 and Envelope 1 to modulate its frequency. Sorting by Ordinal value ensures that components work together in the correct order.

This approach to organizing and implementing signal flow through a synthesizer was very much inspired by J. Paul Morrison's website on flow-based programming. The idea is that you have a collection of components connected in some way with data flowing through them. It is easy to change and rearrange components to create new configurations. I'm a strong believer in this approach.

The download Visual Studio solution for this part is the same for Part II and Part III. It includes the toolkit plus two demo projects. One is for a simple synthesizer. The other is for a "Lite Wave" synthesizer that is quite a bit more complex.

The toolkit is dependent on my MIDI toolkit for MIDI functionality. The MIDI toolkit is in turn dependent on several of my other projects. I have included and linked the proper assemblies in the download, so the Solution should compile out of the box.

This has been a brief overview of the synth toolkit. Hopefully, I will be able to improve this article over time as questions come up. It has been challenging to write because there is a lot of information to cover. On the one hand, I have wanted to give a useful overview of the toolkit. On the other hand, I did not want to get bogged down in details. Time will tell if I have struck the right balance.

If you have an interest in softsynths, I hope I've peaked your interest enough for you to continue on to Part II. It provides a more in-depth look by showing you how to create a simple synthesizer using the toolkit.

License

Share

About the Author

Aside from dabbling in BASIC on his old Atari 1040ST years ago, Leslie's programming experience didn't really begin until he discovered the Internet in the late 90s. There he found a treasure trove of information about two of his favorite interests: MIDI and sound synthesis.

After spending a good deal of time calculating formulas he found on the Internet for creating new sounds by hand, he decided that an easier way would be to program the computer to do the work for him. This led him to learn C. He discovered that beyond using programming as a tool for synthesizing sound, he loved programming in and of itself.

Eventually he taught himself C++ and C#, and along the way he immersed himself in the ideas of object oriented programming. Like many of us, he gotten bitten by the design patterns bug and a copy of GOF is never far from his hands.

Now his primary interest is in creating a complete MIDI toolkit using the C# language. He hopes to create something that will become an indispensable tool for those wanting to write MIDI applications for the .NET framework.

Besides programming, his other interests are photography and playing his Les Paul guitars.

This synth is awesome and the information in the article has been fantastic. It's reignited my interest in synths!

I have been able to get things running under Win RT but my real passion is Windows phone. It seems though that Sanford.Multimedia and Sanford.Wave will need a little TLC to get them working.

Understanding that the author is a busy person with many better things to do than worry about my Windows Phone synth aspirations but I am hoping that he might hear my plea and promises of no further questions and let me see if I can brig them up to date by sharing the original source.

I hope this helps. I'm not sure what's actually in that file. I just happened to find it on my server (I had a hunch I uploaded something there years ago). I need to do some archaeology through some of my older hard drives to really get all my old Code Project stuff together.

The wave stuff relies on interopting with the Win32 API. This may not work for the phone API. Maybe it has an audio API you could use in its place.

As you might or might not be aware, Windows Phone 7 lacks MIDI support, which is one thing my app desperately needs. I was part of the way down the road of writing my own synth when I came across your Synth Toolkit and MIDI Toolkit. While I haven't tried them out yet, it looks like they'll save me a ton of time. Thanks man!
Mike

When I saw it was you, Leslie, that authored this I was hoping so much that it would be VST compatible. While I'm disappointed that it isn't, I also know how much of a pain in the a$$ it would be to do so.

When you are ready to tackle that bit, there are plenty of resources on the net to help you out (after you register on Steinberg's website to download their SDK obviously). Readers who are looking for a C# VST SDK can try http://vstnet.codeplex.com/[^]. Disclaimer: I've never used this myself.

You might know the Java Synthesizer API. Java supports Soundbanks(i.e. SF2, Wave) which contains Wavetables for several instruments (i.e. General Midi).
Which WaveTable format do you use in this project? I want to create a custom wavetable for playing a general midi instrument. If I use an existing SoundFont (SF2) and extract all wave samples from it (using Java), how can I use those samples within this project?

Which wave samples does the library need to synthesize? Are there samples for each octave, note or something other?

Hi Leslie.
I can't get it running under Windows 7 64bit. As I try to play notes on the virtual keyboard no sound appears. (no error or exception is thrown).
As I try to close the application the window doesn't respond wihtout any default windows "not responding" mark. I simply can't click on any elements or move the window. The same problem occurs if I try to load any other program.
Can anyone reproduce this error?

[Edit]
As I pause the application during debugging, the application stopps on the Synthesizer.Stop() method on the code piece outputDevice.Reset().

[Edit2]
It seems to be a problem using the 64bit DirectX Audio Library. If I update the project settings to compile for x86 the application works again.

Is it possible to use AUto PItch Correction(Auto Tune) with this code sample? I am trying to find a way to take a sound stream and AutoTune it, for a fun Windows Moblile app. But I know nothing about sound manipulation.

I have been looking for a soft synth implementation in a nice high level language so I can learn the basics by example, this is a really amazing piece of work. I was amazed that you put this ammount of really amazing work out in the public domain like this, many thanks!

It doesn't seem as though i can catch those events fired from that window form when its in view, I tried both KeyPress and KeyDown and neither of the two allowed me to extend the keyboards functionality into the program. This would be a really impressive addition to the project, any ideas how I can implement this?

First of all, I want to say that I love this tutorial series, and i'm looking forward to part III.

I've been tinkering with the idea of creating a softsynth which plays realtime based on peripheral input (any peripheral, but I'm currently thinking Wiimote). The softsynth will play along side a finished musical track, and will know what pitch of the track at any time (probably through some sort of cue file). By manipulating the peripheral, the softsynth will change the pitch to different harmonics of the pitch of the song. Your modular design seems a perfect start for this .

First of all, I want to say that I love this tutorial series, and i'm looking forward to part III.

Thank you!

Michel de Brisis wrote:

I've been tinkering with the idea of creating a softsynth which plays realtime based on peripheral input (any peripheral, but I'm currently thinking Wiimote). The softsynth will play along side a finished musical track, and will know what pitch of the track at any time (probably through some sort of cue file). By manipulating the peripheral, the softsynth will change the pitch to different harmonics of the pitch of the song. Your modular design seems a perfect start for this

Sounds cool! My goal was to design the toolkit in such a way that it can be easily expanded. It will be exciting to see what folks come up with, ideas like yours that I wouldn't have thought of on my own.

but I had to up the buffer size to 2048 to prevent popping (this is on a Turion X2 laptop).

It's strange, sometimes I can get away with 1024 as a buffer size without popping, then at other times I have to bump it up to get smooth playback.

BTW, there is a default bank file in the Release folder in the bin folder. Copy this to the debug folder. It was originally there, but I think the Code Project editor deleted some files (nothing critical) to make the download smaller.

The toolkit relies on MDX (Managed DirectX) for waveform output. There are two problems I've encountered with this. The first, and by far the worst, is that the output occasionally begins glitching. This is usually in response to a window event, such as moving a window or clicking on a control. When this begins happening, I will move a window around randomly until playback resumes normally. Very strange. And I have no idea for a solution.

The second problem is with the DirectX effects. They add a ton of latency and there's no way to tap into their output in order to record the results.

Today, I tore out DirectX of the toolkit (I still have an earlier version with it in), and replaced it with a custom library I wrote: Sanford.Multimedia.Wave. It includes classes for waveform output (as well as writing and reading wave files). I've been playing around with it tonight, and it seems more stable than DirectX. It has slightly more latency, the lowest I've been able to get is 20ms, which is not bad.

As far as the effects go, I've added functionality for you to write your own. I've added to effects, a stereo delay and chorus, to the toolkit. Unlike the DirectX effects, they don't add a lot of latency and more importantly, you can record the output from them directly to a wave file.

I'd definitely ditch MDX. I checked the August 2007 sdk, and it is still there, but Microsoft never got it out of Beta, and unfortunately it looks like they never will. They do have a new audio architecture in the aug 2007 sdk called XAudio2 that looks very promising, and it seems to address several of the issues you have. There's no Managed wrapper for it yet, but it's purely COM and should be fairly easy use in c#.

I'm writing an audio related app at the moment and I've been find the Audio side of things really difficult to settle on. At the moment its using DirectX and I'm so not happy with it as it is. Ideally I'd like to go ASIO. I found this article of yours really interesting. Do you have any plans to go to ASIO?

I have zero experience with C++ or wrapping a C++ lib for C# so I'm looking at all the alternatives I can find at the moment.

I'm going to take a good look at this XAudio2 ...

Just out of interest what are the details of your multimedia.wave namespace?

Just out of interest what are the details of your multimedia.wave namespace?

It's a simple wrapper around the Win32 multimedia wave API for sending waveform data. I originally wanted to write an article about it which would include the code for download, but unfortunately I haven't been about to get around to it.

I know what you mean about ASIO ... I have to have it in an application I'm making as I need to properly support low latency 24bit auiod upto 192khz !!! I keep putting it off tho, I've hardly even looked at C++ code let alone wrote any.

Finding information about supporting 24bit audio is very thin on the ground. I'm hoping that XAudio2 will do it out of the box from C# so I can use that until I've cracked ASIO.