Programming Notes

Pages

Sunday, 18 January 2015

From early on in our courses we were warned of a certain group design project. Older electronics students would share stories of it like war veterans. Even younger lecturers would be able to recall their project experiences. Last February / March it was my turn to face this task: the D4 group design exercise. This was probably the most intense 3 weeks of my degree so far, and I can't really see it being beaten. The sheer volume of work, incredibly short time frame and outrageous specification ensured that we were working full-out right through to the last second. Although we have much larger third and fourth year projects, we get an entire year for both, so the workload is spread over a much longer time (for most students). Reflecting over the project after having time to recover, we did manage to achieve some cool things.

D4 always starts off shrouded in secrecy. First, a backronym is released, giving us absolutely no idea what we'll be doing. For us, the acronym was "BOOMBASTIC". A week later, everyone attends a lecture where the project is revealed, the specification given and deadlines set. BOOMBASTIC turned out to be "Body-Operated One-Man Band with Amplification, Storage and Transmission Integrated Circuits". Apart from "Integrated Circuits", this was actually a decent description of what we had to build - in groups of 6 we had to create a portable audio device for a street performer, with the spec demanding a ridiculously large number of features. The device had to amplify its output, be battery powered, stream audio wirelessly and support saving and playback of performances - all to be flawlessly implemented in 12 days. Fortunately I don't think I could been given a better project, considering my interest in audio programming.

I was part of team "Eminem" and one of the others in our group thought up a cool concept - what if we turned a computer keyboard into a musical keyboard. We envisaged a small module that had a USB port, allowing any standard keyboard to be plugged in and used as a musical instrument. We split the design up into separate tasks: power supply, amplifier, wireless streaming, keyboard interface, peripherals and the synth software. I took up the software responsibilities, basically writing a synthesiser from scratch to run in real-time on a Raspberry Pi. Since the synth was quite key in having a product that did anything interesting, I had a fair amount of pressure to get my part working to some degree. Everyone had a responsibility to design and get a working module though, the idea being we could work as independently as possible.

There were several key parts to my software design. I used a relatively simple method of audio generation called "wave table" synthesis - in this method we calculate and store a single period of any waveform. This means the wave table can store something as simple as a sine wave, to something as complex as a piano sound. By reading from the table, which is just an array in a program, at different speeds we can generate different frequencies. This is intuitive if we read it back at nice multiples e.g. 2x or 4x. For all other frequencies, there will be some aspect of interpolation, which is just guessing the value between different samples. For this project high fidelity audio wasn't really an aim, so I settled with linear interpolation. My understanding of this technique came from the book I used to self teach audio programming - "The Audio Programming Book" by Richard Boulanger.

In my design I decided on having two wave tables - this allows you to do lots of cool stuff like amplitude modulation (AM), frequency modulation (FM) or just mixing between the two tables. I also wanted enveloping of the output - so notes could have attack and decay. This was supported by also having two envelope tables - sets of points corresponding to the envelope of a given note. Finally, for maximum flexibility, I included two extra tables called "tone maps". This maps every ASCII letter to a frequency - by default these frequencies correspond to the Western scale using equal temperament, but this table allows the synth to make pretty much any noise.

The basic idea of my design.

After a few all-nighters I ended up creating the name "ASCIIboard" and it was adopted for our project. For some reason we liked the idea of using ASCII codes, which were essentially just named byte codes, to represent different operations. As a team we came up with a specification for what each letter would do, with a large chunk reserved for producing musical sounds. All the peripherals talked to the Pi using a really simple protocol where single characters (so essentially just single bytes) were exchanged. One of the most important principles of the software is everything can be done using these codes. There is a simple core API, which accepts characters and does everything based off these. Any user code or higher level processing is simply producing these codes and feeding it to the API. I spent a while on the core API trying to make sure it could do anything using this set of 127 unique codes. But by only using these codes, it's possible to completely reproduce the output of the synthesiser anywhere - on a Pi or even on a PC, just by feeding the ASCII codes into the synth.

Polyphony was probably too ambitious with the time given, but I pretty much had it working by the end. Polyphony basically means being able to produce multiple notes at once, think of chords on a piano. The first attempt I didn't realise how hard it would be, so the solution was buggy. A lot of exception faults later, I stopped and thought hard about what I done. I then came up with a data structure and way to manage the polyphony which would be a bit more robust. Although there are only 2 wave tables, we also need oscillators to generate audio. For the uninitiated, an oscillator is a device that generates a single frequency, which might be in circuitry or software. In this case an oscillator is simply a structure containing position in a wave table, with an amount to increment on each sample. If we have say 5 oscillators, we can generate 5 different notes simultaneously. More oscillators will however equal more computation for each sample. Although it would be convenient (but sound awful), the Pi couldn't support enough polyphony to have every note sounding at once - this meant I had to set a limit on the number of notes sounding simultaneously. With a limited number of oscillators, I had to manage which oscillators were used and which oscillators were free. I had to track which oscillator belonged to which note. In the end I used a linked list based structure to track available and used oscillators, with the oscillators returned to a pool after a note is released and requested when a new note pressed.

The core API design.

Probably the most outrageous and unnecessary part of my design was the ability for a user to add in their own features and functions. The processing was organised to use a list of callback functions, so that a user could insert their own code into the program and reconfigure the synthesiser to do anything. This wasn't really needed because no other user will probably ever use my code, and there wasn't really any credit received for this feature. But basically, the software could continue to be extended fairly easily with no modification to the underlying code. The engine even made

I used this design to implement some useful example functions. One of these functions was tremolo - this was achieved by writing a sine wave (or triangle wave) to the second wavetable and performing AM. Remembering the principle the design is based on, all this function does is generate a stream of character codes which can feed into the synth. Any notes the user now plays, which pass to the synth core, will have tremolo applied (sounds like a note fading in and out quickly). Vibrato is another useful one, which is slight pitch varying, done similarly with a sine wave in table 2 and FM of the first signal. Hopefully it's kind of clear now how lots of things can be achieved with this really simple design.

Finally, just in case I hadn't already attempted enough, I added some basic filtering in as well. I implemented a low-pass Butterworth filter which could be added to the output of the synth. This was an example of a feature that couldn't really be done using simply by generating characters. An attempt was made to add a (dynamic) compressor to the output, as I found the synth was really quiet with a single note sounding but really loud with lots of notes. I realised I was being stupid and just scaled the output by the number of notes playing; a compressor is maybe another project for another day.

From this I probably make it sound like this project was overall straightforwards. It's quite easy with hindsight to make it sound like I designed stuff and it worked perfectly, unfortunately this wasn't the case (and never is for an engineer). The first part of implementing was to get the synth core API working on a PC and producing a audio file output, so not running in real time. I could feed a few characters in, step through with a debugger and examine the audio output with Audacity. Without this step I probably wouldn't have got anything working. However, moving the design to the Pi was incredibly frustrating - I had a pretty cool synth module working on a PC but for a few days the Pi couldn't even produce the simplest sine wave tone without sounding mangled. There was also a much bigger learning curve for ALSA (sound in Linux) and getting PortAudio to run on a Pi. One of the biggest challenges was audio glitches on the output. The final solution was to use the callback mechanism in PortAudio, and also to use the circular buffer structure supplied by the library. This circular buffer structure is thread safe and it just works, unlike my buffer. All the buffer had to do was store a set of generated samples, which was both written to and read from in blocks of samples.

An oscilloscope trace of the glitching before the fix - in this case I was hoping for a sine wave.

The main difficulty for me was the tight time constraints. In a
perfect world I would have liked to have programmed an ARM processor or
at least used a real-time Linux kernel. Sadly, there was no time to
waste tinkering with new hardware - the
Raspberry Pi was a compromise to a truly embedded system which wasn't that different to developing on a PC. Also,
quite early on we had to ditch any hope of using a USB keyboard and hacked an old
PS/2 keyboard instead. The final system had a lot of unnecessary
electronics, partly because of the way we had to split it up into
isolated working parts. D4t taught me a lot about compromise and
failure, and the consequences of those when you're doing a group
project. The project was even introduced to us a our "first lesson in
failure", so it wasn't a surprise lots of things weren't successful.

Perhaps the biggest failure for us was the final stage - integrating all the parts together. In the last week I easily put in 13+ hours a day, with the final night being my first real "all-nighter" coding. Although not everyone was as keen as me, almost all of us put in a lot of work, and had a lot to do. Because of this, we all kind of forgot that we would at some point need a complete unit, with parts working together. Fortunately, because of the simplicity of the protocol, we did get some of the bits communicating. Most importantly, the keyboard worked and we had a design that made sounds. This integration was all done within the last couple of hours, with frantic work literally done in the final minutes of the project. A week later, each team was allocated about an hour to set up a demonstration of the project. This ended up, for us, being a frantic hour disassembling our nicely packaged, but slightly "bricked" design, so that I could hack together the parts that all worked reliably. Luckily for the team, the markers were happy with our end result, even if it was far from where we'd have liked it to be.

For me the final important lesson was that as electronics students we are now capable of making something really cool if we just take the initiative. We have the tools now to be able to come up with concepts, designs, then source components and quickly prototype something. I'm going to leave a few resources for the interested. Just for a bit more information, here's a link to my project report. Here's a link to the code just in case anyone has a vague interest in looking at / hacking the code themselves. Finally, here's a link to the photo album for D4 2014 (which features all the groups in my cohort). There were some other interesting and impressive projects, from gloves that produced piano sounds, to laser harps, to a print out, playable, paper keyboard.

Friday, 19 September 2014

The amount I learnt in first year of university was ridiculous. The course touched on almost every area of electronics - maths, physics, solid state, digital design, programming, communications, to name a few. In second year we were we were trusted to go and hone these new skills, mainly in the form of larger group projects i.e. the D4 project.

Most of the modules were advanced something - continuing on from the work of last year. One of the most interesting new subjects was a module on computer architecture. At the start I had a vague idea of things like instructions, but by the end I was up to speed on computer design, concepts such as processor pipelines and caching.

The first major project was designing an integrated circuit (D2). There was a specification of a few modules we had to implement - things like a 4-bit adder, ring oscillator and an 8-bit counter. As a team we designed and built the different modules. Then, at the end we fudged together some kind of final IC and realised it wasn't all going to fit. This was the first real taste of having to work as a team, and having a fairly tricky deadline. There were plenty of stressful moments and staying in the lab until 11pm (even on a Sunday...)

The circuit designs were done by hand, although someone in the class discovered an amazing tool - logic Friday - which is a free program that minimises logic for you. Once we knew which gates we needed, we started simulating in OrCAD. Simultaneously, we were laying out the designs on an IC using L-Edit, putting down the tracks and trying to cram the designs into the smallest possible space. The whole work-flow at this point was pretty nasty and clunky. There was a lot of exporting hundreds of files to use the different tools, for example to simulate the IC design by creating a PSpice model. Also, the L-Edit middle mouse button fixation was fairly frustrating. However, this was our first experience with the tools, we all rushed into the design without spending much time to get accustomed to the programs we were using.

Why I had no life for about a week - everything was done by hand

This was early on in Semester 1, fast forward quite a few months (including winter exams) and we had a physical IC with our design etched on. In Semester 2 we had to test it, this meant writing a set of test vectors (inputs and expected outputs) which could test the chip for any faults.

D2 was tough, but the main highlight of 2nd year at Southampton was the main group project D4. D4 is a fairly crazy 3 weeks of designing, implementing and presenting, and a traditional part of second year at Southampton for electronics. D4 is going to need its own write up to do it justice, but in summary we had to build some kind of musical "thing", ideally used by a street performer - meaning it needed to be portable, have built-in amplification and the possibility of being battery powered. There was a whole range of projects from different people but ours was essentially a (computer) keyboard operated synthesiser, all processing done by a Raspberry Pi.

After D4 it wasn't long before summer exams, which were all fine and there weren't too many surprises. Generally, I find focussing on exams uninspiring, although it forces you to properly understand all the material you've neglected. For me, there was a fairly big chunk of knowledge missing where lectures had been going parallel to the group projects.

Sunday, 7 September 2014

It's been ages since I put anything up here, so I'm going to have to account for the last couple of years. There was some unfinished stuff on my first year at uni, so I guess that should go first.

The most important outcome of last year was definitely passing my first year of university. First year was awesome I have no regrets about choosing electronics. The balance between software and hardware is exactly what I was hoping for - there was lots of programming along with circuit theory, semiconductor physics, control theory among other things. The course was relatively intense; there were often 9-5 days of labs and lectures, and a couple of projects at the end of each semester. Not only did I have to learn a completely new set of skills, I had to learn to cook, survive student nights out and in general learn to live away from home. It's crazy to think that in the first lab I barely new how to use an oscilloscope, but by the end of first year we were building and testing complex circuits, designing PCBs and testing solar cells.

ECS - The cause and solution to all my problems

In terms of C and C++ I took a lot out of the embedded programming part of the course, which was an area I hadn't touched before. In one of the first labs we had to build a sturdy microcontroller based off an AVR ATEGA644P chip. In later labs we were given tasks to do with this board, like make a simple heart rate monitor, or make a tiny speaker play a given tune. Lots of twiddling was done at the bit level, and hours were spent looking through the monstrous data sheet trying to find what a specific register did. The final project was definitely satisfying - we had to create a voltage boosting circuit which could take a dynamic load. To do this we had to use the microcontroller as a control system, constantly adjusting some pulse and monitoring the output voltage. We also had to have communication between a PC and the board so that a new voltage could easily be set.

One really interesting area that was covered was programmable logic devices (PLDs) and Hardware Description Languages. We were taught some SystemVerilog and given a few CPLDs to play with in the lab. Although the stuff we did was basic, it was a completely different way of programming. We had to start to think in terms of sequential and combinational logic, where combinational logic is all done in parallel.

The biggest worry on doing this course was that I would lose my enjoyment of programming. When someone is telling you that you have to do something, psychologically that seems to change how we all approach it. Being spoon fed, made to learn stuff for exams, made to rush some coursework you left too late is not fun, and you get so little out of it. It did feel a little that when I was doing some programming labs, in the stress of constant marking and assessment, I wanted to do the minimum needed and go home. However, I think that overall, I was given opportunities I wouldn't have had if I hadn't chosen this course. Most importantly, I learnt how to start applying my skills to real-world applications. I got to see my code make things physically happen, solve real problems, and I guess at the same time get me a few marks towards a degree!

Wednesday, 31 October 2012

For a presentation that I've got to do for university I've returned to the project I did writing a Windows application to show the Mandelbrot set. Because I couldn't cram everything into 7 minutes of Powerpoint presentation, I wrote up a little background.

Wednesday, 17 October 2012

Hopefully you've seen my post on my phase vocoder project. Otherwise there won't be much motivation for this slog at the theory; when you have this much maths and theory there better be a good reason for it. Fortunately, the phase vocoder can do some extremely cool stuff, like shift the pitch of some audio while keeping it at the same speed. If you haven't read about Fourier's theory, or the Fourier transform, look back to simpler times. Without understanding this you really won't understand the next bit.

A FFT is pretty useful for breaking down the frequencies in a snapshot of audio. However, a snapshot is just a snapshot - capturing an instance in time. Generally we'd be talking a transform size of 1024, up to maybe 4096 samples. You might wonder why you can't just take a transform of an entire 3-4 minutes of audio, and then use that for all your spectral needs. While that transform will have astounding number of spectral points - i.e. really super "spectral resolution" it will have awful "time resolution" - i.e. you have no way of seeing how the frequencies change over time. You wouldn't be able to pick out the distinct thumps of a bass drum or the squeal of a 30 second guitar solo, because it's squished down into a generic spectral mush over the whole song.

What would the intuitive next step be, if we want to see how the frequencies change in time? What about taking two transforms, with the first transform of the first half of the audio and the second transform the second half. Now we have half the number of spectral points but we do have some time resolution i.e. can see how the spectrum changes over the audio. Moving on, the natural progression is to make the transforms smaller and have more of them. You could make the transform size 1024 samples big and then just split the audio into tiny sections. After taking loads of Fourier transforms, you end up with a load of spectral data. You will now be able to see how the frequencies change over time, with good enough time and spectral resolution. This is the beginnings of the STFT - the Short Term Fourier Transform. A good visual way to imagine this process, is as a spectrogram (often a set of bars that light up, indicating frequency levels) on a CD player or in a Digital workstation.

It would be nice if the STFT was as simple as slicing up audio and shifting the magical FFT along, but there are catches. Because the FFT abruptly slices audio, but then assumes that snapshot is periodic, you get some nasty spectral leakage. The phenomona is illustrated below.

How the Fourier transform mauls your signal

Spectral leakage is the bane of DFT processing. Discontinuities are pretty much guaranteed when transforming an arbitrary signal. There is no perfect solution to this, but there is a very good one. Often you'll here mention of the "transform window"; the window is the samples that the DFT is currently “looking at”. Leaving the samples alone, this window is called a rectangular window. Ultimately we want to get rid of those jumps at the sides of the window. To do this we use a windowing function, a function designed to smooth out discontinuities while not affecting the frequencies too much.

To apply the window you simply multiply the input sample with the corresponding sample of the windowing function. There are many other windowing functions e.g. Blackman, triangular, Gaussian. The two most popular windowing functions are the Hann window and the Hamming window. They’re almost identical, apart from a slightly different constant. The formula and pictures of what they look like are given nicely on Wikipedia.

Windowing however ruins the audio a bit, and on reconstruction the audio will sound like it's coming in and out really quickly (amplitude modulation), not really what you wanted. The solution is to overlap the snapshots enough, so on reconstruction the original signal can be rebuilt fine. For most windows the overlap required is an overlap of 75%. Often, for the STFT we refer to the "hop size", i.e. how many samples you "hop" the window over each time. If N is the transform size, a common hop size is N/4 or maybe, if you can affort the extra data, N/8.

The STFT, pictorially (drawn badly)

The STFT is useful for some spectral things, for example the cross synthesis project I did earlier, but not capable of shifting pitch or stretching time without distortion. However, with only a small modification of the spectral data, we can – this operation is the heart of the phase vocoder algorithm. All will be revealed in the next installment.

Saturday, 13 October 2012

I find that when I learn stuff, a good way to not forget the stuff is to write notes. I write notes on the things that took a long time to get, the things I found interesting and the answers to the questions in my head at the time.

As a result, I've written a pretty ridiculous amount of notes for all the stuff I've learnt on Digital Signal Processing (DSP) - about 10,000 words or 37 pages, done in a window of extreme boredom in the summer holidays.

I'm not sure how useful these notes will be to anyone else, as the notes won't be as thorough as a well written textbook. Anyway, here they are. Enjoy.

I've recently moved to university (college for Americans) and so life has been a bit hectic for the last 3 weeks. The course is great (electronics) and I'm sure now that engineering suits me better than computer science. Anyway, the result of settling is I've done hardly no programming, apart from the introductory C labs they give to everyone.

A side note is I can appreciate how hard the programming aspect will be to a complete newcomer. I'm not completely sure about lectures on programming and how much people will get out, but who knows? The practicals seem to throw people in the deep end a little, and there doesn't seem to be much critical feedback - the staff are teaching good style and methods, but I'm not sure what will stop people developing bad habits.

The lack of activity will finally let me catch up to where I am in (audio) programming. I've only just briefly started looking at Csound, which is kind of like learning a new language in itself. The last major project I did though was creating a phase vocoder. This was probably the most exciting thing I've done with C++.

In short, the phase vocoder takes breaks some audio down into the frequencies that make it up (spectral data), using a STFT, and then converts the data to a form which means the data is independent of time. Once you have data independent of time, you can play the audio back at any speed you want and keep the pitches all the same (time stretch). Once you can time stretch, you can resample, bringing the audio back to speed but now with the pitch raised or lowered.

Maybe as motivation for the theory, here is the output of my phase vocoder on some audio samples.
I'm going to follow up with the theory and pseudocode in the near future. Another pretty cool application of the phase vocoder is performing a "spectral freeze", where you hold onto a single spectral frame of data. The result sounds pretty unnatural. The phase vocoder can also cross two spectra, and even interpolate between spectral data.

I got a fair number of hitches in the design process. "The Audio Programming Book" has quite a compact implementation, in typical C style. I wanted to use C++, and I wanted to create something much more general, and expandable. I created classes to encapsulate audio signal data and spectral data - this worked well because it followed the RAII (Resource Allocation Is Initialisation) principle, and as a result I got no memory leaks in my program. However, this layer of complexity did have a performance impact every time you wanted to actually access the data. The solution was to make a function that passed the internals out (sounds like bad practice but I'm pretty sure it's okay for that kind of object) and allow the buffer array to be accessed normally. I also had a class for the phase vocoder itself,

Once I'd fudged something together, obviously the output was nothing like it was meant to be. I searched for more answers online, but was met mostly with inpenetrable articles from journals. I intensely debugged my code, fixed some problems but still wasn't getting output of any value. Probably the most frustrating part of debugging is when you find a bug but then the program still doesn't work.

In true programmers' fashion, after testing every function in turn and stepping through almost every line, comparing different outputs with manually calculated values I pinned down the problem and slapped myself. In the tiny function "phase_unwrap", which brings the phase (just an angle in radians) down to the range -pi to +pi instead of adding / subtracting twopi, I was adding / subtracting pi. This changed the phase and this tiny change was ruining the output. I guess this acts as yet another reminder to always, always, check every single function you write and run some expected input and output through. The moment you assume a function is too trivial to get wrong is the moment you set yourself up for hours of debugging!