24 November, 2011, 11:26:44 PM

Here is my situation. I am trying to playback sampled audio on an old Mattel Aquarius. The computer has the ability to put out only a high value or a low value to the audio channel. It is very similar to the PC speaker in this regard. I have been toying with different ways to get sampled audio out of the machine, using a PC, Cooledit and a program that I wrote to do conversion and playback. The absolute max sampling rate that I can get out of the machine is 50,700 Hz, although at that rate I really am not allowing the audio signal to max out so it can be very quiet. I have created a 1 bit encoder that takes an 8 bit unsigned .RAW file as an input.

The first thing I do is downsample my target audio to 8 bit mono, then save it as a raw, my converter then takes that raw file and converts it to a 1 bit audio data stream. I want to create audio files that I can use with my video player, and since I am limited to 1 megabyte of storage total (bankswitched cart) and I need to use most of the data storage for my video frames, I really need a way to create half-way decent sounding 1 bit audio. In addition, I would like a method to optimize higher sampling rate audio for other purposes (or to just see how "good" the audio could get) but I will limit this question to the task at hand. First, lets start with my source file:

So, straightforward enough... Now here is the output from a simple program that takes every 8 bit unsigned value greater than 127 and makes it a 1 and every value at 126 or lower and makes it a 0. No distribution of the error or dithering is performed. In order to play it back effectively, I converted the output to an 8 bit wave with values of 0 and 255 respectively:

Inside my program I built in the ability to distribute the quantization error. It works like this:

Quantize the current sample (to 0 or 255) and take the difference.Add the error to subsequent samples prior to their quantization, distributing various amounts.I also have the ability to add dither in the same manner, random noise at up to 255.I have tried to learn about noise shaping and dithering, etc, but frankly, at this low a sampling rate I am not sure what applies.Here is an example where I take 1/3 of the quantization error and apply it to the next sample, or in the form of

Here is what I would like to know. Given a frequency of f (in this case, f=7,061) how do I get the quantization noise to be more focused at a more inaudible frequency. I understand that at 7,061 the max frequency I have is 3,500 hz, but I know from looking at human ear frequency response curves that there is a dip at around 3,000 hz (if I understand it correctly) and I would like to "push" the quantization distortion to the frequency.

Anybody have any good ideas? Also, any thoughts on a different way to approach this would be great.

So at a sampling rate of 7061 I am toast - unless there is someway to move the noise BELOW 100 Hz - is that even possible? At this point all I have done is played with trial and error, and I realize I won't get perfectly clean sound, but I'd like to figure out how to optimize it - some super sound genius must know how?

What you say remind me of what some old audio programs before 1995 did on MS-DOS.

Doing only 1bit audio as you are intending won't get you much further than were you are. Dithering will not help at all, since the noise you're dithering is at the same level than the sound you intend to hear. Dithering adds noise, remember it.

You also cannot do any worthwhile noise shaping at that frequency. You said the machine supports up to 50Khz but for some reason you're using 7Khz for audio. The higher frequency might help on the noise shaping, but, again, you don't have enough resolution for a good solution.

Now, what did the programs that I mention do?

There are two solutions, depending on what the hardware allows you to do:

If you have some sort of volume that can be controlled by software, you simulate the bits using the volume control. This alone would be better than anything you've got until now.

The second solution is some sort of pulse width modulation. At 50Khz maybe you don't have too much of a margin, but the idea is to control the volume (so, the bits) by graduating the pulse width. Maybe you could get 4 or 5 bits with this solution at 50Khz and playing up to 10Khz of audio.

Both ideas require the audio to have the desired amount of bits (i.e. it is a solution applied on playback). With a single bit, you have no solution.

Yes. No point moving the "noise" down towards DC IMO - it won't change per sample so will make things worse.

What you can do is process the signal before re-quantisation using DRC, EQ etc to make it as loud as possible beforehand, so maximising the limited SNR.

If you were using 50kHz, or anything close, then you could use noise shaping quite effectively - though the amount of ultrasonic energy you might create could damage the speaker. There would be less damaging but still quite effective compromises though.

EDIT: if parts have to be quiet, a little dither might help. Not the right amount (you don't have room for it!), but just a little.

For just playing back audio - I could use the 50,700 Hz, but to play audio and video at the same time, I have to use a lower sampling rate. The idea of making the audio louder before encoding sounds like the best bet. I'd still be interested on how to implement noise shaping at 50,700 Hz for other audio playback purposes.

I did implement PWM, but I run into the same problem, I really can only get 11 effective PWM "levels", but I end up with a really annoying piercing whistle (or buzz, or square wave) rather than distortion noise. The distortion noise seemed more pleasant.

Are you sure that binary on/off is your only option? Can't you employ its 1-voice synthesizer to switch tones on and off?

I am sure something else is possible - but I have digital audio. All I can do is send a 1 or a 0 to the audio out port - that's is, full-stop. If I send a series of 1's and 0's a fixed frequency then I get a square wave, but that doesn't help me with sampled audio...

If you can actually run at 50.7kHz, that provides opportunities for increased fidelity. What would make sense to me would be to take an integer fraction of 50.7kHz and sample at that. Let's use 5070Hz (ie. 50700Hz / 10) as an example. Now, you do a trivial upsample to 50.7kHz by adding 9 zeros after each sample value, *then* you dither. Using noise-shaping, you can probably push some of the noise out of the audible range and improve your fidelity.

This might take too much CPU time, but it keeps the amount of audio signal data down, and improves fidelity.

This approach isn't restricted to 50.7kHz; you could likely do it for some other arbitrary sampling rate. The point is that the higher your functional sampling rate, the further up the frequency spectrum you can push the noise, which improves perceptual fidelity. My point is: More sampling rate = more better options for improving fidelity.

Ok, so, lets talk specifically about how to accomplish that. I have 50,700 Hz to work with, and I don't mind preprocessing the signal, in fact, that is my preference. The 50,700 is the absolute max I can playback at - my source material is typically going to be 44,100. What math do I do to the signal to shape the noise up to 25,000 Hz and what math do I do to add dither to the sample?

What if you were to convert to 50,700 Hz Direct Stream Digital (aka 1-bit Delta-Sigma Modulation) and play that directly. If you can run at that speed, you might be able to realistically get about 4 or 5 bits of precision and still retain a reasonable bandwidth.

I think that is a great idea - how would one convert the audio to DSD at 50,700 Hz?

Anyway, I took some advice from here and ran the source audio through a hard limiter to push the signal strength as high as possible prior to the 1 bit quantization. I then used Bitcrusher (VST Plugin) and fiddled with the setting. Although it does bit crush, it still creates a file with 3 levels instead of 2, so then I ran it through my 1 bit quantizer, encoded it, burnt the eprom and ran it on the original hardware. I captured the output at 44.1 kHz from the audio out port on the machine. (Modded Mattel aquarius to output composite video and one channel of audio.)