Encoding an image to sound

The purpose of this project is to encode an image to a sound that can be viewed with a spectrogram. For some time I have known that musical artists have encoded pictures into their music. Most notable of these is artists is Aphex Twin. Luckily I had a copy of Windolicker and a great visualization program Sonic Visualiser. After looking at the images I decided it would be cool to try and encode my own images. I saw a few programs available, but decided it would be a better challenge to write my own program from scratch using Perl.

Spectrograms

A spectrogram is a graph representing the intensity or a frequency with relation to time. Normally the frequencies are along the Y axis, with the time on the X axis. The intensity of the frequency is represented by the brightness of the color. The frequency and color can use either a linear scale or a logarithmic scale. Below is an spectrogram of a few piano chords. The audio file used can be found on Wikipedia here.

Image encoding

The idea I had to encode the image was to simply create a sine wave at a corresponding frequency to represent the Y axis, a corresponding time to represent the X axis and a corresponding amplitude to represent the pixel color intensity.

Creating Sound

The first step to encoding an image was to learn how audio formats work. At first I tried writing a script that plays a frequency to the ‘/dev/dsp’ (Which is the sound card on Linux). When writing straight to /dev/dsp you are limited by a sample rate of 8000hz and a sample size of 8bits. Below simple Perl script that plays a concert A 440hz. To execute run ‘./sin.pl > /dev/dsp’.

The DSP defaults do not offer much fidelity I needed at least the fidelity of an audio CD, which is 16bits at 44.1khz. I did some of searching on CPAN to find a library that allowed me write wave files. Most of the audio libraries had a too much overhead for what I wanted to do. Instead I looked up the file format for a ‘.wav’ and coded my own library. This library is limited to only producing a 16bit 44.1khz mono wave.

Luckily I found a simple bitmap reader on CPAN called Image::BMP. This is a nice lightweight library that dose not depend on any external libraries or compiled code. Using this library I was able to easily load and read the bitmap data.

Encoding the Image

The first pass of my program disregarded the color data and only produced a frequency for the Y axis if the color intensity was less that half the sum of all colors. Below is an example. Note: I converted the WAV to an MP3 to conserve bandwidth, at 320kbps not much data is lost.

I was really shocked to fist see the image! The only tweaking I needed to do was to use a linear scale for the frequency. Also if I selected too high an amplitude for the sin wave, clipping occurred in areas with too much black. For image above I used an amplitude of about 1000 on a scale of 0 to 32768.

The next step was to add amplitude scaling to match the color intensity. For this I summed all the color channels for a given pixel and scaled it to represent the max amplitude ‘(R + G + B) / 768 * max_amplitude’. Below is a picture of me after using the scaling.

By selecting a color scheme that goes from black to white and using a linear scale for the volume I get a very good black and white image. To prevent clipping on very dark images I added an inverse option that will invert the color producing a negative image.

You can reverse the color scheme to go from white to black to produce the regular image

Full Program

Below you can view and/or download the full code to this program. Currently performance is not optimized. So don’t write me telling me its slow. I currently have a few idea to speed it up. Also for best results use a small image around 100px tall.

I just ran the algorithm using Strawberry Perl on Windows (yeah, guilty me, and my job). It doesn’t seems to work, all it gives out is a scrambled mess of frequencies when I watch the resulting WAV on Sound Visualizer. I want to fix it or find what’s wrong here, because I think this article is really creative (and I want to use it to obscure some hints in a project I have).