Audio encoding (where to start)

This is a discussion on Audio encoding (where to start) within the C++ Programming forums, part of the General Programming Boards category; I'm planning on learning audio encoding. How to read it, parse it, play it, etc. I am planning on learning ...

Audio encoding (where to start)

I'm planning on learning audio encoding. How to read it, parse it, play it, etc. I am planning on learning video encoding later down the road, but I thought that audio encoding would be better to start with.

There are tons of different encodings for both audio and video. Some examples:
Audio: PCM, vorbis, flac, wma, aac, mp3, etc.
video: mjpg, all the flavors of mpeg, wmv, etc.
Additionally, where audio & video meet (and where there is video, there is almost always audio...) there's usually another thing to worry about, a container format:
Container: ogg, riff, mp4, (window media w/e), etc.

However, when it all boils down to it, if you want to know how to read a certain file or format, you just need to research. Find the appropriate documentation, chances are good that it's out there on the web, somewhere. Read it, figure out how the format works, build the appropriate software. (Note: the skills needed for binary file I/O and proper (de)serialization of data are a prerequisite here. And all the basics of programming are a prerequisite to that.)

However, all that is only really needed if you for some reason want to understand the format for yourself. If you're just interested in reading/writing it, it's not needed. Other people have likely done the hard work for you. There are tons of libraries out there for encoding/decoding a/v formats. More open formats like vorbis & flac practically come with the code - see their respective sites. Other formats have had open source solutions - libmpg123, lame, libsoundfile, etc. ffmpeg and its derivative apps reads half the video stuff known to man. Libraries like these are under liberal licenses - use the tools available to you. I've written software to play mp3/ogg files, but I do not know these formats - I used the respective libraries for the file -> sound wave part.

(Also, some formats (well, the algorithms involved) are patented, notably mp3. There might be others. :-/ Just be aware.)

I'm not planning on re-distributing anything, so copyrights won't be of any problem. I am also interested in understanding how it works, so I'll be making the libraries my self. I will keep in mind the libraries you mentioned, however, in case I need a point of reference.

I would start by understanding the basics: How audio is represented [in it's most primitive form, e.g. PCM] in a system. Once you understand that, try to follow how compressed audio works - there are simple forms [differential and mu-law (aka u-law) for example], and there are complex forms (the mobile/cell-phone GSM audio portion for example).

For video formats, understanding how images are stored. Also understanding general compression and lossy compression [some of which you'd get from the audio side] will be helpful.