fre:ac Developer Blogfre:ac is a free audio converter and CD ripper/audio CD grabber with support for various popular formats and encoders. It currently converts between MP3, MP4/M4A, WMA, Ogg Vorbis, FLAC, AAC, WAV, and Bonk formats and features full Unicode support, including support for UTF-8 freedb entries.https://freac.org/index.php/en/developer-blog-mainmenu-9
Fri, 22 Feb 2019 13:58:48 +0000Joomla! 1.5 - Open Source Content Managementen-gbfre:ac development status update 12/2018https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/300-freac-development-status-update-122018
https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/300-freac-development-status-update-122018Hi and welcome to the latest fre:ac development status update! I did not write a report for November as I was busy preparing the latest alpha release around the end of the month, so this report covers both, November and December of 2018.

BeGeistert 2018

On the 3rd and 4th of November I attended the BeGeistert 2018 Haiku developers conference which was held in my hometown of Hamburg, Germany. I met great people from the Haiku project, got to work on the fre:ac port for Haiku and ported Monkey's Audio over.

It was a great event and a nice experience and I'm already looking forward to attending future BeGeistert meetings in the years to come.

Flatpak (and work on Snap)

The big news for December certainly is fre:ac finally being available as a Flatpak. This makes it easier than ever for Linux users to discover and try out fre:ac.

The fre:ac Flatpak is available from Flathub, the largest application repository for Flatpak distribution.

Now that the Flatpak is ready, I'm shifting my attention towards building a Snap package as well. There already is some progress, but I did not get to work on the project a lot during the holiday season. Thus I now expect the Snap to be ready around mid January.

Enhancements

The December release of fre:ac added a notification component that can play a customizable sound or display a message box upon finished conversions. This was requested by several users doing very large conversions that might take anywhere from several minutes to hours. They often leave fre:ac running unattended, so an audible notification upon job completion can be useful.

Additionally, the December release now supports accent colors on macOS Mojave and improves scaling on HiDPI displays, especially with radio buttons and with edit fields where text was sometimes cut off in scaled mode.

Minor enhancements added in November and December include an option to switch stereo channels that was added to the channel converter DSP component and support for drag & drop in the tag editor. The latter did not make it into the December release though and will debut in the next alpha or beta.

Various fixes

In addition to the above changes, many bugs have been fixed in the past two months:

Fixed MP4 files not being created when <directory> placeholder is used in output filename pattern

Fixed tag editor being unable to open some file types on case-sensitive file systems

Fixed freaccmd issues when input and output files are the same

Fixed crash when converting mono to stereo

Fixed encoding mono MP3s

That's all for this issue. The report for January should be out in early February, so make sure to check the site for updates.

]]>robert.kausch@bonkenc.org (Robert)fre:acSat, 05 Jan 2019 22:18:43 +0000fre:ac development status update 10/2018https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/295-freac-development-status-update-102018
https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/295-freac-development-status-update-102018Hi all, this is the fre:ac development status update for October 2018. I have a lot of interesting things to share.

Haiku packages and fixes

First and foremost, I created fre:ac packages for the Haiku operating system. The Haiku project just released their first beta in the end of September and fre:ac is now available via their HaikuDepot package manager. Just search for it and you are one click away from installing fre:ac.

I also fixed the Fraunhofer AAC encoder and updated the LAME MP3 encoder package for Haiku while at it. I plan to also port Monkey's Audio to this interesting OS.

But while it's great that fre:ac now runs on Haiku, the 20180913 release is still far from perfect. I implemented a lot of fixes in October to make the next alpha run much better:

Fixed window titles becoming inactive when a menu, dropdown or tooltip is shown

Fixed scrollbars not working when moving the cursor outside the window

Improved image downscaling using a weighted average box filter

Added support for font scaling (limited HiDPI mode)

Fixed application signature to avoid warning on startup

Fixed resource compilation to simplify package script

A new alpha with these fixes and more will probably be out in mid-November.

Notarization of macOS version

Apple recently introduced notarization for apps distributed outside the app store. It's basically an automated malware and best-practices check for applications that when passing will display a "this app was checked by Apple" message on macOS Mojave.

It took me a weekend to adapt the fre:ac package for this new security feature, but now it's functioning very well. Starting with the next release, fre:ac for macOS will be notarized by Apple.

Other fixes

Some other fixes have been implemented in October:

Linux HiDPI fixesThe Linux version of fre:ac will now evaluate the GDK_SCALE environment variable in order to automatically scale the font and UI element size on HiDPI displays.

freaccmd fixes for non-Windows systemsThe fre:ac command line interface had some issues with spaces in file names. Namely, the current version does not work with the standard Unix shell space escaping (which is putting a backslash in front of each space character). This will be fixed in the next release.

Fixed issue with enabling "Write to input folder" and "Delete original files after encoding" optionsWhen enabling the "Write to input folder" and "Delete original files after encoding" options at the same time and the output filename equals the input filename, the current version of fre:ac can delete both, the existing input and the new output file after conversion, leaving you with none of the two. The next release will fix this and will be brought forward one or two weeks for this.

Work place changes

Finally, after more than 15 years at the previous company, I changed my job on 1st of October. I'm now working for PreSonus Software Ltd. in Hamburg, Germany on tasks much closer to what I'm already doing in my free time with fre:ac.

I will mainly work on the Notion music notation software, but also look into improving Studio One's audio file export and conversion capabilities and much more.

I'm happy to be part of the PreSonus team and really looking forward to great things coming in the future.

That's all for this month. Be sure to come back next month for another update.

]]>robert.kausch@bonkenc.org (Robert)fre:acSat, 03 Nov 2018 08:35:35 +0000fre:ac development status update 09/2018https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/294-freac-development-status-update-092018
https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/294-freac-development-status-update-092018Hi all, this is the fre:ac development status update for September 2018.

New alpha release

I published a new alpha last month, integrating the SuperFast encoding mode and freaccmd's dynamic arguments support that I wrote about in last month's issue. With these features added, the new alpha is now almost feature complete with respect to what is planned for fre:ac 1.1 beta.

Improved tags interoperability

Several changes have been implemented to improve compatibility with tags written by other applications (most notably foobar2000). These changes enable fre:ac to extract more information from files containing such tags than previously possible. The individual improvements are:

Support reading and writing ID3v2 tags from/to .wav files

Support reading cue sheets from ID3v2 tags

Use album artist for artist field if only the former is set

One example where these will be useful is when ripping a CD to a single .wav file with foobar2000. Upcoming fre:ac releases will be able to list the individual tracks contained in the .wav file even if no separate .cue file is available.

Improved error handling

Despite fre:ac 1.1 alpha releases having supported parallel conversions for some time, error handling is still designed for single threaded operation. When a conversion fails, the affected thread shows an error message and the whole conversion process stops at that point.

This will be much improved in the next release. An error in one conversion will not stop the whole process any longer. Instead, error messages will be collected and displayed all in one dialog at the end of the conversion process.

In other words, when the 100th file in a 1000 files conversion job causes an error, fre:ac will continue to process the other 900 files instead of stopping the whole job. No more worrying that large unattended conversion jobs stop early without you noticing.

HiDPI improvements

Last but not least, I've also implemented some improvements for the placement of tool windows and dialogs in HiDPI mode. Dropdown lists, popup menus, tooltips and dialogs should now be displayed where you would expect them when running in scaled mode.

This concludes this month's issue. Be sure to come back in a month for the next report.

There are only two things to be reported this time, but those surely are great news:

SuperFast codecs merged into mainline

The SuperFast versions of the LAME, AAC (FAAC, FDK-AAC and Core Audio), Opus and Speex codecs have finally been merged into mainline fre:ac. You still have to enable SuperFast mode manually on the Resources configuration page to make full use of them, but that option is set to become enabled by default before the fre:ac 1.1 final release.

SuperFast mode is marked experimental for now and has to be enabled manually

Read here and here for more details on fre:ac's SuperFast encoding technology.

Dynamic arguments feature merged

I wrote about the dynamic arguments feature for freaccmd in last month's issue. Now that code has been merged, making it the second big feature completed this month.

While not visible for users of the graphical UI, this will make things much easier for those who would like to script and/or automate conversions with the command line interface.

There will be a new alpha release with these changes in the next few days. Keep watching for it and stay tuned for the next update in about a month.

New alphas and SuperFast preview

There were two new alpha releases last month providing all the new features talked about in previous development status reports and fixing several issues. Unlike previous alphas, the current version is now recommended for everyone to try out. If you find any issues, please report them to support@freac.org.

The SuperFast technology will be included in the next regular alpha release and will then be available for everyone to use.

Dynamic encoder arguments for freaccmd

An important item on the feature list for fre:ac 1.1 beta is support for command line configuration arguments for all codecs added during fre:ac 1.1 development. Currently, freaccmd supports arguments for a very limited set of codecs only. The code to change that lives in the dynamic-arguments branches of fre:ac and BoCA and I made good progress towards merging it with the master branches in the past month.

With the new code, codecs can specify which command line arguments they support and freaccmd will make them available without the need for any codec specific code in the command line frontend itself.

Other changes

Improved handling of album artist in tag editorHandling of album artists in the tag editor has been improved to fix issues when the field is changed in album editing mode. Prior to this change this could result in the album artist information getting lost for some or all of the relevant tracks.

Improved adjustment of dialogs to text sizesSeveral dialogs have been reworked to adjust dynamically to the size of translated texts. The width of labels displayed in a dialog can vary greatly in different languages, so dynamic adjustment is necessary. This is not completed yet, so work on this will continue in the next few months.

That's all for this issue. Make sure to come back next month for the August status update.

The challenge

The main difficulty is that while most other formats have discrete frames of audio samples in their bitstreams, MP3 frames can overlap each other:

In this example, the average frame size is 4 blocks of data. The individual frame lengths are 4, 3, 4, 3, 1, 5, 5 and 7 blocks. In an AAC bitstream, each frame will simply have a length matching the number of data blocks required for that frame and the frames will neatly come one after another. In an MP3 bitstream, however, (at least for CBR files, VBR is more complicated) frames have a fixed size and when there is space left in a frame after all samples have been encoded, that space can be used by the following frames. This space available to following frames is called bit reservoir and allows the codec to maintain a set target quality in most cases, even when frame sizes are fixed and audio complexity changes.

Have a look at the example. The 5th frame is only one data block long and that data block fits completely into the 4th frame. It even leaves some space, so the first data block of the 6th frame starts in the 4th frame as well. Looking at only the 5th and 6th frame, their layout in the bitstream looks like this:

Here the frame headers come after the data and in case of frame #5, there even is data of another frame (#6) between its data and its header. In real world MP3 streams, the situation can be even more intricate.

Basic SuperFast operation

So this is a problem when implementing the SuperFast technology for MP3. SuperFast works by passing chunks of audio data to separate encoder instances and later joining the encoded data blocks back together in the right order. This requires the frames to be available in discrete form in order to deal with overlap and joining the frames correctly. The SuperFast encoding loop usually looks like this (click to jump to example source code):

MP3 difficulties

When dealing with MP3, multiple issues arise from the peculiarities around the bit reservoir:

The encoder might not return all encoded frames after processing a chunk of data as some frames might still be waiting for additional data to put in the bit reservoir.

Frames are not available in discrete form, but may be overlapping each other.

After dealing with the above, frames need to be put back into an MP3 compatible bitstream after joining.

Frames might require more reservoir than is available after joining with frames coming from other codec instances.

Previous attempts to create multi-threaded MP3 encoders dealt with these issues in a very simple way: They completely disabled the bit reservoir to get nicely laid out frames with no overlapping data. This solution cuts into the resulting MP3's quality, though, which is why such encoders never really gained traction.

So let's see how we can handle these issues more adequately.

Unraveling it

The first one is relatively simple. After encoding a chunk of data, we call lame_encode_flush_no_gap to force the encoder to return all encoded frames even if they are not completely filled yet. This makes sure we can operate with all the relevant frames in the next steps.

The second issue is handled by a bitstream unpacker that parses the data returned by the encoder and extracts discrete frames from the bitstream. After this step all frames will be laid out as a frame header followed by the complete data belonging to that frame. No more intermixing with other frames' headers or data.

After unpacking, we are ready to perform overlap skipping and ordering of data chunks from different encoder instances.

When writing the ordered frames to the output stream, we now need to make sure to repack them back into an MP3 compatible bitstream. The repacker deals with frame sizes and the bit reservoir and tries to pack frames in the most compact way.

Sometimes, though, a frame requires more reservoir than is currently available and the repacker needs to find a way to fit it in. It basically has two options to accomplish this: If only a few extra bits are needed, the repacker can add padding to a frame. This will add an additional byte and sometimes this is enough to provide the required reservoir. In cases where it is not sufficient, the repacker can enlarge one or more previous frames to a bigger frame size. This usually allows to provide enough reservoir, but requires all affected frames to be repacked again.

However, even this might not be enough when issue number 4 comes into play. In some rare cases, a frame requires so much reservoir that it is simply not possible to fit it into the bitstream. This can happen because one encoder instance cannot know how much reservoir will be left over by the instance encoding the preceding chunk. In cases where the preceding instance has to deal with a difficult to encode signal, it might leave next to no reservoir available to the next encoder.

Dealing with this was difficult. While there are some simple options like forcing the encoder to use a lower bitrate, these might potentially result in audible quality drops. So I tried to find another way to handle this.

Basically, the SuperFast algorithm will try to re-encode the audio part starting with the non-fitting frame and repeat this until it fits. To work around situations where it might never fit using this strategy, each time it fails, we try to put some more pressure on the bit reservoir by prepending a few frames of difficult to encode dummy data. These dummy frames force the encoder to spend some reservoir on them and lead to using less reservoir for our previously non-fitting frame, eventually allowing us to fit the frame into the bitstream.

The result

With all these additional steps, the process for SuperFast LAME now looks like this (click to jump to source code):

Arriving at this point took several months of work, but was absolutely worth it. The SuperFast LAME encoder scales well with the number of CPU cores and can provide a 3.5x speedup on a quad-core processor. On my 8 core, 16 thread CPU, I was able to achieve up to 12x speed increase with it.

Unlike previous attempts to speed up MP3 encoding, SuperFast LAME does this while still using the MP3 format's bit reservoir feature and uses an unmodified encoder library - the necessary changes are all implemented in the frontend application and could be used with alternative MP3 encoders as well.

I plan to implement this technology on top of the command line LAME frontend in the future. For now, my priority is on releasing fre:ac 1.1 beta and final versions, though. But keep watching this blog for future annoucements about a SuperFast enabled stand-alone LAME version.

Source code

]]>robert.kausch@bonkenc.org (Robert)fre:acFri, 20 Jul 2018 18:27:21 +0000fre:ac development status update 06/2018https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/280-freac-development-status-update-062018
https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/280-freac-development-status-update-062018The June development update is overdue, but better late than never, here it is. It was a very productive month, so let's get right to the good stuff.

Parallel conversion jobs

The current alpha release supports only one conversion job at a time. Multiple tracks in a conversion can be processed in parallel, but when you try to start a new job while a conversion is still running, you just get a message asking if you would like to schedule the new job for after the current one is finished.

The next release will enable parallel conversion jobs. As long as there are CPU threads left, multiple conversions, possibly using different settings, can run at the same time. This helps when converting multiple albums to a single file per album or when ripping CDs using multiple drives.

Improved handling of automatic ripping

This brings us to the next item. There are some issues with the current alpha when using the automatic ripping option with multiple drives. When inserting a disc while other tracks are still in the joblist, the new ripping job will try to process those other tracks again, leading to some tracks being ripped more than once. Also, the new job will not start before any currently running rip is finished. Both issues will be fixed in the next alpha which makes ripping with multiple drives much more usable.

Fixed metadata bug with Core Audio on Windows

In May, a user opened an issue on GitHub reporting that when converting ALAC files to AAC using the Core Audio encoder on Windows, tags were missing on some files. I could easily reproduce the issue, but it seemed really strange. It occurred only when converting files decoded with an external decoder (i.e. a separate .exe called by fre:ac) and the selected encoder was Core Audio. That didn't seem to make any sense at first.

It turned out to actually be a bug in Apple's Core Audio implementation on Windows. It would make file handles created by its API calls inheritable by sub-processes. The sub-processes (in this case the external decoders) would then inherit any open handles and lock the respective files, making them unwritable by the tagger component.

Making handles inheritable is something that an API never should do as it can lead to unforeseeable behavior and very difficult to analyze bugs.

Fortunately there is a work-around by avoiding the problematic APIs. The next alpha release will include this fix.

Automatic codec builds

Till now, all the codecs included with fre:ac are built manually: Set the correct compiler flags for each codec on each supported OS, apply necessary patches, configure the codecs with the right flags and run make to build them. This costs a lot of time whenever a new codec version is relased and also is a bit error-prone, so it was necessary to change it.

I built a script to automate all the steps listed above for most of the necessary codecs and some other libraries. The script can compile FAAC, FAAD2, FDK-AAC, FLAC, LAME, libav, libogg, libsamplerate, libsndfile, Monkey's Audio, mpg123, Opus, RubberBand, Speex, Vorbis and WavPack on Windows, macOS, Linux and FreeBSD. Whenever a new version of one of these libraries is released in the future, I can simply update the package download URL and run the script to build a new release.

Reworked donation dialog

The donation dialog has been reworked to support more payment types. Previously supporting only PayPal, the new dialog adds support for Donorbox, SEPA transfers and the Bitcoin and Ethereum crypto currencies.

Other items

A number of other changes have been implemented in the past month, the most notable of which are:

HiDPI iconsPreparing for the upcoming beta release, I added higher quality versions of the toolbar icons that now look crisp on HiDPI displays like Apple's Retina screens.

Completely translatableIn the current alpha release, not all strings are translatable. This applies to configuration dialogs for external codecs especially. The next alpha will fix this and enable translations for WavPack, Musepack, OptimFROG and TAK configuration dialogs along with some other previously untranslatable strings.

Fixed MP4 metadata bugWhen converting multiple files in parallel to AAC or ALAC output, it can happen that some files end up being unoptimized due to a bug in the MP4v2 library used by fre:ac. Optimization of MP4 files means that tags and the seektable are moved to the beginning of the file for more efficient processing. The next alpha release will include a work-around for the MP4v2 bug fixing the issue of MP4 files not being optimized.

Downloads now hosted on GitHubThe links on the downloads page now point to GitHub instead of SourceForge. This enables direct downloads without an intermediate page to choose a mirror and allows downloading using right-click + save as.

SuperFast LAME status

There were some open issues with the SuperFast LAME implementation when I last wrote about it in the April status update. These have been fixed now and there will be another SuperFast preview release including LAME support very soon after the next alpha. I'm also preparing a technical article about how the MP3 bit reserviour is handled in SuperFast LAME. This should be out within one week from now.

That's it for this month. Be sure to come back in about one month for the next update.

]]>robert.kausch@bonkenc.org (Robert)fre:acWed, 04 Jul 2018 22:39:09 +0000fre:ac development status update 05/2018https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/279-freac-development-status-update-052018
https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/279-freac-development-status-update-052018Hi all, it's time for an update on fre:ac development again. The past month was quite productive and so I have lots of things to talk about.

Integration with Travis CI

The GitHub projects for the smooth Class Library, BoCA and fre:ac are now integrated with the Travis CI platform for automated build tests. Every commit to one of these repositories now starts automatic build processes on Linux and macOS to check if anything got broken. This improves the development process by ensuring that build-breaking issues will be noticed quickly.

The build processes are also started for pull requests, so anyone who submits a patch can immediately see if it breaks anything.

Fast CRC patches accepted into FLAC and Ogg

The patches for faster CRC calculations I wrote about last month have been accepted by the upstream FLAC and Ogg projects. So with the next FLAC and Ogg releases, any software using them will benefit from faster encoding and decoding.

Allowing playback during conversions

Until now, it's not possible to play a track in fre:ac while a conversion is running. This limitation will be lifted with the next alpha release. You will be able to play tracks during conversions as long as they are not on a CD that is currently being ripped from.

Faster AAC, APE and WMA encoding

fre:ac's AAC, Monkey's Audio (APE) and WMA encoder components use temporary files for writing output data. The content of these files is transferred to the actual output file after the encoding process is finished which causes a small delay at the end of each conversion. The next alpha release will fix this by writing directly to the actual output file from the start.

Making this possible required an addition to the internal IO filter interface and extensive testing. This is why it was not done like this earlier.

Improved handling of album artists

Starting with the next alpha release, fre:ac will make use of the <albumartist> placeholder in the default output filename pattern. This prevents the creation of separate folders for each track when dealing with sampler CDs. On samplers, the previously used <artist> placeholder would resolve to a different artist for each track, while <albumartist> will usually be something like Various artists and be the same for every track.

SourceForge Project of the Month

fre:ac has been chosen as the SourceForge Project of the Month of May 2018. This is the second time fre:ac won this award after October 2015. You can read a short interview with me in the SourceForge blog.

This closes this month's issue. Be sure to come back in June for another update.

]]>robert.kausch@bonkenc.org (Robert)fre:acThu, 31 May 2018 21:35:21 +0000fre:ac development status update 04/2018https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/278-freac-development-status-update-042018
https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/278-freac-development-status-update-042018It's time for a new development status update after an interesting month.

Optimized CRC routines for audio codecs

In case you missed it, here is my article on speeding up LAME, FLAC, Ogg and Monkey's Audio with faster CRC checks. The proposed CRC algorithm is roughly 5 times faster than the one previously used and results in a speedup of about 5% for FLAC encoding and decoding. Patches have been submitted to the upstream projects and I hope for integration in official releases of these codecs.

Fixed crashes with local CDDB queries

A user reported occasional crashes when querying a local CDDB database on Linux. This turned out to be a thread-safety issue that manifested itself only when the CDDB query dialog was displayed and then immediately closed before the main thread finished processing the window mapping event.

The issue affects all systems using the X11 window system, so it can happen on Linux, FreeBSD and other Unix-like systems.

This and another issue that I found while investigating it will be fixed in the next alpha release.

SuperFast LAME nearing completion

A whole bunch of changes have been incorporated into the SuperFast version of the LAME MP3 encoder component. It's almost complete now and an official preview release is getting closer.

This month's changes include:

Support for CBR mode and VBR rate limiting

Support for MP3s with frame CRCs

Writing Xing header table of contents

Writing Xing header CRCs

There is just one item left on my list now which is related to handling the bit reservoir in high complexity situations (especially with MPEG 2 streams at 22.05 or 24 kHz). In that case it can happen that an encoder thread tries to use more reservoir than actually is available. Special handling has to be implemented to resolve such situations. I hope to be able to finish this in May.

While waiting for SuperFast LAME, make sure to check out the 2nd SuperFast preview release with added support for FDK-AAC and Speex and tuning for Opus and Core Audio AAC.

]]>robert.kausch@bonkenc.org (Robert)fre:acTue, 01 May 2018 11:28:47 +0000Faster CRC checks to speed up codecshttps://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/277-fastcrc
https://freac.org/index.php/en/developer-blog-mainmenu-9/14-freac/277-fastcrcSo, I kind of stumbled into this, but always looking for possible optimizations, I simply had to explore it...

tl;dr:I accelerated checksum calculations and thus encoding times of LAME, FLAC, Ogg and Monkey's Audio using an optimized CRC algorithm. Find patches at the end of this post. These will be part of the next fre:ac 1.1 alpha release.

Calculating Xing/LAME header CRCs

Working on the LAME MP3 implementation of my SuperFast technology, I came across the necessity to do CRC checksum calculations. Every MP3 created by LAME has a Xing or LAME VBR header at the beginning. It contains index points to the MP3 as well as information about duration and gapless playback. At the end of this header, there are two CRC checksums, one for the MP3 bitstream and one for the header itself.

As the bitstream repacker used in SuperFast LAME changes the MP3's internal structure, an update of the Xing/LAME header's CRC values is necessary afterwards. I started with a simple implementation of the CRC16 algorithm that I wrote for the smooth Class Library. This created a small delay at the end of each conversion when the CRC for the MP3 file is updated. Not a big deal for the usually small MP3s weighting in at 3-4 MB. With larger files, however, like when converting a whole album to a single output file, it became painful. The CRC calculation added a delay of half a second for a 60 MB file on my i7 6900K system. On slower systems it would be much more.

Steps to optimize the calculation

The first thing I tried was using compiler optimizations for the CRC routines (GCC's -O3 instead of -Os). This brought the delay down to about a quarter second. Still too much for my taste, though.

I then started looking for optimized CRC algorithms and found Matt Stancliff's crcspeed repository. It is based on an algorithm developed by Intel that uses additional lookup tables to enable processing of multiple input bytes in a single step. There are different variants of this algorithm circling around, processing different numbers of bytes in each step, but it's generally called slicing-by-X (where X is usually 2, 4, 8 or 16).

I updated my CRC implementation to use the slicing algorithm and did some measurements. The slicing-by-8 variant turned out to be roughly 10 times faster than my original version and 5 times faster than the GCC -O3 compiled one. There was very little additional speedup when using slicing-by-12 (which I found to be the fastest) or slicing-by-16, so I decided to stick with slicing-by-8 as a good compromise between speed and memory requirements. Using the slicing-by-8 algorithm reduced the delay at the end of the 60 MB MP3 conversion to just a few 10s of milliseconds.

But I did not stop there...

Looking further

So, if I have to calculate CRC checksums for the Xing/LAME header, LAME itself will have to do the same. You just don't notice a delay, because the calculation is not done at once at the end, but spread over the whole encoding process. But does LAME use an optimized CRC implementation? As it turned out, no.

I updated the LAME CRC routines with the slicing-by-8 algorithm and got a speed-up of only 0.5%. Not much, but I wondered if other codecs (especially lossless ones that generate more data) might benefit more.

I looked further and found non-optimal CRC implementations in FLAC, Ogg (used for Opus, Vorbis and other codecs) and Monkey's Audio. Replacing them with the optimized algorithm yielded similar results to LAME for the lossy formats. The lossless formats, however, benefit more from the optimization and are sped up by about 5% due to more data being generated. When using Ogg FLAC, the speed-up is roughly 10% due to CRC's being calculated for both, the FLAC audio frames and the Ogg container pages.

So we get up to 5% speed-up in the usual case and around 10% improvement for the Ogg FLAC format. All by simply replacing the CRC algorithm with an optimized version.

Technical considerations

The original Intel algorithm and Matt Stancliff's version require separate implementations for big-endian and little-endian CPUs. I converted the algorithm to an endian-independent form, i.e. only one variant for all processors. I did not measure any significant speed difference after making the code endian-independent when compiling with optimizations turned on.

It's possible to speed up the CRC calculations even more using other methods such as using the PCLMULQDQ instruction on modern x86 CPUs. However, that would make the code depend on that platform and probably provide only marginal additional speed gains.

My implementation uses static lookup tables for LAME, FLAC and Ogg. This blows up code size a bit and I would have preferred calculating the tables on the fly on first use. That's difficult to get right in a portable, thread safe way in plain C though, so it is used only for Monkey's Audio which is written in C++ (allowing dynamic initialization of static data).

Speed gains

Here are some numbers showing relative speed gain when encoding and decoding with different codecs (all used with default settings):

Codec

Encode

Decode

LAME

0.5%

-

Opus*

0.5%

1%

Vorbis*

0.5%

2%

Monkey's Audio

4%

-

FLAC

5%

5%

Ogg FLAC

10%

15%

* Opus and Vorbis themselves are not optimized, but use the optimized Ogg container library.

The patches

Here are my patches to update the mentioned codecs' CRC calculations to the optimized slicing algorithm: