Thursday, September 29, 2016

libsquish's DXT1 "Cluster Fit" method applied to ETC1

(This post is probably of interest to like a dozen people in the world, so it's kinda hairy.)

libsquish (a popular DXT encoding library) internally uses a total ordering based method to find high-quality DXT endpoints. This method can also be applied to ETC1 encoding, using the equations in rg_etc1's optimizer's remarks to solve for optimal subblock colors given each possible selector distribution in the total ordering and the current best intensity index and subblock color.

I don't actually compute the total ordering, I instead iterate over all selector distributions present in the total ordering because the actual per-pixel selector values don't matter to the solver. So basically, the new optimizer first tries the subblock's average color, then it computes and tries a series of "correction" factors (relative to the subblock's average color), which depend on the current intensity table index and the current best subblock color found so far (to account for clamping). A hash table is also used to prevent the optimizer from evaluating a trial solution more than once.

Update 10/1: Okay, I computed a histogram of the "winning" subblock average color correction factors applied over a few hundred test textures. I then selected the top 64 correction factors (out of 165), and don't bother trying the rest. Here's a graph showing the usage histogram of the selector distributions across all the test textures, sorted by frequency (the most successfully selector distributions are to the right).

How It Works

rg_etc1 tries refining the current best subblock color found so far as it scans "around" the ETC1 444 or 555 color 3D lattice surrounding the subblock's average color. This refinement approach takes as input the current set of 2-bit selectors, the current best intensity table index, and the current best subblock color (only to account for clamping). Given this information, you can compute a RGB "correction" factor which is subtracted from the subblock's average color to compute a potentially better (lower error) subblock color.

Here's my current optimizer's compute() function. It first tries the subblock's average color (at coordinates m_br, m_bg, m_bb), to establish a baseline (minimally useful) solution, then it iterates over the precomputed (and sorted) selector distribution table and attempts applying (usually) a few dozen or so avg. color "correction" factors from this table. The table is sorted, so the entries with the highest probability of applying the best correction appear first (as mentioned above).

Note that evaluate_solution() uses a hash table to avoid trying the same solution more than once. The sorted table of total ordering selector distributions is at the bottom of this post.

About Me

Back in the day I worked for several years at Digital Illusions on things like the first shipping deferred shaded game ("Shrek" - 2001), software renderers, and game AI. Then, after working for Microsoft at Ensemble Studios for 5 years as engine lead on Halo Wars, I took a year off to create "crunch", an advanced DXTc texture compression library. I then worked 5 years at Valve, where I contributed to Portal 2, Dota 2, CS:GO, and the Linux versions of Valve's Source1 games. I was one of the original developers on the Steam Linux team, where I worked with a (somewhat enigmatic) multi-billionare on proving that OpenGL could still hold its own vs. Direct3D. I also started the vogl (Valve's OpenGL debugger) project from scratch, which I worked on for over a year. In my spare time I work on various open source lossless and texture compression projects: crunch, LZHAM, miniz, jpeg-compressor, and picojpeg.