Conservative Morphological Anti-Aliasing (CMAA)

This is a computer translation of the original content. It is provided for general information only and should not be relied upon as complete or accurate.

This sample presents a new, image-based, post-processing antialiasing technique referred to as Conservative Morphological Anti-Aliasing and can be downloaded here. The technique was originally developed by Filip Strugar at Intel for use in GRID2 by Codemasters*, to offer a high performance alternative to traditional multi sample anti-aliasing (MSAA) while addressing artistic concerns with existing post-processing antialiasing techniques. The sample allows CMAA to be compared with several popular post processing techniques together with hardware MSAA in a real time rendered scene as well as to an existing image. The scene is rendered using a simple HDR technique and includes basic animation to allow the user to compare how the different techniques cope with temporal artifacts in addition to static portions of the image.

Figure 1: CMAA Sample using HDR and animating geometry

MSAA has long been used to reduce aliasing in computer games and significantly improve their visual appearance. MSAA works by running the pixel shader once per pixel but running the coverage and occlusion tests at higher than normal resolution, typically 2x through 8x, and then merging the results together. While significantly faster than super sampling it still represents a significant additional cost compared to no anti-aliasing and is difficult to implement with certain techniques. One of the common ways to approach this problem is to use an image-based post-process anti-aliasing (PPAA), which became practical with GPU ports of Morphological antialiasing (MLAA) [Reshetov 2009] [1] and further developments such as “Enhanced Subpixel Morphological Antialiasing“(SMAA) [2] and NVidia’s “Fast approximate anti-aliasing” (FXAA) [3]. Compared to MSAA, these PPAA techniques are easy to implement and work in scenarios where MSAA does not (such as deferred lighting and other non-geometry based aliasing), but lack adequate sub-pixel accuracy and are less temporally stable. They also cause perceptible blurring of textures and text, since it is difficult for edge-detection algorithms to distinguish between intentional colour discontinuities and unwanted aliasing caused by imperfect rendering.
Currently two of the most popular PPAA algorithms are:

SMAA is an algorithm based on MLAA but with a number of innovations and improvements, and with a number of quality/performance presets. It implements advanced pattern recognition and local contrast adaptation, and the more expensive variations use temporal super-sampling to reduce temporal instability and improve quality. The SMAA algorithm version referenced in this document is the latest public code v2.7.

FXAA is a much faster effect. However, FXAA has simpler colour discontinuity shape detection, causing substantial (frequently unwanted) image blurring. It also has fairly limited kernel size by default, so it doesn't sufficiently anti-alias longer edge shapes, while increasing the kernel size impacts performance significantly. FXAA algorithm version referenced in this document is latest public code v3.8.

In this sample we introduce a new technique called Conservative Morphological Anti-Aliasing (CMAA). CMAA addresses two requirements that are currently not addressed by existing techniques:

To run efficiently on low-medium range GPU hardware, such as integrated GPUs, while providing a quality anti-aliasing solution. A budget under 3ms was used as a guide when developing the technique at a resolution of 1600x900 running on a 15watt, 4th Generation Intel® Core™ processor.

To be minimally invasive so it can be acceptable as a replacement to 2xMSAA in a wide range of applications, including worst case scenarios such as text, repeating patterns, certain geometries (power lines, mesh fences, foliage), and moving images.

CMAA is positioned between FXAA and SMAA 1x in computation cost (1.0-1.2x the cost of default FXAA 3.8 and 0.55-0.75x the cost of SMAA 1x). Compared to FXAA 3.8, CMAA provides significantly better image quality and temporal stability as it correctly handles edge lines up to 64 pixels long and is based on an algorithm that only handles symmetrical discontinuities in order to avoid unwanted blurring (thus being more conservative). When compared to SMAA 1x it will provide less anti-aliasing as it handles fewer shape types but also causes less blurring, shape distortion, and has more temporal stability (is less affected by small frame-to-frame image changes).

CMAA has four basic logical steps (not necessarily matching the order in the implementation):
1. Image analysis for colour discontinuities (afterwards stored in a local compressed 'edge' buffer). The method used is not unique to CMAA.
2. Extracting locally dominant edges with a small kernel. (Unique variation of existing algorithms).
3. Handling of simple shapes. Not particularly unique.
4. Handling of symmetrical long edge shape. (Unique take on the original MLAA shape Handling algorithm.)

Step 2: Locally dominant edge detection (or, non-dominant edge pruning)
This step serves a similar function to “local contrast adaptation” in SMAA and “local contrast
test” in FXAA but with a smaller kernel. For each edge detected in Step 1, colour delta value above threshold (dEc) is compared to that
of neighboring 12 edges (dEn):

Figure 3: The edge remains an edge if its dEc > lerp( average(dEn), max(dEn), ldeFactor), where ldeFactor is empirically chosen (defaults to 0.35).

This smaller local adaptation kernel size is somewhat less efficient at increasing effective edge detection range. However, it is more effective at preventing blurring of small shapes (such as text), reducing local shape interference from less noticeable edges, avoiding some of the pitfalls of large kernels (visible kernel-sized transition from un-blurred to blurry), and has better performance.

Step 3: Handling of simple shapes
Edges detected in step 1, and refined in step 2, are used to make assumptions about the shape of the underlying edge before rasterization (virtual shape). For simple shape handling, all pixels are analyzed for existence of 2, 3 and 4 edge aliasing shapes, and colour transfer is applied to match the virtual shape colour coverage and achieve the local anti-aliasing effect (Figure 4). While this colour transfer is not always symmetrical, the amount of shape distortion is minimized to sub-pixel size.

Step 4: Handling of symmetrical long edge shape.
4a. Each edge-bearing pixel is analyzed for a potential Z-shape, representing the center of the virtual shape rasterization pixel step (which is mostly a triangle edge). Criterion used for this detection is illustrated in Figure 5. Four Z-shape orientations (with 90° difference) are handled.

4b. For each detected Z-shape, the length of the edge to the left/right is determined by tracing the horizontal (for two horizontal Z-shapes) edges on both sides, and stopping if none are present on either side, or a vertical edge is encountered (Figure 6).
4c. The edge length from the previous step is used to reconstruct the location of the virtual shape edge and apply colour transfer (to both sides of the Z-shape) to match coverage that it would have at each pixel. This step overrides any anti-aliasing done in Step 3 on the same pixels.

The inherent symmetry of this approach better preserves overall image average colour and shape, ignores borderline cases, and better preserves original shapes while also being more temporally stable. One pixel (or few pixels) changes do not induce drastic colour transfer and shape modification (when compared to SMAA 1X, FXAA 3.8 and older MLAA-based techniques).

The sample UI allows a direct comparison between several anti-aliasing techniques that are selectable from within a drop down menu along with several debug features. All the techniques can be viewed in high detail using a zoom box that can be enabled from the UI and positioned by using a right mouse click. For both CMAA and SAA additional debug information is shown that highlights the actual edges that have been detected by the algorithm; slider allows you to adjust the threshold used for edge detection; In the case of CMAA both the edge threshold and the non-dominant edge removal threshold can be modified. The effect on performance caused by modifying the threshold can be viewed if the application is run with vsync disabled. GPU performance metrics are displayed in the upper left hand corner of the application showing the overall cost of rendering the scene and the time taken in the post processing anti-aliasing code. When viewing the stats, additional debug information such as the zoombox and the edge view should be disabled as they both lower performance by forcing sub-optimal code paths to be used. When viewing CMAA in the zoombox with ”Show Edges” enabled the zoombox will also animate showing the effects of applying CMAA to the image, this doesn’t affect the rest of the display.

In addition to showing the effect of the various post-processing effects on the real-time scene the application allows a static image to be loaded and used as the source for the effect; the currently supported file format is PNG. A synthetic sample image is provided in the samples media directory (Figure 9).Attempting to run the sample with 2x and 4x MSAA will have no effect as these would normally affect the image source but CMAA, SMAA, FXAA and SAA can all be applied to the image. This feature quickly allows anyone considering using any of the post processing techniques in the sample to load images taken from their own application and experiment with the various threshold parameters.

Generally, CMAA shows the least number of artifacts but is not as good as SMAA 1X [2] when handling certain angled lines; SMAA 1X has some issues with certain shapes but is otherwise high quality. FXAA3.8 displays lots of artifacts and inability to blur edges on long slightly angled lines (bottom). Both SMAA 1X and FXAA 3.8 cause distortion of the pixel grid area, as a result of the local contrast adaptation algorithm with a large kernel.