Before continuing talking about anti-aliasing it will be great to understand what aliasing is and why it happens at all. Aliasing in computer graphics is a kind of the more general problem, which is solved by the mathematical discipline called “Sampling and Filtering Theory” and a set of techniques for doing this are anti-aliasing thechniques . Sampling is a process converting from a continuous signal to a discrete signal, an opposite process called reconstruction. During sampling some part the original information is loosed because samples count is finite. To understand what is sampling and how correct choose these samples look at the image #1.
The idea of explanation of the sampling issue I’ve got from the wonderful book called “Real-Time Rendering” by Tomas Akenine-Moller, Eric Haines and Naty Hoffman. Because I can’t distribute any part of the book even images, I wasn’t lazy and drew my own picture, which I think will help.
On the image you can see so named “wheel spin paradox”. You may often see it in the movies for example. The paradox here is that a wheel may appear like it spins in the opposite to the original direction or may seem motionless at all. First row shows the original signal. At the second you can see a signal which is sampled at the very low frequency, so the wheel looks like it rotates in the opposite direction. At the third row is situated a signal sampled at exactly two times of the original frequency. It looks like the wheel doesn’t move at all. And at the last row a signal sampled at frequency slightly big than two times of the original. Notice how good the original signal was sampled in that case. So we can prove that to have good sampled original signal it’s absolutely necessary to have the sample rate at least slightly more than twice of the original signal frequency. This minimum sampled rate has a name Nyquist rate or Nyquist limit.
Now when we understand the nature of anti-aliasing let’s see what happens when we transform objects from 3d dimension representation into the 2d dimension. You may even know that line segment consists of the infinite number of points inside it. This is true for a rectangle; it also consists of the infinite number points inside it. Moreover you can prove that if the length of the line segment is equal to the perimeter of the rectangle, they both have infinite and equal number of points. The all of that is applied as well for 3D dimension. So there is no way to transform object in 3D dimension into 2D dimension without losing some information, except having infinite 2D representation. But because almost all commodity monitors have finite number of 2D points, this value often also called screen resolution, we can’t do that. To be right having infinite number of 2D point or infinity big screen resolution isn’t necessary, because when we’re increasing screen resolution we just add more detail to small or distant objects, while big object, which have relatively big screen space coverage percent, just very slightly improve their quality, because their signal frequency are commensurable with sampling rate. There is no matter how much detailed will be this 2d grid; anyway we can chose a 3d object, which will be so small so we will not be able to properly cover it by this 2d grid. Moreover having big screen resolution is very inefficient in terms of memory, bandwidth and calculations. Bandwidth bottleneck become more important in the last couple of years and relates to the using deferred rendering techniques, which requires a very fat back buffer.
I want to give you some information about memory usage. Let’s imagine a typical screen resolution used in computer games is 1680x1050. To have it we need to have memory at least equal 1680x1050 x (4 bytes) x 3 ~= 20 MB. Here I’ve made assumption that we have double buffering and one 32 bits depth buffer. If we just double resolution memory usage will be ~20 * 4 = 80 MB (Note this is typical memory usage for MSAAx4 or FSAAx2). Further increasing the screen resolution will be 320 MB (equivalent MSAAx8), 1280 MB (equivalent for theoretical MSAAx16), 5120 MB (MSAAx32) and so on. As you can see this is almost impossible on the commodity hardware, where we typical have one or two gigabytes because computer graphics consists not only from back-buffers but from meshes, textures, other render targets(like shadow maps, environments map and so on) and everything else, that makes your favorite game interesting.
So now we know that when we convert 3D object into 2D we lose original information. That all says us that the process of showing 3D graphics on the display is a kind of the sampling task also called rasterization. There are a lot of different techniques to do this task. Google has a lot of information about this. Here I just give you some basic information (image #2).
Firstly the rasterizer is just one of many stages in the graphics pipeline. It goes right before execution of the fragment shaders. On the input to the rasterizer are passed vertices in the clip space and an information about what a primitive they have to produce together. That can be line, triangle, polygon with more than three vertices or something else. On the output we have a set of pixels, which are ready to be passed to the next stage. So for example rasterizer has three vertices and these vertices have to make triangle. Rasterizer connects vertices sequentially and gets wireframe representation of the triangle, but we want to get solid triangle, so rasterizer tries to project this wireframe triangle onto the 2D grid. During this process we are loosing information about the original wireframe triangle, because during this process are passed only pixels whose center locates inside the triangle boundary. So even if pixel overlaps triangle boundary but its center doesn’t locate inside the boundary of triangle this pixel will be skipped from the pipeline forever. This is very important note because later you will see that MSAA uses slightly different approach for doing this.
To eliminate this problem there are two common and relatively old solutions, to be precise three but the third is very expensive for real-time graphics. There are FSAA, MSAA and Accumulation buffer with camera projection matrix jittering. As I already said the last solution is very expensive in terms of performance. The idea behind this approach is very simple before we display final image we render the same scene multiple times into so named accumulation buffer, which commonly was hardware implemented and has slightly more bits per component to eliminate color clamping, with every time jittered by sample pattern projection matrix. Then we resolve accumulation buffer into back-buffer by just dividing each color by N, where N is the number of scene draws. It works great and gives stable and pleasure result, but performance is very poor. Maybe this approach is good or even excellent for movies but not for games.
The first more acceptable method is FSAA. FSAA stands for Full Screen Anti-Aliasing. This approach also has another name called SSAA (Super Sampling Anti-Aliasing). The basic idea of this method is to give each pixel some additional resolution. This is reached by drawing final image in the bigger resolution 2x 3x 4x, so for every pixel for the original not scaled image we have 2x2 3x3 4x4 block of pixels. To get final color of this pixel we usually get the simple arithmetic’s average sum of all these sub-pixels. It looks pretty good, but the memory consumption and performance make this approach almost not useful in real-time graphics applications like games, but there are some exceptions.
The second solution is MSAA, stands for Multi-Sample Anti-Aliasing. It presents in almost every commodity graphics cards and has hardware implementation. This technique also tries to give each pixel some additional sub-pixel information, but in difference to FSAA does that in performance sparse form. With MSAA enabled the pixel stops being an atomic part of the rasterizer. Now each pixel has its own dimension depended of the MSAA mode. This additional information is stored in an additional buffer called samples buffer. It contains color/z/stencil for every sample. So with the same number of samples as in the FSAA approach memory requirements are equal. To minimize performance overhead which relates to the shading or in other words with the fragment shader execution number, this approach doesn’t run fragment shader for every individual sample instead of that it does him only once for every pixel. Then this once calculated color is interpolated on the whole set of samples, which belong to this pixel.
The further optimization is CSAA stands for Coverage Sampling Anti-Aliasing. While MSAA decouples shading cost from the number of samples, CSAA goes forward and does decoupling coverage mask from the samples color/depth/stencil. So for example CSAA x8 stores only 4 samples with color/depth/stencil and 8 bits for coverage mask, while MSAA x8 uses 8 samples, where each has its own color/depth/stencil. CSAA uses assumption: "If we want to further quality improvements and the further performance reduction is very unwanted then much more important is coverage mask for pixel than an additional color and depth information". CSAA provides very similar picture quality with a theoretical MSAA mode while using a lot less memory.
The main disadvantage of this method is it can’t perform anti-aliasing on alpha-tested edges. Unlike FSAA it works only on the geometry edges. But there is technique called Alpha to Coverage which tries to solve this task but it’s presented at the small range of the graphic cards. Nvidia has official support of this technique started form the Geforce 7 series and DirectX 10. Maybe later I’ll come back to this topic.
That is all for the introduction. Next time we’ll talk about Deferred Shading and Anti-Aliasing techniques for this popular in several last years approach.