Wednesday, June 15, 2011

Deferred decals

In this article I will try to explain a popular solution for applying an additional detail to the geometry without changing it. This approach often called “decals” or “spots” and used widely in almost every commodity game for adding an extra detail to the static and dynamic geometry and for applying some dynamic effects like bullets and explosions holes, blood splashes and so on.
When I firstly tried to implement decals system, I've not found any suitable paper about how to implement this technique. I spent a lot of time while searching some useful info on the different forums, but didn’t find anything except one article in the GPG 2 book named “Applying Decals to Arbitrary Surfaces” by Eric Lengyel. So I implemented this technique and it worked, but a lot of artifacts (hard tuning z-bias, and so on) made me sad. More over this approach is suitable only for a static geometry and completely unsuitable for a dynamic skinned geometry because in this case we have to recalculate decal’s vertices in the sync with the animation’s frames. There is some workaround for problem related to the skinned models, but that is fully different technique and it greatly makes implementation and support of the decal system very expensive and hard.
But as soon as many game developers have started using a new approach for rendering, called DS (Deferred Shading) and it’s different variations like LPP, DL, and so on, was offered a new approach for applying decals to the static and dynamic skinned geometry almost in the same simple and intuitive way.
The basic idea for that algorithm is very simple. And if you heard about so named Humus’s “Volume Decals” we have done, if not we almost done. The only difference between this approach and Humus’s is which kind of the texture we used and how we calculate a texture coordinates.
Before going to talking about the idea of the deferred decal I'll quickly explain the common decal’s approach. What is decal itself, it’s a spot at some surface which can blends with the original surface’s color. To make a decal on a desired surface and without changing anything in the original surface data, we have to clone some part of the surface’s geometry which intersects with the decal’s volume and then render these vertices over the original surface and finally blend decal's color with the surface's color. This process may be both simple and very hard, it depends on the surface type. For an almost flat surface finding decal’s geometry is relatively simple, but for some curve surface it may be not. To find right surface vertices where decal has to be, you have to have some collision data. Usual developers use some third party physics engine to find approximate position of the decal on the desired object. And then they use detailed geometry representation to do right vertices selection and clipping. There are a lot of different approaches, in one engines used a detailed geometry representation in the collision scene via concave triangle meshes (many physics engines support this feature for the static collision scene only, but not for dynamic because of lack of performance needed to fully update a highly optimized internal representation), some another engines use physics engine to determine some approximated position of the decals, and then use for example BSP model (or something else) to determine a set of triangles which belongs to the decals.
For the static geometry this relatively expensive calculation needed only once and may be even precomputed, but for skinned geometry we will need recalculate it almost every frame and this is very bad. Moreover even for static geometry that is not always possible to precompute or even avoid recalculation of the decals geometry. The simple example is terrain with dynamic level of detail, every time when terrain changes its lod you have to recalculate decal's vertices or you will see terrible z-fighting in places where decal is. We have to find some better solution. Currently I know 4 approches:

  • Use additional vertex’s stream with color which is unique for every model in the scene even if they share with each other another geometry data like position, texture coordinates, vertices weights for animation and so on. In these additional vertices streams we dynamically add decal’s color. The process is very similar to the process of adding decals on the static geometry. We get the transformed geometry for the current frame, find vertices, which were affected by the decal, and add to the our additional stream color associated with this decal and its properties. If you make skinning on the CPU like DooM 3 does you already have transformed geometry on the CPU side, but if you use GPU hardware skinning like FarCry you have to only once get the all calculation on the CPU to get access to the transformed geometry, fill the color vertex stream with decal’s color and then do all transformation on the GPU side as usual. This method is very depended on the tessellation factor of the geometry, because it stores color for vertices but not for every pixel. That means that this method is very similar in quality to the per-vertex lighting. And it's only suited for adding some very blurred spots, but not for detailed.
  • Similar to the previous, but instead of writing color into the additional vertex stream gather all affected by a decal vertices and do animation on these vertices as well as on the original geometry. It may be difficult to batch decals in this case, because for each decal we need to send into the shader skeleton transformation matrices.
  • Based on the approximated collision model for the object find position, direction and size of a decal. Based on this info (position, direction, size)  we can present decal as a box which has position, direction as orientation and size as scale along each of the three axises. As soon as we presented all the decal with appropriate box, we can reuse this info to fully reconstruct decals on the arbitrary surface entirely in the vertex and pixel shaders. The rest part will be very similar to the texture projection or shadow mapping techniques. We can imagine that each box associated with each decal is a small camera from which we can see only a little piece of the object’s geometry. So if we construct view and projection matrix for this box (this is very easy and I’ll show it later), we can transform object’s vertices from world space to the our decal's virtual space. Now we have the clip space positions of the object’s surface points for each box (decal). All that is left to do is make from clip space coordinates texture coordinates and access by these coordinates appropriate decal’s textures. Of course after we extracted the decal color we can easily blend it with the main surface’s color. The all this work may be executed as well in the main shades (here you’ll have very limited number of the possible decal because of lack of instruction count) or in additional shader with additional pass.
  • Use additional texture for every affected by decals models. To add a decal the all what we need is find affected by it vertices and then draw into the additional render target these vertices using vertices texture coordinates as a position in the clip space. This process called writing into the texture wrapping and on the output we will have texture with the decal color. During the main shading pass we access this texture with the same texture coordinates as for a diffuse texture.This approach has a lot of drawbacks: usage a lot of memory, need run-time mip-map level generation, additional work in case of device lost in DX9 and so on.
      The deferred decals approach is based on the texture projection approach but extends it so, that is almost not needed additional geometry for decals at all. With deferred decals we must not have very detailed collision objects. For now what we need is very approximate collision object and access to the scene depth from the pixel shader (this ability presents in every DS pipeline).
Each decal in the deferred decals may be represented via a box shape with desired scale, orientation and position (as I described earlier). We just render this box with the special shaders to perform texture projection. In the vertex shader based on the incoming parameters we reconstruct view-projection matrix.
//-- 2. compute world matrix.
//-- then we premultiply it by the scale matrix. world = scale * rotateTranslate.
float3 scale = m_scale;
float3 pos   = m_pos;
float3 zAxis = normalize(;
float3 yAxis = normalize(;
float3 xAxis = cross(yAxis, zAxis);
float4x4 scaleMat = 
{scale.x, 0, 0, 0},{0, scale.y, 0, 0},{0, 0, scale.z, 0},{0, 0, 0, 1}
float4x4 worldMat = 
{xAxis, 0},{yAxis, 0},{zAxis, 0},{pos,   1}
//-- 3. final world matrix.
worldMat = mul(scaleMat, worldMat);
//-- 4. compute data for the decal view matrix.
float4x4 lookAtMat = 
{xAxis.x, yAxis.x, zAxis.x, 0}, 
{xAxis.y, yAxis.y, zAxis.y, 0},
{xAxis.z, yAxis.z, zAxis.z, 0},
{-dot(xAxis, pos),  -dot(yAxis, pos), -dot(zAxis, pos), 1}
//-- 5. compute data for the decal proj matrix.
float4x4 projMat = 
{2.0f / scale.x, 0, 0, 0},
{0, 2.0f / scale.y, 0, 0},
{0, 0, 1, 0},
{0,   0, 0, 1}
//-- 6. caclulate final view-projection decal matrix.
float4x4 viewProjMat = mul(lookAtMat, projMat);
Then in the pixel shader using this view-projection matrix we calculate texture coordinates for accessing decal’s texture (very similar as we do that for shadow mapping or texture projection).
//-- calculate texture coordinates and clip coordinates.
float2 texCoord = i.pos.xy *;
//-- reconstruct pixel’s world position.
float3 pixelWorldPos = reconstructWorldPos(texCoord);

//-- reconstruct decal view-proj matrix.
float4x4 decalViewProjMat = { i.row0, i.row1, i.row2, i.row3 };

//-- calculate texture coordinates for projection texture.
float4 pixelClipPos = mul(float4(pixelWorldPos, 1.0f), decalViewProjMat);
pixelClipPos.xy /= pixelClipPosInTexSpace.w;
//-- accessing decal’s texture.
//-- Note: CS2TS is helper function performing conversion
//-- from Clip Space To Texture Space.
float4 oColor = sample2D(diffuse, CS2TS(pixelClipPos));
That is all. Of course you can perform computation of the view-projection matrix on the CPU side, but I decide to eliminate CPU overhead and move all computation to the GPU. There are a lot of optimization may be performed to save precessing power but for clear understanding I wrote the code in the most cleaner way.
For now you know how to use deferred decals, but there are some optimizations:
    • using instancing for decals in conjunction with the texture array allows render all decal in one draw call (in the demo I use this approach).
    • If you’re using DS pipeline you can blend decals entirely into the main normal and albedo buffer, so you can save memory and performance, because you don’t need read it later during the lighting pass.
You have to know also that deferred decals has a some drawbacks, one of the most important is separation the decal influence on the not desired geometry (look at the image).
At the image you might see some artifact when in the decal’s volume are both terrain and the barrel models, but in one case I want that decal affects only terrain and in the second only barrel but not the terrain. There is a solution and the base idea of it add to the g-buffer additional channel which represents object's unique index (0-255) in the current frame. In the same way each decal also has associated with it an unique index which means which kind of objects this current decal wants to affect. I have good feel about this method and want to try implement it in near future. More over this additional info about the object's index may be reused for the material selection during the lighting pass.
As every technique deferred decals approach has some cons and pros.
  • doesn't have problems with z-fighting at all due the projection nature.
  • combine decal approaches for static and dynamic geometry in one simple solution.
  • no needed some detailed geometry representation to find the exact vertices, a very approximate collision shape is enough.
  • not needed any additional work for lighting, because the lights impact is handled during the DS's lighting pass.
  • all DS's lighting optimizations work as well for it.
  • needed some additional work to fix nature of the texture projection technique.
  • unwise usage may result in very high overdraw (if you use very big decal on terrain for example). But we can apply optimization from the DS light resolving pass because the both approches are very similiar.
  • works well only with DS's oriented pipelines (including forward shading with z-pre-pass, but with additional memory usage) .
Some screenshots and of course the demo with the shader code (anyone who wants C++ code may mail me and I send him it):
P.S. The second part of the post "Anti-aliasing techniques" still in-progress.

1 comment:

  1. Hey neat work there on deferred decals.
    Is the only difference between your technique and humus is the use of different texture.
    If u could share the source code, would be great for learning.