Tuesday, November 1, 2011

Software occlusion culling trip. Part I: Packing the luggage.

          Occlusion culling in computer graphics especially in games becomes more and more important in several last years because the complexity of virtual worlds grows very quickly. Often in games are used much more complex materials with very heavy shading cost. But traditional hardware occlusion culling (HOC) which uses the extremely efficient and scalable GPUs power often becomes a bottleneck in the current games because rendering itself uses a lot of GPU power and often GPU’s more utilized than CPU (especially multicore CPU) and any additional work on GPU, not directly related to the rendering itself, may cause relatively big drops in the frame rate. So developers have started finding some way how to use occlusion culling because it helps cull a lot of invisible objects from rendering without worrying about GPU overworking.There is some “solution” called software occlusion culling (SOC) with difference in which kind of processing power is used. SOC is very similar to HOC by principle, but SOC  uses CPU for making decision about is desired object visible or not. In age where CPUs continue increasing the core number it seems not so bad idea, because SOC itself is very suitable for parallelization and to utilize all the power of modern CPUs. To be precise SOC has several another drawbacks and it isn’t single solution for the occlusion culling. There are some techniques which precomputes static scene and then uses this precomputed data to quickly answer the question “Which objects are visible from current point of view?” But they work well only with almost the static (often indoor) environment and only with a very few number of dynamic objects in the scene. So if you want to have both the occlusion culling and the relatively dynamic (potentially destructible) environment SOC is the only one suitable approach.
 
            That theme has been always very interesting for me and during this cycle of posts I want to implement software occlusion culling step by step. I know that this technique is not novel and these are examples where this technique was successfully commercially implemented and works (Frostbite 2 engine), but I’m interesting in taking experience in many areas around this technique. What areas are related to the SOC: software rasterization, SIMD instructions optimizations, and multithreading support and of course C++.

            The first step on our trip will be understading how the OC (HOC in particular) algorithm works and why HOC becomes less suitable for the heavy GPU usage games. OC is a very good optimization for virtual worlds with heavy number of virtual objects and with the relatively small amount of the entire world visible at any given time from the current camera point of view. OC algorithm consists of two basic conceptions: occluder objects and occluded objects. Occluder objects are renderable objects which hide another object behind themself. Not all objects are good occluders. Good occluders are for example walls, big buildings and so on. In other worlds everything big enough that can hind a lot of another objects. Occluded objects are object which were occluded from rendering because they are located totally behind an occluder object. As we can see OC is not need to be performed with some kind of object rendering it may be even fully analytically, means be fully performed by some mathematical equation. For example PVS (Potentially Visible Set, it is one kind of the OC algorithms) uses analytical approach while HOC and SOC use so named numerical approach. Look at the image below to understand the roles of occluder objects (red color) and occluded objects (black color).  The green triangle is camera view frustum if we are looking along XOZ plane. As we can see occluder objects can significantly reduce visible objects count if OC is enabled.

            Now we will try to answer the question “Why does HOC become less suitable approach for heavy GPU usage games?” The answer is hidden in the basic principle how CPU and GPU work together. Most of the time they work independently and asynchronously. That means that CPU is just supplier of the work for the consumer GPU. During the rendering CPU just calls GAPI (Graphics API) function to do something, the graphic's card driver collects these commands in its internal command buffer and time to time flushes this command buffer to the GPU for execution. There is no implicit synchronization between CPU and GPU during the typical usage, even Present() GAPI function is also asynchronous, and it is an usual driver buffer command as many other. But there are some exceptions where synchronization is necessary. One of the cases when we try to retrieve data back from GPU or we ask GPU to return some query result synchronously.

            Let’s talk a little about query mechanism in modern GPUs. There are a lot of different queries, detailed description about most of them you can find in the official documentation DirectX or OpenGL standard paper. For HOC we are interesting in the one specific kind of the queries named occlusion query. What is it? Occlusion query helps us to answer the question “How many pixels of a rendered object were rendered” Before rendering some geometry we ask GAPI to get us occlusion query, then we render out geometry, and after that we ask GAPI to get us the result about how many pixels were rendered. Notice that third step, when we ask GAPI to return us the result, may be either synchronous or asynchronous. Often used synchronous or some kind of synchronous where the query is asynchronous itself but we ask GAPI about the result in the infinite loop. To understand why the synchronization CPU and GPU is really bad idea, when we retrieve results of an occlusion query let’s look at the image below. Here we illustrate driver command buffer, where green rectangles are usual GPU command like SetTexture, DrawPrimitive and so on, two red rectangles are respectively acquire an occlusion query and retrieve results for this query and the blue rectangle between two red rectangles is DrawPrimitive command for which we want to get the number of the rendered pixels.

As you can see on the image our CPU is slightly ahead over the GPU, means CPU has produced more operations then GPU can do right now, but CPU doesn't wait while GPU is executing these commands, it just continue to write command in the command buffer as usual and it even doesn’t guess that GPU currently is really busy and can’t perform all these commands immediately. But what happens when we acquire an occlusion query and are waiting for the results. We manually make CPU to wait until GPU hasn’t done its works. It means that now CPU waits GPU until the red vertical line. And if we have a lot of these queries, that may significantly drop overall performance.  Don’t forget also about an additional work for the GPU introduced by the drawing occluder geometry and tested geometry of the occluded objects.

Now we know the answers on both questions. Let’s think about how SOC can help us in that situation. In the last several years CPUs have increased their core number relatively quickly and now it’s typical that gamer’s PC has a CPU with 4 and more cores. Unfortunately PC game developers are not so much experienced in multicore programming as our colleagues from console world, where Xbox 360 and PS3 have 4-6 general use cell processors. On the PC a typical game engine uses only 2 threads (but there are some exceptions of course and often most advanced game studios try to implement much more advanced multithreading): the first one is the entry point of the game where the main command flow is executed and the second one is so named background tasks thread, usually for loading content in background to be sure that we don’t interrupt main thread and don't cause short-time fps drop . But CPU utilization is not uniform for these threads, while main thread utilizes almost 100% of one CPU core; second thread uses usually 20-40% of CPU core power. So typically we use only less than 70% power of the dual core processor, 35% for the quad core and only 15% for eight cores. That shows us that we have a lot of unused processing power which we can use to improve our game and make it even better. As I mentioned earlier SOC is very suitable for parallel processing, so we can relatively easily use this processing power to accelerate OC algorithm. More over as well as we will do occlusion culling fully on the CPU we have to not see any negative influence of CPU-GPU synchronization.

So let’s think about what we need to implement a fairly good SOC. We will do this step by step and will describe common principles. Well that is the list of steps to achieve our goal:
·         Simple software rasterization algorithm.
·         Simple single-threaded software occlusion culling algorithm.
·         SIMD optimization of software rasterization and culling algorithm.
·         Job system to effectively parallel SOC.
·         Some automated tool to generate occluder geometry from the rendering geometry without artist’s part.

That sounds not very hard but as always devil is in the details.

Antialiasing techniques. Part II: Image based antialiasing techniques.

        I’m too late with this part. I’ve even started writing it but was really busy at work and at my pet project. About a month ago I’ve found a really cool introduction to existing image based antialiasing techniques and a lot of papers. So I decided to not describe them here by myself and just post here links to these papers: http://iryoku.com/aacourse/

Wednesday, June 15, 2011

Deferred decals

In this article I will try to explain a popular solution for applying an additional detail to the geometry without changing it. This approach often called “decals” or “spots” and used widely in almost every commodity game for adding an extra detail to the static and dynamic geometry and for applying some dynamic effects like bullets and explosions holes, blood splashes and so on.
When I firstly tried to implement decals system, I've not found any suitable paper about how to implement this technique. I spent a lot of time while searching some useful info on the different forums, but didn’t find anything except one article in the GPG 2 book named “Applying Decals to Arbitrary Surfaces” by Eric Lengyel. So I implemented this technique and it worked, but a lot of artifacts (hard tuning z-bias, and so on) made me sad. More over this approach is suitable only for a static geometry and completely unsuitable for a dynamic skinned geometry because in this case we have to recalculate decal’s vertices in the sync with the animation’s frames. There is some workaround for problem related to the skinned models, but that is fully different technique and it greatly makes implementation and support of the decal system very expensive and hard.
But as soon as many game developers have started using a new approach for rendering, called DS (Deferred Shading) and it’s different variations like LPP, DL, and so on, was offered a new approach for applying decals to the static and dynamic skinned geometry almost in the same simple and intuitive way.
The basic idea for that algorithm is very simple. And if you heard about so named Humus’s “Volume Decals” we have done, if not we almost done. The only difference between this approach and Humus’s is which kind of the texture we used and how we calculate a texture coordinates.
Before going to talking about the idea of the deferred decal I'll quickly explain the common decal’s approach. What is decal itself, it’s a spot at some surface which can blends with the original surface’s color. To make a decal on a desired surface and without changing anything in the original surface data, we have to clone some part of the surface’s geometry which intersects with the decal’s volume and then render these vertices over the original surface and finally blend decal's color with the surface's color. This process may be both simple and very hard, it depends on the surface type. For an almost flat surface finding decal’s geometry is relatively simple, but for some curve surface it may be not. To find right surface vertices where decal has to be, you have to have some collision data. Usual developers use some third party physics engine to find approximate position of the decal on the desired object. And then they use detailed geometry representation to do right vertices selection and clipping. There are a lot of different approaches, in one engines used a detailed geometry representation in the collision scene via concave triangle meshes (many physics engines support this feature for the static collision scene only, but not for dynamic because of lack of performance needed to fully update a highly optimized internal representation), some another engines use physics engine to determine some approximated position of the decals, and then use for example BSP model (or something else) to determine a set of triangles which belongs to the decals.
For the static geometry this relatively expensive calculation needed only once and may be even precomputed, but for skinned geometry we will need recalculate it almost every frame and this is very bad. Moreover even for static geometry that is not always possible to precompute or even avoid recalculation of the decals geometry. The simple example is terrain with dynamic level of detail, every time when terrain changes its lod you have to recalculate decal's vertices or you will see terrible z-fighting in places where decal is. We have to find some better solution. Currently I know 4 approches:

  • Use additional vertex’s stream with color which is unique for every model in the scene even if they share with each other another geometry data like position, texture coordinates, vertices weights for animation and so on. In these additional vertices streams we dynamically add decal’s color. The process is very similar to the process of adding decals on the static geometry. We get the transformed geometry for the current frame, find vertices, which were affected by the decal, and add to the our additional stream color associated with this decal and its properties. If you make skinning on the CPU like DooM 3 does you already have transformed geometry on the CPU side, but if you use GPU hardware skinning like FarCry you have to only once get the all calculation on the CPU to get access to the transformed geometry, fill the color vertex stream with decal’s color and then do all transformation on the GPU side as usual. This method is very depended on the tessellation factor of the geometry, because it stores color for vertices but not for every pixel. That means that this method is very similar in quality to the per-vertex lighting. And it's only suited for adding some very blurred spots, but not for detailed.
  • Similar to the previous, but instead of writing color into the additional vertex stream gather all affected by a decal vertices and do animation on these vertices as well as on the original geometry. It may be difficult to batch decals in this case, because for each decal we need to send into the shader skeleton transformation matrices.
  • Based on the approximated collision model for the object find position, direction and size of a decal. Based on this info (position, direction, size)  we can present decal as a box which has position, direction as orientation and size as scale along each of the three axises. As soon as we presented all the decal with appropriate box, we can reuse this info to fully reconstruct decals on the arbitrary surface entirely in the vertex and pixel shaders. The rest part will be very similar to the texture projection or shadow mapping techniques. We can imagine that each box associated with each decal is a small camera from which we can see only a little piece of the object’s geometry. So if we construct view and projection matrix for this box (this is very easy and I’ll show it later), we can transform object’s vertices from world space to the our decal's virtual space. Now we have the clip space positions of the object’s surface points for each box (decal). All that is left to do is make from clip space coordinates texture coordinates and access by these coordinates appropriate decal’s textures. Of course after we extracted the decal color we can easily blend it with the main surface’s color. The all this work may be executed as well in the main shades (here you’ll have very limited number of the possible decal because of lack of instruction count) or in additional shader with additional pass.
  • Use additional texture for every affected by decals models. To add a decal the all what we need is find affected by it vertices and then draw into the additional render target these vertices using vertices texture coordinates as a position in the clip space. This process called writing into the texture wrapping and on the output we will have texture with the decal color. During the main shading pass we access this texture with the same texture coordinates as for a diffuse texture.This approach has a lot of drawbacks: usage a lot of memory, need run-time mip-map level generation, additional work in case of device lost in DX9 and so on.
      The deferred decals approach is based on the texture projection approach but extends it so, that is almost not needed additional geometry for decals at all. With deferred decals we must not have very detailed collision objects. For now what we need is very approximate collision object and access to the scene depth from the pixel shader (this ability presents in every DS pipeline).
Each decal in the deferred decals may be represented via a box shape with desired scale, orientation and position (as I described earlier). We just render this box with the special shaders to perform texture projection. In the vertex shader based on the incoming parameters we reconstruct view-projection matrix.
//-- 2. compute world matrix.
//-- then we premultiply it by the scale matrix. world = scale * rotateTranslate.
float3 scale = m_scale;
float3 pos   = m_pos;
float3 zAxis = normalize(m_dir.xyz);
float3 yAxis = normalize(m_up.xyz);
float3 xAxis = cross(yAxis, zAxis);
float4x4 scaleMat = 
{
{scale.x, 0, 0, 0},{0, scale.y, 0, 0},{0, 0, scale.z, 0},{0, 0, 0, 1}
};
float4x4 worldMat = 
{
{xAxis, 0},{yAxis, 0},{zAxis, 0},{pos,   1}
};
//-- 3. final world matrix.
worldMat = mul(scaleMat, worldMat);
//-- 4. compute data for the decal view matrix.
float4x4 lookAtMat = 
{
{xAxis.x, yAxis.x, zAxis.x, 0}, 
{xAxis.y, yAxis.y, zAxis.y, 0},
{xAxis.z, yAxis.z, zAxis.z, 0},
{-dot(xAxis, pos),  -dot(yAxis, pos), -dot(zAxis, pos), 1}
};
//-- 5. compute data for the decal proj matrix.
float4x4 projMat = 
{
{2.0f / scale.x, 0, 0, 0},
{0, 2.0f / scale.y, 0, 0},
{0, 0, 1, 0},
{0,   0, 0, 1}
};
//-- 6. caclulate final view-projection decal matrix.
float4x4 viewProjMat = mul(lookAtMat, projMat);
Then in the pixel shader using this view-projection matrix we calculate texture coordinates for accessing decal’s texture (very similar as we do that for shadow mapping or texture projection).
//-- calculate texture coordinates and clip coordinates.
float2 texCoord = i.pos.xy * g_screenRes.zw;
//-- reconstruct pixel’s world position.
float3 pixelWorldPos = reconstructWorldPos(texCoord);

//-- reconstruct decal view-proj matrix.
float4x4 decalViewProjMat = { i.row0, i.row1, i.row2, i.row3 };

//-- calculate texture coordinates for projection texture.
float4 pixelClipPos = mul(float4(pixelWorldPos, 1.0f), decalViewProjMat);
pixelClipPos.xy /= pixelClipPosInTexSpace.w;
//-- accessing decal’s texture.
//-- Note: CS2TS is helper function performing conversion
//-- from Clip Space To Texture Space.
float4 oColor = sample2D(diffuse, CS2TS(pixelClipPos));
That is all. Of course you can perform computation of the view-projection matrix on the CPU side, but I decide to eliminate CPU overhead and move all computation to the GPU. There are a lot of optimization may be performed to save precessing power but for clear understanding I wrote the code in the most cleaner way.
For now you know how to use deferred decals, but there are some optimizations:
    • using instancing for decals in conjunction with the texture array allows render all decal in one draw call (in the demo I use this approach).
    • If you’re using DS pipeline you can blend decals entirely into the main normal and albedo buffer, so you can save memory and performance, because you don’t need read it later during the lighting pass.
You have to know also that deferred decals has a some drawbacks, one of the most important is separation the decal influence on the not desired geometry (look at the image).
At the image you might see some artifact when in the decal’s volume are both terrain and the barrel models, but in one case I want that decal affects only terrain and in the second only barrel but not the terrain. There is a solution and the base idea of it add to the g-buffer additional channel which represents object's unique index (0-255) in the current frame. In the same way each decal also has associated with it an unique index which means which kind of objects this current decal wants to affect. I have good feel about this method and want to try implement it in near future. More over this additional info about the object's index may be reused for the material selection during the lighting pass.
As every technique deferred decals approach has some cons and pros.
Pros:
  • doesn't have problems with z-fighting at all due the projection nature.
  • combine decal approaches for static and dynamic geometry in one simple solution.
  • no needed some detailed geometry representation to find the exact vertices, a very approximate collision shape is enough.
  • not needed any additional work for lighting, because the lights impact is handled during the DS's lighting pass.
  • all DS's lighting optimizations work as well for it.
Cons:
  • needed some additional work to fix nature of the texture projection technique.
  • unwise usage may result in very high overdraw (if you use very big decal on terrain for example). But we can apply optimization from the DS light resolving pass because the both approches are very similiar.
  • works well only with DS's oriented pipelines (including forward shading with z-pre-pass, but with additional memory usage) .
Some screenshots and of course the demo with the shader code (anyone who wants C++ code may mail me and I send him it):
P.S. The second part of the post "Anti-aliasing techniques" still in-progress.

Thursday, March 3, 2011

Anti-Aliasing techniques. Part I: Introduction.


Here I want to talk about different techniques related to the anti-aliasing.
Before continuing talking about anti-aliasing it will be great to understand what aliasing is and why it happens at all. Aliasing in computer graphics is a kind of the more general problem, which is solved by the mathematical discipline called “Sampling and Filtering Theory” and a set of techniques for doing this are anti-aliasing thechniques . Sampling is a process converting from a continuous signal to a discrete signal, an opposite process called reconstruction. During sampling some part the original information is loosed because samples count is finite. To understand what is sampling and how correct choose these samples look at the image #1.

The idea of explanation of the sampling issue I’ve got from the wonderful book called “Real-Time Rendering” by Tomas Akenine-Moller, Eric Haines and Naty Hoffman. Because I can’t distribute any part of the book even images, I wasn’t lazy and drew my own picture, which I think will help.

On the image you can see so named “wheel spin paradox”. You may often see it in the movies for example. The paradox here is that a wheel may appear like it spins in the opposite to the original direction or may seem motionless at all. First row shows the original signal. At the second you can see a signal which is sampled at the very low frequency, so the wheel looks like it rotates in the opposite direction. At the third row is situated a signal sampled at exactly two times of the original frequency. It looks like the wheel doesn’t move at all. And at the last row a signal sampled at frequency slightly big than two times of the original. Notice how good the original signal was sampled in that case. So we can prove that to have good sampled original signal it’s absolutely necessary to have the sample rate at least slightly more than twice of the original signal frequency. This minimum sampled rate has a name Nyquist rate or Nyquist limit.

Now when we understand the nature of anti-aliasing let’s see what happens when we transform objects from 3d dimension representation into the 2d dimension. You may even know that line segment consists of the infinite number of points inside it. This is true for a rectangle; it also consists of the infinite number points inside it. Moreover you can prove that if the length of the line segment is equal to the perimeter of the rectangle, they both have infinite and equal number of points. The all of that is applied as well for 3D dimension. So there is no way to transform object in 3D dimension into 2D dimension without losing some information, except having infinite 2D representation. But because almost all commodity monitors have finite number of 2D points, this value often also called screen resolution, we can’t do that. To be right having infinite number of 2D point or infinity big screen resolution isn’t necessary, because when we’re increasing screen resolution we just add more detail to small or distant objects, while big object, which have relatively big screen space coverage percent, just very slightly improve their quality, because their signal frequency are commensurable with sampling rate. There is no matter how much detailed will be this 2d grid; anyway we can chose a 3d object, which will be so small so we will not be able to properly cover it by this 2d grid. Moreover having big screen resolution is very inefficient in terms of memory, bandwidth and calculations. Bandwidth bottleneck become more important in the last couple of years and relates to the using deferred rendering techniques, which requires a very fat back buffer.

 I want to give you some information about memory usage. Let’s imagine a typical screen resolution used in computer games is 1680x1050. To have it we need to have memory at least equal 1680x1050 x (4 bytes) x 3 ~= 20 MB. Here I’ve made assumption that we have double buffering and one 32 bits depth buffer. If we just double resolution memory usage will be ~20 * 4 = 80 MB (Note this is typical memory usage for MSAAx4 or FSAAx2). Further increasing the screen resolution will be 320 MB (equivalent MSAAx8), 1280 MB (equivalent for theoretical MSAAx16), 5120 MB (MSAAx32) and so on. As you can see this is almost impossible on the commodity hardware, where we typical have one or two gigabytes because computer graphics consists not only from back-buffers but from meshes, textures, other render targets(like shadow maps, environments map and so on) and everything else, that makes your favorite game interesting.

   So now we know that when we convert 3D object into 2D we lose original information. That all says us that the process of showing 3D graphics on the display is a kind of the sampling task also called rasterization. There are a lot of different techniques to do this task. Google has a lot of information about this. Here I just give you some basic information (image #2).

Firstly the rasterizer is just one of many stages in the graphics pipeline. It goes right before execution of the fragment shaders. On the input to the rasterizer are passed vertices in the clip space and an information about what a primitive they have to produce together. That can be line, triangle, polygon with more than three vertices or something else. On the output we have a set of pixels, which are ready to be passed to the next stage. So for example rasterizer has three vertices and these vertices have to make triangle. Rasterizer connects vertices sequentially and gets wireframe representation of the triangle, but we want to get solid triangle, so rasterizer tries to project this wireframe triangle onto the 2D grid. During this process we are loosing information about the original wireframe triangle, because during this process are passed only pixels whose center locates inside the triangle boundary. So even if pixel overlaps triangle boundary but its center doesn’t locate inside the boundary of triangle this pixel will be skipped from the pipeline forever. This is very important note because later you will see that MSAA uses slightly different approach for doing this.

To eliminate this problem there are two common and relatively old solutions, to be precise three but the third is very expensive for real-time graphics. There are FSAA, MSAA and Accumulation buffer with camera projection matrix jittering. As I already said the last solution is very expensive in terms of performance. The idea behind this approach is very simple before we display final image we render the same scene multiple times into so named accumulation buffer, which commonly was hardware implemented and has slightly more bits per component to eliminate color clamping, with every time jittered by sample pattern projection matrix. Then we resolve accumulation buffer into back-buffer by just dividing each color by N, where N is the number of scene draws. It works great and gives stable and pleasure result, but performance is very poor. Maybe this approach is good or even excellent for movies but not for games.

The first more acceptable method is FSAA. FSAA stands for Full Screen Anti-Aliasing. This approach also has another name called SSAA (Super Sampling Anti-Aliasing). The basic idea of this method is to give each pixel some additional resolution. This is reached by drawing final image in the bigger resolution 2x 3x 4x, so for every pixel for the original not scaled image we have 2x2 3x3 4x4 block of pixels. To get final color of this pixel we usually get the simple arithmetic’s average sum of all these sub-pixels. It looks pretty good, but the memory consumption and performance make this approach almost not useful in real-time graphics applications like games, but there are some exceptions.

The second solution is MSAA, stands for Multi-Sample Anti-Aliasing. It presents in almost every commodity graphics cards and has hardware implementation. This technique also tries to give each pixel some additional sub-pixel information, but in difference to FSAA does that in performance sparse form. With MSAA enabled the pixel stops being an atomic part of the rasterizer. Now each pixel has its own dimension depended of the MSAA mode. This additional information is stored in an additional buffer called samples buffer. It contains color/z/stencil for every sample. So with the same number of samples as in the FSAA approach memory requirements are equal. To minimize performance overhead which relates to the shading or in other words with the fragment shader execution number, this approach doesn’t run fragment shader for every individual sample instead of that it does him only once for every pixel. Then this once calculated color is interpolated on the whole set of samples, which belong to this pixel.

 The further optimization is CSAA stands for Coverage Sampling Anti-Aliasing. While MSAA decouples shading cost from the number of samples, CSAA goes forward and does decoupling coverage mask from the samples color/depth/stencil. So for example CSAA x8 stores only 4 samples with color/depth/stencil and 8 bits for coverage mask, while MSAA x8 uses 8 samples, where each has its own color/depth/stencil. CSAA uses assumption: "If we want to further quality improvements and the further performance reduction is very unwanted then much more important is coverage mask for pixel than an additional color and depth information". CSAA provides very similar picture quality with a theoretical MSAA mode while using a lot less memory.

 The main disadvantage of this method is it can’t perform anti-aliasing on alpha-tested edges. Unlike FSAA it works only on the geometry edges. But there is technique called Alpha to Coverage which tries to solve this task but it’s presented at the small range of the graphic cards. Nvidia has official support of this technique started form the Geforce 7 series and DirectX 10. Maybe later I’ll come back to this topic.

That is all for the introduction. Next time we’ll talk about Deferred Shading and Anti-Aliasing techniques for this popular in several last years approach. 

Wednesday, March 2, 2011

This is just the beginning...

Hi, my name is Bronislaw Sviglo.
Before I continue I want to notice that I'm not a native english speaker so I will make a little(I hope:) ) mistakes. But anyway I decided to use english as a primary language for blogging because this decision gives me opportunity to improve my english in the future and because every programmer should know english .
Currently I'm working as a graphics programmer on wargaming.net. I began this blog because at the couple of last years I've accumulated a lot of ideas and I decided that it will be cool to gather them together in one place. That will help me to fast find my old ideas and maybe it will help someone else. I will try always make some little demo on my own framework to not just "solve the spherical problem in the vacuum" of the specific topic, but give some real example of an implementation and prove or not that this technique has a right to existence.