Realtime global illumination

April 2, 2011

I’ve had an idea floating around in my head for several months now, but evening classes, a hard drive failure, then a GPU failure prevented me from doing much about it until this weekend. First, a couple of screenshots of what I’ll be talking about.

Cornell box, I choose you!

The above screenshots are of real-time single-bounce GI in a static scene with fully dynamic lighting. There are 7182 patches, and the lighting calculations take 36ms per frame on one thread of an Intel Core 2 @2.4 GHz. The code is not optimized.

The basic idea is simple and is split into two phases.

A one-time scene compilation phase:

  • Split all surfaces in the scene into roughly equal sized patches.
  • For each patch i, build a list of all other patches visible from i along with the form factor between patches i and j:
    F_{ij} = \frac{cos \phi_i \, cos \phi_j}{\pi \, |r|^2} \, A_j
    Fij is the form factor between patches i and j
    Aj is the area of patch j
    r is the vector from i to j
    Φi is the angle between r and the normal of patch i
    Φj is the angle between r and the normal of patch j

And a per-frame lighting phase:

  • For each patch i, calculate the direct illumination.
  • For each patch i, calculate single-bounce indirect illumination from all visible patches:
  • I_i = \sum_j D_j \, F_{ij}
    Ij is the single-bounce indirect illumination for patch i
    Dj is the direct illumination for patch j

So far, so radiosity. If I understand Michal Iwanicki’s GDC presentation correctly, this is similar to the lighting tech on Milo and Kate, only they additionally project the bounce lighting into SH.

The problem with this approach is that the running time is O(N2) with the number of patches. We could work around this by making the patches quite large, running on the GPU, or both. Alternatively, we can bring the running time down to O(N.log(N)) by borrowing from Michael Bunnell’s work on dynamic AO and cluster patches into a hierarchy. I chose to perform bottom-up patch clustering similarly to the method that Miles Macklin describes in his (Almost) realtime GI blog post.

Scene compilation is now:

  • Split all surfaces in the scene into roughly equal sized patches.
  • Build a hierarchy of patches using k-means clustering.
  • For each patch i, build a list of all other patches visible from i along with the form factor between patches i and j. If a visible patch j is too far from patch i look further up the hierarchy.

And the lighting phase:

  • For each leaf patch i in the hierarchy, calculate the direct illumination.
  • Propagate the direct lighting up the hierarchy.
  • For each patch i, calculate single-bounce indirect illumination from all visible patches clusters.

Although this technique is really simple, it supports a feature set similar to that of Enlighten:

  • Global illumination reacts to changes in surface albedo.
  • Static area light sources that cast soft shadows.

That’s basically about it. There are a few of other areas I’m tempted to look into once I’ve cleaned the code up a bit:

  • Calculate directly and indirect illumination at different frequencies. This would allow scaling to much larger scenes.
  • Perform the last two lighting steps multiple times to approximate more light bounces.
  • Project the indirect illumination into SH, HL2 or the Half-Life basis.
  • Light probes for dynamic objects.

You can grab the source code from here. Expect a mess, since it’s a C++ port of a C# proof of concept with liberal use of vector and hash_map. Scene construction is particularly bad and may take a minute to run. You can fly around using WASD and left-click dragging with the mouse.


16 Responses to “Realtime global illumination”

  1. Miles Says:

    Hi Tom, you have probably already tried it but you can cache the indirect lighting between frames to get multiple bounce effects for “free”.

    Usually the lighting environment won’t be changing fast enough for the latency to be a problem and it should converge pretty quickly.

    Looks great though!

  2. Tom Madams Says:


    I’ve not actually tried inter-frame caching yet, I want to clean the code up a bit first. Well, a lot actually; it’s an embarrassment right now. Ideally, each bounce should just be a couple of loops over flattened arrays and then a vertex buffer/light map update.

  3. MJP Says:

    I’ve been playing around with a similar point-based approach, except I pre-calculate disc-to-disc visibility and do the runtime steps in a compute shader. At the moment I have it down do ~4.1ms for a scene with 20k sample points, and it’s completely bandwidth bound so I should be able to get that time lower by packing the data formats a little tighter. I haven’t even bothered with the hierarchical stuff yet, since the pre-computed visibility greatly reduces the number of calculations anyway. I was thinking that for scaling it up to bigger scenes, I could do some kind of real-time visibility determination for the sample points so that I only update the indirect lighting for points that actually need it for that frame.

    Anyway, very cool stuff! There’s definitely a lot of promising avenues to explore.

    • Tom Madams Says:

      Well it looks like you’ve got me beat by quite a way. Any chance that there’s a blog post about it coming up soon? I’d like to see some pretty pictures!

      What kind of test scene are you using? In my rather artificial setup, precomputed visibility only reduces the number of patch-to-patch interactions by about 33%, which doesn’t help that much. I can easily imagine that in more real-world scenes there’s a much higher rate of occlusion.

  4. MJP Says:

    I think it’s a little ways off from being safe for public consumption…it’s still riddled with hacks and commented-out code.

    The test scene isn’t too different from yours…it’s a floor with three colored walls, with a sphere, torus, and cone in the center. It really wasn’t visibility that helped out a lot for me, it was actually pre-computing the form factors. This let me reject points below a certain threshold, and also kept from having to sample the buffer containing sample point positions + normals + colors in the compute shader.

  5. Robin Green Says:

    Good grief, not the HL2 basis please. Ick.

  6. Tom Madams Says:


    You’re right of course, looking at the comparison images from “Efficient Irradiance Normal Mapping”, the HL2 basis gives worse results than I remember.

  7. Szymon Swistun Says:

    Hey Tom, did I see you are using a “tree” structure in there? We talked about this… I think you are doing it wrong 🙂

    Good stuff though. Would be nice to spread the cost of updating the patches / form factor relationships over several frames ( or frequencies ) and be able to introduce dynamic geo somehow into the mix, without special casing etc…

  8. Tom Madams Says:

    Things are a lot easier when you know your code never has to survive full production 😉

    I’m reworking the whole thing at the moment to use an octree (!) of SH probes, similar to the point-based approximate color-bleeding paper. I’m not sure if the quality will be up to snuff, but I think it would be easier to make dynamic updates work.

  9. This may sound like a nooby question, but, what is the usual way to generate the patches of a given scene? I’ve read a few articles on how GI is computed for each patch or point given an average light intensity computed on renders, piece of cake stuff for me to understand, but nothing mentioning on how the surfaces are actually generated. Which probably makes sense to them since the lighting is the focus of the article :p

    So how do you split your surfaces (before you get to the clustering phase to organize them)? And does your method work for any surfaces or just planar ones like in the Cornell Box?

    Your CPU is similar in spec to mine (Core 2 at 2.2ghz) so I’ve been meaning to try out some GI work to go with my deferred renderer.

    • Tom Madams Says:

      This is something I’ve never actually done myself, but I believe a common approach is to perform a UV unwrapping process similar to one used during lightmap generation. The DirectX SDK contains UVAtlas, which I’ve heard good things about:

    • chris Says:

      I’m more interested in knowing how you did the surface splitting. The results look quite similar to the ones shown here:

      I’m actually using XNA so I can’t use the D3DX library to use UVatlas functions. So other than that, I haven’t been able to find an algorithm that splits up surfaces into patches.

      • Tom Madams Says:

        Well, in my test case all the surfaces were quads, so generating the patches was trivial.

        Are you really interesting in splitting your geometry? These days, I think it’s more common to parameterize with textures (you can think of the texels as your patches if that makes it easier to visualize). UVAtlas can run off-line, so it doesn’t matter whether you’re using XNA or not.

        • Ah I see now, well I just prefer to use a managed library, but good to know that about UVAtlas. I’ll probably start using quads only, since they really are trivial to map (and you can still do a lot with quads). There’s still a lot that can be made using quads. Thanks for the visualization, that’s how I pictured the patches in your screenshots, like texels without a filter.

  10. Akres Says:

    Hi, could i use the image in my thesis? It illustrates nicely the artifact created by simple radiosity.
    Thanks 🙂

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: