Importance sampling and example

1. Importance sampling idea

We have random variable X_{1} , it has pdf (probability density function) p(x),

we have known
E\left ( f\left ( X_{1} \right ) \right )= \int f\left ( x \right )p\left ( x \right )dx

And according Monte Carlo estimator, if we want evaluate function f(x) by samples from p(x), we got
E\left ( f\left ( X_{1} \right ) \right ) \approx \frac{1}{n}\sum_{1}^{n}f\left ( x_{i} \right ) where x_{i} is sample from X_{1}

Let’s say we want( or can only) to sample from another distribution function q(x), but we still want to get the value E\left ( f\left ( X_{1} \right ) \right ) by Monte Carlo method, what we should do?

For clear, we use another random variable X_{2} , its pdf  q(x)

A little trick
E\left ( f\left ( X_{1} \right ) \right )= \int f\left ( x \right )p\left ( x \right )dx=\int f\left ( x \right )\frac{p\left ( x \right )}{q\left ( x \right )}q\left ( x \right )dx

it just looks like we want evaluate function f\left ( x \right )\frac{p\left ( x \right )}{q\left ( x \right )} by samples from q(x), using the new random variable X_{2} , we have

\int f\left ( x \right )\frac{p\left ( x \right )}{q\left ( x \right )}q\left ( x \right )dx = E\left ( f\left ( X_{2} \right )\frac{p\left ( X_{2} \right )}{q\left ( X_{2} \right )} \right )

According Monte Carlo like above, we should have
E\left ( f\left ( X_{2} \right )\frac{p\left ( X_{2} \right )}{q\left ( X_{2} \right )} \right ) \approx \frac{1}{n}\sum_{1}^{n}f\left ( x_{i} \right )\frac{p\left ( x_{i} \right )}{q\left ( x_{i} \right )}

so we have
E\left ( f\left ( X_{1} \right ) \right ) \approx \frac{1}{n}\sum_{1}^{n}f\left ( x_{i} \right )\frac{p\left ( x_{i} \right )}{q\left ( x_{i} \right )}

where x_{i} is sample from X_{2}

this is what importantce sampling does, it uses samples from another distribution funtion, but still evaluate the old function for old random variable.

2. why Importance sampling can do better

let’s caculate variance on

\sigma ^2 \left ( \frac{1}{n}\sum_{1}^{n}f\left ( x_{i} \right ) \right ) = \frac{1}{n^2} \sigma ^2 \left ( \sum_{1}^{n}f\left ( x_{i} \right ) \right ) = \frac{1}{n} \sigma ^2 \left ( f\left ( x \right ) \right )

\sigma ^2 \left ( \frac{1}{n}\sum_{1}^{n}f\left ( x_{i} \right )\frac{p\left ( x_{i} \right )}{q\left ( x \right )} \right ) = \frac{1}{n} \sigma ^2 \left ( f\left ( x \right ) \frac{p\left ( x \right )}{q\left ( x \right )} \right )

so it we select q(x) smartly, we can get a much smaller \sigma^2



A straight forward way to understand VSM

Why I want to re-understand VSM

In the original VSM paper,William and Andrew derived the MAGIC result from probability theory:



  • means the shadow map depth or shadow caster depth in shadow map
  • means the interpolated( filtered) value in shadow map
  • means the interpolated(filtered) value in shadow map
  • D means the shadow receiver’s depth)

The derivation looks very beautiful, but not so easy to understand , especially the idea to treat z as a random variable, so here I’ll try to explain it by middle school math, then we can kill the uncomfortable MAGIC feeling.


Firstly, let’s back to the problem why VSM had been invented : Soft Shadow.
In the real word,soft shadow are generated with a physic explanation(eg area light),
but visually, Soft shadow just means a smooth transition between shadowed area and unshadowed area.
so there are many methods to fake it in computer graphics, VSM is just one of them ( note: even with beautiful math, VSM is not a physic correct algorithm at all).

In the above graph, we have 2 neighborhood texels in shadow map corresponding to 2 light rays, their shadow depth value are Z1 and Z2,.
According shadow map algorithm, we can know pixel A is in shadow and pixel B is lit, what we care about is the area between A and B,

If applying regular shadow test, a sudden shadow density changing must happen at somewhere between A and B. Even we can use bilinear filter to fetch shadow, the smooth input value doesn’t generate a smooth output value because shadow testing result is a binary value. That is why we said shadow map is not filterable.

Let’s see what will happen if we apply the MAGIC shadow testing formula from VSM.

First, let’s introduce a interpolating variable t, t will be changing linearly (0-1) from pixel A to pixel B.
Then with linear filtering, between A and B, we can write:



here, we define:

Looks still mess, right?

Let’s see 2 special cases :

Special case 1: Perpendicular Reciever:

when receiver surface perpendicular to light direction (see reciever Rp)

For this scenario , we can easily know ( between t=0 and t=1)
Then final result:

So for this case, the shadow result will follow the most simple smooth transition when t change from 0 to 1 : Linear transition.

Special case 2: Reciever on interp(z):

When Shadow Reciever are on interp(z) ( see the red dash line in above picture).


we can know immediately:

Combine the two special cases,and the property of that Pshadow is a Monotonic function decreasing function to D (see Appendix)

For the area between interp(z) (red dash line) and Reciever Rp, we have:

we can say: PShadow is Monotonic changing between t and 1 along D decreasing. it’s a smooth transition too.

For the portion below Rp, the situation is same but the PShadow < t.

Short Summary

  • with t increasing to 1, PShadow will increase to 1 smoothly
  • with D decreasing to interp(z), PShadow will increasing to 1 smoothly too.
  • when ( receiver surfaceperpendicular to light) , it’s a linear transition.


From the result, we can say ShadowMap filtering start to make sense, since a interpolated shadow depth can generate a shadow value between 0 and 1. that’s why we say VSM is filterable.

we can go one step forward, if we blur the shadow map texture, the same transition effect will expand from 2 neighborhood shadow depth texels to more adjacent texel, then it results a more smooth shadow transition, that’s why we say VSM is pre-filtable.

Appendix:Pshadow is a Monotonic decreasing funtion to D.

Check the original define:

since if t is fixed,

is a fixed value,

so if D is increasing, then d is increasing, then Pshadow is decreasing,

Then we can know Pshadow is a Monotonic decreasing function to D.

PS. the original VSM paper can be downloaded at :

VSM “Variance”

[Tech Memo] Content Performance Profiler For Artists.

In game development,performance profiler is extremely helpful to optimize the game. but most Profiler is designed for programmers.

Actually artists(especially, shader/FX artists) need Profiler too.  unlike programmers who always want to find the reason behind of bad performance. What artists need is a tool which can verify their idea quickly,  if the performance doesn’t become good, then they go for another idea.

Obviously, Tools like pix can’t help them.

Read more of this post

[Tech Presentation] Baked Particle System

This is a new tech I added into our engine,  we call it “Baked Particle System”.

Not like traditional particle system which calculates everything on the fly, we have a pre-calculation stage (Baking).

In Baking stage, since we don’t need worry about CPU performance problem, we can do a bunch of stuffs that the real-time one can’t do.

After baking, we recording necessary information into vertex stream, playback it in runtime.

Here is a example video, for complex collision effects.

[demo video :Crazy bounce]

[demo video :Multi sprite Lighting]

Following is a PPT when I introduce the tech to my team internally, it give a briefing how the tech works,  I’d like to share it with you guys

PPT download

A brief on Pre-computed Particle system

[Tech Presentation] Fast Cascade Variance Shadow Map

Shadow rendering is always a problem in game developing, I worked on several titles in past years, there is always needs to write your own shadow for your game.  It’s all about Quality, Performance and memory.

what I shared here is for open space shooter engine. especially designed for Xbox360.

It supports:

  • Soft Edge (Variance Shadow map Tech)
  • Long Distance ( Optimized Cascaded Shadow map)
  • Stable (No shadow map rotation)
  • Fast ( 3ms include shadow caster rendering)

I share the ppt here, it come from one of internal presentations I did before. the target audience is for game developer  ( it also introduced some basic concepts for helping non-graphic programmers understanding).

hope it can help your own shadow system

Download PPT Cascade VSM Shadow

[Training Presentation] Power PC CPU Cache

For current generation game console,  they  all chose PowerPC as CPU architecture.

Even PowerPC framework is more clear and understandable than x86, it’s still complex if you want to know everything.

Fortunately,for most cases, what we only need to know  is :CPU CACHE  

This is a presentation I did in our team internally,  it explain some terms and concepts:

  • Cache line
  • Cache Mapping(Set Associativity)
  • D-Cache/I-Cache,Cache Levels
  • Cache Eviction Policy (LRU/PLRU)
  • Cache Write Policy (No-Allocation/Write Through, Write-Allocation/WriteBack)
  • LHS etc
Target Audience:  Junior/Mid level Programmer

Click follow Image to see sliders .(