Kaijiya's Rendering Equation ============================ Sebastien Loisel, zed@math.mcgill.ca Overview ======== The basic idea is that we want to build this function I(X,Y) where X and Y are points in 3d which tells you how much light is coming in from Y into X. Then, if the eye is at the origin, I(0,Y) will give you the intensity of the rays of light coming into the eye. This allows you to produce a picture. It is important to stress here that I(X,Y) should be the 'final' amount of light going from Y to X, not merely the amount of light that Y emits towards X as a light source. We want I(X,Y) to include all the light generated from all the bounces that light makes in the scene. It may at first seem hard to write down what then I should be but a moment's thought will reveal that I(X,Y)=E(X,Y) + integral of I(Y,Z)*R(X,Y,Z) over all Z is the equation we are looking for. In this equation, E(X,Y) is the light emitted from Y that gets directly to X (zero if X does not see Y). R(X,Y,Z) is a real number between zero and one which says what proportion of ray I(Y,Z) hitting point Y gets reflected in the direction of X. That is, R(X,Y,Z)*I(Y,Z) is the intensity of the ray YZ after it gets bounced and becomes ray XY. The integral is simply the sum of all the contributions obtained from the bounces. Since the I(Y,Z) are defined the same way, we hope that this is indeed the formula we are looking for. By writing L(I)=integral of I(Y,Z)*R(X,Y,Z) over all Z (note that L(I) is again a function of X,Y), we can re-write the above equation as: I=E + L(I) The engineer then moves on to simply say that: I-L(I)=E (1-L)I=E I=1/(1-L) E and the engineer argues that 1/(1-L) is 1+L+L^2+L^3+... so that this is: I=E+L(E)+L^2(E)+... Which is simply the amount of light emitted, plus the amount of light emitted then bounced off a surface once, plus the amount of light emitted and bounced off two surfaces, and so on. This is not a satisfactory proof so the rest of this document justifies what we just did. To do this properly you need some decent linear algebra first. Note that "iff" in this document stands for "if and only if". Vector spaces ============= We call a vector space a set V with operation + along with a field of scalars F with operation * and + with an operation ^ which maps an element of F and an element of V into an element of V. All of this has to satisfy certain axioms, as follows: 1) F,*,- is a field. If you don't know what that is, think of the real numbers with multiplication and subtraction. 2) V,+ is an abelian group. If you don't know what that is, think of R^n - you can add vectors, and for any vector you can find its opposite vector such that the sum of a vector and its opposite is the zero vector. Also, with any a,b in F and x,y in V, you need that: 3) (a+b)^x=a^x+b^x 4) a^x=x^a 5) a^(x+y)=a^x+a^y 6) 0^x=O where 0 is the null element in F and O is the null element in V 7) 1^x=x where 1 is the unit element in F I'm probably forgetting something, when you start writing these things down you always forget one. Also, no one writes ^ for the mixed operation, you always write * and pray no one gets confused, so this is what I'll do now. The next useful thing is linear operations. A linear operation L on a vector field V with scalars F into a vector field U with scalars F (same scalars!) is simply a function with the following properties: For all x,y in V, a in F 1) L(x+y)=L(x)+L(y) 2) L(a*x)=a*L(x) Let's look at these operations more closely. A linear operation is said to be invertible iff for any u in U there is one and only one v in V such that L(v)=u. This definition makes sense, but alas it makes it hard to check whether a linear operation is invertible. The following will make things easier. Definition: the set of vectors {v is in V, L(v)=u} for any u in U is called L^-1(u) or L inverse of u. L^-1(O) has a special name, it is the kernel of L, noted Ker(L). Lemma: Ker(L) always contains the zero vector. Proof: L(a*x)=a*L(x) for any a and x. In particular it's true for a=0 then you get L(O)=O thus O is in the kernel of L. Lemma: L is invertible if and only if Ker(L) is a singleton (a set with only one point). By the previous lemma then Ker(L)={O}. Proof: If L is invertible, by definition, L(x)=0 for only one point so there's nothing to prove. The other way is harder, assume now that Ker(L) has only one point. We need to prove that L^-1(v) is then a singleton for all v. Assume that Ker(L) is a single point but L^-1(v) has at least two points in it, say, x and y. Then, L(x)=L(y)=v. But then O=L(x)-L(y)=L(x-y). But then, x-y is in the kernel of L, which is simply {O} thus x-y=O or x=y. We assumed that x and y were two different points so we have a contradiction. Thus it can not be that L^-1(v) has at least two points. therefore, L^-1(v) has one point. This is true for any v, therefore L is invertible. If you define the + operation for linear operations in suitable fashion you can see that they form a vector space themselves. Take any two linear operations, L and M. Then we define [L+M](x) to be L(x)+M(x). Similarly, for a scalar a, we define [a*L](x) to be a*L(x). You can then verify from the axioms that the space of the linear operations is indeed a vector space. In finite dimensional spaces, these guys are continuous but not in the infinite dimensional cases always. At any rate we don't really care about that. There is a special case of linear operations. When you have a linear operation L from V to V (same vector space) then we call it a linear operator. The space of all linear operations from V to V is called the dual space of V. To really get a hold of these operations, you need something more than just a simple vector space. There are various ways of doing this but here we'll simply look at normed vector spaces. A normed vector space is a vector space V with scalars R (reals) or C (complex numbers) with an operation || such that: For any x,y in V and a in R or C, 1) |x| is a positive or zero real number and |x|=0 only if x=O. 2) |x+y|<=|x|+|y| (triangle inequality) 3) |a*x|=|a|*|x| where |a| is the absolute value of a For instance, R^2 with the euclidian norm |(x,y)|=sqrt(x*x+y*y) is a normed vector space. A vector space with a norm is said to be complete iff, for any sequence of vectors v1, v2, v3, ..., vn, ... (infinite) such that |vn-vm| tends to zero whenever n and m grow large then there is a vector v such that |vn-v| goes to zero when n grows large. Here it is good to formalize what I mean by 'when n and m grow large'. I mean that, for any a>0 (strict inequality), there is a large integer N such that, whenever n and m are both larger than N, |vn-vm|0 (strict inequality), there is a large integer N such that whenever n is more than N, |vn-v|=|L(y)| by the definition of |L|. So from the last equation, we can tell that |L| >= |x/|x|| =1/|x| * |x| = 1 But we have assumed that |L|<1 so we have a contradiction! Therefore, x can not be non-zero. So Ker(L) is {O} and by the previous lemma, L is invertible. Lemma: if a is strictly between 0 and 1 then the power series 1+a+a*a+a*a*a+... converges to 1/(1-a). Proof: exercise. (Hint, try multiplying 1+a+a*a+a*a*a+... by (1-a) and check that you get back 1. Make sure that 1+a+a*a+a*a*a converges too.) Lemma: If L is a linear operation from V to V and |L|<1 then for any vector x, the power series x+L(x)+L(L(x))+L(L(L(x)))+... converges to a limit point if V is a complete space. Proof: We write L^n for L(L(L(...(x)))) with L occuring n times (eg, L^3(x) is L(L(L(x)))) and we let L^0 be the identity operation. The norm of L^n is |L|^n. (Proof is left as an exercise.) We need to prove that, if n is large then L^n(x)+L^(n+1)(x)+... is small in the norm of the vector space (this shows convergence since the vector space is complete). But |L^n(x)+L^(n+1)(x)+...| <= |L^n(x)|+|L^(n+1)(x)|+... (*) (proof is left as an exercise) and |L^k(x)| <= |L^k|*|x| = |L|^k * |x|. Then the right side of (*) is at most |x|(|L|^n+|L|^(n+1)+...)=|x||L|^n(1+|L|+|L|^2+|L|^3+...) With |L|=a<1, we know that 1+|L|+|L|^2+... converges to 1/(1-a) which is some real number and |x| is also just some constant so that, letting K=|x|/(1-a) this becomes |L^n(x)+L^(n+1)(x)+...| <= L*|L|^n But since |L|<1, |L|^n shrinks to 0 as n goes to infinity thus indeed the left side of (*) is small and the power series converges. Theorem: with |L|<1, (I-L)^-1 is I+L+L^2+L^3+... Proof: same as what you do for the scalars. (I-L)(I+L+L^2+...)=(I-L)(I+L+L^2+...+L^n) + (I-L)(L^(n+1)+L^(n+2)+...) We've already shown that L^(n+1)+L^(n+2)+... goes to zero so, by picking n large, this is a small term. Then we distribute (I-L) on the first part and get: =(I+L+L^2+...+L^n)-(L+L^2+L^3+...+L^(n+1)) + (I-L)(L^(n+1)+...) =I - L^(n+1) + (I-L)(L^(n+1)+...) With n suitably large, L^(n+1) and (I-L)(L^(n+1)+L^(n+2)+...) vanish so that we get that (I-L)(I+L+L^2+...)=I Therefore, I-L and I+L+L^2+... are inverses of one another. The really interesting vector spaces ==================================== Functions make a vector space. Take any two function f and g, you can define the natural operation (f+g)(x)=f(x)+g(x) and (a*f)(x)=a*f(x). This satisfies all the vector space axioms (a is any real or complex number). Furthermore you can put a norm on this space: |f|=(Integral over all of R (or C) of |f|^p)^(1/p) This is called the p-norm of f. Proving that these norms satisfy the triangle inequality is hard and I will not do this here. It is also provable that these normed vector spaces are complete. Lemma: for any decent g(x,t), the operation L(f) defined by: L(f)=Integral over all of R (or C) of g(x,t) f(x) with respect to x is a linear operator. The function g maps x (a vector) and t (another vector) into R or C and the function f maps x (a vector) into R or C also. Sketch of proof: this is true simply from the linearity of the integral. There are important pitfalls and for this reason I will not prove this. All these topics are part of what's called Functional Analysis, Measure Theory, Lebesgue Integral, whatever, and I refer you to one of those books if you're really curious. However, it is usually convincing enough to just look at L(f+g) and L(a*g) and see how it makes sense to say that they are L(f)+L(g) and a*L(g) respectively. The pitfalls occur when the integral is infinite or (argh!) undefined. Lemma: with L defined as above, I-L is also a linear operator and its inverse is I+L+L^2+..., which we can compute or at least approximate. Proof: simple application of what we did so far. The Rendering Equation ====================== The idea is this. We want to compute I(x,y), the amount of light going from y (a point in 3d) to x (a point in 3d). We try to reason with I to see how it should be defined and we choose the following definition: J(x,y)=E(x,y)+Integral over all z (in R^3) of [ R(x,y,z)*J(y,z) ] That is, the amount of light going from y to x is the sum of: 1- E(x,y) the light emitted from y to x (this is 0 if there is no direct line of sight from y to x or if y does not emit light). Points y in space who have a nonzero E(x,y) are light sources. Since the light source's emitted light depends also on the position of x (the target it is illuminating), it can be a directional light source and other gizmos. 2- We also need to count all the light rays coming in from a third point z, bouncing off the surface at y and reflecting in the direction of x. The light coming in from z is simply J(y,z). Now the surface may reflect this in any way we like, so R(x,y,z) just tells you how much J(y,z) is scaled when it bounces in from z against y and then flies to x. In particular, if y does not see x, R(x,y,z)=0. It doesn't make much sense to have R<0 anywhere, so we assume that R>=0. Now we can apply all our functional analysis we just puked over. Define L to be: L(f)=Integral over all z (in R^3) of [R(x,y,z)*f(y,z)] where f is any function of two variables y,z each in R^3. Then the rendering equation can be re-written as: J(x,y)=E(x,y)+L(J)(x,y) J(x,y)-L(J)(x,y)=E(x,y) [(I-L)(J)](x,y)=E(x,y) Or, in shorthand, (I-L)J=E If |L|<1 (where || is the operator norm already defined) then we know that I-L is invertible and then we have: J=((I-L)^-1)E Moreover, we've already shown that the inverse of I-L in this situation is given by I+L+L^2+... so we re-write as: J=E+L(E)+L^2(E)+L^3(E)+... Now we inspect the situation and attach a meaning to what we just did. |L|<1 simply means that all surfaces absorb a little light. Indeed, the norm of J, defined simply to be (the integral over all x,y of |J(x,y)|^p)^(1/p) is a measure of the total amount of light in the scene. To say that |L|<1 is to say that |L(f)|<|f| for all f, so that light is being absorbed by the objects in the scene. (L(f) is the light being bounced off things.) E is just the emitted light. L(E) is the light after the first bounce. L^2(E) is the light after the second bounce, and so on. Indeed, we have just found a justification for some kind of raytracing here. Start from light sources, see where the light gets, start from these new positions, see where it gets from there, etc... and add all the light from all the bounces. The fun part though is that we know now that this process will work, and that indeed it reflects some sort of physical reality. Indeed, we can hope to get photorealistic images out of this! But how do we evaluate this? All you need is something to solve the integral L(f) for any f. One good example is Monte Carlo integration though other methods are available. Then you can evaluate the power series up to, say, the nth term and rest assured that your solution is close to reality. However, n may be hard to choose. In particular, it will need to be rather high if |L| is close to one. Indeed, for L^n+L^(n+1)+... to be small, you need to have |L|^n small. So if you want, say, |L|^n