Lecture 4 : 3d and 5d Rasterization
Download as PDF

clemire

This counter-intuitive notion that you "spend lots more time to render blurry things" makes a lot more sense when you realize that the larger the circle of confusion is, the more space you have to sample to determine the color of a single point in the image plane.

soohark

Why can't we do something like rendering everything as if they were sharp, but keep track of a z-buffer, and then use the z-buffer to do a gaussian blur with the radius varying by the depth in the z-buffer?

Chris

@Soohark: I'm not entirely sure, but I feel like you'd get an incorrect blurring if you do that. Like imagine if you had a really shallow depth of field and you had object A that was in focus and then object B which is right next to object A on the 2D screen but is actually a couple feet behind. Object B would need to be very blurry and have a wide radius to do the blur, but what you might inadvertently wind up doing is accidentally including pixels from object A in your blur, even though object A is in focus.

So basically what I'm trying to say is that if you wait until after you render the scene to do the blurring, even if you do keep track of the z-buffer you might still include incorrect pixels from objects that aren't near your blurred object.

I'm not sure if what I've said above is at all correct but this is just my intuition.

soohark

@chris what if you only included pixels that are further from the pixel you're calculating the blur on according to the z-buffer?

It wouldn't be completely correct, but it would be much faster, and I think the results would be acceptable.

mmp

@Chris: your intuition is solid.

There's a bigger underlying issue with a single image: one way to think of depth of field is that it's equivalent to rendering the scene from each and every point on the surface of the lens and then averaging those images together. Different points on the lens will end up seeing slightly different views of the scene; as such, a single view can't capture all of the objects that will be visible from other views.

(One interesting example is to consider looking straight on at a (small) cube shape. Normally, you can only see 3 sides of a cube, but if you think about shifting around on a lens area while looking at a cube, you can potentially "see" 5 sides of it. And this in turn gives an intuition about why backface culling with depth of field is such a bother!)

@soohark: people have come up with a number of approaches that try to do their best with a single image, or with multiple images. Some of them can work pretty well, but my sense of them is that they're hard to make work robustly in all cases. One approach is to render a "layered depth image" (z-buffer that records each surface in a pixel, not just the first one, even if the surfaces aren't transparent); this can give some additional information about stuff that's occluded in the main view but is useful for shifted views at other points on the lens....

wmonroe

I'm pretty sure this idea of blurring based on the z-buffer came up on the 148 exam last year...and I definitely remember Vladlen Koltun describing it in 248. I got the impression that this is actually the technique that's used in practice for real-time DOF effects, despite all its shortcomings. In fact, I think for real-time performance, implementations tend to use only a box blur and base all blurring on the value of the z-buffer at the pixel being rendered, rather than figuring out exactly which pixels contribute to the current one based on their individual depths.

Needless to say, this is completely unphysical and leads to bizarre artifacts in some situations.

soohark

This is an interesting problem. When I learned to do flash animations, I realized that artists want to convey rotation of symmetric objects, they'll often add lines, imperfections, or spectral highlights that rotate with object even if they aren't realistic so the viewer can see the rotation. I wonder if the error introduced here can sometimes be beneficial to the animators as well.

mgao12

I agree with soohark, but I don't think to get the edge part of the sphere blurred is a good way to show the rotation. It looks so weird since the real sphere won't look blurred in any case. I think the best way to show the rotation will be adding some non-smooth texture onto the sphere surface, so the sphere looks different while rotating. In this case, we can also use the motion blur of the texture to make it more realistic.

tianxind

Agree with the previous two comments. I think in real life we can tell that a ball is rotating because the surface is not really uniformly smooth. Maybe we can use bump mapping to add small dents to the surface so that we can see some blurrings (highlight or edge) on the sphere.

clemire

Note that the spheres are rotating around an axis that is orthogonal to the plane of the slide. If you used a different axis of rotation, the spheres wouldn't shrink in such a uniform way.

mgao12

I find an interesting thing here. Take the 3rd picture for example. Suppose point A at the edge of the sphere rotates 120 degree to point B. The line segment AB just seems to be tangent to the inner solid sphere.

I suppose the reason is that AB is just the linear route along which we take the samples. All the lines outside it will have some samples taken from the space outside the sphere, so they get blurred. In contrast, All the lines inside it have all the samples inside the sphere, so they don't become blurred.

Tianye

Having this in mind, we should limit the motion from the starting state to the ending state, especially for rotation, in order for the linear approach to work. I think it is more natural to think of motion blur as what we see at a certain point in space during the time period rather than following the path of a point on the object. This is particularly important for rotation. So maybe it will be easier to deal with motion blur with raytracing.

atrytko

For arbitrarily defined t1,t2, shouldn't x'(t) be defined in terms of t1 and t2?

For example: $ \frac{t1 - t}{t1 - t0} * x0 + (1 - \frac{t1 - t}{t1 - t0}) * x1 $ ?

atrytko

As a thought process explanation....

I was going to ask if for Option 2, it should read: "compute min of x'(t), max of w'(t)"?

I now see that this is unnecessary, since values of x'(t) are by definition strictly in the range [x0,x1]. So... as it is written will give the same result, but with less calculation expense.

clemire

If you compute the derivative of xr(t), you'll see that the only time it is ever 0 is when x0/w0 is equal to x1/w1 (i.e. the vertex doesn't actually move). Otherwise, this function always returns a point xr(ti) that is a convex combination of xr(0) and xr(1), so option 2 seems a little more complicated than it needs to be. I think you can just take the min of xr(0) and xr(1) for the minimum x-coordinate, etc.

jingpu

I think there shouldn't be red regions left out of four bounding boxes if the figure is more carefully drawn, since every point on the edges of red region is an possible location of vertices using linear interpolation method.

mmp

Nice catch--you're right! (I"ve updated the original slide, but won't bother with uploading the fix to the website.)

wmonroe

Is it possible to find a linear interpolation of the bounding box itself? What I mean is, could one calculate some starting and ending bounding boxes such that one could recover a conservative estimate of the bounding box at an arbitary point in time just by interpolating the corners, instead of computing many separate bounding boxes for intervals specified in advance? If it's possible to efficiently compute such an object, then I would expect it to give a tighter bound than using a small number of intervals and be more efficient than using a large number. I would expect that the naive first try of simply interpolating between the actual starting and ending bounding boxes would break in some cases (like the backfaces surprise mentioned in slide 26), but I haven't been able to think of a concrete example.

Chris

So in lecture today Matt said that interleaving did not produce as good of a result as low-discrepancy but from this slide the two images seem to look pretty similar. Does the difference get more pronounced as more samples are taken?

mmp

Actually they get more similar at higher sampling rates. The difference really is a small one; I just personally find those structured artifacts in the noise from interleaved sampling fairly objectionable--it may not bother other folks as much...

tianxind

Actually I don't understand what low-discrepancy sampling is. Btw can we use a prettier model than this bigguy? :P

bmild

What kind of linear paths of a triangle's vertices in 3-space would make this happen?

tianxind

Yes same question here: can't imagine such a path

eye

Would moving linearly from back-facing to front-facing and then revert to the original position count? I'm not sure if I'm also misunderstanding, but for example, in the case of a wing flap, the start and end could both be at the top of the stroke but the triangle could be reversed at the bottom.

mmp

The first two pages of this paper lay out a case where it happens; does this help?

bmild

Yes, thanks!

jingpu

Given the lens formula, assume F, z, and z' are all positive, F should be the smallest among the three. Therefore, going back to figure, point F should be on the right of z'.

mmp

You're right! (How embarrassing.) I've fixed the original slide--thanks!!!

Chris

It's worth pointing out that an image Z' a distance d in front of the focal plane will appear more blurry than an image the exact same distance d behind the focal plane.

Another cool effect is the dolly-zoom. Basically by zooming in/increasing field of view while also stepping farther back from the subject just the right amount, you can keep the depth of field the same but increase the blur in the out of focus sections. The background will also distort and change size while doing this and it creates an overall uneasy/surreal effect.

bmild

^^ As seen in Vertigo: http://youtu.be/je0NhvAQ6fM?t=31s

Tianye

Just checking if I get it right. Are this slide and slide 37 two different ways of thinking of circle confusion (1.the area in object space that a certain point on the image plane can see 2. the area on the image plane that a certain point in space can reach)? Here I suppose the point on the left should be zi', the image plane, and zf can be calculated from 1/f=1/zi'+1/zf. The parameters we need to set for the lens would be the distance of the image plane zi', the diameter of the lens and the focal length.

brianjo

The "dz" notation makes it seem like x is shifted by the diameter of the circle of confusion when in fact it should be the radius (assuming u takes on values from (-1,1)).

Tianye

One conservative bounding box I can think of is to calculate bounding box for 3D rasterization as usual then find the maximum z, calculate circle of confusion dz, and extend the previously calculated bounding box accordingly. Is there a better way of doing this?

mmp

If you're computing an axis-aligned bounding box, that's about the best you can do, particularly because the blur amount varies non-linearly in z,

(Note, though, that you need to compute blur at both the min z and max z of the bounding box, since, depending on the object's depth, sometimes one or the other will be the one with greater blur. At least it can be shown that blur is never greater than it is at those two extremes.)