A Note About Tools

That means there are going to be a lot of LaTex-created images in this section. Sorry, but it’s a lot less painful than trying to write out math in raw HTML. (Okay, writing this inspired me to search for LaTex to MathML translators, and of course the W3 itself maintains quite a nice list. Oh well, I promise I’ll look into that for next time.) In the meantime, I’m using the wiki built into Trac to write this post, with the TracMath plugin to take inline LaTeX blocks and render them as PNGs. This means that I just type

Well, okay, I don’t write that, because I can’t remember LaTeX math syntax to save my life. (I got pretty good at it in college, but it’s been a while.) Anyway, it’s hard to parse by eye, and I’m very prone to swap a u for a v, which can be unfortunate when you’ve just carefully drawn a distinction between them.

Instead, I’m using Maxima, a free Computer Algebra System. It’s stranger to pick up than Mathematica, but the price is right (and it’s quite powerful once you get used to its ways). It seems to be really flaky on recent Ubuntu releases, which is unfortunate, but I’m primarily using 5.17.0 for Windows and it works like a champ. It has a very nice tex function, which will spit out the TeX formatting for any given expression.

So in practice, I work the math in Maxima, have it convert to TeX, then paste that into the post. (Sometimes I tweak it slightly, for example to pull out a leading “1/2” rather than leaving the whole expression over 2 as Maxima seems to prefer. But for the most part, the expressions in the text are directly out of Maxima.) I have some ideas of writing a Trac plugin to close the loop and directly evaluate and typeset Maxima expressions in the text, but I haven’t got it all worked out yet: the equations in a piece generally build on each other, so you need enough lifecycle awareness to start and maintain a separate Maxima state for each page in flight. Also you probably need some non-displayed helper definitions and intermediate values, so you need a separate bit of syntax to indicate what should be rendered into the page. I think the fact of the existence of the PageOutlineMacro means it’s possible, but the fact that the PageOutlineMacro actually re-parses the whole page means it’s not necessarily easy. (I’d also still have to solve creating and maintaining enough state that the various individual blocks could fetch their expression / formatted results from the initial gather-and-run macro.) Anyway, for the moment I’ll do it the hard way.

Back to the Math

I’m going to run through a concrete k=3 (quadratic) example with 6 control points (so n=5), and claim that this demonstrates a general solution for k=3. I’m pretty sure it does, but it’s a lot easier to keep track of the indexes with a concrete number. Or so I claim.

So, the knot vector T should be the sequence [0, 0, 0, 1, 2, 3, 4, 4, 4]. Recall that indices are zero-based, and that Ni,1 is defined as

Given this, the Ni,1 terms will start being interesting at i=2, where N2,1= 1 for 0 <= u < 1. From then on, each Ni,1 is non-zero on Ni,1= 1 for i-2 <= u < i-2+1, up through the last value we need at N5,1. Using Maxima to evaluate the raw B-spline equation

for our knot vector T (again, this is the k=3 case), we get a whomping huge expression in terms of the control points P, the Ni,1 “switches”, and of course v. I won’t reproduce the whole thing here (fully expanded, there are 33 terms, and even my new wider columns can’t handle that), but here’s the coefficient of the second control point P1 for flavor:

Explicit Equations for Each Segment

The good thing is that we don’t ever really need the whole equation at once. Only one of the Ni,1 terms will be 1 at any given time — this is what produces the curve segments. By setting N2,1 through N5,1 to 1 in turn (and the other Ni,1 terms to 0) we can produce the four curve segments. In turn, these are

Maxima Script

For anyone who’s following along at home, here’s the Maxima to get our results so far:

Note that Maxima lists are one-based, hence the helper function T5 to preserve the zero-based indexing in the original definitions. Since Ni,0 is never referenced, I can get away with a list starting at index 1 (though I admit this is made more confusing by the choice of a zero-based loop variable. Oops.) Also note that BSpline5_3 wasn’t actually defined as a function there — that was just an equality, to make Maxima print it out nicely. Other than that, this is pretty much what it looks like. Oh, and notice that the recursive N definition has to avoid a couple of divide by zero cases to implement the “where 0/0 is defined to be 0” escape hatch.

Reparamaterization

Okay, those equations work but they look pretty ugly (and they’re different in each segment, but we’ll work on that in a minute). It took me a while to sort this out, but I sort of tipped my hand back in the “Definitions” post where I introduce w and j, so I’ll just repeat the equation from there: v = u (n-k+2) = j+w-1. Remember that j is the segment number ranging from 1 to n-k+2, which is exactly the subscript of segment in the explicit equations above. So we substitute w-1 in the first equation, 1+w-1 in the second and so forth. Using the Maxima definitions above, this is just evaluating the BSpline5_3_Seg function above at seg+w-1 rather than v:

Now we’re getting somewhere. The middle two equations are identical except for shifted subscripts on P. The first two and last two are different, but we can work that out in a minute. First let’s generalize the uniform version and check our claim that

works for those middle terms. This will also give us a peek at the what’s going on with the end segments.

What we’re computing here is the difference between the “real” equation for each of the four segments and what we get from using the uniform equation for all of the segments (ignoring the segment number).

Doesn’t look immediately promising, but there are two good things:

the two middle segments are as predicted by the uniform function above

the two end segments’ differences only involve two control points

We know that P1 and P4 are used in the middle two segments as well as the end segments, so we can’t do anything to them. However, the two end points (P0 and P5) are only used in the end segments. So what it we feed some different value (not the current P0, but some combination of P0 and P1) into the uniform equation? Can we make the result be the same as evaluating the “real” B-spline function on the original control points?

In the uniform version of segment 1, we replace P0 with a new variable x, then set it equal to the real first segment of the k=3 B-spline:

(Okay, that didn’t really need a LaTeX block, but it really did take me quite a while to figure out, and I was inexplicably happy when I found it (literally: one very really reason for this whole series is to explain to my wife why I was grinning so much that week). If I’d happened to have run into that formula up front I would have spared myself a great deal of frustrating math trying to figure out how to make sense of what I was seeing. Of course, I also wouldn’t have learned a lot about splines, so maybe it was better this way.)

Repeating the process for the last control point (in this case P5), we get x = 2 P5-P4:

This is a reassuring result — the equations are all symmetric, so it should be possible to reverse the control points and produce the same curve. Therefore, it’s nice to see that this formulation for the end-points is also symmetric.

So the practical upshot of this is that we now have a way to evaluate all the segments of a B-spline, even the “weird” end segments, with the same uniform equation. To do this, we have to massage the control points a little. Given the input sequence

( P0 , P1 , … , Pn-1 , Pn )

we need to make a copy, so we can feed the uniform function the sequence

( 2 P0 – P1 , P1 , … , Pn-1 , 2 Pn – Pn-1 )

instead. That doesn’t seem like much of a price to pay for freedom (or at least for being able to evaluate all segments of all k=3 B-splines curves with the same straight-line evaluation code).

The same general approach works for k=4 curves (cubic B-splines). Because the cubic blending functions involve more points in each segment, we have to replace both P0 and P1 with free variables, and solve for them simultaneously (using the known equations for the first two segments, instead of just the first segment), but it’s the same logic and it works quite nicely. I haven’t tried to generalize a formula for any value of k, mostly because quadratic and cubic B-splines are far more common than other types.

This took much longer to post than I’d planned, but I hope it was worth the wait for someone. I can absolutely guarantee that this isn’t an original result, but I haven’t actually seen it in Wikipedia or MathWorld or the usual suspects, so I feel entitled to be slightly smug.

November 1, 2009

Before getting into a correct treatment of the problem, and deriving a correct way to handle these differing end segments (as part of the B-spline series of posts), I should explain what I mean by “uniform” and why I’m so hung up on it. (That is, other than the fact that the Buniform function is an easily-understood mixing of three control points, while the recursive definition of the basis functions makes my head hurt.) In particular, why not just accept a solution along the lines of:

(I may have got the indexing slightly wrong, but I think this works (and is a good illustration of why I was being so pedantic about u, v, and w above — it’s not complex math, but it’s easy to mess up the parameterization).) Also note that I’m only looking at the weirdness at the beginning of the curve; properly the conditional would need to check whether the segment was at either end of the curve. This is a simplified version to support the discussion, not an actual implementation.

Endpoints are often exceptional cases; why not just define the evaluation with that exceptional case and move on? All the more so since my attempted trick of “pre-conditioning the data” (replacing the original sequence of control points with a sequence with repeats at the beginning and end) didn’t work? (An important background assumption to the rest of this discussion is that this function is going to be run many many times, since we need a lot of points on that curve to make it look like a smooth curve when we draw it on-screen, or when we send the points to a CNC router to cut metal, or whatever we’re doing with it. If we only needed to do it once, this would probably be the right solution.)

The Problem with Branching

The answer basically has to do with computer architecture and the way it has evolved over the past couple of decades. Once upon a time, when I was programming on my father’s hand-me-down Heathkit Z-80, this would have been a perfect solution — each time we needed to evaluate a B-spline, we’d simply decide if we were in the normal or weird section of the curve, and calculate accordingly. The problem is that as we poured billions of dollars into CPU fabrication techniques, we got more and more transistors, and needed some way to take advantage of them in order to make things run faster. One major solution is “pipelining”, which essentially can be described as doing many things at once. Rather than waiting for each step to complete before we start on the next step, the CPU will get a lot of things started at once, and pull them all together when needed. (Perhaps I’ve been watching too much Iron Chef, but this seems like partitioning work out to a sous chef.) In the extreme case of the P4 architecture, the pipeline was up to 20 stages long. A much more technical discussion can be found at http://arstechnica.com/old/content/2001/05/p4andg4e.ars but the basic takeaway is that for a modern CPU to run fast, it needs to be able to do many things at once.

This is the second part of a so-far three part series. See part one for some context.

For a typical background you can look at MathWorld or Wikipedia. These are pretty representative of the descriptions you’ll find on the internet in casual searching: technically correct, with nicely formatted equations, but not a whole lot of actual help if you’re trying to understand the things. In particular, the recursive definition of the basis functions are very clever but incredibly unintuitive as far as what they mean. (I also notice that the links share my own indecisiveness about how to title-case “B-spline”.)

A much more useful writeup can be found in the “B-spline Curves” section in these excellent class notes. This is a nice treatment with lots of pretty pictures, although I think the author makes things somewhat more complex by staying with the raw 0 to 1 parameterization throughout (so the elements of the knot vector end up as fractions rather than integers). Nothing wrong with it, just a different treatment (and I suspect one motivated by uniform representation across many kinds of curves, rather than my focus on uniform representation for all segments of the B-spline to ease high-volume computation).

(I should say at this point that I’m torn about how to present things. It’s very tempting to start with what I know now, tell a nice clean story, and maybe then offer an amusing anecdote about how even I didn’t quite get it at first… but that’s exactly what the textbook I was working from did (well, except for the amusing anecdote), and it didn’t serve me very well. So instead I’ll let the game tell the story, and just lay it out as it happened to me.)

The B-Spline family of curves are a nice way to make smooth curves from a series of control points. The idea is that the control points give you “handles” to pull the curve around; it won’t actually pass through any of them (except for the first and last ones), but it will be attracted to them while retaining some nice smoothness properties.

That much I knew coming in. For various reasons, I wanted to write a program to evaluate B-splines, with a particular eye to rendering them in 3D. This seemed like it was going to be a very easy task; instead, I learned a lot along the way.

December 3, 2008

It seems like I’m about to head into the dark territory of F# “quotations”, which apparently let you play compiler without all that tedious parsing. (Sorry for the absurdly shallow interpretation, but I haven’t played with them much yet.) Anyway, before I get into that, I thought I’d write up my progress on getting nicely rendered space curves.

Segmented Cylinders

Last post, I’d gotten a unit cylinder to render halfway decently, after some compromises (and throwing more triangles at the thing). After that, I had to translate and rotate them where I wanted them. That should have been a trivial step (“I have a unit vector along the x axis, please map it over here”), but turned out to be more trouble than it should have been. (I won’t tell the full story, but here’s a quick runthrough.) The Matrix.CreateWorld method looked very promising indeed, but kept squashing out any z component (for reasons which became clear later; at the time I was passing the z unit vector as my “up vector”). After some hunting, and reading far more than I ever wanted to about quaternions, I fell back to a simpler approach: I want to overlay the 0-1 line on the x axis onto some line p0 to p1. Clearly, I need to translate by p0 and scale by |p1-p0| (that’s supposed to be the length of the line, if my pseudo-typography isn’t coming through). In almost all cases, I also need to rotate; from basic vector math, the axis of rotation will be the cross producti × (p1-p0) (I can’t get the little hat over the i, but that’s the unit vector in the x direction). The amount of the rotation will be the angle between the vectors: acos(i • (p1-p0)). (I’m particularly fond of UK-based sites, since they tend to refer to “maths”. Juvenile, I know.) This all works fine except for the case when p0 and p1 actually lie on the x axis already; in that case, the cross product goes to all zeros, which was what was flattening out my matrices from CreateWorld (at least I think so; I never went back to test the theory). So I ended up with a scale, a conditional rotation, and then a translation. I was sad about the conditional — even though modern shaders can handle conditionals, it still feels like a mark of shame. There’s probably some way to be clever about this; after all, any axis of rotation will do just fine if you’re rotating zero degrees. But I was in a hurry to get something rendering again, so I gave up for the moment.

Anyway, this was enough to get me from oddly-lit unit cylinders all the way up to this:

Sandworm: Curve as discrete cylinder segments

In this case, I’m rendering the first Hermite basis function, since I had code lying around to calculate it. With 20 samples, the curve looks quite smooth (all things considered), but it’s discontinuous, and gets into worse trouble near the ends. (In areas of high curvature, the abruptly changing normals at lines of overlap look distinctly articulated, rather than like a smooth curve.) So the next step was to get at least C0 continuity, and try to do something about the normals.

November 27, 2008

Sometimes per-pixel lighting just can’t save you. That’s my final conclusion, but I thought the journey was interesting, and explored a few different ways of trying to figure out why your shaders aren’t doing what you expect.

This adventure started from a mistaken assumption. In a previous life, I worked a lot with texture mapping the world around a fixed viewpoint. In that setting, the actual geometry of the enclosing mesh doesn’t matter, as long as your texture is either pre-distorted (so that linear interpolation is geometrically correct) or sampled frequently enough (per-pixel computation to the rescue!). Somewhere along the line, I seem to have picked up the idea that as long as the outlines are correct, per-pixel computation will always let you reduce the geometry. That’s true for some extreme definitions of per-pixel computation (if you’re willing to do full per-pixel matrix computations, you can almost render an aligned quad and walk away); however, in the real world you have to be able to defer some computation to the vertex shader and take advantage of all that hardware linear interpolation.

Anyway, my goal is to draw cheap, correctly-lit cylinders, using as few triangles as possible. In the long-term, I want to use this technique for highly tessellated poly-lines, so there are serious upper bounds on the budget I can spend on each individual line segment. Thanks to my assumption above, I figured I should be able to get away with four points sampling a unit circle, and make up the difference in the pixel shader.

Per-pixel shading is necessary for this sort of task from the beginning; calculating color at each vertex just doesn’t cut it when you’re approximating non-linear shapes. Just for the record, though, here was the first result with per-vertex lighting of the “square cylinder”:

Per-Vertex Shading; 4-sided

I figured that everything would be sorted out with per-pixel lighting. In practice, though, here were my results (followed by a high-tessellation version as the “right answer”):

Per-Pixel Shading; 4 sided

Per-pixel; 2048-sided cylinder. The reference image.

These images look similar, but if you flip back and forth (I’ll try to get that working once I know WordPress a little better) you’ll see that the low-tessellation version has a band of brightness about a quarter of the way up the “cylinder”, while in the correct version, this is concentrated at the very bottom of the image.