Stuart Marks on Java, Software Systems, and other topics

That Infernal Scene Graph Warning Message II

In my earlier post on this topic I hinted that we had found a resolution to the issue surrounding the warning message, I hinted further in some of my replies to comments, and I even left it as sort of a cliffhanger as to what the resolution was. So, here’s the resolution.

We’ve decided that when a node is added to a group, that node is automatically removed from the group that previously owned it, if any. (Let’s call this the “auto-remove” feature.) We’ve also decided to turn off the warning message by default, but to have it be enabled optionally, possibly via a system property, for debugging purposes. Finally, we’ve relaxed the enforcement of some scene graph invariants in cases where the group’s content sequence is bound.

What’s the big deal? Well, there were a bunch of things pulling us in different directions. During the development of 1.2, we added code to prevent nodes from appearing more than once in the scene graph or creating cycles in the scene graph. One case in particular caused us a lot of trouble: adding a node to a group while the node was already a member of another group. We fully intended to disallow this case, and require that code remove a node from its old group before adding it to the new one. Unfortunately this caused a lot of our internal code to break. In versions 1.1 and prior, auto-remove was the specified behavior, and a surprising amount of code relied on this. Given the number of cases we ran across internally, we were sure that this would break a lot of external code. For this reason we decided to relax the restriction for this particular case, continue the auto-remove behavior temporarily, and issue a warning message instead. In a subsequent release, we were going to change the warning to an error and to remove auto-remove behavior.

During the development of our current release, we kept running into this issue. A couple of us wrote code that we thought was reasonable, yet it surprised us when the warning message came out! We had a few hallway conversations from time to time, but a clear-cut answer never emerged. Finally, we realized that we had to get the interested parties in a room and have a knock-down, drag-out meeting to resolve the issue. And so on October 7, 2009, Amy Fowler, Kevin Rushforth, Richard Bair, and I got into a conference room to decide the issue. Three hours later — with no breaks! — we had decided. Actually, it was a great meeting, without a lot of conflict. There were just a lot of issues to cover. Each of us came into the meeting with our initial opinions, but the issues were so close that I think each one of us switched sides at least once during the meeting.

Obviously I can’t reproduce the entire discussion here, but the gist of the arguments went something like this:

It’s simpler, more efficient, and more consistent to disallow auto-remove.

On the other hand, 1.1 allowed auto-remove, so this is incompatible.

On the third hand, we expect people to set up scene graphs at initialization time and modify nodes in-place, instead of doing scene graph surgery.

On the gripping hand, some applications really do need to move nodes around in the scene graph.

Well, that’s not too difficult, just remove the node from the old group first.

Sometimes (especially with bind) it’s difficult or even impossible to remove the node from the old group first.

But most of the cases we’ve seen with bind are actually poor coding practices that we want to have emit a warning message.

And so on, back and forth, and around in circles.

Let’s start off with the topic of moving nodes around within the scene graph. What’s the problem? Suppose you have a node n that you wanted to insert into group g. If you do this:

insert n into g.content;

This might generate the warning message if n were already a member of another Group. To avoid this, you’d have to do:

This is a bit subtle but overall pretty straightforward. First we have to test that n.parent is a Group. (It might be a CustomNode instead.) Note also that if n has no parent, n.parent will be null and the instanceof test will fail. If the test succeeds, we can cast n.parent to a Group and then remove n from the content variable. This is a bit inconvenient but not too bad. We could wrap this up in a nice function and use it in a bunch of places. We even considered adding a utility function to the scene graph to handle this.

Things start to get hairy, though, when when you add bind into the mix. What are people doing with bind that makes it difficult or impossible to remove the node from the old group first? Which uses of bind are “poor coding practices” and which are legitimate?

In order to answer these questions, we’ll need to review the semantics of bind. Consider the following:

var p = bind q + r;

If either q or r changes, the expression is recomputed and assigned to p. The rule is that only the subtree of the expression affected by the variable change is re-evaluated. So if q changes, r isn’t recomputed, and instead its saved value is added to q to get the result that’s then assigned to p. This is hard to see if q and r are simple variables, so let’s make the expression a bit more complicated:

var p = bind f(q) + g(r);

where f() and g() are functions. In this expression, if q changes, f() is called with the new value of q, as one would expect. However, the function g() is not called again. Instead, the saved value of the previous call to g(r) is used, added to the new value of f(q), giving the result assigned to p. You can see this by putting println() statements into f() and g() to see when they’re called. Try it!

What does this do? Initially it calls f(xval) and g(yval), then creates a new Point2D instance and initializes its x and y variables to the values obtained by calling the functions. Now suppose xval changes. Naturally f(xval) has to be called again; as before, the saved value of g(yval) is used and function g() isn’t called again. These values are then used to construct a new instance of a Point2D, which is then assigned to p. What happens to the old instance? Well, it still exists (probably) but if it’s no longer referenced, it’ll eventually get garbage collected.

The important point here is that an object literal is an expression that creates a new instance of an object, not unlike calling a constructor. It’s kind of similar to something like this:

var p = new Point2D(f(xval), g(yval));

except that Point2D, being a JavaFX class, doesn’t actually have a (Number, Number) constructor.

Usually we don’t want to create new instances of objects when the bind-expression is re-evaluated. In some sense we might prefer to do something like this:

var p = Point2D { x: bind f(xval) y: bind g(yval) };

Instead of creating a new Point2D object each time xval or yval changes, it would create one object and mutate its variables in-place. This doesn’t work though, since the x and y variables of Point2D are public-init. To bind to a variable initializer in an object literal, that variable must be declared public so that code outside the class can modify it. So if you want to bind something that has type Point2D, you always end up creating new instances.

Most scene graph objects have public variables that can be bound. Consider this:

This works quite nicely. One Rectangle instance is created and assigned to rect. If any of xval, yval, wval, or hval change, the variables in that single Rectangle instance are mutated, and the effect is that the Rectangle moves or changes size in-place in the scene graph. In turn, if you hook these values up to a Timeline or to one of the Transition classes, that’s how you get animations.

We could write this, and it would work, in that we’d get a Rectangle animating in the proper way. But it’s enormously wasteful. Each time any of xval, yval, wval, or hval changes, a new Rectangle instance is created and the old one is thrown away.

Now, when any of xval, yval, wval, or hval changes, a new Rectangle is created. Then, because the content sequence of the Group is bound, the old Rectangle is removed from the scene graph and the new Rectangle is inserted in its place. It’s quite common for an operation to change both the width and height of the Rectangle, by changing wval and hval. Let’s say wval changes first. This creates a new Rectangle and replaces the old Rectangle. Then, hval changes, and another new Rectangle is created and replaces the one that had just been created. Furthermore, replacing things in the scene graph is more expensive than moving pointers around. Additional calculations occur, such as recomputing bounds, invalidating and caching transformations, etc. Some of this can be done lazily (indeed, in the next release, more of it will be). But it’s undeniable that creating new objects, generating lots of garbage, and doing scene graph surgery because of where a bind is placed, is very expensive. It would be a lot more efficient to rewrite the above code like so:

This mutates the Rectangle in-place, doesn’t generate any garbage, and doesn’t do any scene graph surgery. We’ve improved the efficiency of some of our code quite significantly just by moving bind around.

Now, what does this have to do with the auto-remove behavior that we’ve been debating? Let’s take a look at this variation:

Note that the bind is outside the group. This might look convenient, since the scene graph will automatically be updated if any of angle, xval, yval, wval, or hval is modified. But look carefully: suppose that the value of angle changes. Since it’s inside the Group object literal, a new Group object is created and its rotate variable is set to the new the new value of angle. There’s no need to create another Rectangle object, though, since none of its values have changed. Instead, the same Rectangle object is placed into the content variable of the new Group, which is then assigned to g. The old Group is unreferenced and will eventually be garbage collected. Now, who removes the Rectangle from the old Group? Aha!

This is where the auto-remove issue comes up. In the above code fragment, changing the value of angle causes the same Rectangle instance to be placed into the new Group, but it’s not removed from the old Group. As I mentioned previously, in release 1.1 auto-remove was the specified behavior, so placing the Rectangle into the new Group would automatically and silently remove it from the old Group. Furthermore, there’s really no way for application code to get in there and remove the Rectangle from the old Group first; the bind processing pretty much happens all at once. This would seem to be an argument in favor of auto-remove.

But wait, this code is pretty wasteful. It turns out that the auto-remove warning message was actually pretty useful, since it pointed out a bunch of places in our code where we were doing stuff like this: generating useless garbage and performing wasteful scene graph surgery. We ended up rewriting the code along these lines:

True, we had to write bind five times. But this made things much more efficient, since it didn’t create any excess objects and it avoided doing any scene graph surgery. It also got rid of the warning message! So that convinced us to leave the warning message in, to get people to fix their bad code, and eventually to turn the warning into an error and effectively disallow auto-remove.

Or did it?

Like this:

LikeLoading...

Related

19 Responses

Excellent writing!
But I am not convinced by the conclusion…
Yes, the warning message can help in pinpointing bind issues.
But I bet most people will just squarely look at the message, wonder why it is issued, and bang their head on the wall. Or ask messages in forums… :)
So the message is useful only if very explicit, pointing at possible fixes (that’s the hard part!), and profusely documented like you did…

I wouldn’t expect you to be convinced by the conclusion… there is actually still more of the story to be told. We ended up deciding to allow auto-remove and to suppress the warning, but only after considering even more bind cases. Those will have to wait until the next post. (Sorry.) :-)

Great read. And I always “love” those situations where you end up rewriting half the application because you finally figured out that warning.

I understand the balance between ease-of-use and optimalisation, but taking such a performance penalty like that bind example probably will be, compared to the correct version… We want to prevent JavaFX from being branded as “slow” like Java/Swing was, just because of poor coding.

Stuart,
Excellent writeup on the status on the issues you guys go through. I can see why you guys often feel torn. I like how you marked on the calendar October 7, 2009. I’m sure its a day to remember as the team shapes the language and APIs. Keep up the great work!

Hopefully, moving forward the community and the JavaFX teams can continue to communicate and remind the users (us) what is poor coding and correct coding conventions. Imho, I think breaking compatibility on behavior from 1.1 isn’t a bad thing. (similar to Java 1.0.2 to 1.1 days relating to Awt events). I don’t mind the warnings being turned into errors.

Binding, like the line from Spiderman movie. “With great power comes great responsibility.” Sorry couldn’t resist.

Thanks for your comments! I should mention that this is a learning process for everybody. JavaFX is a young platform, and JavaFX Script is a young language. We’re still learning about how to write good libraries and design good APIs, and sometimes we only learn something when we make mistakes. :-) Thanks for coming along for the ride.

At the risk of splitting hairs, you seem to have two separate problems here: (1) in certain situations object regeneration causes problems with binds, and (2) some people have been using binds in inefficient ways. So you’re considering putting an optional warning on the former in order to solve the latter.

Okay, but suppose (hypothetically speaking) at some unspecified point in the future some coding pattern or API is developed that makes it desirable to use the auto-remove mechanism — in other words when an auto-remove occurs it wouldn’t necessarily be a sign of bad coding. Presumably the warning would still be shown?

I guess what I’m saying is that I (tentatively) agree with your conclusion, but not necessarily the way you seemed to have reached it. You should have treated the two problems as separate issues and explored where the arguments took you, rather than entangled them. Clearly a solution for the binding problem was needed, and clearly a mechanism that identifies and warns of misguided bind usage would be very useful, but I think they should be considered separately.

Like I say, splitting hairs :)

Btw, you used Group as your example, but I assume auto-remove has been implemented in the JavaFX runtime as a whole, not specifically the scene graph, right? In other words, it will work for any container classes of any type. (Just checking! :)

Splitting hairs is fine with me. :-) Your distinction between the two separate issues is a good one. I think we probably did distinguish between them in our discussions more clearly than I’ve represented in this essay. But the discussion was pretty fast-moving and we bounced around a lot so it’s hard to say.

I think I may also have misled you regarding the “conclusion” at the end of this post. I was probably trying to be too clever. The end of the post was really an intermediate “plateau” in the discussion process, not the real conclusion, and it differs from our final decision. I had actually stated our final decision at the top of the post, but by the end of the post I suspect that most readers will have forgotten it. :-)

Actually it’s not your fault, but mine. I intended to round up my comment by saying I was eagerly awaiting part III for the full picture, but forgot. :(

On the wider issue: my personal view is that it’s the mechanics of binding that causes the most confusion for newbie JavaFX Script programmers, and a lot of programmers (particularly those arriving from a design background rather than a coding background) would be greatly helped if the JFX runtime had an option to analyse and report on potential binding misuse.

[…] Marks, one of our fellow team mates in the JavaFX controls team, continues his series on ‘that infernal scene graph warning message‘. It’s a must read if you want a glimpse into the thought process that goes on within […]

[…] Marks, one of my fellow team mates in the JavaFX controls team, continues his series on ‘that infernal scene graph warning message‘. It’s a must read if you want a glimpse into the thought process that goes on within […]

Briefly, it makes high-performance graphics difficult to implement, and it muddles the semantics of things that start at the leaf of the tree and proceed upward, such as event distribution.

——-

The idea of “bind pinned” is interesting. I guess it would be “syntactic sugar” for placing a bind keyword had been placed at the position of each initializer. Could be a useful shorthand, I suppose.

I don’t think a compiler switch would be helpful though. The difficulty is that this would a “silent” semantic difference, where code that is written a particular way would have different meanings depending upon its compilation environment. If some code were written with one meaning in mind and other code were written the other way, it would be impossible to compile them together and have things work properly. Or if the compiler switch were inadvertently set the “other” way, mysterious breakage would likely result. That would be a pretty tough debugging session. I think it would be better to have an explicit syntax for declaring one way vs. the other.

Note also that sometimes you do want to create a new object instead of mutating the object in place when the bind fires. One reason is if you care about the identity of the object. This probably doesn’t occur that often, but sometimes it’s important to have new objects created instead of mutated in place. Another reason is that sometimes an object might have public-init variables that can be initialized but not mutated, so the only possible way to “change” them is to replace the object with one freshly created with different values.

We actually discussed this a bit internally. Personally I wouldn’t like such a change, though it might make sense in certain cases such as for Group. The difficulty is reasoning about the language. For instance, consider the following expression:

var a = bind X { b: c };

Now suppose c changes. Is a new instance of X created or is it mutated in place? Well, you can’t tell unless you go off and read the definition of X. Worse, suppose b is public-init, so the compiler generates code to create a new X. If the definition of X changes so that b becomes public (a compatible change) it might subtly affect code that depended on the old behavior.

To me the semantics of bind are subtle enough already, but at least they’re consistent (if dangerous). Adding some special cases might make certain things work better but overall I think it could lead to subtle bugs.

Interesting, kind of like an “autobind” where any use of xval is as if “bind xval” had been written.

With the current construct,

var xval = …;
var x = bind xval;

You can tell by looking at x that it’s going to change spontaneously, whereas if you assign to xval you can’t tell whether it’s just a plain assignment or whether it might have arbitrary side effects. (Consider if the declarations of xval and x are in widely separate pieces of code.)

On the other hand,

var xval = … cascade on replace;
var x = xval;

you have to opposite problem, where if you look at xval you can tell that it will probably have side effects if you assign to it (but it’s hard to tell where), but if you look at x you can’t tell that it will change spontaneously. Not sure which is worse!

Note that today you can do something sort-of similar with bidirectional binds. For example,

var xval = bind x with inverse;
var x;

Here, xval is like “cascade on replace” except that you get to specify exactly one place that’s affect by an assignment to xval … but of course xval will be changed if x changes, which might be very surprising.

I think I’ve seen “bind … with inverse” used only once, and even then it was problematic.

You are correct that the code would be less explicit, but don’t forget that in practice developers will be using a JavaFX IDE (NetBeans, etc.) which can easily highlight bound variables, show which (if any) other variables affect this one, and so on.

The current semantics for “bind” on the outside of an object literal have two problems, I think:
1) It causes a new instance to be created every time a dependency variable changes, which may be unexpected to many developers, leading to code which instantiates many more objects than was intended.
2) To avoid the above issue and get the desired behavior, a developer would have to “duplicate” the use of “bind” in potentially many places inside the object literal.

A possible solution would be to make the creation of new instances explicit, by requiring that the developer writes “bind new” instead of just “bind” on the outside of the object literal. When only “bind” is specified on the outside, the binding would occur as if every public variable inside had it explicitly specified. This would be a more intuitive syntax.

Of course, this would be an incompatible change to the language. But at this point of JavaFX development, I believe it is still doable.