Scenegraph Refactor?

I'm once again diving through the libraries and once again I'm thinking about how this could be done better.

Being well aware that this part of DarkRadiant is probably the most critical, I still think that there must be something that can be done to makes our lives easier.

Orbweaver, do you think it's possible to make a slow transition with the old and a newly setup nodesystem side by side? Or would that be far too cumbersome and pose a serious performance drain?

On a side note: Today I watched angua pasting a large part of her caves map into the bonehoard.map file. In the map there were about 3600 primitives and 550 entities. She selected only 300 of them and the system (2x 1.6 GHz) worked and worked for about 15 seconds before the new selection turned up. Maybe there was the renderer slowing down things as well, but this is not the only reason, I think.

I had a look at the scenelib.h where the Instance class is defined and I think that there is some serious recursion going on. Each time the selection status of a BrushInstance is changed, it calls the selectedChanged() method from the base class. This checks for a parent and if a parent is found, it traverses the whole subgraph and this parent traverses the subgraph again - I haven't dug deeper.

This is completely insane, rather than propagating the information from top to bottom through the hierarchy, the callbacks are pointing across the scene.

What's your estimation on this? Is this another mammut-task like the dmap compilation stuff and should we rather leave it as it is?

She selected only 300 of them and the system (2x 1.6 GHz) worked and worked for about 15 seconds before the new selection turned up.

Oh! Yes, I was going to mention slow selection, but erm.. I didn't yet.

I have a small work map with just 282 brushes and 29 entities, and I was selecting 3 of them, and then inverting my selection to delete the rest. The invert selection takes 8 seconds, and the delete, another 8 seconds. Ouch!

Orbweaver, do you think it's possible to make a slow transition with the old and a newly setup nodesystem side by side? Or would that be far too cumbersome and pose a serious performance drain?

Are you talking about the iscenegraph module? Since this is a module, it should theoretically be possible to have two versions, yes, although first it would need to be refactored into an external DLL since it is currently Radiant-internal.

This is completely insane, rather than propagating the information from top to bottom through the hierarchy, the callbacks are pointing across the scene.

That's what happens when coders don't look any further than their own nose when implementing new functions. They don't spot when something is horrendously inefficient.

What's your estimation on this? Is this another mammut-task like the dmap compilation stuff and should we rather leave it as it is?

I think you're really talking about the SelectionSystem, not the SceneGraph here are you not? It may be that both of these need to be refactored, but you probably know better than me how large a task this would be (although it is likely that the amount of code needing to be removed is larger than the amount of code needing to be re-added).

Yes, part of my observations were about the SelectionSystem, and part of them were about the scene::Instance, scene::Node stuff with its TypeSystem.

I will start some investigations about this and try to get an overview. I really think that we need to change something here, if we ever want to have some features like parenting and grouping.

As a first step, I thought that I would try to move a single inheritance out of the NodeCastSystem and port it over to either static_cast<> or dynamic_cast<> (I also had a look at boost::polymorphic_cast<> and boost::polymorphic_downcast<>, but I don't know if it's worth the hassle).

An example of typical GtkRadiant crap: The Doom3GroupInstance is in principle a Selectable, but it's not inheriting from a Selectable. Instead it's containing an obscure ObservedSelectable as member that is returned when trying to cast the class onto a Selectable. To make this possible, the TypeCastSystem is used. As result, a new coder like me is unlikely to recognize this class as Selectable wondering for half an hour, why this damn class gets selected in the first place.

I suspect that this was TypeCast stuff was put in there to address performance issues - but the way it's used, I highly doubt that this would really mean a performance gain in comparison to dynamic_cast<>.

As a first step, I thought that I would try to move a single inheritance out of the NodeCastSystem and port it over to either static_cast<> or dynamic_cast<> (I also had a look at boost::polymorphic_cast<> and boost::polymorphic_downcast<>, but I don't know if it's worth the hassle).

boost::polymorphic_cast<> is just a wrapper around dynamic_cast<> which throws an exception insted of returning 0, this probably isn't necessary when all of the Radiant cast functions check for 0 anyway.

I suspect that this was TypeCast stuff was put in there to address performance issues - but the way it's used, I highly doubt that this would really mean a performance gain in comparison to dynamic_cast<>.

More likely it pre-dates the addition of dynamic_cast to C++ (or at least the commonplace implementation of it in compilers). I'm pretty sure that whatever was motivating the Radiant developers, performance wasn't it.

An example of typical GtkRadiant crap: The Doom3GroupInstance is in principle a Selectable, but it's not inheriting from a Selectable. Instead it's containing an obscure ObservedSelectable as member that is returned when trying to cast the class onto a Selectable. To make this possible, the TypeCastSystem is used. As result, a new coder like me is unlikely to recognize this class as Selectable wondering for half an hour, why this damn class gets selected in the first place.

Ah! That's a nice one. I had a similar issue with out production code, that has been created by indian developers (the indians were kicked all now because of their crappy code quality ).

I have a class X and this class has an member variable of type Y. The class X also has an override inline for the point operator to return class Y instead. So in the normal code you see stuff like this:

X *var = blabla;
...
(*var)->FunctionOfY()

When I debugged the code I was looking for the FunctionOfY and couldn't find it and was quit confused because I didn't see why a function that doesn't exists doesn't give me a compiler or linker error. The only reference that turned up, was of course the one from class Y. And I spent at least 2 hours investigating how this can happen without giving me a compiler error, until I realized that this operator overload existed.

This is really one of the most crappy ideas, because it simply confuses a programmer who looks at the code as he thinks he points to one class, but in truth points to a completely different class.

I'll rename the class right now, those misnomers are making everything harder than it already is.

If you're looking into EntityKeyValues, you might want to consider whether this should be a private implementation subclass class inside the EntityCreator (like Doom3EntityClass in the eclassmgr plugin) rather than having in a header file which is copied and pasted into practically every module in the application.

I'm also pretty sure it is implemented appallingly, all this needs to be a std::map but I think it is some stupid list of pairs.

I created a new branch for some experiments with the scenegraph (in case you wonder what the new branch is for).

So far I could change the following:
- the BrushNode / PatchNode / WhateverNode classes are now actually deriving from scene::Node instead from scene::Node::Symbiot. This enables the use of dynamic_cast<>, because everything is handled via the scene::Nodes and not the symbiots. I also had to add a virtual destructor to the scene::Node class to make it polymorphic, otherwise the dynamic_cast throws a compiler error.
- the scene::Node::Symbiot class is gone and its single release() method has been merged into scene::Node itself, where it belongs.
- the Node_getNameable() function is already using dynamic_cast. This was probably the easiest step, because most other NodeTypeCast/get() stuff is implemented in a more complicated way.

Excellent work Greebo, I never even bothered to find out WTF the Node::Symbiot stuff was all about.

Hey! Does this mean you are working on a new renderer?

No, the scenegraph is not the same as the renderer -- it is the way objects are stored in memory to make up a map. However it is highly likely that there are inefficiencies in the scenegraph which affect renderer performance, so improving the scenegraph might speed up the rendering in some cases.

At least it might make our lives a bit easier, so it may be worth experimenting.

I have an SVN-related question: Now that I've opened up a new branch, I probably should merge in all the changes from the trunk from time to time. I assume I can do this with the "merge" command as well? Does this work reliably? This might come in handy if the scenegraph refactor branch lives longer than I first expected.

Merge does work reliably, but it can get confusing regarding which revisions you need to merge. It's best to think in terms of the changes between revisions, rather than revisions themselves (i.e. "I need to merge in all the changes to the trunk between the branch revision and head").

I could remove the NodeStaticCast template from scenelib.h, all the occurrences are using dynamic_cast<> instead. This was the easy part, everything else is using NodeContainedCast which is probably much harder of a nut to crack.

I even think I can merge back the current branch state back into the trunk, after I've done some more tests (my quick check has always been successful, the cast is performed correctly and the methods relying on it as well).

It might well be that I don't succeed when moving further from here, so I believe it's worth saving the current progress.

That's fine, the ContainedCasts will be difficult to move because they do not correspond to an underlying cast operation the way the StaticCasts correspond to dynamic_cast<>. Contained cast would have to be fixed by using the Adaptor pattern, i.e. the main Node subclass implements the methods in the required interfaces and then dispatches them to its owned implementing object, rather than returning the actual implementing object itself as it does now.

I could make some nice progress so far: the InstanceStaticCast, InstanceContainedCast and InstanceIdentityCast templates are completely gone and the CastTables are empty and unused. I'll have to look now into cutting them out of the codebase.

edit: InstanceContainedCast is history along with a few associated templates. Instances are completely migrated to dynamic_cast<> now.

More likely it pre-dates the addition of dynamic_cast to C++ (or at least the commonplace implementation of it in compilers). I'm pretty sure that whatever was motivating the Radiant developers, performance wasn't it.

Yes, this special type of casting was introduced in the beginning of 1.5 because the early C++ compilers had crappy rtti support, some still have. I don't know on which pattern, if any, this system is based.In my opinion, the real problem is the wrong usage of this mechanism, many classes have installedcasts which just return a reference to a membervariable.This messes with the semantics of inheritance and delegation,a delegation should be treated as such and not be hidden as pseudo-inheritance.

So far performance was never been kept in mind when writing 1.5, it was more playing catchup with featuresets and bugfixing most of the time. Sad, but true.

If you feel the urge to implement a new rtti-system, I could post some source of my engine which hasits own rtti-sys as "inspiration"-source. It works with dispatching based on virtual and static functions and does not requiere any variables per instance, just one integer per-class.

Yes, this special type of casting was introduced in the beginning of 1.5 because the early C++ compilers had crappy rtti support, some still have.

I strongly suspected this was the case, since GtkRadiant is an old codebase and had to support Microsoft compilers which seem to be even less standards-compliant than other compilers of their generation.

If you feel the urge to implement a new rtti-system, I could post some source of my engine which hasits own rtti-sys as "inspiration"-source. It works with dispatching based on virtual and static functions and does not requiere any variables per instance, just one integer per-class.

That's kind of you, but I'd much rather stick with the standard dynamic_cast rather than having to implement a home-grown equivalent. So far I don't have any reason to suspect that this will be a problem using GCC (it hasn't been up to now).

I'm down at 13 occurrences of NodeContainedCast - fingers crossed, I will have removed the TypeCast system in the evening.

Good god, the whole system? I'll be really glad to see the back of that, it means I might actually be able to move the scenegraph module into a separate DLL, which in all previous attempts has failed with some junk about the TypeSystem.