My last post comparing Ogre3D to OpenSceneGraph seems to have directed a fair amount of traffic towards this blog, so I wanted to write a follow-up comparison, now that I’ve actually used both for quite a while. Ogre3D and OpenSceneGraph have similar features – animation, shaders, particles, terrain, shadows, etc – but are very different in their overall design, respective capabilities but also the communities that grew up around these engines. I am not going to compare each individual feature (that would be way too long of a post!) but instead will compare from a high level perspective. We are going to look at the latest stable versions, that is Ogre3D 1.9 (released on November 24, 2013) vs. OpenSceneGraph 3.4.0 (released July 20, 2015).

Let’s take a look at the compelling aspects:

Reasons to prefer Ogre3D

1. Ogre3D is well documented

Quality of documentation can be subjective as it really depends on the type of documentation you prefer – some people prefer written tutorials, some people rely on a Wiki with code snippets, and others just learn by doing, looking up API references when necessary. That said, generally speaking I would consider Ogre3D to be better documented than OpenSceneGraph. The API reference is more extensive and verbose, plus we have the excellent manual written by Ogre3D’s founder, Steve Streeting, that covers the entire engine in-depth.

The main source of documentation for OpenSceneGraph is their mailing list / forum, and a (very outdated) wiki. I wouldn’t strictly consider a mailing list “documentation”, but the quality of information you get from there is very good. Often times I search for a particular subject and the first result is a mailing list post from OSG author Robert Osfield, explaining a particular subject or feature in all detail just like a book would.

Speaking of books, there are some books are available covering Ogre3D and OpenSceneGraph, but I haven’t read any of them so can’t attest to how good they are.

2. Ogre3D is easy to use

When I first started with 3D graphics, Ogre3D was one of the first engines I’ve used. It was very easy to find your way around and get a basic scene going, and from there I just worked my way up to where I am today.

In OpenSceneGraph, you’ll often find more than one way of achieving the same result – which is a testament to its great flexibility, but can be a downside for the learning curve. Want to move or rotate a scene element? Here’s 3 different transform nodes you can choose from. Want to change the appearance of a model? Here’s 20 different StateAttributes, good luck finding the one you need. Want to hide a model? Go ahead and change its “node mask”. Want to move the camera manually? You’ll need to somehow calculate your “view matrix”. For a beginner in 3D graphics, all this could be overwhelming. In Ogre3D, these kind of tasks are more streamlined which eases the learning curve.

3. Ogre3D supports many rendering systems

In Ogre3D, calls to the graphics hardware are hidden away behind a “Rendering system”. There are multiple implementations for this rendering system, and before running an Ogre3D application the user can typically choose which rendering system to use. The currently available rendering systems are:
– OpenGL
– OpenGL 3.x (still in “beta” as of Ogre3D 1.9)
– OpenGL ES
– Direct3D 9
– Direct3D 11 (still in “beta” as of Ogre3D 1.9)

The idea is to give the user more choice while requiring no additional effort from the application writer. This choice can be useful because some graphics drivers on Windows are said to perform better with the Direct3D API, while on other OS’s OpenGL is your only option.

Unfortunately, this extra choice doesn’t come for free – Ogre3D is pretty good at abstracting away the differences between these rendering APIs, but there are some things that can’t be abstracted away. For example, in Direct3D you have to write shaders in the HLSL language, but OpenGL uses the GLSL language. These languages are pretty much the same aside from syntactic differences, but it’s still frustrating to try and support both – you’ll either have to write all your shaders twice, or use your own abstractions to generate API-specific shader code.

So generally speaking, an Ogre3D application will still be written with one (or more) of these rendering systems in mind, and the more of these you want to support, the more of a maintenance burden you will have.

OpenSceneGraph on the other hand supports only OpenGL, OpenGL 3.x and OpenGL ES. Also, the choice is made at compile time rather than runtime, making the decision up to whoever is compiling the package rather than the end user.

4. Ogre3D is fun to play around with

This is a bit hard to explain, but when I have a look around the Ogre wiki and forums, I see tons of posts with code snippets, material scripts, compositors and shaders that people have created “just for fun” and decided to share with the community.

I think one reason we’re seeing this tendency is because Ogre3D is a simpler engine, and the process for creating materials and compositors is all streamlined. You can write a material script or compositor script as a simple textfile, any other Ogre3D user can then use it simply by dropping the file into their asset folder.

OpenSceneGraph does feature material scripting as well, but the format is different. It’s part of their general purpose scene serialization library, so the material script files look a bit more abstract than the data-driven format that Ogre3D uses. OpenSceneGraph does not feature compositing scripts, although there is a community addon for that purpose.

Maybe another reason that we’re seeing more “snippets” for Ogre3D is that it’s more commonly used by hobbyist game developers that are just tinkering around – and thus more willing to share their creations – vs. OpenSceneGraph having a large community in the professional 3D space, people that are creating actual products and have busy lives.

Reasons to prefer OpenSceneGraph

1. OpenSceneGraph is flexible

As I hinted in the last article, OpenSceneGraph has a lot more functionality built into its scene graph than Ogre does – callbacks, node masks, node visitors, the ability to create custom nodes and custom StateAttributes, and the ability to attach a node to multiple parents, to mention a few. So what is all that good for? I certainly don’t expect a beginner to get use out of all these features, but they’re part of what makes OSG so extensible and customizable for your own needs. In addition, a flexible core library serves as the foundation for adding more features as “modules” that the core does not depend on, which we will look at in the next bullet point:

2. OpenSceneGraph is modular

Modularity is an important aspect when looking at the usability of a software library. A non-modular piece of software will become useless if one feature in it is not up to your requirements. With a modular design on the other hand, the user can simply decide to not use that particular module and swap it out for a different module, or create his own module.

OpenSceneGraph is incredibly well designed from a modularity standpoint. The core libraries are: “osg” handling the scene graph structure, “osgUtil” implementing rendering tasks and algorithms, and “osgViewer” handling the platform-specific render window creation. In osgViewer, you can easily plug in your own implementation of an osgViewer::GraphicsWindow, which allows for painless integration of OSG with third-party windowing code like SDL2. In Ogre3D this kind of integration is more difficult to pull off (I won’t go into details here, but have a look around for Ogre-SDL-QT-whatever integration guides – the short version of it is that it’s platform and rendering system dependent, and there are some bugs).

So called OSG “node kits” handle more specific features – shadow mapping (osgShadow), particles (osgParticle), animations (osgAnimation), etc. The core library doesn’t depend on any node kits, so the user can pick and choose whatever kits they want or write their own replacement kits. The “osgDB” module handles the reading and writing of files and management of a file “library”, but its use is optional – you can just as well plug in your own resource system by passing std::ifstream*’s to the OSG codecs. Speaking of codecs, various “plugins” that are loaded at runtime handle the reading and writing of various image, video and model formats.

Compare that to Ogre3D, which is modular to a certain degree (e.g. the rendering systems, terrain system, and codec plugins) but falling far short of what OpenSceneGraph has to offer in that respect. The biggest problem for me was that the resource system and the animation system are hardwired into the OgreMain core so not replaceable. This was one of the reasons we eventually decided for OpenMW to ditch Ogre3D and port to OpenSceneGraph.

3. OpenSceneGraph is fast

OpenSceneGraph is probably the fastest free and open source 3D engine in existence. While you’re not going to find extreme optimizations like SIMD instruction sets or “AZDO” OpenGL hacks that would add a lot of complexity – OpenSceneGraph is simply fast by design because it can run its drawing commands in parallel with scene culling, animation and whatever else the user is running in the main thread. The drawing commands themselves are also quite fast. OpenSceneGraph uses a state graph to manage OpenGL state, minimizing the amount of redundant state changes done. OpenSceneGraph also includes some auxiliary tools to help you get the most performance out of it: a scene graph optimizer tool, and a profiling overlay.

When OpenMW was ported from Ogre 1.9 to OpenSceneGraph, I noted 300-400% the framerate when it comes to “raw” drawing performance, and in the end a 250% framerate when all the other subsystems were back in place. This was a particularly complex port though so it’s not a fair comparison by any means. Please do your own benchmarks and draw your own conclusions!

The upcoming Ogre 2.x versions are supposed to resolve performance problems in 1.x, but since we are looking only at the latest stable versions we can’t look at it in this article (and I don’t have any experience with Ogre 2.x, either).

4. OpenSceneGraph is stable

“Stability” of a software library refers to the ability to upgrade your applications to a newer version of the library with minimal hassle. OpenSceneGraph has many users in the professional 3D space that have a vested interest in keeping their products updated, thus stability is important – and this is something OSG author Robert Osfield takes very seriously. Every change is carefully reviewed from a compatibility standpoint before it gets merged.

As for stability in Ogre 1.x versions, my experience has been okay-ish… there were a couple incidents that I felt unnecessary. Pointless renamings that broke old user applications (“StringUtil::BLANK” to “BLANKSTRING”, particle API renaming), and a massive breakage of the resource API (that was thankfully fixed before the 1.9 release). Also note that the future of Ogre with the upcoming 2.0 and 2.1 versions is anything but “stable” – the changes are so massive that porting over a large application that used 1.x can be a scary undertaking. These drastic changes were made to resolve all the various performance bottlenecks in Ogre in one fell swoop.

Conclusion

We can’t say that one or the other engine is “better” – your mileage will vary depending on the things you use it for. Both are excellent engines and – with enough effort – can be made to do whatever you want, really. If you’ve already started a project in either engine, there is most likely no compelling reason to switch. If you are new to 3D rendering, I’d recommend starting with Ogre and see where it takes you. If you already have a bit of 3D experience, you should be able to make up your own mind based on the arguments above. I hope this post helped you decide either way. Happy coding!

Ok, so in the last post about my new PC, you may have got the gist that I wasn’t perfectly happy with the all-in-one Corsair watercooler. I did get the pump noise to a semi-acceptable level, but then there was still the annoying sound of the fans, and fans ramping up when the system is under load. I didn’t want to go back to air cooling either (my overclocks!), so the obvious next step was to build a custom watercooling loop. Custom loops are a lot more expensive than all-in-one pre-filled watercoolers, but do much better in terms of cooling performance and quietness.

Watercooling your PC isn’t as dangerous as it sounds. The chance of a leak in a properly installed loop is very slim, and even if a leak does happen, there’s a decent chance you’ll notice it before your components get damaged. There’s also a chance your hardware would still work fine even with puddles of water over it. Distilled water in itself isn’t conductive, however it can ionize over time due to contact with all the metal parts in the loop. That said, there is still a tiny chance of frying your hardware you’ll just have to live with. Personally, in the event that happens I would just take it as an opportunity to upgrade to newer hardware and forget about it.

Going into this, I was still a bit scared of the process. I am kind of a clunky person when it comes to handiwork, but it turns out the installation wasn’t difficult at all. In fact, planning out the loop and picking out components was more work than the actual build.

Since this is primarily a workstation computer, I decided to leave the videocard out of the loop for now, mostly because the fans don’t spin anyway when the videocard isn’t under heavy load, and because video card water blocks are really expensive.

To get a basic water cooling loop going, you will need:
– Coolant: a liquid substance moving through the loop. Most people recommend distilled water. You can also get colored fluids, but they are notorious for gunking up your loop and being harder to clean out.
– Pump: moves coolant, usually with rotor blades. Pumps are really silent these days, but it’s still recommended to decouple (e.g. put on a piece of foam) the pump so that vibrations do not transfer to your PC chassis.
– Reservoir: a tank feeding the pump. Technically, you could run a loop without a reservoir, but having one makes filling the system, bleeding (the process of getting air bubbles out), and monitoring the water level a lot easier. Some reservoirs have integrated pump mounts.
– Tubing: d’uh.
– CPU block: a metal block cooling your CPU, transferring the heat into the water.
– Radiators: Radiators cool down the water by transferring heat into the air. Most radiators are meant to have fans attached to them.
– Fittings: connects tubing between your CPU block, radiators, reservoir and pump top.

The design of your loop will depend on many factors, such as what hardware you’re cooling, how much space you have available in your PC case, whether or not you want a silent system, whether or not you’re overclocking, etc.

I highly recommend JayzTwoCent’s beginner’s guide to watercooling, but have a look around for other resources too. MartinsLiquidLab and SilentPcReview are good resources for individual part reviews. Once you’ve picked out parts, ask on a forum (e.g. overclock.net) for feedback, you want to be sure you are not buying incompatible or nonsensical parts.

With all the parts in front of you, assembling everything should only take a day or two. Then you’ll want to perform a leak test over night (with your hardware still unplugged), and the next day you can enjoy the awesome performance, quietness and badass look of your watercooled PC!

plumbing done

The total cost of my loop was just under 500€. I could probably have built one for 200€, but I wanted to pick the highest quality parts and also went with one more radiator than really necessary, just so I can run the fans at slower (silent) speeds. To put the price into perspective, keep in mind that having your hardware extra cool will increase its lifespan; also, watercooling parts will generally last you forever. The only moving part is the pump, and even these things are typically rated for about 5 years of 24/7 operation.

Is watercooling worth it from a cost perspective? Not really. But it’s a fun tinkering hobby, and the only option for folks not willing to compromise on the performance nor on the quietness of their PC.

So this is just a quick write up of important chromium extensions that I currently use, just for my own reference in case my user profile suffers a catastrophic failure or something. But maybe there are a couple extensions in here that you guys didn’t know about and might find useful, so I figured to post it on the blog. Here we go!

Have you ever written a lengthy text post, hit “submit” and have the whole thing disappear because the connection timed out, or the forum asked you to login, or told you that you can not post more than once every 30 seconds? Lazarus is here to bring you salvation. Everything you type into a textfield is saved automatically and can be restored by hitting the icon in the upper right corner, even if you have refreshed the page or browsed elsewhere. This extension has saved my ass so many times, it’s amazing.

Blocks social network buttons such as “Tweet this!” / “Share this!” from many sites. I don’t use social networks, so the buttons are useless for me. Even if I did use social networks, I would still find those buttons clunky and useless. They can also (and often do) track the sites that you visit.

This is an extension I only found today, wish I had known about it sooner!

Ever get annoyed by those pesky “Our site uses cookies” pop ups that many sites are riddled with nowadays? Unfortunately these are mandated because of a silly law in the EU. The whole idea is just broken. 99.9% of today’s websites use cookies, so I really don’t need to know that. Especially annoying when the notification is a modal dialog that prevents me from using the site until I grudgingly click “Accept”. And the fact that I accepted the notification is of course going to be stored as… you guessed it… a cookie. So for anyone who wants no cookies, or browses anonymously, the usability of the site will be crippled.

This extension effectively removes these annoying pop ups. If you do come across a site that the extension didn’t work for, make sure to right click and select “Report a cookie warning” so the developer can fix it.

If you don’t care about cookies, then this extension is for you. Even if you do care about cookies, you should still get this extension, and just use your browser’s cookie notification facilities in place of the site’s. It’s a win/win for everyone!

Finally done setting up my new computer. I bought it over two months ago, but it took me a long time to get it running as intended. I got shipped a damaged case, then later had the mainboard break after only two weeks of using it. I have also been struggling with a noise issue on the watercooling unit, which is finally resolved now.

As for Linux support on this new hardware, my experience has been flawless, save for one component.

Corsair H100i GTX on Linux

The Corsair water coolers only partially support Linux. Partial support meaning, the unit will run fine out of the box, but you will not be able to use the Corsair Link software to adjust fan speed, pump speed and LED color. There is an open source replacement for Corsair Link under development, but it does not appear to support the GTX models yet.

I long debated whether to get an RMA for my unit. The pump was very loud in the beginning and also made an unhealthy-sounding grinding noise at higher RPMs. The grinding problem fixed itself after a few weeks of usage. It was, however, still slightly too loud for my taste.

Finally, I noticed that I could run Corsair Link on Windows, set the desired preferences, then reboot back into Linux, with the preferences still persisting. This way, I set the pump to Quiet mode and now I’m happy with the noise level. The pump is almost inaudible over the fans now.

I was able to do this in a virtual machine even, using Windows 8 in VirtualBox. To make the VM recognize the Corsair Link internal USB device, you’ll need to add your (host) user to the vboxusers group, install the VirtualBox Extension Pack (not to confuse with the Virtual Box Guest Additions! I’m not sure though if this step is actually necessary), then enable the USB device in the virtual machine’s preferences.

Overclocking results

Nothing too impressive for an i7-4790k. I couldn’t get it stable at 4.7 GHZ without going over 1.4 volts, which is a little more than I’m comfortable with – especially considering the cooler is still running on Quiet settings. Reading through results from other people, I seem to have gotten a below average chip. Most i7-4790k’s still cap at 4.7 GHZ, though.

Frequency

Core voltage

4.4 GHZ (stock Turbo)

1.150v

4.5 GHZ

1.200v

4.6 GHZ

1.350v

I’m going to stick with the 4.6 GHZ for day-to-day usage. I probably could have achieved the same with high-end air cooling. Which kind of puts the money spent on the H100i to waste. Well, at least it looks cool.

I am well aware that 1.35 volts will affect the lifespan of the chip, but that’s fine with me. If it breaks, I’ll simply have an excuse to upgrade again. Which makes sense anyway, given that Skylake chips have been released now.

Conclusion

Very happy with the results of this upgrade. I am seeing ~2.6 times the framerate in OpenMW compared to my old rig, and an ever larger improvement in compile times. I can make -j8 the whole project in about 2 minutes and 40 seconds. While there are still “faster” CPUs with more cores, you would sacrifice single-core performance, which is very important for the linker and single threaded applications in general (so almost every application). So for me, the 4790k is the best CPU that money can buy right now (Ignoring the i7-6700k, which was unfortunately released after I purchased my system).

The graphics card might be overkill for what I’m doing right now, but at least it runs very quiet and has great power efficiency. I don’t play a lot of games these days, but I picked up Metro 2033: Redux the other day just to stress the card a bit, then promptly got lost in the game. It’s a great game, definitely the most immersive game I’ve played to date, and I would say the GPU purchase was worth it alone just to max out that game.

The GTX 970 STRIX edition also features a passive cooling mode when the GPU is not under heavy load. With OpenMW running at a framelimit of 120 fps, the GPU isn’t even stressed enough to start spinning its fans. Pretty impressive!

Did I mention that the PC looks awesome? I guess looks are not a deciding factor, but a nice bonus. My desk is generally quite empty these days, so I don’t mind having a little something to look at. I might get rid of the LED lighting later if I find it too distracting, but so far I’m liking it!

For my next build (maybe in 3 years or so) I plan on building a custom watercooling loop, just for the heck of it (and so I can overclock the sh*t out of my CPU while still being near silent). But this one will do for now.

A week into using OpenSceneGraph, I like to think I have a pretty good grasp on its concepts and internal workings. Having used Ogre previously helps a lot, but there are a few Ogre concepts that I’ve had to un-learn in order to proceed. This is not by any means a comparison of which library is “better”, nor do I think these are points the Ogre team should be improving on – it’s better to improve on the aspects that set the two libraries apart, in my opinion. The design goals are also different, with Ogre more focused on ease of use, and OSG more focused on flexibility.

Graph-based

In Ogre, the scene graph is mostly used to manage the derived transforms of the Renderables attached to leaf nodes, and that’s it. In OSG, the scene graph is a much more integral part of the library’s design and feature set. A few examples:

StateSets

In Ogre, a Material contains all possible render state in it, and Materials can only be set on Renderables, i.e. leafs.

In OSG, any node can have a StateSet attached to it. A StateSet behaves more like an std::map or std::set in that it only contains the render state you requested, and none else. For state that you haven’t set up, the renderer will effectively use either the parent’s State, or if no parent defines this State either, then the OpenGL default state. The State of a child node will override the State of its parent, unless the parent has an “override” flag set on it.

I really like this approach, it makes it easy to toggle state on an entire sub-graph, whereas in Ogre you’d need to walk over all renderables and change their material, being careful not to affect any unintended renderables when materials are shared.

It also allows for better optimization of the renderer itself, by reducing redundant state changes, and rarely used state like stencil settings will have zero overhead if they’re not contained in any StateSets.

Passes and techniques

One side effect of the StateSet approach having children override their parent’s state is that multipass rendering is no longer inbuilt into the StateSet. A StateSet effectively describes a single pass only. In Ogre, each material has a vector of Techniques and each Technique a vector of passes, which then contains the actual render state.

In OSG, multipass rendering is handled by a special node, which traverses it’s children N times when N passes are defined, and sets the StateSet of the appropriate Pass on each traversal.

Being able to define multiple passes on an entire sub-graph rather than just on leafs is very powerful, and the implementation is in the stand-alone osgFX component, which means the renderer itself doesn’t have to deal with the complexity of multiple passes/techniques, nor do users who do not need multiple passes/techniques need to know that the feature even exists.

Nodes can have multiple parents

This seemed a bit strange when I first came across it. Nodes can have more than one parent, which means the node will be traversed (and rendered) multiple times, similar to the multipass implementation. Since each traversal is coming from a different node path, it can have a different world matrix, different State, etc. One example use for this is to create clones of a part of the scene at different positions, with very low memory usage.

Obviously there are more sophisticated techniques for cloning geometry, like static geometry batching and instancing, but they have their own drawbacks, in particular with high memory usage, culling problems, micromanagement overhead, and not being able to properly set up the closest point lights for each individual mesh, which was a big issue in OpenMW.

Of course, with more power comes more responsibility, so you need to be careful not to create loops in your graph (I tried it – stack overflow).

Streamlined vertex formats

osg::Geometry only deals with non-interleaved vertex formats, i.e. vertices, normals, and UV coords are all in separate buffers. This makes it much easier to implement SW skinning/morphing or passing the vertices to your physics engine for raytesting purposes. Note that should you really need interleaved vertex data, it can still be used by creating your own Drawable class.

No custom allocators

All allocations are done using the standard new, whereas in Ogre you have to use macros like OGRE_MALLOC / OGRE_MALLOC_T whenever passing allocated memory to the library. By default Ogre uses the nedmalloc allocator, which at the time it was implemented promised a performance boost on some platforms, but should be obsolete by now, as stated on its homepage:

(Windows 7, Linux 3.x, FreeBSD 8, Mac OS X 10.6 all contain state-of-the-art allocators and no third party allocator is likely to significantly improve on them in real world results).

It might be about time to switch Ogre back to using standard allocators, and while at it, maybe remove the custom allocator system all together, because it does add some complexity for the user to deal with.

Build tools

Both Ogre and OSG are using CMake, so it can be interesting to compare how they’re using it.

I said this wouldn’t be a competition, but here I found the difference in complexity and platform specific workarounds to be staggering. OSG is the clear winner in this category… sorry. See for yourself:

So, I’ve been asked about my thoughts on the Ogre 2.1 release, and the implications for OpenMW. Here they are.

OpenGL3 requirement

The 2.1 branch has dropped all support for OpenGL2. OpenMW was never meant to run on hardware that originally ran Morrowind – still, dropping GL2 support seems a bit extreme at this time. For example, the open source Mesa drivers haven’t caught up to that point yet (at least not on Ubuntu 14.04, and thus one of my dev machines). Admittedly this may become more or less irrelevant a few years down the line.

Diverging branches

Development is currently split across 3 different unstable branches (1.10, 2.0 and 2.1). 2.0 and 2.1 are lagging behind the 1.10 branch with 500+ unmerged fixes. This makes contribution unattractive to say the least.

New material system

This is the feature I was looking forward to the most, but now I’m a little disappointed. The shader macro system is very basic and does not appear to support setting a uniform to a material property or to a scene parameter. Most of the work is done by the C++ implementation which is tightly coupled to a specific shader.

The default Physically Based Shading materials are nice but obviously not usable with Morrowind’s lighting system. Coding custom materials is possible but not without intimate knowledge of the underlying AZDO backend. The default PBS material C++ implementation comes at a whopping 2000+ lines and makes use of a “command buffer” and “vao manager” that we can obviously not expect Ogre users to be familiar with.

This also goes against the principle of making the shaders as user-modifiable as possible, one of my main goals with the rendering backend in OpenMW.

The general issue is trading flexibility and fast prototyping for hardcore performance. Overall, not something I am comfortable using, so I started looking for alternatives.

Meet OpenSceneGraph

OpenSceneGraph is an established high performance 3D graphics engine using OpenGL. I took a detailed look and fell in love with its design. All features we need are provided:

– Material stencil support: Ogre does not support stencil settings in its material system, which can be used by NIF files. OSG does support it. Admittedly not a very prominent feature, still nice to have.

Other notable changes

No resource system

OSG does not require a resource system, all loading can be done by the user. Hell yes! Ogre’s resource system was always ugly in that it expects everything (be it a material, mesh or texture) to be named, which *is* a real problem when internal resources conflict with user-defined resources. There are also inherent problems with Ogre’s resource manager being a singleton; we need multiple resource managers in OpenCS, one for each document.

In fact, the Ogre folk realized this mistake long ago and started a “Resource system redesign” GSOC project, which unfortunately to this day has not been finished.

No DirectX

Dropping support for this particular render system has been on my list for a while. If it were not for Ogre’s terrible OpenGL performance this would have been done a long time ago. I look forward to being able to write shaders directly in GLSL rather than using the compatibility header we have in place. Another problem was that bugs cropping up with that renderer can not be fixed by me and tend to accumulate.

In the past few days I’ve been taking a closer look at Ogre 2.0, to evaluate if and when we can start using it. I’m now confident that a prototype is feasible. Some work has been done on the prerequisites and an outline has been created for problematic issues. I’ve also listed some new features in Ogre 2.0, besides performance improvements, that are particularly exciting for us.

Dependencies

The first step is to port MyGUI. Luckily its use of Ogre is fairly minimal and abstracted via a Platform interface. It didn’t take long to get most demos working and I arrived at the following patch. One interesting change is the rendering callback. Previously, MyGUI registered a RenderQueueListener with Ogre’s scene manager to be notified when the “Overlay” render queue is hit. This no longer works in Ogre 2.0, because empty render queues are skipped. Curiously, I found the following comment in the Ogre source code (in 1.x versions, too!):

// NB only queues which have been created are rendered, no time is wasted
// parsing through non-existent queues (even though there are 10 available)

My only explanation for how MyGUI rendering could have actually worked in the first place is a bug in Ogre 1.x, causing empty queues to be traversed anyway – contrary to what the comment says.
After changing MyGUI to render from the frameRenderingQueued callback, it works. This is not a clean solution, and I’d like to leverage the compositor system instead – but custom compositor passes are not supported in Ogre 2.0 yet (though planned).
Also, MyGUI’s Render-to-texture and the RenderBox demo have not been ported yet. This is a tricky part, because direct updating of render textures no longer works as it used to: all rendering has to go through the compositor system.
A couple of loose ends and I’ll probably have to revisit them later, but good enough for use in a prototype.

On to problematic points in OpenMW itself…

Tag points

Tag points are used to attach an object to a bone on a character’s skeleton – say, a weapon on their hand. This feature is widely used in our character system (not just for equipment, but also for supporting Morrowind’s segmented body parts).
Unfortunately, it hasn’t been ported to Ogre 2.0 yet:

TagPoints isn’t going to be fixed soon. First we need to integrate the new skeleton system into Entities

I’m considering to work around this for now by creating tag points as regular scene nodes and updating their transform manually every frame based on the bone transforms. This is likely not very efficient, but at least it works.

Scene node names

Currently, we use Ogre’s auto-generated scene node names to identify objects in a few places (whose idea was that?). This will not work in Ogre 2.0, because scene nodes don’t have unique names anymore.
I don’t expect too much trouble here – a search for getHandle() returns 93 hits and most of them are actually just getHandle() == “player” (*grumble*)
So, how do we deal with this? I’d like to remove the getHandle() function. We don’t really need it. The only legitimate use at the moment is for connecting Bullet’s rigid bodies with their MW-object for ray query purposes. We can easily change this to storing the object’s MWWorld::Ptr instead of the scene node name. And good riddance to World::searchPtrViaHandle, because it’s terribly inefficient.
If that plan fails, we could just use the unique scene node IDs (uint32) offered by Ogre 2.0.

Future

And that should be everything we need to worry about for the initial port. But there’s much more exciting stuff up ahead:

New material system

In 2012, I created a library for Ogre called shiny. This is what we currently use to effectively handle shader permutations, among a few other things.
My plan was to contribute a similar system back to Ogre directly, but as it turns out dark_sylinc has beat me to it. He’s developed a new material system for Ogre 2.0 in feature parity with my proposal. Some of the terminology has changed, but otherwise the proposal gives you a good overview of the new system.
It’s currently sitting in a feature branch and not ready for prime time yet. It will also require some porting on our end – notably, assembling materials is now handled by the user in a callback, rather than handled generically by shiny’s material template system. This is a welcome change for me, as I always felt the template system to be somewhat limited and unsatisfying. It wasn’t used much in OpenMW anyways, since we have to convert materials on-the-fly from the NIF format.

While all this is still in development, it’s good to know that we can continue using shiny in the meantime – it builds and works just fine with the current Ogre v2-0 branch.

Hardware requirements

So far, OpenMW requirements have been fairly low. Shaders are used by default, but not strictly required.
With the current Ogre v2-0 branch, this hasn’t changed yet, but the fixed function pipeline has already been removed in a feature branch, along with the DX9 and GL2 render systems. So eventually the minimum requirement will be GL3 compatible hardware.
If you think you might be affected, don’t worry – we can keep support for compiling with Ogre 1.x in a branch.

Another question is whether we can keep maintaining DirectX support on our end. Ogre’s API does its best to work rendersystem-agnostic, but unfortunately writing shaders doesn’t work that way. Currently, we use a porting header allowing us to generate shaders for both OpenGL GLSL and DirectX HLSL syntax from the same code. I don’t know if this will still work with the new material system. Shiny used the generic, but slow boost::wave preprocessor, the new material system uses a custom one. There’s a chance the header will trigger obscure edge cases or unimplemented features in the preprocessor. If that’s the case, then we will write shaders directly in GLSL, and DirectX support will be thrown out. I have no intention to maintain two near-identical copies of all shaders.

Animating culled objects

In Ogre 1.x, culled (as in: not visible in the view frustum) objects do not have their animation and skeleton updated. Sounds sensible at first, but this can cause problems when the animation drives collision objects or other gameplay elements.
It’s funny that we’ve been hit by that exact issue after Matias called it:

Frustum culling at fine granularity conflicts with DOD (Data Oriented Design) paradigms, and thus reduces the overall performance when looking at most of the skeletons (which is most likely the general case). Furthermore fine frustum culling is useless in modern real world applications where there are like +4 camera passes (reflections, shadow mapping) as eventually everything ends up getting caught by the camera. Also some games need to get the skeleton data for collisions or logic, despite not being on camera

In Ogre 2.0, skeletons are always updated, even when not in the view frustum, so the above issue and similar ones should go away.

Coordinate precision

In cells far from the coordinate origin, objects begin to shake violently. This is due to precision issues when transforming individual vertices with large coordinates.
A common solution is to treat the camera as the world origin for rendering purposes. An option for this exists in Ogre 1.x, but I haven’t been able to use it due to minor bugs. Camera-relative rendering has been redesigned in Ogre 2.0. This gives us a perfect excuse to revisit the problem and hopefully get rid of it for good.

Non-uniform character scaling

A missing feature in OpenMW is scaling the width of NPCs by their “Weight” property, so most NPCs currently look a bit too skinny. The reason we can not currently support this is down to Ogre: when non-uniform scaling is used on a skeleton root, the scale is treated in bone-local space. This does not match our needs; rotated bones (e.g. arms) become longer instead of wider.
It’s good to know that this issue has been taken into consideration and fixed in Ogre 2.0’s new skeleton system, although I’m not sure when we can start using it (it’s still labelled as experimental and disabled by default).

That’s it for today. The next post will be when I have substantial progress to show.