Following one man's task of building a virtual world from the comfort of his pajamas. Discusses Procedural Terrain, Vegetation and Architecture generation. Also OpenCL, Voxels and Computer Graphics in general.

Monday, July 9, 2012

The Uncanny Valley of Procedural Generation

As a developer of procedural worlds, what worries me most is not failing spectacularly at my goals. My biggest fear is to produce something that looks believable, but still is somehow off. Even a seemingly perfect world could be rejected by your subconsciousness. You may not be able to put it into words, or point your finger at it, but you feel there is something wrong.

To make matters even worse, it seems we can be collectively hypnotized into liking something just because it is a new way of doing things. Soon the novelty wears off and we realize the Emperor was naked all the time. We want to believe things look better that they do. For instance we know 3D graphics are a developing field so we are ready to forgive a lot, until something better appears and sets a new standard.Remember this beauty?

This is Peter Gabriel's "Kiss That Frog". You can see the video here. It won a MTV award for special effects in 1994. I remember loving this video. It is hard to watch now.

This is not specific of procedural techniques. Humans are equally able to generate uncanny, ugly things. The problem is proceduralism makes it a lot easier. The world is not crafted by hand, any aberration in the algorithms will be mindlessly multiplied. It also depends on the degree of realism you want to achieve. If you are trying to fake nature, odds are your creation is some form of monstrosity that will not stand the pass of time.I often wonder whether any sort of synthetic reality is doomed in the long term. At this point I don't know for sure, but I have two simple ideas to guide me across this maze:1. Global rather than LocalProcedural methods can be divided into two large families, global methods and local methods. In local methods content generated for a given point in space does not depend on the content of neighbor points.The Perlin noise function, for instance, can be evaluated locally. This means the output of the function depends only on the coordinates of the point, plus some constants. It also means many points can be evaluated in parallel. Since they are isolated from their neighbors, you do not need to compute any neighbors before evaluating a single point.They are blindingly fast, but it comes at a cost: They do not have a soul. They do not produce any information. All what you see comes from a small seed of values and the specific ways these values are churned and shaken by these clever algorithms.Because of their speed and fairly good results, they can be used in many subtle places, but they should not be the backbone of your world. Our minds are very good a discovering redundancy. All these methods are like a kaleidoscope effect, they can trick you for a moment but soon you start seeing the mirrors. And once you have seen them, the magic is gone for good.Here is your typical multifractal Perlin terrain:

Just don't do it. We all know it is not real.Another popular local method is L-Systems. This is when they are in their vanilla form, as context free grammars where symbols are replaced with no awareness of their surroundings. If used to produce trees, you soon have branches that intersect each other, or that go into illogical directions.Here is one gem that illustrates this point:

Global methods, on the other hand, are closer to simulations. The value of a single point may depend on the values of very distant points. Imagine a fluvial erosion filter. A point at the base of a mountain may be largely influenced by a large streak of points uphill.Global methods ere effective because they have cause-effect relationships built into them. This makes all the difference. It brings entropy into the world, it gives it time and history.Here is an example showing some fluvial erosion. The patterns you see here are far more believable:

In the same fashion, most successful tree and plant generators use some sort of simulation or at least global constraints. For my trees I chose a global method that grows branches in full awareness of each other. It is a very simple algorithm and still beats the results you get from vanilla L-Systems.

The problem with simulations is they are costly to evaluate. There is information moving around so they are harder to compute in parallel. Consider this example. The location, shape and strength of a river will be determined by a water source many kilometers away. A visitor to the virtual world may encounter the river long before than the source, but still the source and everything between must be accounted for. For worlds that are generated on demand as the viewer moves this may be too much to handle.

Even then, you should always consider using a global method. It may make your solution complex and slower, but you will have something to show.

2. Steal from Mother (Nature)

Nature has already spent a lot of time and energy producing the patterns we accept as real. We already use them to texture 3D models, there is no reason why we couldn't go beyond that.

Here is a very interesting approach to terrain synthesis. It combines elevation samples from real sites from Earth and stitches them into new ways. This results in fairly believable scenes that can cover huge spaces without any apparent repetition.

In this case they used some samples taken from the Grand Canyon.

A similar technique can be used for smaller terrain features like rocks, cliffs and small boulders. It is possible to have a set of volumetric natural textures and map them over larger terrain features.

The biggest issue is how to mask the repetition, but this can be done using Wang tiles. I have reworked many of my core functions to be like this and I like them better than my previous functions. I will be posting some results soon.

29 comments:

I would choose simulation over random any time! I personally hope to do a lot of simulation once I finish my course of game development (Mostly as hobby projects probably, while I work for some company)

Global does not imply believable. You can have global methods that output a lot of non-sense. Actually most of them do. You have to beat them with a stick to get something borderline plausible. My point is global methods are a more powerful framework. By using them you are improving your chances at synthesizing reality.

I'm not sure what you mean by "local techniques can be described using neighbors". If you are thinking about Cellular Automata, they are global methods.

i mean you can fairly simply make a riverbed look like it carved it's way into the rock. Grand Canyon like. With fractals and all.

But you dont want it to be rocky ground all the time, so you need many different layers of rock and earth or sand and all that. and methods that calculate erosion for each type. this get's messy fast.

But i also think it's evolutionary design that will provide the most realistic look. Rocks need to look "weared". You need to see erosion, you just don't expect that mountains \ rocks \ earth to look straight "function" like. You want em to look like they have been weared by wind and water for millions of years.

what i really like is some work of this guy http://www.atomontage.com/

I remember that frog video at the time, and I remember thinking it looked rubbish, it was meant to be scary or trippy but it really was a bit cheap and nasty.As far as reality judgment calls on what you do now viewed in the future, the safest bet is SUBLETY, and yes, in capital letters! :)Think of the first films that used animatronics in them: Everything that could move, moved. So the poor 'creatures' would have quavering lips, spinning eyes and eyebrows, the works.Same with CGI, look at water effects. 'Hey I can do water, watch how I massively exaggerate the wave effect on screen!'Subtlety now, will be seen in the future as being forward thinking.Well that's my view anyhoo.

I feel like games & the demoscene have an intrinsic advantage over videos and still images, in that they can be re-generated at runtime. An acceptable technical limit now (ie, unable to render over 1080p, 4Kx4K texture size limits, etc) will probably be laughable in ten years. A cleverly designed game/demo can take advantage of things that don't even exist yet, and thereby age better than a video or still image.

You are right about that. On the other hand, this advantage is seldom exploited. Games and demos rot at a higher pace than pre-rendered media. This Peter Gabriel video is contemporary with Doom.

It is not easy to take advantage of the future. Big part of the hardware evolution has not been quantitative, that is, just pushing more polys than before. There are big feature leaps that cannot be anticipated until the feature is really working. Think of multitexturing, shaders, access to textures from geometry shaders, tesselation. This trend will likely go on.

Instead of recombining multiple elevation maps, one could also try to find a generative model that can produce certain classes of elevation maps. This would greatly increase the amount of diversity in the elevation maps while still keeping a realistic feel. I am currently checking a hierarchical approach in which local methods are used to split the world into larger parts and global methods are used within these parts to govern the actual landscape. So basically using local methods for global features and global methods for local features.

perhaps slightly off-topic or irrelevant (in which case i apologise) but is it possible that you have 'missed the forest for the trees'? by which i mean, you seem to have built incredible proceedural tech, which looks more and more visually appealing with each post you make showing off your project, but unless you are intending to sell this tech/engine/whatever to outside parties, how is it going to apply to the game (?) you are intending to make? two examples i think are relevant: 1. the aborted game 'subversion' ( by introversion studios) - chris delay spent a number of years developing his proceedural world whilst avoiding the issues of 'core-gameplay/concept' - i.e what is the hook that makes it engaging for players, or drives the narrative being told?2. minecraft, whilst having (as a default setting at any rate) incredibly unpolished graphics, is a compelling experience where the player actively engages and creates their own unique narrative based upon the funamentally rewarding game mechanics of mining & building. by diverting energy towards the avoidance of entropy, is there the possibility that you have neglected other core mechanics? have you given any thought as to how the player will ultimately experience the world/s you are building, and what their scope for interaction with it will be? ps: i don't mean to sound at all negative, or critical of the (fine) work you have done, it is just that to my mind, these issues seem important as regards the end-goal. thank you for your time.

At this point my focus is more about generic technology. Something that could be used to generate game worlds like the ones you see in commercial games, but at a fraction of the cost. I think it helps to understand where technology fails, and how it can be taken to new levels.

This also applies to the game I want to make. I don't have any special mechanic in mind yet, the only thing I know for sure is I want the environment to be a big part of the experience.

I believe this is a great thing, even if it never gets made into a game. It shows people what is possible with Procedural generation, perhaps even opening doors for other games/developers, hell, it made me (a future game programmer =3) interested in Procedural Generation.

Further, once you have all the technology, even if you decide to not make the game, the system is usable for more situations than just the original plan, so you could use it for a different project, or even make it open source and have it be a world-builder engine which people can use for their own projects (or for learning) XD.

Interesting post, but I think you're off-base with the conclusion "entropy matters". You could easily pump up the seed length in the Perlin noise landscape so that it contains more entropy than the one generated by fluvial erosion, but the latter would still be look far better.

A more consistent conclusion would be that "structure matters". This obviously has limits too - PRGs are very structured, but are assumed indistinguishable from random coins - but in the context of procedural design I think it's a better lesson.

Entropy in this sense is about the flow of energy and matter. Local functions, like Perlin's, can mimic some of this with relative success, like simple Brownian motion, but like you said aesthetically they don't go too far.

Local functions do not "follow" the flow, they statistically fake an end state. I could be so bold and argue local functions do not have the property of entropy. They do not live in the realm of time. Even if they did (let say we added a time coordinate to Perlin's), entropy is a vector. That is, a flow comes from somewhere and it goes somewhere else. To me this implies gobality.

Actually I would argue that unrealistic procedurally-generated assets can have a certain unique appeal to them. Striving for perfect realism is actually semi-boring (IMHO) as our senses are already attuned to realty and there is no surprise involved with perfect replicas. For example, there is a certain beauty to voronoi patterns, even though perfect voronoi patterns don't occur very often in nature (at least, geologically speaking).

The real benefit of procedural generation extends beyond a visual reproduction though. Through procedural generation, you can create a world which behaves (somewhat) logically. Sampling a bunch of random patches of terrain might not make sense in certain contexts (like if a mountain pops up in the middle of a flood plain, unless it were made of a very hard substance, such as a volcanic protrusion). Not to say there is not value in learning/sampling/stealing from nature. There is a lot of research on developing patterns based on other sample patterns, one I remember is here:

As for the problem of intersecting branches with trees, one solution I thought up (which others probably did as well) is just to create a cloud of points, and a minimum spanning tree (with special modifications to ensure a merge towards the base of the tree).

I think you need to be able to replicate nature perfectly before you adventure into deconstructing it.

Like Picasso and many other modern painters, they were gifted classically trained painters. They could paint realistically if they wanted and most did through their earlier phases. This is why their later wacky stuff is good. You need to get it right before you can take it apart.

Same with jazz. You need real skills and be able to play anything they throw at you, then you can start breaking the music.

But it is my opinion. I know there are other ways to create unique and appealing content.

They do not have a soul. ... a small seed of values and ... clever algorithms.

The `small seed of values` seems to be an unnecessary constraint. Could we achieve some `soul` with a larger seed of values? E.g. use wikipedia pages to direct the growth of trees, or cities, or terrains? You mentioned a solution that uses elevation data, but have you considered approaches that use less domain-specific but `soulful` data - something that humans already find interesting, and that lacks the repetitive nature.

I've been thinking recently about procedural generation, particularly how to offer artists a more ad-hoc, flexible level of supervision and control.

Picking out a bunch of domain-specific sample data seems like it could be very difficult, and not much easier to use for procedural generation.

The landscape example used grand canyon data. Would its results have been so good with just any arbitrary landscape data? Would just any samples from a forest be suitable? Would a symphony from Beethoven or Mozart result in similarly "interesting" landscapes or areas? I think it might, it utilized effectively.