Search this blog

06 September, 2014

As for the Mathematica 101, after the (long) introduction I'll be talking with code...

Introduction to "Scientific Python"

In this I'll assume a basic knowledge of Python, if you need to get up to speed, learnXinYminute is the best resource for a programmer.

With "Scientific Python" I refer to an ecosystem of python packages built around NumPy/SciPy/IPython. I recommend installing a scientific python distribution, I think Anaconda is by far the best (PythonXY is an alternative), you could grab the packages from pypi/pip from any Python distribution, but it's more of a hassle.

NumPy is the building block for most other packages. It provides a matlab-like n-dimensional array class that provides fast computation via Blas/Lapack. It can be compiled with a variety of Blas implementations (Intel's MKL, Atlas, Netlib's, OpenBlas...), a perk of using a good distribution is that it usually comes with the fastest option for your system (which usually is multithreaded MKL). SciPy adds more numerical analysis routines on top of the basic operations provided by NumPy.

IPython is a notebook-like interface similar to Mathematica's (really, it's a client-server infrastructure with different clients, but the only one that really matters is the HTML-based notebook one). An alternative environment is Spyder, which is more akin to Matlab's or Mathematica Workbench (a classic IDE) and also embeds IPython consoles for immediate code execution.

Especially when learning, it's probably best to start with IPython.

Why I looked into SciPy

While I really like Mathematica for exploratory programming and scientific computation, there are a few reasons that compelled me to look for an alternative (other than Wolfram being an ass that I hate having to feed).

First of all, Mathematica is commercial -and- expensive (same as Matlab btw). Which really doesn't matter when I use it as a tool to explore ideas and make results that will be used somewhere else, but it's really bad as a programming language.

I wouldn't really want to redistribute the code I write in it, and even deploying "executables" is not free. Not to mention not many people know Mathematica to begin with. Python, in comparison, is very well known, free, and integrated pretty much everywhere. I can drop my code directly in Maya (or any other package really, python is everywhere) for artists to use, for example.

Another big advantage is that Python is familiar, even for people that don't know it, it's a simple imperative scripting language.Mathematica is in contrast a very odd Lisp, which will look strange at first even to people who know other Lisps. Also, it's mostly about symbolic computation, and the way it evaluate can be quite mysterious. CPython internals on the other hand, can be quite easily understood.Lastly, a potential problem lies in the fact that python packages aren't guaranteed to have all the same licensing terms, and you might need many of them. Verifying that everything you end up installing can be used for commercial purposes is a bit of a hassle...

How does it fare?It's free. It's integrated everywhere. It's familiar. It has lots of libraries. It works. It -can- be used as a Mathematica or Matlab replacement, while being free, so every time you need to redistribute your work (research!) it should be considered.But it has still (many) weaknesses.

As a tool for exploratory programming, Mathematica is miles ahead. Its documentation is great, it comes with a larger wealth of great tools and its visualization options are probably the best bar none.Experimentation is an order of magnitude better if you have good visualization and interactivity support, and Mathematica, right now, kills the competition on that front.Python has IPython and Matplotlib. IPython can't display output if assignments are made, and displays only the last evaluated expression. Matplotlib is really slow, really ugly, and uses a ton of memory. Also you can either get it integrated in IPython, but with zero interactivity, or in a separate window, with just very bare-bones support for plot rotation/translation/scale.

Overall, I'll still be using Mathematica for many tasks, it's a great tool.As a tool for numerical computation it fares better. Its main rival would be Matlab, whose strength really lies in the great toolboxes Mathworksprovides. Even if the SciPy ecosystem is large with a good community, there are many areas where its packages are lacking, not well supported or immature.

Sadly though for the most Matlab is not that popular because of the unique functionality it provides, but because MathWorks markets well to the academia and it became the language of choice for many researchers and courses.Also, researchers don't care about redistributing source nearly as much as they really should, this day and age it's all still about printed publications...So, is Matlab dead? Not even close, and to be honest, there are many issues Python has to solve. Overall though, things are shifting already, and I really can't see a bright future for Matlab or its clones, as fundamentally Python is a much better language, and for research being open is probably the most important feature. We'll see.A note on performance and explorationFor some reason, most of the languages for scientific exploratory programming are reallyslow. Python, Matlab, Mathematica, they are all fairly slow languages. The usual argument is that it doesn't matter at all, because these are scripting languages used to glue very high-performance numerical routines. And I would totally agree. If it didn't matter.A language for exploratory programming has to be expressive and high-level, but also fast enough for the abstractions not to fall on their knees. Sadly, Python isn't.Even with simple code, if you're processing a modicum amount of data, you'll need to know its internals, and the variety of options available for optimization. It's similar in this regard to Mathematica, where using functions like Compile often requires planning the code up-front to fit in the restrictions of such optimizers.Empirically though it seems that the amount of time I had to spend minding performance patterns in Python is even higher than what I do in Mathematica. I suspect it's because many packages are pure python.It's true that you can do all the optimization staying inside the interactive environment, not needing to change languages. That's not bad. But if you start having to spend a significant amount of time thinking about performance, instead of transforming data, it's a problem.Also, it's a mystery to me why most scientific languages are not built for multithreading, at all. All of them, Python, Matlab and Mathematica, execute only some underlying C code in parallel (e.g. blas routines). But not anything else (all the routines not written in native code, often things such as plots, optimizers, integrators).Even Julia, which was built specifically for performance, doesn't really do multithreading so far, just "green" threads (one at a time, like python) and multiprocessing.Multiprocessing in Python is great, IPython makes it a breeze to configure a cluster of machines or even many processes on a local machine. But it still requires order of magnitudes more effort than threading, killing interactivity (push global objects, imports, functions, all manually across instances).Mathematica at least does the multiprocessing data distribution automatically, detecting dependencies and input data that need to be transferred.

Finally, code: NumPy by example. Execute each section in an IPython cell (copy and paste, then shift-enter)

# First, we'll have to import NumPy and SciPy packages.

# IPython has "magics" (macros) to help common tasks like this:

%pylab

# This will avoid scientific notation when printing, see also %precision

JobLib, makes multiprocessing easier (see IPython.Parallel too) but still not great as you can't have multithreading, multiprocessing means you'll have to send data around independent python interpreters :/

Parakeet, compiles annotated numpy functions to CPU and GPU code, similar to Numba

NumExpr, a fast interpreter of numerical expressions on arrays. Faster than numpy by aggregating operations (instead of doing one at at time)

For 3d animations, VisVisseems the only tool that is capable of achieving decent speed, quality, and has a good interface and support. It has a matlab-like interface, but actually creating objects (Line() instead of plot...) is muuuuch better/faster. Its successor is VisPy, still very experimental.

03 September, 2014

This accounts only for well-established methods, there are many other methods and combinations of methods I'm not covering. It's a sort of "recap".ForwardSingle pass over geometry generates "final" image, lights are bound to draw calls (via uniforms), accurate culling of light influence on geometry requires CSG splits. Multiple lights require either loops/branches in the shaders or shader permutations.

Culling lights on geometry requires geometrical splits (not a huge deal, actually). Can support "static" variations of shaders to customize for a given rendering case (number/type of lights, number of textures and so on) but "pays" such optimization with combinatorial explosion of shader cases and many more draw calls.

Decals need to be multipass, lit twice. Alternatively, for static decals mesh can be cut and texture layering used (more shader variations), or for dynamic decals color can be splatted before main pass (but that costs an access to the offscreen buffer regardless or not a decal is there).

Complex shaders might not run optimally. As you have to do texturing and lighting (and shadowing) in the same pass, shaders can require a lot of registers and yield limited occupancy. Accessing many textures in sequence might create more trashing than accessing them in separate passes.

Lighting/texturing variations have to be dealt with dynamic branches which are often problematic for the shader compiler (must allocate registers for the worst case...), conditional moves (wasted work and registers) or shader permutations (combinatorial explosion)

Many "modern" rending effects require a depth/normal pre-pass anyways (i.e. SSAO, screen-space shadows, reflections and so on). Even though some of these can be faded out after a preset range and thus they can work with a partial pre-pass.

All shading is done on geometry, which means we pay all the eventual inefficiencies (e.g. partial quads, overdraw) on all shaders.

Forward+ (light indexed)Forward+ is basically identical to forward but doesn't do any geometry split on the scene as a pre-pass, it relies on tiles or 3d grids ("clustered") to cull lights in runtime.

Benefits

Same memory as forward, more bandwidth. Enables MSAA.

Any material (same as forward)

Compared to forward, no mesh splitting necessary, much less shader permutations, less draw calls.

Compared to forward it handles dynamic lights with good culling.

Issues

Light occlusion culling requires a full depth pre-pass for a total of two geometrical passes. Can be somehow sidestepped with a clustered light grid, if you don't have to end up splatting too many lights into it.

All shadowmaps need to be generated upfront (more memory) or splatted in screen-space in a pre-pass.

All lighting permutations need to be addressed as dynamic branches in the shader. Not good if we need to support many kinds of light/shadow types. In cases where simple lighting is needed, still has to pay the price of a monolithic ubershader that has to consider any lighting scenario.

Compared to forward, seems a steep price to pay to just get rid of geometry cutting. Note that even if it "solved" shader permutations, its solution is the same as doing forward with shaders that dynamic branch over light types/number of lights and setting these parameters per draw call.

Deferred shadingGeometry pass renders a buffer of material attributes (and other proprieties needed for lighting but bound to geometry, e.g. lightmaps, vertex-baked lighting...). Lighting is computed in screenspace either by rendering volumes (stencil) or by using tiling. Multiple shading equations need either to be handled via branches in the lighting shaders, or via multiple passes per light.

Benefits

Decouples texturing from lighting. Executes only texturing on geometry so it suffers less from partial quads, overdraw. Also, potentially can be faster on complex shaders (as discussed in the forward rendering issues).

Allows volumetric or multipass decals (and special effects) on the GBuffer (without computing the lighting twice).

Allows full-screen material passes like analytic geometric specular antialiasing (pre-filtering), which really works only done on the GBuffer, in forward it fails on all hard edges (split normals), and screen-space subsurface scattering.

Less draw calls, less shader permutations, one or few lighting shaders that can be hand-optimized well.

Issues

Uses more memory and bandwidth. Might be slower due to more memory communication needed, especially on areas with simple lighting.

Doesn't handle transparencies easily. If a tiled or clustered deferred is used, the light information can be passed to a forward+ pass for transparencies.

Limits materials that need many different material parameters to be passed from geometry to lighting (GBuffer), even if shader variation for material in a modern PBR renderer tends not to be a problem.

Can't do lighting computations per object/vertex (i.e. GI), needs to pass everything per pixel in the GBuffer. An alternative is to store baked data in a voxel structure.

In general it has lots of enticing benefits over forward, and it -might- be faster in complex lighting/material/decal scenarios, but the baseline simple lighting/shading case is much more expensive.

Notes on tiled/clustered versus "stenciled" techniques

On older hardware early-stencil was limited to a single bit, so it couldn't be used both to mark the light volume and distinguish surface types. Tiled could be needed as it allowed more material variety by categorizing tiles and issuing multiple tile draws if needed.

On newer hardware tiled benefits lie in the ability of reducing bandwidth by processing all lights in a tile in a single shader. It also has some benefits for very small lights as these might stall early in the pipeline of the rasterizer, if drawn as volumes (draws that generate too little PS work). In fact most tile renderers demos like to show thousands of lights in view... But the reality is that it's still tricky to afford many shadowed lights per pixel in any case (even on nextgen where we have enough memory to cache shadowmaps), and unshadowed, cheap lights are worse than no lighting at all.Often, these cheap unshadowed lights are used as "fill", a cheap replacement for GI. This is not an unreasonable use case, but there are better ways, and standard lights, even when diffuse only, are actually not a great representation of indirect radiance. Voxel, vertex and lightmap bakes are often superior, or one could thing of special fill volumes that can take more space, embedding some radiance representation and falloff in them.In fact one of the typical "deferred" looks that many games still have today is characterized by "many" cheap point lights without shadowing (nor GI, nor gobos...), creating ugly circular splotches in the scene. Also tiled/clustered makes dynamic shadows somewhat harder, as you can't render one shadowmap at a time...

Tiled and clustered have their reasons, but demo scenes with thousands of cheap point lights are not one. Mostly they are interesting if you can compute other interesting data per tile/voxel.You can still get a BW saving in "realistic" scenes with low overlap of lighting volumes, but it's a tradeoff between that and accurate per-quad light culling you get from a modern early-z volume renderer.

It requires less memory per pass, but the total memory traffic summing all passes is roughly the same. The former used to be -very- important due to the limited EDRAM memory on xbox 360 (and its inability to render outside EDRAM).

In theory allows more material "hacks", but it's still very limited. In fact I'd say the material expressiveness is identical to deferred shading, but you can add "extra" lighting in the second geometry pass. On deferred shading that has to be passed along using extra GBuffer space.

Allows shadows to be generated and used per light, instead of all upfront like deferred shading/forward+

Issues

An extra geometric pass (could be avoided by using the same GBuffer as deferred shading, then doing lighting and compositing with the textures in separate fullscreen passes - but then it's almost more a variant of DS than DL imho)

30 August, 2014

One of the first things you learn when you start making games is to ignore most of the internet. Gamers can undoubtedly be wonderful, but as in all things internet, a vocal minority of idiots can overtake most spaces of discourse. Normally, that doesn't matter, these people are irrelevant and worthless even if they might think they have any power in their jerking circles. But if after words other forms of harassment are used, things change.

I'm not a game designer, I play games, I like games, but my work is about realtime rendering, most often the fact it's used by a game is incidental. So I really didn't want to write this, I didn't think there is anything I can add to what has already been written. And Anita herself does an excellent job at defending her work. Also I think the audience of this blog is the right target for this discussion.

Still, we've passed a point and I feel everybody in this industry should be aware of what's happening and state their opinion. I needed to make a "public" stance.

Recap: Anita Sarkeesian is a media critic. She began a successful kickstarter campaign to produce reviews of gender tropes in videogames. She has been subject to intolerable, criminal harassment. People who spoke in her support have been harassed, websites have been hacked... Just google her name to see the evidence.

My personal stance:

I support the work of Anita Sarkeesian. As I would of anybody speaking intelligently about anything, even if I were in disagreement.

I agree with the message in the Tropes Vs Women series. I find it to be extremely interesting, agreeable and instrumental to raise awareness of an in many cases not well understood phenomenon.

If I have any opinion on her work, is that I suspect in most cases hurtful stereotypes don't come from malice or laziness (neither of which she mentions as possible causes by the way), but from the fact that games are mostly made by people like me, male, born in the eighties, accustomed to a given culture.

And even if we can logically see the issues we still have in gender depictions, we often lack the emotional connection and ability to notice their prevalence. We need all the critiques we can get.

It's time to marginalize harassment and ban such idiots from the gaming community. To tell them that it's not socially acceptable, that most people don't share their views.

Right now for most of the video attacks (I've found no intelligent rebuttal yet) to Tropes Vs Women are "liked" on youtube. Reasonable people don't speak up, and that's even understandable, nobody should argue with idiots, they are usually better left ignored. But this got out of hands.

I'm not really up for a debate. I understand that there can be an debate on the merit of her ideas, there can be debate about her methods even, and I'd love to read anything intelligent about it.

We are way past a discussion on whether she is right or wrong. I personally think she is substantially right, but even if she was wrong I think we should all still fight for her to be able to do her job without such vile attacks. When these things happen, in such an extent, I think it's time for the industry to be vocal, for peopleto stop and just say no. If you think Anita's work (and her as a person) doesn't deserve at least that respect, I'd invite you to just stop following me, seriously.

Angelo Pesce. Twitter: @kenpex.
I'm a rendering technical director.
This blog is my place to jot (incoherent, disorganized) notes about various things, so I can remove them from my head and keep them safely on the internet.