Convert to HLSL clip-space as a last step in vertex shader?

From what I can tell OpenGL uses a -1,+1 clip space (along Z) and Direct3D uses a 0,1 clip space (the Y axis is also flipped I think??)

I have this arrangement where the client app doesn't know what the underlying vendor API is going to be, so it is up to the shader (ideally) to deal with the agnostic inputs.

Given OpenGL conventions running in an HLSL shader. The only challenge I think is the clip space.

I am hoping that this can be wrapped up in the last few lines of the vertex shader. And thought it would not hurt to just ask

FYI: For what it's worth, I've always liked OpenGL (column major) matrices on the CPU side, though I know people bemoan them. The real value in my opinion is having the axes and position of the matrix accessible as contiguous memory. I don't know if that was the original motivation for the arrangement, but if your code needs to be around for a long time I think this is best for everyone on the maintainer side. I'm not as crazy about the -1,1 clip space (at least it's consistent in all 3 dimensions) but I reckon it would help to just pick a side, and OpenGL is obviously the better side from an openness standpoint.

Last edited by michagl; 09-08-2012 at 02:32 PM.

God have mercy on the soul that wanted hard decimal points and pure ctor conversion in GLSL.

A different formulation might end up with w being 1; basically doing the perspective divide in the vertex shader. But I do not know if that is safe or not. EDITED: It's not safe; See A.R.'s comments below. In my tests I also ended up with a divide by 0. I don't know how likely that is in the real world, or what would be the end result. I am guessing this doesn't happen in theory due to clipping at the near plane before division would normally take place.

Furthermore I do realize that this gets computed every time the vertex shader runs. It's easy to make an argument that the matrices should just be prepared for the shader; but on the other hand it also makes sense to not burden client applications with low level details. Time will tell I suppose what makes sense. But a simple environment variable can always optimize the post processing away.

I am primarily only interested in whether or not this looks correct. I have not done extensive testing. The X axis may be flipped for all I know. But it looks normal enough.

I am also concerned about 0 divide, but not really.

Last edited by michagl; 09-08-2012 at 02:34 PM.
Reason: warning

God have mercy on the soul that wanted hard decimal points and pure ctor conversion in GLSL.

Wouldn't it just make more sense to have OpenGL and D3D-specific projection matrices? You know, have different functions to compute projection matrices and call them based on the renderer? After all, it is the projection matrix that defines the particular clip space.

Also, doing the perspective divide outside of hardware means that you won't be getting perspective-correct interpolation of components.

Also, doing the perspective divide outside of hardware means that you won't be getting perspective-correct interpolation of components.

Yeah you're right there^ I don't count myself a genius, but the code in the second post I think just transforms between the clip-conventions, and as you can see keeps the w component. If I see perspective oddities at some point I will be sure to report otherwise. Right now I am just getting started with this render framework.

As for micro managing the projection matrices etc. I touched on this above. I realize it would be better. But this is being implemented in a vendor API agnostic organization. I realize the only real games in town are probably OpenGL and Direct3D. But the project is intended to be long term, and the abstraction layer is not concerned with the contents of the shader registers for instance--the interfaces do not even mention vertex or pixel shaders.

I reckon the post processing needs will be specified by preprocessor macros. And if the client application can reasonably guess the underlying implementation they can do the necessary preprocessing; and if the guess is right, get a minor performance boost.

Last edited by michagl; 09-08-2012 at 02:34 PM.
Reason: Emphasis

God have mercy on the soul that wanted hard decimal points and pure ctor conversion in GLSL.

I don't see how that has anything to do with this. Being API agnostic on the outside does not mean you're API agnostic on the inside. Indeed, that's the whole point of a platform-neutral API: you put all your platform specific code behind a wall of platform neutrality.

Your outside user should not be making projection matrices by themselves. Or if they are, then you need to modify those matrices before passing them along internally. And if your abstraction is too low-level to do this (that is, you don't know where the projection matrix is or can't otherwise identify it), then your abstraction is too low-level to be a proper API-agnostic abstraction. And then that is what you should be fixing.

^The point of the middling API is to be pluggable. So the number one point of order for it is to be easy to implement and maintain, as there can easily be many multiple platform targets... every version of every vendor API are candidates including remote rendering and diagnostic and version compatibility layers.

The API just assembles and runs GPU programs, loads registers. It doesn't need to know about matrices because those are shader variables and there can be any number of them or none. It also manages local (eg. video memory) buffers and accelerates presentation (SwapBuffers) via a portable OS layer. It's maintainer focused and intended to be easy to implement with as few export (virtual method) requirements as possible.

If you must know

God have mercy on the soul that wanted hard decimal points and pure ctor conversion in GLSL.

So the number one point of order for it is to be easy to implement and maintain

So... how does that jive with having to hack a vertex shader and back-door this conversion between GLSL's clip-space and HLSL's clip-space? Because that's not easy to implement, and even you haven't found a correct implementation yet. And let's not forget the fact that it becomes rather more complex to do if the user is using tessellation shaders or a geometry shader.

"easy to implement and maintain" is not going to work at the level of abstraction you're trying to work with. At least, it's not going to work with "cross-platform from user code". Because this won't be the last of the irreconcilable differences between OpenGL and D3D, especially if someone tries to use older GL versions. So either your abstraction is going to leak (ie: external code will have to do different things based on the platform) or implementing it is going to be a pain.

It's maintainer focused and intended to be easy to implement with as few export (virtual method) requirements as possible.

The number of "virtual method"s is a terrible measure of ease of implementation. What matters is what those methods actually do.

So... how does that jive with having to hack a vertex shader and back-door this conversion between GLSL's clip-space and HLSL's clip-space? Because that's not easy to implement, and even you haven't found a correct implementation yet. And let's not forget the fact that it becomes rather more complex to do if the user is using tessellation shaders or a geometry shader.

Why would this matter for either of these shader stages? If they are not the same thing.

I don't consider it a hack. You just advertise the convention in play. Don't get me wrong. Another layer can be implemented to do these things. But it's above and beyond actually interfacing with the vendor APIs because you can implement this without touching the APIs. Also the conventions are much broader than the actual APIs. If you are interested I can show you how this all works with some links but it would be going further off topic. Maybe PM if you really want to know.

"easy to implement and maintain" is not going to work at the level of abstraction you're trying to work with. At least, it's not going to work with "cross-platform from user code". Because this won't be the last of the irreconcilable differences between OpenGL and D3D, especially if someone tries to use older GL versions. So either your abstraction is going to leak (ie: external code will have to do different things based on the platform) or implementing it is going to be a pain.

Maybe we are talking about two different goals. I can't really follow this line of questioning anyway. Sorry. External code will have to do different things depending on what it wants to do. Some things may be redundant. The idea is just to be able to swap out the vendor API without rewriting the code that touches the abstraction API. If a plugin fails it just fails, the end user can try another one and file a complaint.

The number of "virtual method"s is a terrible measure of ease of implementation. What matters is what those methods actually do.

Well in my experience if there are many, there is a lot more that must be done before a maintainer can start testing things; and that scares them away. And it makes headers a mess to follow and document. If you are making something monolithic that is fine. But if you are making a specification for a plugin keeping it simple makes things easier for everyone (including myself)

God have mercy on the soul that wanted hard decimal points and pure ctor conversion in GLSL.

I reckon the post processing needs will be specified by preprocessor macros. And if the client application can reasonably guess the underlying implementation they can do the necessary preprocessing; and if the guess is right, get a minor performance boost.

Just to be clear on something I said a little while back that is a little confusing on reading.

In plain English. The "macros" would be set (they are; but you know what I mean) via an API that adds it to the shaders as a #define one way or another. And the "preprocessing" has nothing to do with the shader. That refers to prepping the projection (and or transform) matrices pre shader. So the basic workflow is the client application does things however they want to. It may be in control of the shaders or it may not. It adds something like "OPENGL_CLIP" to the list of macros. And then the shader will handle the macro. Probably it will have an include header added to it, and be expected to add a footer to the end of the vertex shader to catch the OPENGL_CLIP. If the shader is GLSL nothing is done. If its HLSL then a little extra work is done by the shader. No big deal. If the client app can't live with the work it can define DIRECT3D_CLIP and rework its projection matrices itself and hope for the best.

The client app might guess D3D because it loaded a plugin called Direct3D or it might guess OpenGL because the plugin was able to assemble an OpenGL program.

To be honest. It looks pretty clear to me that we are moving towards configuring the entire device via text files. And the APIs are doing not much more than managing video memory. So this kind of framework works out just fine.

At this stage the spec only does assembly but if at some point it will specify a pseudo shader language most likely based on functional blocks that the plugins can reasonably implement.

The goal is to provide bare bone tools for every day people who want to make very presentable full fledged video games with a focus on the fundamentals.

I mean I was looking at the D3D11 documentation the other day, and there are just so many interfaces methods and constants all over the place it is bewildering. But video games still look the same as they always have. So you have to imagine that most of it is just cruft.

Last edited by michagl; 09-10-2012 at 02:53 PM.

God have mercy on the soul that wanted hard decimal points and pure ctor conversion in GLSL.

The goal is to provide bare bone tools for every day people who want to make very presentable full fledged video games with a focus on the fundamentals.

Then you are really going about this the wrong way. You want to make a real abstraction, not a bunch of macros and other such nonsense.

People who want to make "very presentable full fledged video games" are either people with money (and therefore aren't going to use your tool when they can hire a programmer to write and maintain their engine for them) or indies. And indies aren't trying to use low-level tools; they're using higher level stuff like UDK, Unity3D, XNA, and so forth. They don't want do deal with the low-level details, because low-level details are taking precious development time away from important things like gameplay and getting the game finished.

So the only people who would use this tool you're developing are hobbiests. People who noodle around with game code in their spare time, with no real determination to make a better game.

I mean I was looking at the D3D11 documentation the other day, and there are just so many interfaces methods and constants all over the place it is bewildering. But video games still look the same as they always have. So you have to imagine that most of it is just cruft.

... what? Let's ignore the fact that "video games still look the same as they always have" is patently false. You are saying, in all seriousness, that most of D3D11's API is cruft because, in your estimation, games don't look any differently. That the number of API functions is in any way related to the way games look. That if an API has a lot of functions, then that must mean games should look better or else the API is broken.

There is no logical reasoning between "API has lots of functions" and "games don't look better" that leads to "API with lots of functions that don't matter."