1 Easiest would be to write a 2D shader that sets z to 1.0 and w to 1 (in the 4D output position). In 3D mode the screen edges are at -1 and 1 along the x and y axes, so a trifan of the correct size will fill the screen

2 You need an orthographic camera matrix instead of a projection one. Projection is done by dividing the position vector by it's 4th element(w). So a matrix that gives a constant w won't do projection (e.g set the matrix's bottom row to 0 0 0 1).

- Some more small optimizations (one of them will help speeding up another bigger optimization I'm planning ;) ).

- glFinish improved by immediately marking all internal objects that were buffered for submission as being available again. This speeds up the following drawing calls because those will find free internal objects faster.

- Fix: non-VBO drawing (glDrawArrays / glDrawElements called with pointers into client memory) was buggy, the render-pipe wasn't flushed if necessary which could lead to wrong objects being rendered.

- VBO locking improved not to read back data unless necessary. The primary result is probably not that intuitive: It speeds up *non*-VBO drawing quite a lot, to be exact by around factor 3! This is because Nova cannot render client memory data, so the library has to copy that into dedicated internal VBOs. Of course changes you do to your own VBOs are sped up too, but since you (hopefully) don't do that too often the overall performance gain won't be that noticable.

- The GLSL patcher now also handles the following functions so that they can be used even though they have been removed / replaced from the specs: texture2DProg, texture2DLod, texture2DProjLod, textureCubeLod.

- VBO locking improved not to read back data unless necessary. The primary result is probably not that intuitive: It speeds up *non*-VBO drawing quite a lot, to be exact by around factor 3! This is because Nova cannot render client memory data, so the library has to copy that into dedicated internal VBOs. Of course changes you do to your own VBOs are sped up too, but since you (hopefully) don't do that too often the overall performance gain won't be that noticable.

"Really need a side-by-side comparison" is it always a technical difference? I mean when I watch graphics based on composition on A1 it always looks better to me. Another thing is I remember switching from a matrox card to newer 3D card a while back on another platform and noticing a reduction in IQ.

When comparing Warp3D to Composition in something like Wings Battlefield I prefer the latter while viewing blankers I prefer the OGL ones. Watching something similiar on the recent Nova also left me a bit ambivalent as far as IQ is concerned beacuse of smoother and better framerates on 3D and not only framerates because sometimes on other platforms you have microstuttering or frame times (which we rarely get on A1) as they call it nowdays which effect a dinamic impression. So I guess there is also a consideration of the still pic comparison and the fluid motion comparison.

A similiar thing happens to me when I visit forums like the ScummVM ones where they compare versions and talk about technical differences yet the Amiga versions always look much better to me

@imagodespiraNow it truly begin to looks nice and like a pleasant gameGood worksBTW please add a dragon : I so much love dragons

>Seeems like Warp3D NOVA (1.32) + OpenGLES2.0 (1.10) gives 3-4 times the frame rate compared to Warp3D I agree from my own tests: when drawing in the better conditions for Warp3D (I mean a simple textured draw with no lighting nor pixels effects) then Nova is already 3.36 times fasterbut the Nova superiority vs Warp3D should increase for more complex scene or effects...

Danlel has been commissioned by A-EON Technology Ltd to work on OpenGL ES 2.0 for Warp3D Nova.

He has now completed 11 updates since the initial version. Here is his latest update today:

Quote:

OpenGL ES 2 (v1.11) for Warp3D Nova (AmigaOS 4)

I picked up the pencil I dropped after successfully finishing up the initial version 1.0 for the 11th time now, so it's a new week with a new update once again :)

This time we got one fix and some real *massive* optimizations for a common-case scenario.

The fix: glViewport didn't always work correctly. A dumb typo made it falsely depend on the provided target window- / bitmap-height :P Thanks to Frank Menzel for reporting that one. Really wondering how this one could remain unnoticed for so long.

But now to the optimizations :)It's all about non-VBO drawing commands. So what's that anyway you may ask?

OGLES2 allows you to draw your geometry either from GPU memory (VBO) or directly from your application's RAM (non-VBO). The latter is often used in older progs when VBOs weren't available, in stuff ported from OGL(ES)1, for simplicity, for vertex-data that constantly changes, etc.

In contrast, Nova only allows you to draw your stuff through VBOs. Therefore the OGLES2 wrapper has to create / update at least one VBO internally if you want to draw sth. from your application's RAM. So for the lib-user it looks as if he'd draw from his RAM directly, but in reality the wrapper turns everything into a VBO behind the scenes. And that VBO modification means that the data has to be uploaded to the GPU. Furthermore it means that the lib has to wait until that VBO is not used by the GPU anymore, which could be the case if you issued another non-VBO draw-command before.OGLES2 has to do all that for every single non-VBO glDraw-command you issue, because it has to asume that your vertex data changed. There is no way to tell OpenGL "hey, don't worry, that data will remain unchanged for the next 1000 draw-calls".

As you may guess all this is a huge bottleneck. So I spent some time to improve that situation.The basic idea is that it's actually faster to check the whole data for changes and to not upload anything if there's no change than to always upload. And instead of comparing the data I hash it and compare that hash only. The hashing function has been extremely optimized in 1.10 already (it's used internally for other things already). Anyway, that's the core idea, there's a bit more though.

Note: throughout the following lines I'll present the fps of the boing-ball-test if compiled *not* to use VBOs! So it's 1024 identical balls (about 800 triangles each) rendered using client memory vertex- / index-data.Before it was very slow in that mode (more at Warp3D than W3D Nova niveau), especially with that big load (although the previous update already contained a nice speedup by around factor 3, as you probably remember). However this test situation can be considered a best-case scenario.

I also prepared a second version of that test, one that renders two different types of balls which use completely different vertex-/ index-data sets in an alternating way (ball type A, then type B, then A, B, ...). Triangle- and ball-count is the same as in test 1. This second test is used to somewhat simulate a worst-case scenario.

Optimization 1: client-RAM index-arrays are uploaded only if a data change has been detected.Variant 1: 3.7 fpsVariant 2: 3.7 fps

Optimization 2: client-RAM vertex-arrays are uploaded only if a data change has been detected.Variant 1: 16.0 fpsVariant 2: 3.7 fps

Wow :) Variant 1 becomes insane!Interesting to note is that variant 2 isn't getting any slower (at least not measurable), despite the fact that it now computes a hash over 25kb of vertex-data (about 800 vertices, 32 bytes each) - and it does so 1024 times per frame for nothing... Yes, the hashing has been optimized indeed ;)

Optimization 3: instead of just one VBO for index- and vertex-arrays the library now manages a hole lot of such VBOs internally.Variant 1: 15.9 fpsVariant 2: 12.5 fps

Not bad, hm? ;)

However there's one situation that doesn't benefit much from all this, namely when you render procedurally generated vertex data that changes all the time.If you use glDrawElements then it's likely that at least the index-array-upload can be optimized away because in the common procedural use-case this will remain constant, but the always-changing vertex-data will not just not benefit from all this but might actually become slower than before because there's the additional hashing overhead now...

However, as we have seen at "Optimization 2" computing the hash is virtually for free - at least in those tests here, which, after all, throw around about 50 times (!) more vertices while issueing about 40 times (!) more draw-calls than the current "Wings Remastered" beta does in the most heavy loaded strafing-scenes... (and "Wings Remastered" will be the most heavy Warp3D game in existence). Just to give you an idea about what amounts of data this boing-ball tests is actually about...So, if you stay within those limits the hashing overhead is definitely neglectable.

So compared to v1.10 the new v1.11 delivers a performance gain around factor 3.8 to 4.8 for common non-VBO situations!And because it sounds even cooler:Compared to v1.9 we have an improvement of an incredible factor 11.4 to 14.4!

Note: the actual performance gain highly depends on the size of the data you are about to draw with one draw-call. So don't expect that your numbers are identical to those above. But the overall order of magnitude will be around that.

Is that just a heads up that Daniel is still owrking on it, or is v1.11 already downloadable/within the Enhancer package?

I'm pretty sure that 1.11 is yet to be released. You're being given a peek at what happens behind the scenes. Daniel has managed to get some impressive performance boosts. All stuff to look forward to in the next Enhancer pack update.