On my way to do hierarchical occlusion culling on the GPU using compute shaders, guided by the scene's kd-tree.Here is a video showing only hierarchical frustum culling:http://www.youtube.com/v/FNuqvgHlrz8?version=3&hl=en_US&start=(notice how the culled areas become larger the farther away the camera is from a particular node)Also, the frustum the culling uses is smaller than the camera's frustum to show the effect.So, how does it work:- The CPU (Java, that is) initially assembles a list of kd-tree nodes at level N (where N is really small) very quickly and submits that to the compute shader via a SSBO- At every pass, the compute shader reads in that list, performs culling, and writes the visible nodes into an output SSBO- Those two buffer bindings ping-pong between each other- An atomic counter buffer is used to record the actual number of written nodes in each step, which will be the input for the next step (stream compaction with parallel prefix sums is absolute overkill for these very small amount of nodes) (I've also implementedwarp-aggregated atomics in GLSL only to see that the GLSL compiler seemingly does this automatically already (as is mentioned in the article for nvcc))- The last pass is used to write MultiDrawArraysIndirect structs for the voxels in the visible nodes to a separate SSBO (the voxels SSBO containing all voxels is built in such a way that all voxels within any kd-tree node are contiguous, so that it is easy to generate a MultiDrawArraysIndirect when only having a kd-tree node, which contains the index of the first voxel it or its descendends contain along with the number of voxels)- This last SSBO is then used as the draw indirect buffer for a glMultiDrawArraysIndirectCount() call together with the number of draw call structs stored in the atomic counter buffer, which becomes the indirect parameter buffer

Here is the compute shader doing the culling:https://gist.github.com/httpdigest/15399efe2b60a2b31d1c2cbe414ce5cfand here is some portion of the host code driving the culling:Next will be what will bring the most benefit for this highly fragment shader and ROP bound rendering: Combining Hi-Z occlusion culling with the hierarchical frustum culling. This means that Hi-Z occlusion culling will also be done hierarchically, starting with a coarse kd-tree node level and refining the nodes when they are visible. The reason why I am doing frustum culling on the GPU is: Hi-Z culling has* to be done on the GPU and doing it hierarchically through the kd-tree will benefit from fewer nodes to be tested.

*that's not entirely true, since there are games out there using a software rasterizer to cull on the CPU

Today was an interesting day, as I had witnessed how two parts of the rendering pipeline competed for being the major bottleneck when applying two slightly different techniques of voxel rendering.The first technique was rendering point sprite quads covering the screen-space projection of the voxel, as presented by: http://www.jcgt.org/published/0007/03/04/The second technique I came up with was to use the geometry shader to compute and generate the convex hull of the projected voxel with the help of this nice paper: https://pdfs.semanticscholar.org/1f59/8266e387cf367702d16acf5a4e02cc72cb99.pdfWhile the first technique produces a very low load on vertex transform/primitive assembly, it suffers from many additional fragments being generated for close voxels, where the quad enclosing the screen-space projected voxel contains a large margin/oversize to a) still make it a quad and b) cover the voxel entirely. This produces a higher load on fragment operations (fragment shader doing the final ray/AABB intersection and likely more importantly the ROPs reading/comparing/writing depth and writing color).Now my idea was to reduce fragment operation costs by reducing the amount of excess fragments the quad produces, by not making it a quad anymore but a perfectly fit convex hull comprising either 4 or 6 vertices.Having heard many bad stories about how geometry shaders perform, I still gave it a try and I was positively surprised at an increase of roughly 21% in total frame time when generating the fragments with the convex hull for close voxels.Here, the cost of fragment operations was reduced to a point where this wasn't the bottleneck anymore, but: vertex operations (passing the GL_POINTS to the geometry shader and there emitting a single triangle strip of either 4 or 6 vertices) now were. One could literally see how for moderately far away voxels where the screen-space quad had little oversize/margin, the quad rendering solution overtook the convex hull geometry shader solution.The latter however was ideal for close voxels. So, it's going to be a hybrid solution in the end.Here are some images and a video showing the overlap/margin of the point sprite quad rendering and the (missing) overlap of the convex hull rendering:http://www.youtube.com/v/7TFKwAUZ0qE?version=3&hl=en_US&start=

Not much to show, but have been modifying the core of the game to integrate YAML configuration instead of hardcoded classes, managed to reduce the code base by ~1.8k LoC (the solution looks much cleaner now), at cost of ~500 YAML lines. YAML configuration is used as a base when something is being setup, for example a unit is initialized from the following configuration:

Besides being cleaner, I have prepared the ground to be able to reaload the configuration once the game in debug mode, which is super useful in my opinion when for example it is required to tune object hitboxes and balance the game. I plan to make available both YAML examples and console-like widget I have built for the debug mode of my application.

Created a dynamic door / key event system, which basically allows me to alter tiledmaps during runtime.For instance I can change blocked values from true to false when the player contains a certain item.

Got an idea today for how to efficiently determine the set of actual visible voxels for gathering/caching irradiance only for those visible voxels. Will write a detailed description later. Here is a video highlighting the visible voxels captured at certain points:http://www.youtube.com/v/0xMQYkvWGJc?version=3&hl=en_US&start=

Thanks. Visual debugging is definitely important, and I was just tired of trying to analyze glGetBufferSubData() numbers.As for time to do it: Up till now this is not becoming a strong factor for me. I don't have any abstractions in the code, neither do I factor out common code or parameterize those in any way. Always copy/pasting from a previous demo and applying slight modifications is just so much faster. Also, I try to keep the programs single-file and as small as anyhow possible to quickly iterate different ideas. Only shader sources need additional files. This is why I am really looking forward to Java 12 becoming a thing with (multiline) raw string literals, so that a whole demo/example can indeed be one file (without having to resort to multiple concatenated "blabla\n" strings of course).

Thanks. Visual debugging is definitely important, and I was just tired of trying to analyze glGetBufferSubData() numbers.As for time to do it: Up till now this is not becoming a strong factor for me. I don't have any abstractions in the code, neither do I factor out common code or parameterize those in any way. Always copy/pasting from a previous demo and applying slight modifications is just so much faster.

There are only very few cases where the right abstraction doesn't give you more development speed. Finding the right abstraction that doesn't make small changes harder is a different topic though. I was thinking just like you, some time ago in my own engine, before I took the discipline to do some "proper programming" and not another bunch of hacks. It was so worth it. The more often one does it, the better one gets at it and the more one realizes how much it is worth

Also, I try to keep the programs single-file and as small as anyhow possible to quickly iterate different ideas. Only shader sources need additional files. This is why I am really looking forward to Java 12 becoming a thing with (multiline) raw string literals, so that a whole demo/example can indeed be one file (without having to resort to multiple concatenated "blabla\n" strings of course).

I have a system that uses a file listening mode for auto reloading and runtime compiling shader code - how would you get this when you have all your shaders as static strings somewhere in your classes? Or don't you need runtime recompilation on change?

Yes, when you arrive at some point where you know which parts will change and which parts won't, then it is benefitial to do abstractions. But I certainly haven't reached this point yet. Any new technique I might discover might make me throw away most of the code anyways. It is only after you decide on a certain "architecture" (rendering pipeline, capabilities, flexibility/pluggability, etc.) of your engine, that you can start organizing your code in such a way that changes to moving parts become easy while changes to static parts (which you anticipate to not change often) become harder.I've just recently used a very hacky Java Preferences way of storing and loading presets of window size/position and camera position and orientation to allow for performance comparisons between code changes.

Reloading of shaders at runtime sounds nice, definitely. With static (constant pool) strings this will of course not work. Most changes I had to do in the shaders also involved host/shader interface changes, especially when trying to find an optimal SSBO memory layout for BVH and kd trees. I had to play around with that a lot. So runtime reloading of shaders was simply not something I needed to look into in the past.

I got a "Power Chord" patch working as an option for the PTheremin yesterday.

Basically two sines, a perfect fifth apart [hmm, guitars generally use equal temperament instead? maybe, maybe not if they tune by using harmonics], with simple clipping, with a parameter available for real time interpolatione between the clip and the sine. It sounds decent, not ideal.

I tried a couple other nonlinear equations given by DspRelated and CCRMA, but neither were radical enough.

Not finished, though, as the aliasing is over the top. Am going to work on implementing a low-pass filter. My audio engineering friend points out I made need to over-sample to get enough filtering that doesn't also compromise the basic tone.

Having seen that on Nvidia drivers glGetProgramBinary also outputs the ARB_vertex_program/ARB_fragment_program/NV_gpu_program4/5 assembly text, I begun learning the NV_gpu_program4 assembly language and reading about low-level shader optimizations (MAD instead of MUL and ADD, using negation on instructions operands instead of negating the result, etc.) to squeeze the most performance out of it.EDIT: Hit exactly this GLSL compiler bug now: https://devtalk.nvidia.com/default/topic/952840/bug-report-linker-error-when-indexing-dvec-in-an-unbounded-loop/Plus, whenever I use bounded loops the time the compiler takes to compile the GLSL code is directly proportional to the upper bound of the loop..... So if I have for (int i = 0; i < 1024; i++) ... glCompileShader() actually does not end in foreseeable time... This makes me want assembly even more...

I learned a lot more about 3D rendering, and finally managed to add a third dimension to the map. I have low-poly lighting and adapted marching squares into 3D to give the map regions a better shape. The height map I'm using isn't the best and I'm far from done with the map rendering, so the mountains don't look like mountains yet. But they will!

I like the upper one a bit better, but there's not much of a difference tbh.I would strongly suggest you to start working on something more important than fine tuning textures if you want to finish this within the next decade. Any gameplay yet? Do you have a main menu? A settings menu? How many units do you have implemented yet?

I'm telling you this because i also get caught up with fine-tuning very often and it's not very productive, it's not going to get you closer to your goal if you waste your time adjusting textures, trust me. You will probably make some huge changes later on that force you to swap out your fine tuned textures anyway.

I understand you very well. There is that moment which does not give me rest concerning these textures. I have both the main menu and a good part of the gameplay. But there is still a lot of work to be done.

Started working on a rudimentary options menu which is basically just a visual representation of the underlying settings.json file of my game. GUI is really my weakspot, so I just started playing a little bit around and began to start implementing methods which allow the player to alter the settings file thourgh the options menu. Backend methods for loading, saving and adapting the settings file are already done. Really not enjoying GUI somehow. Options menu is far from dome but basically does what it needs to right now. Also planning to allow the player to change controls later in a second tab and maybe a restore default button.

Implemented multi-level k-d tree traversal. That means, there are now two acceleration structures - top-level and bottom-level. The top-level tree is used to spatially subdivide the (transformed) axis-aligned bounding boxes of the few model instances. Each leaf in that tree then indexes into a bottom-level k-d tree for the model which holds the voxels.The following video shows a debug render showing the actual k-d tree nodes color-coded based on the number of descends from parent to child nodes and the number of neighbor/adjacent node followings. Each leaf of the top-level tree holds a per-instance transformation matrix (video shows different per-instance position and rotation). Each bottom-level k-d tree references a color palette which each voxel indexes into with a 0..255 byte value. So, this is compatible with MagicaVoxel color/material palettes. Next comes adding scaled-down models into the large voxel world. The video shows the chr_sword.vox model shipped with MagicaVoxel.http://www.youtube.com/v/kZ-LvwoMZVs?version=3&hl=en_US&start=Layout of the acceleration structure:

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org