The StructBuffer holds the position of the 'sliding window'. We will use the Particle like this:

1 2 3 4 5 6 7 8 9 10 11 12 13

Particlep = newParticle(buf);

buf.position(13);p.x(0.4f);p.y(0.5f);p.z(0.6f);p.state(-5);

buf.position(14);p.x(0.3f);p.y(0.2f);p.z(0.1f);p.state(42);

So the Particle isn't sliding over the data, but the data is sliding underneath the Particle.

What if we want two Particles accessing the same dataset?

1 2 3 4 5 6 7 8 9 10 11 12 13 14

StructBufferpBuf0 = buf;StructBufferpBuf1 = buf.duplicate();

Particlep0 = newParticle(pBuf0);Particlep1 = newParticle(pBuf1);

pBuf0.position(11);p0...;

pBuf1.position(9);p1...;

// p0.x = p1.yp1.x( p0.y() );

Once we're done manipulating the particle-data, we can extract the ByteBuffer from the StructBuffer like this:

1

ByteBufferbb = buf.getBacking();

As said above, the performance is 3 times faster (!!) than iterating over an array of "struct-objects" (Vec3) and about 66% the speed of directly manipulating the FloatBuffers. This will only get better once Sun natively implements structs in the VM. It takes however the burden of massive gc() and object-creation.

I'm finalizing the sourcecode at the moment, but performance is kinda stuck at this level (which is nothing to be ashamed about IMO )

Is this a usable design / framework, are there suggestions how to change things? I'd like to hear your comments.

Hi, appreciate more people! Σ ♥ = ¾Learn how to award medals... and work your way up the social rankings!

at first I want to apologize for commenting without being well informed about the whole structs discussion - though I read some postings and the RFE..

well, one thing what I really like to be cleared is the difference between Structs and MappedObjects As I said, I don't have the definitions, which may be included in some other posts, in my head, but in order to minimize misunderstandings, followiing describes the way I will use both term here in this post:

Structs:An automatic mechanism to copy a class from or to a buffer (and only a buffer). Classes are still reference types and can be null, in contrast to C# structs, which are value types. (-> no default constructor is needed for java structs)

MappedObjects:An interpretation of a continous segment of a buffer, with shared memory. This means any modification to the Objects' (marked) attributes or properties will reflect in a change of the buffer. Futher two or more Objects can be mapped to overlapping regions of the buffer.

OK, with this I can start commenting on your implementation:

Firstly, from my point of view you are talking about a mapping between a Buffer and Objects. The main difference to mapped objects, like described above, is that you usaully only have a few instances - as much as different structs have to be accessed at once. Since a non VM integrated type of an mapped object always has at least a single reference to a ByteBuffer, your technique should save a decent amount of memory. The other side of the coin is that IMHO the best argument for mapped objects is, to avoid those fency get and put method calls on a buffer. Your implementation however, only puts it to a slightly higher level: from primitves to objects (which are only allowed have primitive attributes?). I don't like such a coding style, I find array of classes much more natural.

Btw:As from my understading your current implementation needs two compilations, right?If so, the following me be interesting to you:With Java 6, you can dynamically compile a dynamically written class, load and instantiate it in the same runtime. (take a look a javax.tools)

Well, back to topic: I personally prefer the struct-way, because to my experience multiple mappings on the same data usually leads to undesirable side effects. Especially in multi-threaded applications this can be really horrible keeping safe. What I really like about Java is that the language tries to avoid this. IMHO mapped objects are best compered to pointers, which were wisely not integrated into Java (in contast to C#). What'll be the next thing, an extension which allows you to take own control of finalization? (like delete in C++)

The only argument against structs may be that a copy operation from or to a bufferis needed, but since structs are usually small Objects that shouldn't have a strong effect on the performace. Therefore I currently would do an implentation like this:

in future, I hope to change this to:(we should be possible to do by ourself with Java 6 and annotations)

1 2 3 4 5 6 7 8

@Struct//--> this will automatically add the interface and generate an implementation/overide the old onepublicclassParticle3 {publicfloatx;publicfloaty;publicfloatz;// how about: @Endian=little ? this would define how the data is saved/loaded from the ByteBufferpublicintstate;}

One of the main ideas of structs/mapped objects/whatevers was to avoid memory copies. You are completely able to product the affect of structs by writing bytecode-rewriters and such but at the end of the day any savings you make in typing syntactic sugar are buggered by the VM being slow at doing the memory copy thing. Just 2p comment on the issue, not really to do with Riven's implementation.

Third, structs are not meant as copies, your StructBuffer impl. does a lot of copying, which is disasterous for performance.

You are kidding, don't you?

One usaually does ONE copy per update/frame, e.g. for dynamic geometry sent to the graphics-card. Even with a heavy dynamic geometry load this will NEVER be a performance penalty or even the bottleneck of an application.Its just a simple,fast copy, and if the data is dynamic it will be modified before the copy. Take care about the performance of these operations. Java's that great for that, making use them can drastically increase performance. I've opened another thread (Why mapped objects (a.k.a structs) aren't the ultimate (performance) solution) for explaining a possible use, because I don't want to hijack yours.

Btw. in C# + DirectX geometry is always copied when sent to a VertexBuffer since structs are always copied by only assigning them to a variabke (e.g. putting hem into an array) or passing them as an arguement (except using the ref keyword). nobody at microsoft complains about performonce. Further it is very convenient, since copy to a stream/buffer is as I described automatically enabled for structs like I advice.

Further, your load()/save() methods almost make me cry, ByteBuffer.getFloat() is even much slower than FloatBuffer.get() which is much slower than FloatBuffer.get(i) (which is much slower than Unsafe.getFloat(pointer) - which I use )

And if your benchmark tells you they have equal speed, your benchmark is flawed.

And about your copying, you're copying a_lot_of_times_per_frame/update, basicly for every struct-access, not a 'batch-write' to the gpu.

It sounds to me you never worked with NIO and performance-critical-code, seeing the trival mistakes you make. (no offence intended)

Hi, appreciate more people! Σ ♥ = ¾Learn how to award medals... and work your way up the social rankings!

Further, your load()/save() methods almost make me cry, ByteBuffer.getFloat() is even much slower than FloatBuffer.get() which is much slower than FloatBuffer.get(i) (which is much slower than Unsafe.getFloat(pointer) - which I use )

And if your benchmark tells you they have equal speed, your benchmark is flawed.

I have no benchmark, but what I can tell is that if I simply comment out the lines which fill the buffers send to the graphics-card every frame, the framerate doesn't increase at all. So copying the data to the buffer has for me no influence on the performance (and I copy about half a million vertices (positions and normals only every frame).

And about your copying, you're copying a_lot_of_times_per_frame/update, basicly for every struct-access, not a 'batch-write' to the gpu.

It sounds to me you never worked with NIO and performance-critical-code, seeing the trival mistakes you make. (no offence intended)

You didn't uderstand my version. in my code the 'sructs' are normal java classes. Accessing a field means accessing a field, no buffer at all. Once all modificartions is done, I either:

- copy the data to a buffer, I have allocated once inside the VM and send it to graphics-card with glBufferSubData- or directly use the the memory from openGL (glMapBuffer) and copy data into the returned ByteBuffer

That sounds pretty much like: you don't understand my Ships, they are Cars.

No they aren't that inefficent. this is a copy operations, constant in time. Linear for an array. My whole point is that this will (almost) never be the bottleneck of a REAL application. And if so Java may the wrong language for you, sorry.Invent a new one where all data is stored in some kind of buffer and Objects only mapped to them, this would be great for you, right? (also no offense intended )

Then don't hijack this thread talking about things that you don't want, don't need, and don't understand.

I was interested in comments on the API, not about "do we need structs?" - yes, some of us need them, most of us dont.

hey, don't blame me, that's why I opened the other thread. if you like, please comment there whether you think the kind of optimization I discussed in the 2nd half is possible with your struct implementation. that would be contructive

Further, I comented on your API, that I don't like the style encapsulating a single struct into a buffer class, don't remember that?

Btw you told that ByteBuffer.putFloat is slower than FloatBuffer.put, but you used this (access.putFloat,..). did I miss s.th. ?

Ofcourse you don't like the API/design when you don't need it, because it's kinda "hackery" and non-Java, but required when performance is all one cares about.

And no, i'm not encapsulating a *single* struct into a buffer-class (check the two-structs-example). And the StructBuffer class is not bound to the Particle-class, but can be used for any generated struct.

About access.putFloat() - My own quote:

Quote

ByteBuffer.getFloat() is even much slower than FloatBuffer.get() which is much slower than FloatBuffer.get(i) (which is much slower than Unsafe.getFloat(pointer) - which I use)

So I'm using the Unsafe-class (in a 100% safe way)

Hi, appreciate more people! Σ ♥ = ¾Learn how to award medals... and work your way up the social rankings!

Look what point exactly are you trying to make here? Sure using Unsafe is hackery, but didn't Riven admit to that right from the start? I've been following this discussion, found it refreshing for a while but am getting a little concerned now, cos I can't help feeling you missed the entire point! You consider structs as value types that maintain their own state. Riven's structs merely wrap existing data. You provide convenience methods to read from and write to a buffer. Riven's structs provide a (structured, fast-path) view on bulk data. Why do you bluntly suggest using a different language, when someone has just shown a way of getting really good performance out of Java, on a silver plate. This may not be your cup of tea, and to be fair, I wouldn't use this because of the Unsafe bit. But still, I'm fascinated!

I hope I haven't missed anything but anyway, that's my take of this thread so far.

Fecking structs, I wish I'd never called them that. The whole point is nothing to do with C or structs in C or "value types". It is all about providing object-oriented access to data held in ByteBuffers in an efficient manner which is easy to implement in the current VMs, without breaking any semantics in the language or the specs.

Fecking structs, I wish I'd never called them that. The whole point is nothing to do with C or structs in C or "value types". It is all about providing object-oriented access to data held in ByteBuffers in an efficient manner which is easy to implement in the current VMs, without breaking any semantics in the language or the specs.

I use term dipper for simillar structures. That things you talked about might be called memory mapped, or siliding window dipper.

Riven, I hate to stop you getting that bytecode-transformer finished ;-)but I still have little insight in which buffer methods are fast and which aren't. There once was a thread on that (http://www.java-gaming.org/forums/index.php?topic=11385.0 ) but it ended that at least for mustang, all buffer methods should perform excellent.

So I learned that absolute puts and gets are faster than relative puts and gets, aren't they?Does it make a difference between e.g. ByteBuffer.putFloat().. and converting the ByteBuffer to a FloatBuffer and then calling put?What else are the take away points to make direct buffer access fast?

i've looked at sun's directbytebuffer. it seems i can allocate, access and free memory using Unsafe pretty simply. the only reason not using a bytebuffer when going for performance seems to be that a range check is performed at every access. however, i found a lot of "anInt << 0"-bitshifts. my brain, and my ide also say that these are completely pointless. what are they good for?

[some time later]

i wrote a small benchmark, but it lies. the java float array is, depening on some fine tuning, up to 30 times faster than accessing the data via Unsafe. i guess the vm is understanding my benchmark and starts to cheat.

it seems that arrays are a LOT faster than any buffer available.running on 1.5.0_06 -serverwtf?looking at this, i have to ask: what are buffers good for? why is the unsafe access that slow? shouldn't it be faster?

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org