That is bizarre. An apparently trivial change makes the performance change by a factor of 2. It doesn't give me confidence that the code would work as expected on another JVM (e.g. one by IBM or Apple).

After rearranging code a bit (to put loops inside their methods, so it will be easier to see what is happening) and simplifying loops (hotspot was doing everything on registers after few first memory accesses, plus it was again to much code to check), I have come up with following

After rearranging code a bit (to put loops inside their methods, so it will be easier to see what is happening) and simplifying loops (hotspot was doing everything on registers after few first memory accesses, plus it was again to much code to check), I have come up with following

After I simplicied the loops, I get performance factor 1.14

Not that you do: "& 0x2F" which is 47, while there are 64 elements in the array, so it must be "& 0x3F"

I already PMed you, but you haven't replied yet, so could you please tell my how to get that native code printed? Thanks.

Hi, appreciate more people! Σ ♥ = ¾Learn how to award medals... and work your way up the social rankings!

I already PMed you, but you haven't replied yet, so could you please tell my how to get that native code printed? Thanks.

PMs used to pop up here on login, don't they ? Sorry - I have not noticed a new one, information is hidden on bottom of main page...

As for printing out native code, download debug 6.0 jvm and call it with-XX:+PrintOptoAssemblyon command line.

For your current code, I'm getting factor of 1.1 on first few iterations, then some compilations kicks in and ration changes to 2.2 and stays there.Try to access more than one object in same method (3-4 of them) - you will get a lot worse ratio.

Problem is, that there seems to be certain kind of operations in very simple cases which get totally optimized by hotspot (ratio of 1.1-1.3). But anything more complicated and we are back into lets-call-a-method mode, which gives ratio 10+.

Woah, you're making a direct-buffer of 1 byte, then accessing N bytes from it...

ByteBuffer bb = ByteBuffer.allocateDirect(1);this is local variable, used only to get a field reference for reflection. I could probably just use ByteBuffer.class instead of bb.getClass(), but I just wanted to be sure that I'll resolve it against real class of direct buffer.

One note - in my previous benchmark, it was the 'position' method which made a major difference. Without it, unsafe version was 3-4 times faster !!! Unfortunately, it is needed to be able to reuse same structure wrapper for various positions in same buffer. Having one object per entry in native array is not possible (memory, gc hit).

Edit: okay, I looked through your code, and noticed you didn't mean the ByteBuffer.position()

You basicly use that method to move along the data, and 'land' where you like it, to re-use objects... I don't know if that's such a good idea... Think about this:

1 2 3 4 5 6 7 8 9 10 11 12 13

MappedObjectmo = newMappedObject(...)mo.doSomething();

// "mo" points to other data now// this is not like anything in java

floatx = mo.x();

voiddoSomething(MappedObjectobj){obj.position(...);}

My implementation is 100% safe, as long as the ByteBuffer is floating around. With MappedObject.position(...) you can wreak havok and cause native crashes. You can't allow that to happen, ever. Checking input here (throwing exceptions) will disable inlining, which is kinda slow compared to non-struct classes.

Hi, appreciate more people! Σ ♥ = ¾Learn how to award medals... and work your way up the social rankings!

We are talking here about million objects per second. I can imagine such structure being used to fill out vertex data inside opengl buffers - thousands of dynamic triangles each frame, giving 100-1000 thousands triangles per second. You certainly don't want to allocate anything to access single vertex. One allocation per array of vertices is probably acceptable, but maybe even switching nio buffers is not too far fetched.

ByteBuffer.getFloat() which is very very slowFloatBuffer.get() is still about 1,5-3x slower than class field access

The best performance you get with Javassist is 25-50% slower than 'normal' code. I have to say it's a nice transparant architechture though, but if you need raw-performance, it's unacceptable, and when you see that unsafe.getFloat() is about 15-20% *faster* than class field access, the choice is easy. You'll lose the transparancy, and have to change your code from fields to method-calls, but I think that's worth it, for the die-hards.

Second, you're using Lists in your benchmark, which will significantly influence performance, not to mention Math.sqrt()s in tight loops, which is very heavy, and Random.nextFloat(). Do you really want to measure that too?

Hi, appreciate more people! Σ ♥ = ¾Learn how to award medals... and work your way up the social rankings!

Enough of all this crazy hackery! I don't get why there is such opposition to the ultra-simple idea that MappedObjects are.

1. Abstract class in java.nio containing final reference to a ByteBuffer2. Primitive fields mapped IN ORDER DECLARED, of size specified by the Java specs, no need for annotations3. Reference fields held in heap section.4. JVM is free to detect classes extending MappedObject and may either optimise directly into machine code, or rewrite bytecode to provide similar but less efficient access by proxy.

Why MappedObjects?

1. Clean, clear code, with no annotations, no caveats, no getters and setters to pollute OOP designs2. High performance: Bounds check performed only ONCE on a setPosition() call3. High performance: no need to create or destroy any objects, just use one and slide it around the buffer4. High performance: no read-modify-write operations, it's all direct in memory access5. Provides all the benefits of a C-struct but fits seamlessly in with Java's object-oriented paradigm and behaves just like any other reference type by virtue of being a real object on the heap

Somebody give me a sound, concrete reason why MappedObjects, as I have described here, do not do everything we need to get us clean, clear, concise, fast, object-oriented, easily implemented, side-effect-free, interfacing with native data.

Are you sure you will never need an annotation ? What about accessing data which is coming from network - some fields can be in different endianess. I agree, that default behaviour should not require any annotations - but allowing ones for not-so-trivial cases could simplify usage a lot.

You can get all of the benefits of your MappedObject with my idea of bytecode weaving - with single exception of having to put transformation class in classloader/on startup. It would even allow you to use field access directly, as it would be changed silently to use accessors. And it has a major benefit - it could be used out of the box, right now, without forcing particular construct on rest of world, which is more concerned about JSP v7.0 than native access to resources.

As far as hackery is concerned... it would be all invisible from client point of view, only library implementation would have to use few magic hacks. I vaguely recall certain library passing native pointers as ints/longs between the calls and doing direct pointer arithmetic on them

Enough of all this crazy hackery! I don't get why there is such opposition to the ultra-simple idea that MappedObjects are.

1. Abstract class in java.nio containing final reference to a ByteBuffer2. Primitive fields mapped IN ORDER DECLARED, of size specified by the Java specs, no need for annotations3. Reference fields held in heap section.4. JVM is free to detect classes extending MappedObject and may either optimise directly into machine code, or rewrite bytecode to provide similar but less efficient access by proxy.

Because it is a change in the language specification. Especially after the less than ecstatic reception of recent changes, I think the chances of such a change being accepted are slim.

You might still want annotations if you wanted to be able to represent C structs that had been packed to boundaries other than 1 byte; i.e. where padding has been inserted to maintain appropriate alignment.

You can get all of the benefits of your MappedObject with my idea of bytecode weaving - with single exception of having to put transformation class in classloader/on startup. It would even allow you to use field access directly, as it would be changed silently to use accessors.

Do you know how the sliding-window * feature, that you thought was absolutely required, would be implemented with bytecode transformation? I'm very curious.

* using 1 object and move it along the data

Hi, appreciate more people! Σ ♥ = ¾Learn how to award medals... and work your way up the social rankings!

where MemoryMappedObject would implement position/sliding method, normally visible, without any tricks.

Bytecode weaver would do following:1) If something extends MemoryMappedObject, remove the public fields, create correct getters/setters with any magic inside which is needed (depending on implementation), probably also pass SIZEOF as extra argument to super constructor2) If something accesses any field from MemoryMappedObject, convert get/putfield to getter/setter calls.

On top of that, I could imagine few extra properties/annotationsa) possibility of explicitly giving sizeof parameter (passing to super constructor, or in annotation which would be weaved to be passed in constructor) - for easy alignmentb) specifying explicit offset of particular fieldc) specifying endianess of particular fieldd) (optionally, not sure about that, especially about multiple-dimensions) posibility to denote arrays of values, like

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org