Can this section of code be made any faster? This is pretty crucial since this portion gets executed many times per frame.

I was wondering if there's a way to limit the variable newpix (using bit operations?) so that it is always between the values 0-255 (if the value is greater than 255, than it should truncate to 255). Then I could simply discard that if statement which checks for overflow.

As far as I know, it can't be replaced by a simple bit operation but the if statement is very cheap anyway. The only possible speed optimization I can see is to have an unsigned byte type in java (am I right? you would be able to skip the masking when converting to an int), but that doesn't help you much does it?

You are right that you want to dispose of the branch if possible (conditional branches are nasty for most CPU's), so to get round it you do:

1

newpix |= ((255-newpix)>>31);

If newpix < 255, then 255-newpix is positive. the shift right 31 will propogate the sign bit through the whole int, giving '0'. ORing this in gives no change.

If newpix >=255, the result of this line is '0xffffffff', and when OR'd in makes newpix 0xffffffff as well, so when you cast back to a byte you will get '0xff' - the clamped value you were after.

This takes 3 logical ops (3 cycles), compared to a branch predict error (~30 cycles) whenever the clamping is used. So if you clamp ~10% of pixels this way, the chances are that it will be around the same speed

Your original looks a little odd though - are your arrays really byte arrays? Are you not dealling with multiple colour channels at least? If you are tring to do additive or alpha blending, there are slightly more efficient ways to do this as you can ususally get away with dealling with R and B combined in one int, saving you 1/3rd of the work for a blend.

You are right that you want to dispose of the branch if possible (conditional branches are nasty for most CPU's), so to get round it you do:

1

newpix |= ((255-newpix)>>31);

If newpix < 255, then 255-newpix is positive. the shift right 31 will propogate the sign bit through the whole int, giving '0'. ORing this in gives no change.

If newpix >=255, the result of this line is '0xffffffff', and when OR'd in makes newpix 0xffffffff as well, so when you cast back to a byte you will get '0xff' - the clamped value you were after.

Thanks. Unfortunately, it's slower now . Although that's really quite an elegant way of skirting the conditional statement.

I think the issue is not so much a branch prediction error/cache miss, but the sheer number of instructions that gets executed per frame.

EDIT: I'm using a p4 so part of the slowdown could probably be with the issue described by Mark in his post above.

Quote

Your original looks a little odd though - are your arrays really byte arrays? Are you not dealling with multiple colour channels at least? If you are tring to do additive or alpha blending, there are slightly more efficient ways to do this as you can ususally get away with dealling with R and B combined in one int, saving you 1/3rd of the work for a blend.

Hope this helps,

- Dom

I'm actually working with 8-bit IndexColorModels and a DataBuffer.Byte pixel array. The addition that you see is just adding corresponding pixel values from a pre-defined 8-bit texture map to the DataBuffer.Byte pixel array.

Hmm. What you suggested got me thinking though. I wonder if I could use a DataBuffer.Int with the IndexColorModel so that 4 8-bit pixel values can be combined in an int, and then perform the adding on the int instead...

What you suggested got me thinking though. I wonder if I could use a DataBuffer.Int with the IndexColorModel so that 4 8-bit pixel values can be combined in an int, and then perform the adding on the int instead...

I'm guessing you would have to do an awful lot of masking instead to prevent overflows to 'bleed' into the wrong bits, or am I missing something?

Overflows are now going to the wrong color and even to the 4th byte (i.e. addSaturated(0xff00ff, 0xff00ff) results in 0x1ff01ff) so the result can be slightly wrong. Maybe this isn't a problem though, but then again maybe it is...

From 1.5 webstart supports arbitrary VM arguments which ought to include -server. No doubt it will only work if the selected JRE has the server VM installed, but perhaps you could have an installable extension which contained the server VM dll and copied it to the right place in the chosen JRE. Obviously the .jar file would have to be signed, but it looks feasible.

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org