The fascinating part is the description of some guys 'hacking' the GPU to make it run math intensive simulations instead of graphic processing. I always wondered how it would be possible to take advantage of the power of modern GPUs to boost the performance of existing programs (assuming these programs do some math related computation of course). Does anybody have some examples besides those from the article?

>The fascinating part is the description of some>guys 'hacking' the GPU to make it run math intensive>simulations instead of graphic processing.

That was already done. Graphic-adapters offers a fast floating point unit and a huge chunk of fast ram. I've seen an example about abusing the GPU for number crushing (e.g. bruteforce attacks). It was much faster than using the fastest CPU available and the CPU itself stayed idle .

There was also another programm wich did all (heavy math) calculations inside of the GPU.

IMO it's quite intersting but it doesn't work for all kind of stuff (at least nowadays - but who knows what will happen in the industry in the next years).

Interesting that you'd bring this up. The thread on the raycaster for J2ME got me to thinking. I implemented a Voxel based caster with no optimizations (i.e. all the math was done in real time) and found the performance to be quite acceptable. I began thinking about it and realized that as long as optimizations were done to limit the types of collisions (e.g. a voxel engine instead of free floating spheres and boxes), a real time raytracer should be possible today. Which of course got me to thinking about how I could make the GPU do it. (Why should I waste the processor's time?) Does anyone have any good articles on programming the GPU directly?

Thanks a lump Cas. It would probably help if you elaborated a little bit. Honestly, if you load the entire voxel model + textures into VRam you should be able to get the GPU to do all the work for you. All you'd need to do is modify the voxel model as characters move around the map. You'll obviously need to take a few shortcuts (for example, RLE encode the slivers in the model) otherwise your voxel data is going to be fscking HUGE. The other thing I'd do is only worry about direct light sources. If you find a wall, cast out one more ray. If it doesn't hit a light source, simply set the pixel to the ambient value for the room and be done with it. (Sort of like Doom on steroids, but with super-pimped lighting.)

It all rather depends on where you want the answers to be. If you're happy with them staying on the graphics card, then that's fine, but then they're not much use unless you're going to draw something with them - which is of course exactly how graphics APIs are designed: set up state, send the data, send an operation request. In which case, yes, it's a great idea! But if you're just trying to hijack the card into doing tons of maths computations for you, you'll be shafted by the incredibly slow speed of data transfer back to the CPU. No graphics systems are designed with getting information back out of a card and on to the CPU.

What you *really* want is some proper instructions in the CPU for doing SIMD. Like SSE2 or Altivec. And another CPU

It all rather depends on where you want the answers to be. If you're happy with them staying on the graphics card, then that's fine, but then they're not much use unless you're going to draw something with them - which is of course exactly how graphics APIs are designed: set up state, send the data, send an operation request.

Exactly! What I'm proposing would practically be a scenegraph in Video Ram managed by the GPU. Since the entire process begins and ends in the GPU/VRam, your AGP bus would spend most of its time doing nothing.

Quote

But if you're just trying to hijack the card into doing tons of maths computations for you, you'll be shafted by the incredibly slow speed of data transfer back to the CPU. No graphics systems are designed with getting information back out of a card and on to the CPU.

It's amazing how much the current AGP transfer rate from the video card -> CPU sucks. It should be the same rate as outgoing, but most of the 3D card drivers really ignore the input aspect. I'm thinking this may change tho. There's been a lot of interest from graphics professionals in using graphics cards for improving the speed of rendering work. If the AGP input speed were improved, they could get back animations of the quality of Dawn (or better) in near real-time. This would be a huge improvement for television stations who often are under very tight schedules to produce special effects for sports games and television shows. (Although the later may be limited in its usefulness.) Just call it Video Toaster 2003.

Quote

What you *really* want is some proper instructions in the CPU for doing SIMD. Like SSE2 or Altivec. And another CPU

No, what you really need is a vector math coprocessor with a super deep pipeline. Given to the task of doing only vector math (no pixel pushing), you should be able to significantly outperform even most of today's GPUs. As an added bonus, I've heard that most specialized chips are very cheap to produce. Development is a killer tho...

Well, the real interesting part is that double precision numbers are actaully now faster than floating point numbers(on the new chips). This increases the accuracy of the programs.

However I remember Cas saying something that he does not see the accuracy of the computer hardware to grow anytime soon. I don't know if this is supported in java, however this is in C++. The speed increase is tiny, but still combined with the improved accuarury it is significant. However, doubles take more mememory than floats. I'm just trying to be objective.

Also I remember there was an article on gamasutra awhile ago that the next generation platforms such as xbox 2 won't support floats in future since they are unneccesary.

However, I can't see why the idea would not be worth pursuing. The new cards are coming out with DDR-2 and actually the idea that soon we might be able to dynamically store stuff in "vram" is very close to reality. However, I can't see how this is would be supported in Java. I know for a fact that GameCube does this to certain degree. How nintedo does it, I'm not sure. It is certain that they can load most common textures in there.

What comes to the AGP issue is the fact that the buffer between AGP is simply not fast enough, however this problem is on manufacturer side. Nvidia acknowledged this fact and came out of closet to say that the new geforce 4 series cards can't fully utilize agpx8 although it is in the hardware thankfully ATI has been littlebit better in this.

My friend who was at Sony's E3 developer thingie said that the new Sony console will have not only have few separate processors, but also they will utilize "stacking", he said that this bring more object oriented approach to the computer hardware since each CPU(I'm including GPU here) has its own stacking thread. So the developers are able to actually assign few different things to go down into the rendering pipeline at the same time and that the outcome is hardware syncronized so no, which means that omgomgomgomg overhead lolol optimize everyday is gone.

So this means that the designer can send matrix calculations down the cpu pipeline, while sending vertices to be drawn down on the gpu, while calculating normals on second cpu and doing the game logic or something similiar. I think it sounds pretty awesome. I know there is a logical phallus in what I'm saying since some of those things are required to for another, but anyhow, so did my friend explain and I have no reason to believe he is decieving me.

I think you ought to look up phallus in the dictionary Cap'n I think you meant fallacy.

Regardless of the underlying architecture a well designed API should be able to cope with all of this strangeness. A few OpenGL extensions could take care of the unusual PS3 design. I do hope someone writes a driver for it.

1. A phallus is a model of an erect penis, especially one used as a symbol in ancient religions.

2. A phallus is a penis.

This is definition I got from my philosophy teacher:Logical phallus, synonymous to logical fallacy is a logical theory, where another philosopher uses unclear statements to enhance his or her otherwise small philosophical penis size,

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org