Hallo
I just came up with a pretty simple idea. Mutlithreaded rendering.
I know that with DDraw, only one write can be performed once on a given surface, and therefore i came up with the following....
Assume that we have an Iso engine, the can render in multiple viewports, as we define ''em (already have this).
Then we launch 3 threads that each one will render a part of the full viewport.
Ie.
+-------------------+
/ Drawn by Thread 1 /
+-------------------+
/ Drawn by Thread 2 /
+-------------------+
/ Drawn by Thread 3 /
+-------------------+
And then blitted all together on the same buffer.
Do you think that this will produce faster rendering??
Any ideas on the concept?
RFC post
Paris Theofanidis
X-treme DevTeam
( http://www.x-treme.gr/ )

I don''t think it''ll be faster cause you''re using the 2d card after all, so if it could work that fast, even without thrad it will run that fast.Instead, you could use double or tripple buffering.And you case uses threads for music, input, sound and graphics, with a timer and some priority rules...

Correct me if I''m wrong.

-* Sounds, music and story makes the difference between good and great games *-

No matter how many or how few threads you have, you still only have one CPU (usually) and it can only go so fast. I doubt if you would gain any speed by having different threads render different things, but you would have to duplicate a lot of code most likely.

If you want to experiment with threads, try having one handle all the rendering and another one handle user input. This is a common practice.

Even if you are using one CPU, multi-threading can be faster. Windows allocates CPU usage per application. Multi-threading can use more CPU power if done correctly. I do not use multi-threading. I like to use setproirotyclass and setthreadpriority. Some function like that.

I believe you''re right about the graphics board not being able to do parallel blitting to the same surface, but can it do parallel(simultaneous) blitting to different surfaces?If not, how does double or triple buffering speed the rendering?

DirectDraw is thread-safe, and will impose mutex locks on touching the buffer you''re blitting to. So, you would have the effect of serializing the blits, and the additional overhead of thread synchronization, so it''ll probably be a bit slower.

The beneficial of triple buffering is due to the fact that the monitor had a refresh rate, and the image is only blit when the scan pass.I''ll try to explain this better or to find a good article about the advantage of triple buffering.Basicaly your card will wait less time for the vsync, and thus you can gain up to 100% fps.

(not a very good explanation, I know)

-* Sounds, music and story makes the difference between good and great games *-

I have two threads in my engine. One continuously updating the screen from a smaller "render queue" as fast as it can. The second thread runs X times per second and runs through all objects in the game updating there positions etc. If they''re on or very close to where the screen is in the game world the object is added to the render queue so that it is drawn by the first thread.

This allows the user to scroll about and select objects while the game engine is updating information in the background.

hm... maybee if each thread blits to a diferent surface, and at the end, all the 3 surfaces are blitted to the backbuffer.....

The waste will be the blit time diference that there is between the Videocard blit, and the software blitting, plus the final blitting of the 3 pieces together...

Yes, i think that having the effects on a different thread is better, hence you don''t blit directly to the Video memory, using normal blitting, but you blit each pixel yourself. (Unless you use a 3Dfx chip )

Usually, your program is getting 99% of the CPU anyways, multithreading gives no noticeable speedup, and will probably give a slow down due to the additional context switching and synchronization overhead.

I tried this once, I created a multithreaded app with two CPU intensive threads. Doing the two intensive CPU calculations in parallel took 8.67 seconds. Doing it a sequentially in a single thread took 8.63 seconds.

If you're locking any of these surfaces, remember lock makes the system acquire the Win16Lock.

Also, please don't mess with SetPriority, you can cause major problems with your system by doing this (if you set it too high, disk buffers won't flush.)

I'm not saying never use multithreading, use it in the right situation. IO intensive tasks are great to put in another thread, since they spend most of their time blocked (waiting for a slow IO device like a disk) and if you're smart, you can get the effect of disk accesses taking zero time. User input in another thread is ok also (just please don't make it a busy-wait constantly checking the DInput states.)

My game was taking up 99% of the cpu power, then i added 2 more threads and it only uses 76%. Maybe it''s not sped-up, but it can handle a lot more now without fear of slowing down. Multi-threaded applications are good, multi-threaded rendering may not be so good, what is going to guarantee that the background is the last thing to be drawn and the only thing to be seen? You might be able to do it, but it would involve more than just threads and end up being more trouble than it''s worth

hmmm, i see your point now mhkrause.True, i use threads in IO mode, and especially when writting servers in Linux. In windows i don''t have much experience in threads.

Well, based on my Linux threads experience, i can tell that using multiple threads is the only way to serve much traffic, such as 100000 requests/sec. I use multiple threads for aquiring the data through socks, and then a lot more threads that will process the data. To be honest, i never expected that the threads used for processing will be a lot slower!!!! But then, thats another project

Well, i think that the best thing is to write a mutlithreaded rendering app , and then meassure the results.

PS: I don''t need syncronization, nor care on the locking...

I''ll try to give another explanation (summary) of what i''ve thinked so far...

Each thread renders a part of the map. Not a diferent layer, but a diferent area. These areas can then be puzzled together, and they form then the screen rendered scene.

Hence each thread will be writting on its own MemorySpace, i won''t need to use semaphores or anything. The threads in NO way interact with eachother. When all threads are finished blitting, then all the sufraces (that were blitted by each thread), are put together.

PS: The IO uses in rendering is really quick. It is actually the time that it takes that data to go into memory, through the busses.

Heh, that''s funny reading the posts of people who don''t know how actually hardware works and (maybe) never handcode this kind of stuff in asm.

I''ll say: NEVER use multithreading for rendering. You can use it for background music, inputs etc (since they must be implemented asynhronius with rendering).When it comes to rendering, following factors will greatly decrease perfomance in multithreading:

- task switching (very slow)- data/code cache trashing (cache gives a lot of speedup if used currectly)- some other facts...

While doing low-level rendering (mostly in inner-loops), it''s better to not interrupt this process. Also, cache misses while accessing data can greatly slow down whole stuff.

FlyFire, i may not be a hardware expert, but process/thread switching speed depends much on the OS, and as i've posted in a prev. post, i'm not much into Win threads.

PS: Under UNIX each thread is actually a process... i'd never use threads under linux for rendering. Does this rule applies under Windows too?

Its not just the cache of the CPU that increases the speed. CPU's handle instructions in parallel, allowing it to handle more than one instructions per cycle. This parallel processing was used since 80486 if i recall correctly, and optimizing important loops with this in-mind, you can speed up your code a lot more than you can imagine. However, this type of optimization usually turn's your app's code upside down (i've seen it)

c 'ya

PS: How are things up there? The news haven't reported anything for sometime now...Regards to my neighbour Russia

Yes, thread changing speed also depends on OS (OS can handle it fast or slow), but of cource, it mainly depends on CPU. Processor spends a lot of cycles while changing thread from one to another and this causes a great speed loss in rendering.

quote:This parallel processing was used since 80486 if i recall correctly,

No, only from Pentium class processors

quote: However, this type of optimization usually turn''s your app''s code

The INTEL 80386 was multithread capable. (protected mode)I think that the 80386 was capable to do parallel processing if correctly programmed.(I think it can handle it in special cases)Even so, only win32 used protected mode and parallel processing.M$ is always very late with it''s software solutions.

I used thread for I/O and INPUT, while my ''main'' thread is producing images.Maybe I''ll create a ''IA'' thread...

Only critical functions must be code in ASM, in the end of the process of writting an app.

-* Sounds, music and story makes the difference between good and great games *-

Technically, the 386 had native support for multiple processes in a sort of 8086 multiple virtual machine setup. Yes, 8086, the granddaddy of them all. I''ve never seen this taken advantage of it, as the OS had run specifically in that mode, and essentially no OS runs in that funky mode, but it''s in the 386 technical specs.

However, I believe the original idea in the parallel mode point was that the 486 was pipelined, which is true. The second pipeline only handled about a third of the instruction set, and only ran the instructions it could handle about 1 in a 3 times, but it was there. The 586 was simply pipelined much much better.

But really all the discussion of pipelined/parallel whatever is essentially pointless because the CPU still only executes a single thread of execution at one time. It will *not* execute a different thread in each pipeline. Even under the superscalar cores. In order to truly benefit from multi-threading you need multiple processors. Multiple threads on a single processor machine just simplifies bookkeeping for the programmer (and trashes the cache). For something like rendering the overlap in computation between the threads combined with the threading overhead make multi-threading impractical.

If your UNIX system is running each thread as a separate process, then you have an old and/or non-POSIX compliant version of UNIX. POSIX provides for multiple threads of execution within a single process. And last I checked linux was POSIX compliant.

quote:Is this an unwritten rule? Of course, i can code whole app in asm and nothing can stop me

Just another idea for using threads in games: Would it be possible to use 2 threads in place of a triple buffer? One thread could handle the game logic and the other could handle rendering... Once the rendering thread finished, it could use a simple semaphore to tell the logic thread to start again, and then call flip() That way you could start the next game loop while waiting for your vsync.

Is this doable? I haven''t done any tests yet, but it seems like it has some potential.