(GL ES)Fast particle system?

I have built a simple particle system on top of Texture2d class. To minimize texture switching, I'm using single 128x128 texture, with a plenty of 32x32 sprites. The problem is that it's not fast enough for me. I thought it could be more effective if I won't be calling glDrawArrays for each particle, and instead call it at once for all of them, but it's quite hard to implement for me. So I thought maybe anyone already made a fast particle system and can share the code. Or, at least give some insights about making one - i.e. is it good to have one big texture, how much perfomance do I win if I reduce calls of glDrawArrays by passing a bigger array, etc.

So far I noticed that once particles grow quite big, to about maybe 128 size, I'm instantly having a major slowdown, even if particles count is not big.

Even with one very large particle, i.e. half of the screen or a whole screen, FPS drops significantly. Is it video card's limitation?

Also, what's better - having a non-alpha texture and use "additive" blending, or instead just have an alpha texture with "normal" blending?
First one takes less memory as texture, so maybe it uses less bandwith when blitted? However, additive blending is probably slower. Is it?

In general, there are three bottlenecks to be aware of:
1) animating the sprites (i.e. physics.) Profile your app and optimize as needed.
2) submitting data to GL. Use point sprites if you can. Minimize data sizes and draw calls either way.
3) rasterizing sprites. The GPU has a finite amount of fill rate, and blending isn't free.

Also, your assumptions about how much memory a texture takes are probably wrong. In general, don't assume that an RGB texture takes less memory than an RGBA texture. Some hardware only samples from 1, 2, or 4 channel data.

arekkusu Wrote:In general, there are three bottlenecks to be aware of:
1) animating the sprites (i.e. physics.) Profile your app and optimize as needed.
2) submitting data to GL. Use point sprites if you can. Minimize data sizes and draw calls either way.

Thanks!
But how effective is it? How much % would I typically get if I switch triangles to point sprites? Same for draw calls - If it's 1-5 percent, as I currently assume, I wouldn't bother. This of course depends on how much calls there are, so let's assume there are 200 of them.

Quote:3) rasterizing sprites. The GPU has a finite amount of fill rate, and blending isn't free.

So if I draw less sprites, I would take less power .

Quote:Also, your assumptions about how much memory a texture takes are probably wrong. In general, don't assume that an RGB texture takes less memory than an RGBA texture. Some hardware only samples from 1, 2, or 4 channel data.

I came to this conclusion by analyzing Texture2d class code, but now I'm looking at it again, and not sure of this anymore.

Quote:Thanks!
But how effective is it? How much % would I typically get if I switch triangles to point sprites? Same for draw calls - If it's 1-5 percent, as I currently assume, I wouldn't bother. This of course depends on how much calls there are, so let's assume there are 200 of them.

Well, it depends. If you've got a lot of other stuff going on in your scene, you're probably going to want to save as much as you can as you design. A point sprite is only 1 vert. Two triangles is 4 verts, if the verts are indexed right.

Also keep in mind that you either need to transform existing geometry or generate particle geometry on the fly if you are not using point sprites. So thats 4 verts + whatever effort needed to get them to show up in the right place.

I'm not sure of the details of point sprites, but I would guess that it either generates a quad using hardware, or does it in software in a more efficient manner than your own code.

Thank you guys. I've decided to release a game with what I had - I optimized the other code, and it's now pretty allright, though not very fast. However, if I eventually find a ready system (not hints and like), I'd gladly use it in my future game. No luck with this yet. Cocos2d's doesn't look very fast, though I will look at it again.

jaguard Wrote:Thank you guys. I've decided to release a game with what I had - I optimized the other code, and it's now pretty allright, though not very fast. However, if I eventually find a ready system (not hints and like), I'd gladly use it in my future game. No luck with this yet. Cocos2d's doesn't look very fast, though I will look at it again.

hey, make sure that you are compiling cocos2d (cocos2d library + particle example) with "thumb compilation" turned OFF (by default it is ON). You will gain 20~60% performance.