Question about arb_buffer_storage, buffer streaming, etc

Hi, I'm fairly new to OpenGL and doing research, reading books, etc. I noticed buffer streaming is an extremely common technique, but ARB_buffer_storage is fairly new(GL 4.4) so it's not covered much. I'm still trying to fully understand buffer streaming so I'm probably wrong in my following sentences

Would it be a good idea to allocate a fairly large VBO via glbufferstorage, map it with the coherent, persistant, and write bits and then stream my data into the VBO as needed? if I understand this correctly, it's esentially saying to openGL "Hey, give me some memory we can share and I promise I won't do anything to the memory you're currently using"?

There is no down side to buffer storage except that it only works on newer hardware (infact nVidia highly recommend it); however you should only use it if you need to stream data because it is still slower than rendering from a static buffer.

I wonder what are the up sides? In the documentation there is nothing definite about the gains that buffer storage give except that it's immutable type of storage. Does it mean that immutable storage is neccessarily faster than a mutable one?

You do not have to keep binding and unbinding the buffer across draw calls or alternatively create new buffers prior to each draw call; both of these are more expensive to using an immutable storage buffer.

Does it mean that immutable storage is necessarily faster than a mutable one

AFAIS yes - which you would expect because you have told OpenGL this buffer cannot be resized

AMD drivers still do not support buffer storage. Therefore I did not have a chance to actually play around with it. This is how far I got with my understanding:

On nvidia and amd mapping is a performance killer because of application and driver thread synchronisation. Because of that glBufferSubData/glGetBufferSubData was the way to go! They also manage all the buffer synchronisation (not overwriting data that the GPU is currently working with, waiting for the GPU and Cache coherence) but often inserting unnecessary memory copying on the CPU side.

With storage buffers you get way more control. But you must take care of synchronisation with fences and optionally also sync for cache coherency.

To use buffer storages for streaming it is advised to use a buffer 3 times the max. size you need and then use it as a ring buffer. The first range to be used by the GPU, the second range ready to be used by the GPU and the third range to write data to or read data from.
Also the storage buffer could be directly mapped vram and you only should read/write sequentially for performance reasons.

Originally Posted by tonyo_au

There is no down side to buffer storage except that it only works on newer hardware (infact nVidia highly recommend it); however you should only use it if you need to stream data because it is still slower than rendering from a static buffer.

Actually it looks like ARB_buffer_storage was designed to work with way older hardware.
Buffer storages are not only better for streaming, they are meant to replace the old buffers! You even can create a storage that is static from the client side and can not be changed by glBufferSubData.