I have tried years ago to replace immediate mode with a glBuffer(Sub)Data based method. This translates badly. The buffer sizes are simply too small, the only way to make it work is to reorganize all the data to accumulate larger amounts, which is way beyond the scope anyone is willing to go. And constant mapping/unmapping is even worse in this particular case.

Of course that it is the worst possible solution to map directly immediate mode drawing (of few vertices) with VBOs. It is much slower than immediate mode because of significant overhead.
I also understand the significant amount of coding to optimize everything. If immediate mode suites you, just keep using it. I'm not sure for MacOS, but on Windows/Linux you should be able to use compatibility profile and easily mix ancient and modern approach, and slowly drift to better and more optimized solution. I'm not sure persistent buffer storage can aways beat immediate mode if direct mapping is used, but if you measured performance and concluded so than it is an excellent news.

I have tried years ago to replace immediate mode with a glBuffer(Sub)Data based method. This translates badly.

And constant mapping/unmapping is even worse in this particular case.

Why? I'd assume an implementation to cache calls to immediate mode commands anyway and execute them in a more optimal manner regarding transfer over the bus at the time you call glEnd(). Why would mapping or updating via Buffer[Sub]Data() not work as well?

The buffer sizes are simply too small, the only way to make it work is to reorganize all the data to accumulate larger amounts, which is way beyond the scope anyone is willing to go.

How does GL_ARB_buffer_storage increase the amount of storage you can allocate for a buffer? I don't get it. What you get with the extension are mechanisms to make buffer objects immutable (and prevent client-side but not server side updates), map a pointer to buffer storage persistently into client space and doing stuff like render-while-mapped (which was not possible otherwise), a vague sorta hint as to where a buffer objects data store is to be allocated (which, if you read the extension closely, you'll find is pretty useless especially with UMAs like Intel hybrid graphics and stuff and is not even guaranteed to work other architectures as expected), and certain other stuff controlling mapped buffers and synchronization.

As it stands and to the best of my knowledge, implementations will not restrict how much stuff you upload to the GPU as long as there is memory. I did some tests a while ago simply creating 1024k sized VBOs and let the code run until I got around 8GB of system memory filled ... while the implementation didn't ever complain with an OUT_OF_MEMORY error. Memory allocation of buffer objects is completely nontransparent and GL_ARB_buffer_storage, although the attempt was made, did not solve this problem. As a developer you can only hope stuff is put where you expect it to be put.

That's why the project never went further than immediate mode 3.0, because anything beyond that simply was not doable efficiently.

Maybe you should for ideas on that in a separate thread! We may be able to help, you know?

Of course that it is the worst possible solution to map directly immediate mode drawing (of few vertices)

From what they (I don't know whether Nikki is female or male here) told us so far, I assume they're dealing with much ore than a few vertices. Maybe something like point clouds? That's why I suggested they open another thread and specify the problem at hand.

I'm not sure for MacOS, but on Windows/Linux

Neither Apple nor Intel (on Linux) will provide ARB_compatibility when using context versions higher than 3.2 - with these vendors on these platform, you're screwed if you wanna stick to legacy stuff - which I personally find very sexy. We wouldn't be discussing a more than 20 year old API for new or refactored applications if ARB_compatibility never came up ...

This is really coming across as what Stack Overflow would call "a rant disguised as a question".

Is immediate mode currently causing you a problem? If the answer to that is "no" then congratulations! You can continue using immediate mode. Nobody is forcing you to upgrade your code. Stick with GL3.0 and use immediate mode safe in the knowledge that there is so much more legacy software out there using it that it's not going to go away any time soon.

If the answer is "yes" then evaluate your code. Buffer objects are nothing new - they've been part of core OpenGL since version 1.5! So stop treating them as some kind of scary new functionality because they're not. So: does your vertex data need to change from frame-to-frame? If not, you can put it in a static buffer object and just be done with it. Your glBegin/glEnd pairs (and everything between then) can each be converted to a single glDrawArrays and your job is done.

If your vertex data does need to change then evaluate the kind of changes it needs. Is this CPU-side code you can migrate to a vertex shader? A lot of what you think is dynamic data can actually be handled this way: frame interpolation, time-based stuff, etc. If it fits this description then you can still put it in a static vertex buffer and be done with it.

My experience is that the only real cases where vertex data needs to be absolutely dynamic are (1) a truly dynamic CPU-side particle system, or (2) 2D GUI code. Even in those cases you can still use glBegin/glEnd intelligently (i.e put them outside a loop rather than inside one) and still get high performance. Or you can use client-side arrays (also not new/scary/risky - GL 1.1 this time). Like I said at the start, nobody is forcing you to use buffer objects and nor is anybody forcing you to use the newer OpenGL. You've got an extremely rich set of options available, most of which are absolutely ubiquitously supported, so instead of complaining about problems how about you start thinking about solutions?

If your vertex data does need to change then evaluate the kind of changes it needs. Is this CPU-side code you can migrate to a vertex shader? A lot of what you think is dynamic data can actually be handled this way: frame interpolation, time-based stuff, etc. If it fits this description then you can still put it in a static vertex buffer and be done with it.

Great! Always the same suggestions that don't help. I have said before that reorganizing the vertex data would be prohibitive due to the work involved. It simply can't be done without rewriting half of the application's rendering code and that's completely out of the question. I do not want to do such a rewrite, it'd be months of work for no gain.

You always make it sound so simple, just rethink your approach and all will be fine. Well, in the real world it won't be fine! In the real world you got to deal with huge crufty code bases that do not like being torn apart and being reassembled. The only chance you got is to go the path of least resistance, in this case it means not to touch the rendering logic itself, only change the means to get your data onto the CPU with as little change to the code as possible.

We put off the transition to core profile due to that - because it simply was too slow -, and now with persistently mapped buffers we finally have the chance - but what's stopping us cold is the plain and simple fact that a lot of computers this needs to run on during the transition phase do not have a dedicated graphics card thanks to some bean counters thinking that the integrated chipsets have become good enough.

Great! Always the same suggestions that don't help. I have said before that reorganizing the vertex data would be prohibitive due to the work involved. It simply can't be done without rewriting half of the application's rendering code and that's completely out of the question. I do not want to do such a rewrite, it'd be months of work for no gain.

Well then don't rewrite it.

As I said earlier, nobody is forcing you to, you've said yourself that the current code is working fine, so why on earth would you rewrite it?

I'm totally failing to see what the problem you're facing is here. You have a codebase that you say is woking fine, you have no pressing need to rewrite it, so not rewriting it is always an option. It will continue to work fine.

If you really really really want to rewrite it you do have other options. You can use client-side vertex arrays for dynamic objects, you can gradually transition without disrupting your codebase too much, and slowly put yourself in a position where you can make the jump to 4.x, but the important thing is that not rewriting it at all is also an option.

That's why I also said "rant disguised as a question" above, you know. Because you seem to be ignoring the fact that you don't have to rewrite this code, and focussing instead on complaints and negatives. You don't have to rewrite, so don't.