Rendering large terrain heightmap (GL_MAX_ELEMENTS_VERTICES?)

I've been working on rendering a square terrain heightmap that I can manipulate in realtime, and my current setup works for anything around 129*129 vertices. But when I try 257*257 vertices, I see some strange artifacts; the wireframe doesn't appear correct and backface culling seems to fail. A second, flat "ghost" grid appears overlayed with the main one, and won't respond to my manipulations. When I try 513*513 vertices, the grid doesn't render at all, but the program is still slowed down as if vertex processing and everything is being done.

I substituted simpler vertex and fragment shaders to see if the problem was there. The FPS jumped up and camera movement was as smooth as it was at lower heightmap resolutions, but the visual problems remained, so I concluded the main problem wasn't in the shaders.

Then I thought the problem might be in the number of vertices on screen (I was only doing backface culling). So I implemented a simple space partitioning system that broke the grid evenly into 33x33 squares and did some frustum culling so only those in the viewing frustum would be rendered (by binding a unique IBO per chunk). I'm certain the code's correct, since I reused it from an older project that worked fine. The visual artifacts still remained, so I concluded the problem wasn't with the number of triangles I'm attempting to draw.

Now I'm considering whether the issue might be the size of the single VBO I'm using. I'm not sure how much a single VBO can hold, and how much my integrated graphics card can hold, total.

Each vertex is 60 bytes. 257x257 of them means nearly 4GB of data. That seems a lot; is it actually? If this is the case, breaking my grid up into squares with their own IBOs won't help much since they index into the same giant buffer. Having a separate VBO per chunk won't help either because they'll all be loaded anyway.

If the amount of data on the card is the problem, the only way I can think of around it would be to have an array of vertices CPU-side, and then build a separate VBO per chunk. Then I'd need to store all the data CPU-side and call glBufferData when I want to draw a specific chunk, which seems really wasteful and beats the point of modern OpenGL programming. But it's the only thing I can think of; if there's too much data on the graphics card I need to make sure there's less on it at any given moment.

Is there any other, less wasteful way to reduce the memory load on the graphics card?

Edit - ignore the GL_MAX_ELEMENTS_VERTICES in the title. I put that there since it was returning 1200, which I thought might be the problem, but then I read a bit more and realized it only referred to glDrawRangeElements and didn't mean much anymore. Unfortunately I can't find any way to edit the title.

But when I try 257*257 vertices, I see some strange artifacts; the wireframe doesn't appear correct and backface culling seems to fail. A second, flat "ghost" grid appears overlayed with the main one, and won't respond to my manipulations. When I try 513*513 vertices, the grid doesn't render at all, but the program is still slowed down as if vertex processing and everything is being done.

It sounds like z-fighting. What are z-depth, near and far clip distances, and values for the grid vertices. Also, you could post an image that illustrates the issue.

Originally Posted by fiodis

Each vertex is 60 bytes. 257x257 of them means nearly 4GB of data. That seems a lot; is it actually?

I seems ridiculously small.
You made a three orders of magnitude calculation error. Your grid data occupy 4MB. In fact, they occupy more because of alignment, but it is a very small VBO. VBO can be of a size of available graphics memory(NV), or even shared memory (AMD). I cannot guarantee whether it is still valid since two years passed since the last experiment I made.

Here's a top view of a corner of the 257x257 heightmap, with a generic texture and wireframe enabled. The wireframe is done by calling glPolygonMode() with the appropriate flags and rendering the same VBO a second time; it works fine with lower resolution heightmaps. The near clip plane's at 1.0, and the far's at 100.0. Usually I have it at 0.001 and 1000.0, but I'd hoped reducing the range would resolve the issue. It didn't.
Here I've raised a few vertices and lowered a few vertices with my editing tool. It looks okay.
Here is the view from the bottom. I've got backface culling enabled, so most of these vertices shouldn't be visible, yet they are. Also it looks like there's an extra plane being drawn from the top of the "hill" I made in the previous screenshot. Additionally, from underneath the wireframe looks slightly off; there aren't any lines in the z direction, only x. (Y is up/down.)

Here are a couple pictures from the center of the terrain, where the problems are more visible. The green square is the graphical representation of my editing tool; it's an effect applied to the terrain entirely through shaders, so I'm not rendering any extra polygons there. Still, you're right, it looks like there's z-fighting, which is odd considering there aren't any overlapping polygons there.
Here's a view of the same area after I've raised a similar hill/valley pattern as I did on the corner. You see the valley and brush are occluded by some kind of giant "ghost" triangle which isn't in the VBO, divides the terrain exactly in half and doesn't respond to raise/lower commands. This is probably the cause of the z-fighting, but I've no idea why it's there.
Also, though I don't have a picture from underneath, backface culling is also failing here.
Very wierd effects and I have no idea anymore where they come from. You're right, 4MB is tiny and even my integrated Intel graphics shouldn't be having a problem.

Using glPolygonOffset did help make the wireframe stand out better and got rid of the wireframe z-fighting, thanks!

I was pretty sure the grid was correct, I'm using primitive restarting to make sure the strips don't bend back and start over. After going over the code over and over again, I found the problem - I was using unsigned shorts as the data type for the index buffer! Their range only ran up to ~65k, and I had ~66k vertices, so they looped back to zero. Once I changed everything to unsigned ints, everything rendered fine just as it did at lower resolutions. Now I only have to make sure I'm never ever drawing ~4,000k vertices in a single draw call.