I have a hard time understanding how this can possibly work on something as performance-critical as a GPU driver, but keeping drivers cross-platform is one of their stated goal.

I'm not an expert on this but I have spent some time looking into Gallium3D (though that was mostly a few years back!).

AIUI, it's source compatibility not binary compatibility, so that's part of it. The other part is that to a certain degree the OS-specific stuff does:
a) set up a direct comms channel between the application and the graphics hardware, e.g. using shared memory. 3D rendering under X, for example, doesn't mostly (in principle) interact with the X server.
b) talk to the windowing system about where the rendering is / should be

Neither of these things are on the performance critical path, which is the communications protocol with the graphics driver. So in principle they don't really matter to graphics performance.

That still leaves the question of how they do 3D API stuff, though... Currently, OpenGL on Linux is built on Mesa, with a load of function calls for very high level operations which call lower-level operations, which call lower-level operations, etc. The default implemenations are all software. The individual graphics card reimplements what operations can be accelerated, according to its capability (better cards might accelerate more or higher-level ops).

For Gallium the idea is that modern graphics cards offer a certain range of functionality that can be hidden behind a low-level (as opposed to high-level OpenGL operations) function call interface. The driver just implements these low-level operations and doesn't need to worry about the API in use.

That suggests (to me - I am no expert) that the optimisation opportunities are at least going to be rather different compared to folding the driver code into the 3D API implemenation. Maybe this is going to be beneficial, maybe it will be awkward...