I wasn't able to reproduce the "array access is out of bounds" error. I did need to add something like "layout(local_size_x=16, local_size_y=16) in;" to avoid a different error. What driver are you using?

Thanks, I was able to reproduce the problem with your new sample. This is a bug in our driver. Before ARB_shader_storage_buffer_objects, which allows the last element of a buffer to be an unsized array, it was possible to use unsized arrays but it was necessary to index that unsized array somewhere in the shader in such a way that the compiler could determine the size from its use. In your example the g_output[128] is causing us to think it's the old style of unsized array and we're sizing it to 128. Then the other g_output[128 + index] is thinking it's not sized right.

You can work around this bug by avoiding indexing into the buffer unsized array with a constant. For example, change g_output[128] to g_output[g_LocalInvocationID.x + 128] instead.

The control flow barrier built-in function barrier() is allowed inside uniform flow control for
compute shaders.

Currenty (in 306.63) this is an error. Will this be fixed (for GLSL and assembly shaders)?

Yes. As I mentioned earlier in the thread (August 27), I had filed a Khronos bug on this issue after realizing that this behavior would be problematic. We decided to simply remove the restriction from the GLSL 4.30 specification, rather than postponing to a future version of GLSL or leaving as an extension. As you observe, this happened in revision 7. NVIDIA hasn't yet published a driver removing the error, but we will definitely do so.

Oh, there's another spec bug in regards to this. The expressions leading to the execution of a barrier() must be "dynamically uniform". However, the section titled "dynamically uniform expressions" states that the concept only applies to fragment shaders.

// ...
struct Struct0 {
ivec2 m0;
};
layout(std430) buffer Input {
int data0; // offset 0
Struct0 data1; // I think that offset should be 16 according to the rule: the base alignment of the structure is N, where
// N is the largest base alignment value of any of its members, and rounded
// up to the base alignment of a vec4. When using 310.33 driver offset is 8 (so, structure base alignment is not rounded up to the base alignment of a vec4).
} g_input;
// ...