| arb_extension = ARB_geometry_shader4 <!-- This is intentionally left unlinked due to the differences between it and the core feature -->

+

}}

+

|{{pipeline float}}

}}

}}

A '''Geometry Shader''' (GS) is a [[Shader]] program written in [[GLSL]] that governs the processing of primitives. It happens after primitive assembly, as an additional optional step in that part of the [[Rendering Pipeline Overview|pipeline]]. A GS can create new primitives, unlike [[Vertex Shader|vertex shaders]], which are limited to a 1:1 input to output ratio. A GS can also do layered rendering; this means that the GS can specifically say that a primitive is to be rendered to a particular layer of the framebuffer.

A '''Geometry Shader''' (GS) is a [[Shader]] program written in [[GLSL]] that governs the processing of primitives. It happens after primitive assembly, as an additional optional step in that part of the [[Rendering Pipeline Overview|pipeline]]. A GS can create new primitives, unlike [[Vertex Shader|vertex shaders]], which are limited to a 1:1 input to output ratio. A GS can also do layered rendering; this means that the GS can specifically say that a primitive is to be rendered to a particular layer of the framebuffer.

−

Unlike other shader stages, a geometry shader is optional and does not have to be used.

+

A geometry shader is optional and does not have to be used.

{{note|While geometry shaders have had previous extensions like GL_EXT_geometry_shader4 and GL_ARB_geometry_shader4, these extensions expose the API and GLSL functionality in ''very'' different ways from the core feature. This page describes only the core feature.}}

{{note|While geometry shaders have had previous extensions like GL_EXT_geometry_shader4 and GL_ARB_geometry_shader4, these extensions expose the API and GLSL functionality in ''very'' different ways from the core feature. This page describes only the core feature.}}

+

== Overview ==

+

+

Geometry shaders sit between [[Vertex Shader]]s (or the optional [[Tessellation]] stage) and the fixed-function [[Vertex Post-Processing]] stage. Vertex shaders have a 1:1 ratio of vertices input to vertices output. Each vertex shader invocation gets one vertex in and writes one vertex out.

+

+

Geometry shader invocations take a single [[Primitive]] as input and may output zero or more primitives. There are implementation-defined limits on how many primitives can be generated from a single GS invocation.

+

+

While the GS can be used to amplify geometry, implementing a form of tessellation, this is not the primary use for the feature. The general uses for GS's are:

+

+

* Layered rendering: taking one primitive and rendering it to multiple images without having to change bound rendertargets and so forth.

+

* [[Transform Feedback]]: This is often employed for doing computational tasks on the GPU (obviously pre-[[Compute Shader]]).

+

+

In OpenGL 4.0, GS's gained two new features: the ability to write to multiple output streams. This is used exclusively with transform feedback, such that different feedback buffer sets can get different transform feedback data.

+

+

The other feature was GS instancing, which allows multiple invocations to operate over the same input primitive. This makes layered rendering easier to implement and possibly faster performing, as each layer's primitive(s) can be computed by a separate GS instance.

== Primitive in/out specification ==

== Primitive in/out specification ==

Line 19:

Line 36:

The {{param|input_primitive}} type ''must'' match the primitive type used with the rendering command that renders with this shader program. The valid values for {{param|input_primitive}}, along with the valid OpenGL [[Primitive|primitive types]], are:

The {{param|input_primitive}} type ''must'' match the primitive type used with the rendering command that renders with this shader program. The valid values for {{param|input_primitive}}, along with the valid OpenGL [[Primitive|primitive types]], are:

These work exactly the same way their counterpart OpenGL rendering modes do. To output individual triangles or lines, simply use {{code|EndPrimitive}} (see below) between each triangle/line.

+

These work exactly the same way their counterpart OpenGL rendering modes do. To output individual triangles or lines, simply use {{code|EndPrimitive}} (see below) after emitting each set of 3 or 2 vertices.

−

There must be a {{code|max_vertex}} declaration for the output. The number must be a compile-time constant, and it defines the maximum number of vertices that will be written by a ''single'' invocation of the GS. It may be no larger than the implementation-defined limit of {{enum|MAX_GEOMETRY_OUTPUT_VERTICES}}. The minimum value for this limit is 256. See the limitations below.

+

There must be a {{code|max_vertices}} declaration for the output. The number must be a compile-time constant, and it defines the maximum number of vertices that will be written by a ''single'' invocation of the GS. It may be no larger than the implementation-defined limit of {{enum|MAX_GEOMETRY_OUTPUT_VERTICES}}. The minimum value for this limit is 256. See the limitations below.

=== Instancing ===

=== Instancing ===

Line 68:

Line 85:

}}

}}

−

The GS can also be instanced. This causes the GS to execute multiple times for the same primitive. This is useful for layered rendering and outputs to multiple streams (see below).

+

The GS can also be instanced (note that this is different from [[Vertex_Rendering#Instancing|instanced rendering]]). This causes the GS to execute multiple times for the same primitive. This is useful for layered rendering and outputs to multiple streams (see below).

To use instancing, there must be an input layout qualifier:

To use instancing, there must be an input layout qualifier:

Line 78:

Line 95:

The output primitives from instances are ordered by the {{code|gl_InvocationID}}. So 2 primitives written from 3 instances will create a primitive stream of: (prim0, inst0), (prim0, inst1), (prim0, inst2), (prim1, inst0), ...

The output primitives from instances are ordered by the {{code|gl_InvocationID}}. So 2 primitives written from 3 instances will create a primitive stream of: (prim0, inst0), (prim0, inst1), (prim0, inst2), (prim1, inst0), ...

−

== Input values ==

+

== Inputs ==

−

== Output values ==

+

Geometry shaders take a primitive as input; each primitive is composed of some number of vertices.

+

+

The outputs of the vertex shader (or [[Tessellation]] Stage, as appropriate) are thus fed to the GS as ''arrays'' of variables. These can be organized as individual values or as part of an [[Interface Block|interface block]].

+

+

The predefined outputs from the prior pipeline stage are already arranged in an interface block.

+

+

<source lang="glsl">

+

in gl_PerVertex

+

{

+

vec4 gl_Position;

+

float gl_PointSize;

+

float gl_ClipDistance[];

+

} gl_in[];

+

</source>

+

+

The length of {{code|gl_in[]}} corresponds to the input primitive count. All input arrays will have the same length.

+

+

The order of vertices in input arrays corresponds to the order of the vertices specified by prior shader stages.

+

+

=== Primitive inputs ===

+

+

There are some GS input values that are based on primitives, not vertices. These are:

+

+

<source lang="glsl">

+

in int gl_PrimitiveIDIn;

+

in int gl_InvocationID; //Requires GLSL 4.0 or ARB_gpu_shader5

+

</source>

+

+

{{code|gl_PrimitiveIDIn}} is the current input primitive's ID, based on the number of primitives processed by this shader since the current rendering command started. {{code|gl_IncocationID}} is the current instance, as defined when [# Instancing|instancing geometry shaders].

+

+

== Outputs ==

+

+

Geometry shaders can output as many vertices as they wish (up to the maximum specified by the {{code|max_vertices}} layout specification). To provide this, output values in geomtry shaders are not arrays. Instead, a function-based interface is used.

+

+

GS code writes all of the output values for a vertex, then calls {{code|EmitVertex()}}. This tells the system to write those output values to where ever it is that output vertices get written. After calling this function, all output variables contain undefined values.

+

+

The GS defines what kind of primitive these vertex outputs represent. The GS can also end a primitive and start a new one, by calling the {{code|EndPrimitive()}} function. This does not emit a vertex.

+

+

In order to write two independent triangles from a GS, you must write three separate vertices with {{code|EmitVertex()}} for the first three vertices, then call {{code|EndPrimitive()}} to end the strip and start a new one. Then you write three more vertices with {{code|EmitVertex()}}.

+

+

Output variables are defined as normal for GLSL. They can be grouped into interface blocks or be single values, as appropriate.

+

+

Many of the predefined outputs are grouped into an interface block called {{code|gl_PerVertex}}:

+

+

<source lang="glsl">

+

out gl_PerVertex

+

{

+

vec4 gl_Position;

+

float gl_PointSize;

+

float gl_ClipDistance[];

+

};

+

</source>

+

+

Output variables can be defined with [[GLSL Type Qualifier#Interpolation_qualifiers]]. The [[Fragment Shader]] equivalent interface variables should define the same variables with the same qualifiers.

+

+

Certain predefined outputs have special meaning and semantics.

+

+

<source lang="glsl">

+

out int gl_PrimitiveID;

+

</source>

+

+

The primitive ID will be passed to the fragment shader. The primitive ID for a particular line/triangle will be taken from the [[Provoking Vertex|provoking vertex]] of that line/triangle, so make sure that you are writing the correct value for the right provoking vertex.

+

+

The meaning for this value is whatever you want it to be. However, if you want to match the standard OpenGL meaning (ie: what the [[Fragment Shader]] would get if no GS were used), just do this for each vertex before emitting it.:

+

+

<source lang="glsl">

+

gl_PrimitiveID = gl_PrimitiveIDIn;

+

</source>

+

+

=== Layered rendering ===

+

+

Layered rendering is the process of having the GS send specific primitives to different layers of a [[Framebuffer_Object#Layered_Images|layered framebuffer]]. This can be useful for doing cube-based shadow mapping, or even for rendering cube environment maps without having to render the entire scene multiple times.

+

+

Layered rendering in the GS works via two special output variables:

+

+

<source lang="glsl">

+

out int gl_Layer;

+

out int gl_ViewportIndex; //Requires GL 4.1 or ARB_viewport_array.

+

</source>

+

+

The {{code|gl_Layer}} output defines which layer in the layered image the primitive goes to. Each vertex in the primitive must get the same layer index. Note that when rendering to cubemap arrays, the {{code|gl_Layer}} value represents layer-faces (the faces within a layer), not the layers of cubemaps.

+

+

{{code|gl_ViewportIndex}}, which requires GL 4.1 or {{extref|viewport_array}}, specifies which viewport index to use with this primitive. As with {{code|gl_Layer}}, the viewport index must be specified with the provoking vertex.

+

+

Note that {{extref|viewport_array}}, while technically a 4.1 feature, is widely available on 3.3 hardware, from both NVIDIA and AMD.

+

+

Note that layered rendering can be more efficient with GS instancing, as different GS invocations can process instances in parallel. However, while {{extref|viewport_array}} is often implemented in 3.3 hardware, no 3.3 hardware offsets {{extref|gpu_shader5}}.

When using [[Transform Feedback]] to compute values, it is often useful to be able to send different sets of vertices to different buffers at different rates. For example, GS's can send vertex data to one stream, while building per-instance data in another stream. The vertex data and per-instance data will be of different lengths, written at different speeds.

+

+

Multiple stream output ''requires'' that the output primitive type be {{code|points}}. You can still take whatever input you prefer.

+

+

To provide this, output variables can be given a stream index with a layout qualifier:

+

<br style="clear: both" />

+

layout(stream = {{param|stream_index}}) out vec4 some_output;

+

+

The {{param|stream_index}} ranges from 0 to {{code|GL_MAX_VERTEX_STREAMS}} - 1.

+

+

A default value for the stream can be set with:

+

+

<source lang="glsl">

+

layout(stream = 2) out;

+

</source>

+

+

All following {{code|out}} variables will use stream 2 unless they specify a stream. The default can be changed later. The initial default is 0.

+

+

To write a vertex to a particular stream, the function {{code|EmitStreamVertex}} is used. This function takes a stream index; only those output variables are written. Similarly, {{code|EndStreamPrimitive}} ends a particular stream's primitive. However, since multiple stream output requires using {{code|points}} primitives, the latter function is not very useful.

+

+

Only values passed to stream 0 will actually be rendered; the rest of the streams will only matter if transform feedback is being used. Calling {{code|EmitVertex}} or {{code|EndPrimitive}} is equivalent to calling their stream counterparts with stream 0.

+

+

== Output limitations ==

+

+

There are two competing limitations on the output of a geometry shader:

+

+

# The maximum number of vertices that a single invocation of a GS can output.

+

# The total maximum number of output components that a single invocation of a GS can output.

+

+

The first limit, defined by {{enum|GL_MAX_GEOMETRY_OUTPUT_VERTICES}}, is the maximum number that can be provided to the {{param|max_vertices}} output layout qualifier. No single geometry shader invocation can exceed this number.

+

+

The other limit, defined by {{enum|GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS}} is, in layman's terms, the total amount of stuff that a single GS invocation can write. It is the total number of output values (a component, in GLSL terms, is a component of a vector. So a {{code|float}} is one component; a {{code|vec3}} is 3 components) that a single GS invocation can write to. This is different from {{enum|GL_MAX_GEOMETRY_OUTPUT_COMPONENTS}} (the maximum allowed number of components in {{code|out}} variables). The total output component is the total number of components + vertices that can be written.

+

+

For example, if the total output component count is 1024 (the smallest maximum value from GL 4.3), and the output stream writes to 12 components, the total number of vertices that can be written is 1024/12 = 85. This is the absolute hard limit to the number of vertices that can be written; even if {{enum|GL_MAX_GEOMETRY_OUTPUT_VERTICES}} is larger than 85, because each vertex takes up 12 components, the true maximum that this particular geometry shader can write is 85 vertices.

A Geometry Shader (GS) is a Shader program written in GLSL that governs the processing of primitives. It happens after primitive assembly, as an additional optional step in that part of the pipeline. A GS can create new primitives, unlike vertex shaders, which are limited to a 1:1 input to output ratio. A GS can also do layered rendering; this means that the GS can specifically say that a primitive is to be rendered to a particular layer of the framebuffer.

A geometry shader is optional and does not have to be used.

Note: While geometry shaders have had previous extensions like GL_EXT_geometry_shader4 and GL_ARB_geometry_shader4, these extensions expose the API and GLSL functionality in very different ways from the core feature. This page describes only the core feature.

Contents

Overview

Geometry shaders sit between Vertex Shaders (or the optional Tessellation stage) and the fixed-function Vertex Post-Processing stage. Vertex shaders have a 1:1 ratio of vertices input to vertices output. Each vertex shader invocation gets one vertex in and writes one vertex out.

Geometry shader invocations take a single Primitive as input and may output zero or more primitives. There are implementation-defined limits on how many primitives can be generated from a single GS invocation.

While the GS can be used to amplify geometry, implementing a form of tessellation, this is not the primary use for the feature. The general uses for GS's are:

Layered rendering: taking one primitive and rendering it to multiple images without having to change bound rendertargets and so forth.

In OpenGL 4.0, GS's gained two new features: the ability to write to multiple output streams. This is used exclusively with transform feedback, such that different feedback buffer sets can get different transform feedback data.

The other feature was GS instancing, which allows multiple invocations to operate over the same input primitive. This makes layered rendering easier to implement and possibly faster performing, as each layer's primitive(s) can be computed by a separate GS instance.

Primitive in/out specification

Each geometry shader is designed to accept a specific Primitive type as input and to output a specific primitive type. The accepted input primitive type is defined in the shader:

layout(input_primitive​) in;

The input_primitive​ type must match the primitive type used with the rendering command that renders with this shader program. The valid values for input_primitive​, along with the valid OpenGL primitive types, are:

GS input

OpenGL primitives

vertex count

points​

GL_POINTS

1

lines​

GL_LINES, GL_LINE_STRIP, GL_LINE_LIST

2

line_adjacency​

GL_LINES_ADJACENCY, GL_LINE_STRIP_ADJACENCY

4

triangles​

GL_TRIANGLES, GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN

3

triangles_adjacency​

GL_TRIANGLES_ADJACENCY, GL_TRIANGLE_STRIP_ADJACENCY

6

The vertex count is the number of vertices that the GS receives per-input primitive.

The output primitive type is defined as follows:

layout(output_primitive​, max_vertices = vert_count​) out;

The output_primitive​ may be one of the following:

points​

line_strip​

triangle_strip​

These work exactly the same way their counterpart OpenGL rendering modes do. To output individual triangles or lines, simply use EndPrimitive​ (see below) after emitting each set of 3 or 2 vertices.

There must be a max_vertices​ declaration for the output. The number must be a compile-time constant, and it defines the maximum number of vertices that will be written by a single invocation of the GS. It may be no larger than the implementation-defined limit of MAX_GEOMETRY_OUTPUT_VERTICES. The minimum value for this limit is 256. See the limitations below.

Instancing

The GS can also be instanced (note that this is different from instanced rendering). This causes the GS to execute multiple times for the same primitive. This is useful for layered rendering and outputs to multiple streams (see below).

To use instancing, there must be an input layout qualifier:

layout(invocations = num_instances​) in;

The value of num_instances​ is a compile-time constant, and must not be larger than MAX_GEOMETRY_SHADER_INVOCATIONS (the minimum implementations will allow is 32). The built-in value gl_InvocationID​ specifies the particular instance of this shader.

The output primitives from instances are ordered by the gl_InvocationID​. So 2 primitives written from 3 instances will create a primitive stream of: (prim0, inst0), (prim0, inst1), (prim0, inst2), (prim1, inst0), ...

Inputs

Geometry shaders take a primitive as input; each primitive is composed of some number of vertices.

The outputs of the vertex shader (or Tessellation Stage, as appropriate) are thus fed to the GS as arrays of variables. These can be organized as individual values or as part of an interface block.

The predefined outputs from the prior pipeline stage are already arranged in an interface block.

Primitive inputs

gl_PrimitiveIDIn​ is the current input primitive's ID, based on the number of primitives processed by this shader since the current rendering command started. gl_IncocationID​ is the current instance, as defined when [# Instancing|instancing geometry shaders].

Outputs

Geometry shaders can output as many vertices as they wish (up to the maximum specified by the max_vertices​ layout specification). To provide this, output values in geomtry shaders are not arrays. Instead, a function-based interface is used.

GS code writes all of the output values for a vertex, then calls EmitVertex()​. This tells the system to write those output values to where ever it is that output vertices get written. After calling this function, all output variables contain undefined values.

The GS defines what kind of primitive these vertex outputs represent. The GS can also end a primitive and start a new one, by calling the EndPrimitive()​ function. This does not emit a vertex.

In order to write two independent triangles from a GS, you must write three separate vertices with EmitVertex()​ for the first three vertices, then call EndPrimitive()​ to end the strip and start a new one. Then you write three more vertices with EmitVertex()​.

Output variables are defined as normal for GLSL. They can be grouped into interface blocks or be single values, as appropriate.

Many of the predefined outputs are grouped into an interface block called gl_PerVertex​:

The primitive ID will be passed to the fragment shader. The primitive ID for a particular line/triangle will be taken from the provoking vertex of that line/triangle, so make sure that you are writing the correct value for the right provoking vertex.

The meaning for this value is whatever you want it to be. However, if you want to match the standard OpenGL meaning (ie: what the Fragment Shader would get if no GS were used), just do this for each vertex before emitting it.:

gl_PrimitiveID = gl_PrimitiveIDIn;

Layered rendering

Layered rendering is the process of having the GS send specific primitives to different layers of a layered framebuffer. This can be useful for doing cube-based shadow mapping, or even for rendering cube environment maps without having to render the entire scene multiple times.

The gl_Layer​ output defines which layer in the layered image the primitive goes to. Each vertex in the primitive must get the same layer index. Note that when rendering to cubemap arrays, the gl_Layer​ value represents layer-faces (the faces within a layer), not the layers of cubemaps.

gl_ViewportIndex​, which requires GL 4.1 or ARB_viewport_array, specifies which viewport index to use with this primitive. As with gl_Layer​, the viewport index must be specified with the provoking vertex.

Note that ARB_viewport_array, while technically a 4.1 feature, is widely available on 3.3 hardware, from both NVIDIA and AMD.

Note that layered rendering can be more efficient with GS instancing, as different GS invocations can process instances in parallel. However, while ARB_viewport_array is often implemented in 3.3 hardware, no 3.3 hardware offsets ARB_gpu_shader5.

Output streams

When using Transform Feedback to compute values, it is often useful to be able to send different sets of vertices to different buffers at different rates. For example, GS's can send vertex data to one stream, while building per-instance data in another stream. The vertex data and per-instance data will be of different lengths, written at different speeds.

Multiple stream output requires that the output primitive type be points​. You can still take whatever input you prefer.

To provide this, output variables can be given a stream index with a layout qualifier:

layout(stream = stream_index​) out vec4 some_output;

The stream_index​ ranges from 0 to GL_MAX_VERTEX_STREAMS​ - 1.

A default value for the stream can be set with:

layout(stream =2)out;

All following out​ variables will use stream 2 unless they specify a stream. The default can be changed later. The initial default is 0.

To write a vertex to a particular stream, the function EmitStreamVertex​ is used. This function takes a stream index; only those output variables are written. Similarly, EndStreamPrimitive​ ends a particular stream's primitive. However, since multiple stream output requires using points​ primitives, the latter function is not very useful.

Only values passed to stream 0 will actually be rendered; the rest of the streams will only matter if transform feedback is being used. Calling EmitVertex​ or EndPrimitive​ is equivalent to calling their stream counterparts with stream 0.

Output limitations

There are two competing limitations on the output of a geometry shader:

The maximum number of vertices that a single invocation of a GS can output.

The total maximum number of output components that a single invocation of a GS can output.

The first limit, defined by GL_MAX_GEOMETRY_OUTPUT_VERTICES, is the maximum number that can be provided to the max_vertices​ output layout qualifier. No single geometry shader invocation can exceed this number.

The other limit, defined by GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS is, in layman's terms, the total amount of stuff that a single GS invocation can write. It is the total number of output values (a component, in GLSL terms, is a component of a vector. So a float​ is one component; a vec3​ is 3 components) that a single GS invocation can write to. This is different from GL_MAX_GEOMETRY_OUTPUT_COMPONENTS (the maximum allowed number of components in out​ variables). The total output component is the total number of components + vertices that can be written.

For example, if the total output component count is 1024 (the smallest maximum value from GL 4.3), and the output stream writes to 12 components, the total number of vertices that can be written is 1024/12 = 85. This is the absolute hard limit to the number of vertices that can be written; even if GL_MAX_GEOMETRY_OUTPUT_VERTICES is larger than 85, because each vertex takes up 12 components, the true maximum that this particular geometry shader can write is 85 vertices.