26. Cluster Culling Shading
This shader type has an execution environment similar to that of a compute shader, where a collection of shader invocations form a workgroup and cooperate to perform coarse level geometry culling and LOD selection. A shader invocation can emit a set of built-in output variables via a new built-in function. The cluster culling shader organizes these emitted variables into a drawing command used by the subsequent rendering pipeline.
26.1. Cluster Culling Shader Input
The only inputs available to the cluster culling shader are variables identifying the specific workgroup and invocation.
26.2. Cluster Culling Shader Output
If a cluster survives after culling in a cluster culling shader invocation, a drawing command to draw this cluster should be emitted by this shader invocation for further rendering processing. There are two types of drawing command, indexed mode and non-indexed mode. Both type of drawing commands consist of a set of built-in output variables which have a similar definition to VkDrawIndexedIndirectCommand and VkDrawIndirectCommand members.
Cluster culling shaders have the following built-in output variables:
- 
built-in variable IndexCountHUAWEIis the number of vertices to draw.
- 
built-in variable VertexCountHUAWEIis the number of vertices to draw.
- 
built-in variable InstanceCountHUAWEIis the number of instances to draw.
- 
built-in variable FirstIndexHUAWEIis the base index within the index buffer.
- 
built-in variable FirstVertexHUAWEIis the index of the first vertex to draw
- 
built-in variable VertexOffsetHUAWEIis the value added to the vertex index before indexing into the vertex buffer.
- 
built-in variable FirstInstanceHUAWEIis the instance ID of the first instance to draw.
- 
built-in variable ClusterIDHUAWEIis the index of cluster being rendered by this drawing command. When cluster culling shader is enabled,ClusterIDHUAWEIwill replacegl_DrawIDpass to vertex shader.
- 
built-in variable ClusterShadingRateis the shading rate of cluster being rendered by this drawing command.
26.3. Cluster Culling Shader Cluster Ordering
- 
When a cluster culling shader is used, all output clusters generated by DispatchClusterHUAWEI() in a given workgroup are passed to subsequent pipeline stage before any cluster generated from subsequent workgroup.
- 
In a workgroup, the order of output clusters generated by DispatchClusterHUAWEI() is specified by the local invocation id, from lower to higher values.
- 
If any cluster culling invocation in the workgroup does not call DispatchClusterHUAWEI(), no cluster will be sent to the subsequent rendering pipeline.
- 
Any cluster culling shader invocation may also call DispatchClusterHUAWEI() many times as shown below:
// Cluster Culling Shader sample code:
        ......
    DispatchClusterHUAWEI();  // dispatch 0
        ......
    DispatchClusterHUAWEI();  // dispatch 1
        ......
    DispatchClusterHUAWEI();  // dispatch 2
        ......In this case, the output sequence of clusters in a workgroup are specified as shown below ( in case of 32 shader invocations in a workgroup):
1. shader invocation0.dispatch0
2. shader invocation1.dispatch0,
            ..........
32. shader invocation31.dispatch0
33. shader invocation0.dispatch1
34. shader invocation1.dispatch1
            ..........
64. shader invocation31.dispatch1
65. shader invocation0.dispatch2
66. shader invocation1.dispatch2
            ..........
96. shader Invocation31.dispatch226.4. Cluster Culling Shader Primitive Ordering
Following guarantees are provided for the relative ordering of primitives produced by a cluster culling shader, as they pertain to primitive order.
- 
Limited guarantees are provided for the relative ordering of primitives produced by a cluster culling shader, as they pertain to primitive order. 
- 
The order of primitives in a given cluster is specified by the content of - 
DispatchClusterHUAWEI() with indexed output built-in variables, vertices sourced from a lower index buffer addresses to higher addresses.
- 
DispatchClusterHUAWEI() with non-indexed output built-in variables, from vertices with a lower numbered vertexIndex to a higher numbered vertexIndex.
 
-