Normally, SIMD8 tessellation evaluation shaders operate on a single
patch, with each channel operating on a different vertex within the
patch. For patch primitives with fewer than 8 vertices, this means
that some of the channels are disabled, effectively wasting compute
power. This is a fairly common case.