Input: Fixed bug that caused non-keyboard buttons to be detected as keyboard input

GX2: Improved implementation for GX2CalcTVSize() and GX2CalcDRCSize()
GX2: Shader code optimizations. Up to 40% faster compile time for float-only shaders (measured on NVIDIA)
GX2: Added support for shader OP3 CNDGT_INT instruction
GX2: Added support for vertex format FMT_16_16_16_16, nfa=0, signed=0
GX2: Fixed software streamout reading format 32_32_32_FLOAT incorrectly
GX2: Added support for vertex shader gl_PointSize export
GX2: Fixed a race condition in which the GPU7 command processor could run ahead of the current write pointer before GX2Init() was called
GX2: Fixed sampler min and mag filter value being read from wrong register bits
GX2: Added support for streamout binding the same buffer as input and output