Another thought is versioning. There are likely to evolve many versions of the same API. It may make things easier for developers if the function hash included a version number. The version number would be the same for all functions declared in the same namespace. This version number could be appended to the function signature.

Since you apparently have a uint8 why not represent bool with that? Infact, wonder if N%8==0 isnt somewhat arbitrary given how the EVM works, now people might still end up taking uint8s and stripping them for 8 bools manually. Note that uint40 goes with that scheme..

I did notice that, as-is, the first and last bits of a slot are cheapest to select; x%2**N and x/2**(256-N) as compared to the middle L; (x/2**N)%2**L Of course, if you add opcodes to chop slots up, the byte-sized approach is faster.

Also you never set calldataload you do set memory/storage slots. Probably only want to chop storage slots up, though. Still, to set on you then first need the access the existing one, strip the bit you do not want to set.. Costs there too. contract.storage[i] = (contract.storage[i]/2**N)*2**N + value_N_bits at the beginning, in the middle: contract.storage[i] = (contract.storage[i]/2**(N+L))*2**(N) + contract.storage[i]%2**N + value_N_bits

Tbh though, seems to me just op-code-for-selecting-bytes/set-selected-bytes is the way to go, unless i am missing something. Can forget this whole story then.