I didn’t catch this because my hand calculated value of -36.0 was also incorrect, which should have been -49.0. Now all tests pass:

[info] + penalize having gaps between the columns

It feels more logical now, but the penalty isn’t nudging the game to actual scoring or creating the right set up. First, I want to separate the reward component and penalty component of the utility function so it’s easier to test. Second, instead of the gaps, let’s try penalizing the heights in general.