Goodhart’s Law

Many organizations are choosing to do away with performance reviews because they are found to be ineffective. That they are done poorly by managers is one reason. Goodhart’s Law may be another.

This week I met with Jamil, a regional sales manager, to talk about measuring the impact of his training program that is focused on developing behaviors to support better sales. Surprisingly, Jamil did not believe that metrics were valuable. At first, I thought he believed the training program or people were so good they did not need to be measured. However, it became clear that Jamil was struggling with setting performance measures that actually got him what was wanted from the sales force.

I’ve said it myself. “What gets measured, gets done”. In other words, measuring a desired behavior tends to promote the desired behavior. However, there’s a dark side to this mantra. It’s possible that the measure actually becomes the target—and that’s all you get. This is known as Goodhart’s Law:

“When a measure becomes the target, it ceases to be a good measure”.

For those who need jokes explained:

You are striving to achieve a difficult-to-define goal, G.

You formulate G* which is not G because it is more simple and more explicit than G. G* is a legitimately correlated indicator of G being achieved.

Your team is given the target G*.

As time goes on, every means of achieving G* is pursued because the system is set up so that a team member, the team, (and generally organizations) that aim at maximizing G* have a “competitive” advantage over those trying to juggle both G* and G. Competitive advantage because G* is the rewarded performance metric.

Eventually the correlation between G* and G completely breaks down.

You, like Jamil, might decide that measurement is hopeless.

Applied to this sales situation, for example, Jamil indicated that one desirable behavior is to make new account calls, and a common metric is to measure the number of calls made in a day This measure became the sole focus of the sales force and Jamil saw a drop in revenue generated. As a result of the focus on this one metric, there was likely little focus on taking the time to build rapport, for example, with the potential customer because the motivation was to get on to the next call—the number of calls is what is being measured.

On the opposite end of the sales continuum, Jamil could simply measure the number of units sold. If I were the customer, there’s no one I’d rather buy from than a sales person whose performance is measured by the number of units sold. These are the easiest with whom to negotiate as price is not the measure. “Well, I’ve got a deal for you”, they might say.

Few metrics are immune to Goodhart’s Law. A good indication is to ask if one can achieve the goal without actually adding any value to the system whatsoever. In that case, the goal is susceptible. Pick any one of the performance metrics in your area, and see if it’s possible to score a perfect on G* and not move the needle on G.

Some will get frustrated when this happens and choose to do away with metrics altogether. Focusing solely on the “big G” goal. But that goal is really hard to define. It is hard to know if the behaviors being done day-to-day across an organization are actually moving you toward the goal (G) and how fast. Performance metrics are valuable leading indicators to tell us and others if we are headed in the right direction, giving us time to adjust as needed. Without them, the workplace can be chaotic with people moving in many directions and the managers spending a lot of time wondering if team members are actually making progress fast enough toward G and whether adjustments should be made.

There is no solution to Goodhart’s Law. How do we deal this?

For Jamil, the first was to recognize there is language to explain what he was feeling. This has been seen before. The second was to start recognizing that many organizations are more focused on G* than on G. Educational institutions and hospitals (think large, ambiguous goals with lots of behavioral measures) are classic places where Jamil found Goodhart’s Law in play. But look at your own self, family, team and organization too. This awareness is powerful. Third for Jamil was acceptance.

We invoke following the spirit of the law instead of the letter of the law when we feel someone has gamed the system. This too was Jamil’s attitude. In our first conversation, Jamil pined that his team should understand and follow the spirit of the performance goals, not the letter of the goals. This, however is a complex conversation that requires a lot of dialogue to understand the nuances and intentions, meaning, purpose and motivations. In a complex organization alignment at all of subtle points is near impossible. Groups who choose to have no performance review must follow this path.

Jamil has found that a “balanced scorecard” is a way to get closer to what he wants to develop great conversation. For example, on his balanced scorecard Jamil set one goal as the number of new business calls and another goal as the number of follow up calls with those clients; the number of units sold and profitability on those units. Think safety and speed. Think quantity and volume. These examples are in total opposition of one another. It is this paradox that creates conversation about setting realistic goals for each metric. More measures of this type give you a performance closer to “G”—closer to what you really desire; reduces potential for abuse, and a better understanding of the tradeoffs being made by focusing one goal over another.

Any measure is imperfect. Some are better than others. This is not a perfect for Jamil, but he has brought him team in to discuss the impossible scorecard he has created. The conversation will be rich.

Look around your own space for those G* metrics that can be gamed. How could you balance at that measure?