http://www.greenbridge.com
Thu, 17 Apr 2014 19:47:20 +0000enhourly1http://wordpress.org/?v=3.2.1Reinhart and Rogoff story shows how important it is to check your indicatorshttp://misleadingindicators.com/?p=1081
http://misleadingindicators.com/?p=1081#commentsTue, 30 Apr 2013 21:34:56 +0000Phil Greenhttp://misleadingindicators.com/?p=1081This month two academic researchers showed that there were errors in the calculations behind the claim by economists Carmen Reinhart and Kenneth Rogoff that the relationship between government debt and real GDP growth is weak for debt/GDP ratios below a threshold of 90 percent of GDP. Above 90 percent, median growth rates fall by one percent, and average growth falls considerably more.

Reinhart and Rogoff’s finding has been cited widely during the economic crisis of the last few years. Turns out there were some simple spreadsheet errors.

It does not matter how many degrees you have or what prestigious institution you work for, errors in indicators can still happen, and are common. Rogoff works at Harvard University and Reinhart at the University of Maryland. Their paper was published by the National Bureau of Economic Research.

Something similar happened when scientists were trying to measure ozone concentrations over Antarctica in the 1980’s. NASA scientists had programmed their satellite to flag measurements under 180 Dobson units (a measure of ozone) as possible errors. They original excluded them. Meanwhile scientists at the Amundsen-Scott ground station at the south pole were reporting measurements of 300 Dobson units. The NASA scientists figured their satellite was reading incorrectly. In fact it was the ground station that was making the measurement error. These problems combined to delay the discovery of the ozone hole by about eight years.

]]>http://misleadingindicators.com/?feed=rss2&p=10810Top ten reasons for reading misLeading Indicators: How to Reliably Measure your Businesshttp://misleadingindicators.com/?p=1066
http://misleadingindicators.com/?p=1066#commentsFri, 29 Mar 2013 21:02:22 +0000Phil Greenhttp://misleadingindicators.com/?p=1066There are many books that tell you what to measure to succeed in a business strategy or to improve business performance. Ideas on what to measure may indeed be very useful. But these ideas won’t help you, and may even be harmful, if you do not know whether what you are measuring is misleading you.

misLeading Indicators reveals the hidden and potentially misleading nature of indicators that can make or break a business.

Here are our top ten reasons you should read misLeading Indicators:

It will provide you with four clear principles for determining which indicators and measurements can be trusted, and which mislead; it illustrates these principles with many indicators from across a wide spectrum of businesses and functions.

It shows how common measurement clichés (e.g. “you can’t manage what you can’t measure”) and metaphors (e.g. automobile and airplane dashboards) can lead you to make misleading interpretations of indicators.

It will show you how to determine whether indicators based on counts are reliable (e.g. inventory, opinion polling).

It will show you how to determine whether indicators based on instrument measurements are accurate and precise (e.g. process measurements, temperature).

It will show you how to determine whether rankings and ratings are reliable (e.g. customer satisfaction ratings, audit scores).

It will show you how people, knowingly or unknowingly, manipulate and modify indicators to make them misleading.

It will show you techniques to develop indicators that focus employee efforts and will show how some indicators, by oversimplifying and glitzing up information displays, misdirect their efforts.

It will show you how common indicators of time series, such as Statistical Process Control charts, are misleadingly explained, justified and interpreted.

It will show you how averages can distort indicators and mislead about the underlying data behind.

It will show you why probability, and thus risk, cannot be measured, and why attempts to create measures of risk so often lead to spectacular failures such as mine explosions and business failures.

]]>http://misleadingindicators.com/?feed=rss2&p=10660How Western Electric rules mislead in statistical process controlhttp://misleadingindicators.com/?p=1018
http://misleadingindicators.com/?p=1018#commentsTue, 26 Feb 2013 17:53:12 +0000Phil Greenhttp://misleadingindicators.com/?p=1018The statistical model behind control charts for in-control processes is based on the assumption a Gaussian process with no autocorrelation (i.e. independent) with a constant mean and constant variance: in other words a white noise process. The various Western Electric rules try to find patterns that are not white noise, and thus show that the process is out of control.

It is quite easy to do a simple experiment to illustrate the flaw in the Western Electric rules. Generate, using some statistical software, several columns of Normal “random” numbers. Then apply the Westinghouse rules. You will see that most of the columns fail the Westinghouse tests, even though they are by definition “white noise.” For example, I generated 10 columns of n=100, with a Normal distribution, and used Minitab to plot I-MR charts and apply all tests. All 10 columns failed at least 1 test.

Individuals SPC chart on simulated Gaussian data. It failed four of the Western Electric rules.

The Western Electric rules in this experiment conclude that all 10 columns are out-of-control, or “non random” (whatever that is), even though I generated the data with a so-called random number generator. The rules state (see here for example) that the probability of an out-of-control process (for one we know to be in control) is very small. For example, the probability that eight points in a row will be on the same side of the centreline is (supposed to be) 1/256. What went wrong?

Several things went wrong. The probability the rules are based on is the probability that the next seven (or whatever) future points fall into some particular pattern (for example, they are all above the mean). These are points that have not even happened yet. This is rarely acknowledged when people explain the rules.

This (false) probability is not useful to someone controlling a process. What is useful is the probability that the process is out of control, given the measurements you already have (and your knowledge about the process and how it works).

Statistical Process Control and six-sigma promoters turn around and pretend that these probabilities are the same. But the probability that you will find such-and-such a pattern on a control chart, using measurement that you already have, is most definitely not equal to the probability that the pattern will occur in the next few measurements. To calculate the probability that particular points that have already been measured indicate an “out-of-control” process, one would have to use a very different procedure than the one used by the Western Electric rules. Our simulation example illustrates this.

The second thing that went wrong is that there are multiple tests on the same data. This changes the probability that there will be a “false alarm,” in other words, a signal that the process is out of control when it is not.

The third thing that went wrong, or that often goes wrong, is the way people interpret the probability. Take the simple rule that says a process is out of control if a measurement goes outside a three standard deviation control limit. The chance of that happening in the next measurement, for a white noise process, is about 0.27%. On average, if we had 10,000 data points, we would expect about 27 to be outside these limits. In another experiment, I generated 100 columns of n=100 Normal (i.e. Gaussian) data, or 10,000 data points and ran that test. There were 27 instances where a point was outside the limit, as expected. These 27 instances came from 24 different columns. It would be tempting to infer that this means that 24 out of the 100 columns were “out of control.” But that is not what it means at all.

Philip Green and George Gabor are co-authors of misLeading Indicators: How to Reliably Measure Your Business, published by Praeger. www.misleadingindicators.com

]]>http://misleadingindicators.com/?feed=rss2&p=10180The Department of Justice will have a tough time proving that Standard and Poors “inflated” its bond ratings.http://misleadingindicators.com/?p=1004
http://misleadingindicators.com/?p=1004#commentsTue, 19 Feb 2013 18:19:15 +0000Phil Greenhttp://misleadingindicators.com/?p=1004The US Department of Justice has launched a civil law suit against the rating agency Standard & Poors, alleging “S&P issued inflated ratings that misrepresented the securities’ true credit risks” and that “S&P falsely represented that its ratings were objective, independent, and uninfluenced by S&P’s relationships with investment banks when, in actuality, S&P’s desire for increased revenue and market share led it to favor the interests of these banks over investors.” The suit targets S&P’s ratings of Collaterized Debt Obligations (CDOs) and Residential Mortgage-Backed Securities (RMBS).

S&P was the only rating agency to downgrade US government’s debt in 2011, from AAA to AA+. Does being the only agency that downgraded US debt imply that S&P “deflated” its rating and that the other agencies were “accurate”? Or does it mean that the other agencies “inflated theirs” and S&P was accurate? or neither?

A rating is like a guess that helps to answer the question “what is the chance that the issuer of this security will default?” Professor Matthew Sands, in a recorded delivery of one of the Richard Feynman lectures on physics at the California Institute of Technology on October 13, 1961, described guessing and chance like this: “by chance, we mean something like a guess. But why do we make guesses? We make guesses when we want to make a judgment when we have incomplete information or we have uncertain knowledge.”

Honest ratings are educated guesses made after carefully considering information about the issuer and evaluating it according to proprietary methods used by the rating agencies. They are not measurements or statements of fact. The agencies do not have, and cannot possible have, complete information about the issuers and what the economy will do.

An educated guess is wrong if what happens is different from the guess, but that does not necessarily make it a bad guess. If you look at the weather in the morning and see dark clouds, it is reasonable to guess that it will rain and to bring your umbrella. If it does not rain, your guess was wrong, even though it was the right guess to make at the time given the (incomplete) information you had when you made it.

Ideally, two people with the same information should make the same guess. Suppose you made your guess about the weather based solely on your observation of the clouds. The weatherman would probably have made a different guess because he has much more information about the weather, and more knowledge about how to make guesses with it. But two weathermen with the same information should, ideally, make the same guess, even if they analyze it and reason with it differently, just as two neighbors who are looking at the same clouds should both guess that it will rain. (We discuss this at length in our book, misLeading Indicators).

But obviously the world is not ideal. Different people have different information, and reason with it differently to make different guesses, and different bond ratings.

To prove that S&P inflated its rating, the DOJ will have to prove two things. First, that the rating was not reasonable given the available information to S&P at the time they made it. The fact that the RMBS’s and CDOs imploded is not evidence that the ratings were not reasonable, any more than the absence of rain proves your guess that it would rain was unreasonable. Second, the government will have to prove that the ratings were a deliberate attempt to mislead, rather than the product of the inherent difficulties of guessing with incomplete information in a non-ideal world.

This will prove to be difficult. S&P has issued a statement saying that “every CDO cited by the DOJ also independently received the same rating from another rating agency.” When S&P downgraded US debt in 2011, Daniel Indiviglio, then associate editor of The Atlantic, criticized S&P of being too hard with its rating.

But the DOJ has not since charged S&P for deflated ratings. Nor has the DOJ charged anyone for issuing inflated stock market ratings. There is no shortage of candidates.

A study by Weiss Ratings, an independent company that issues ratings on financial institutions, analyzed “buy,” “sell,” and “hold” ratings issued to companies that had gone bankrupt. The company looked at a total of fifty investment banking and brokerage firms that had issued ratings to nineteen companies that filed for Chapter eleven in the first four months of 2002. The Weiss study found that “ratings publicly available from 94% of Wall Street firms continued to recommend that investors buy or hold shares in companies that went bankrupt in 2002, right up to the very day these companies filed for Chapter 11.”

That’s like giving AAA ratings to bonds right until the day they tank. Bond ratings have a far superior track record, as the chart below shows.

It is a trivial fact that S&P was wrong with some RMBS and CDOs, as were many others. Using this as a basis for a legal action is wholly arbitrary, especially given that the DOJ ignores many other ratings that are also wrong. If bond rating agencies and weatherman face legal action for making guesses that turn out wrong, they’ll just stop making them.

]]>http://misleadingindicators.com/?feed=rss2&p=10040The bond trader’s fallacyhttp://misleadingindicators.com/?p=986
http://misleadingindicators.com/?p=986#commentsMon, 28 Jan 2013 22:25:45 +0000Phil Greenhttp://misleadingindicators.com/?p=986A few weeks ago one of us was sitting beside a retired bond trader at a luncheon. Both interlocutors being interested in probability and its applications in business, there ensued a disagreement about whether in a sequence of coin flips the flips were independent. The bond trader argued that anyone who believed they were not independent was guilty of the gambler’s fallacy.

Now we would hope that a bond trader would know what the gambler’s fallacy is and how to avoid it while trading profitably in bonds—usually with other people’s money—rather than trading like a gambler suffering from fallacious thinking.

The gambler’s fallacy goes like this. In games of chance gamblers sometimes mistakenly think that the result they are betting on must come soon because what has recently happened has departed from what they expected to happen. For example, in a game involving the rolling of a die, the gambler is betting on a roll of one. A one has not been rolled in the many rolls, way beyond the usual fraction of 5/6, so the gambler reasons that a roll of one is “due” and increases his bet.

There are two flaws in the gambler’s reasoning. The first is thinking he “knows” the probability of getting a one. Probability is not a physical property of the die that he can know. The probability is something he should assign given his state of knowledge about the die—including the outcomes of past rolls—and the way it is being rolled.

The second flaw is that he thinks that the probability of rolling a one increases because there has not been a one for a while. The gambler is falsely reasoning that the die too must somehow be counting, and that it will somehow “know” when it’s time to roll a one.

In a simple game of chance in which long-term outcomes are consistent and conform to predictions, the first flaw is logically serious but practically insignificant. If somehow, through some kind of divine intervention, we actually “knew” the probability of an outcome, then, and only then, would the rolls of the die be logically and physically independent. You would not be able to learn anything from the coin flips that will enable you to improve upon the inferences about future flips. You “know” the probability, so the past, the data, can’t help you any further. Thus the logical independence of future tosses from past tosses.

In business and life there cannot even be a pretence that we “know” the probability. We deal with uncertainty by assigning probabilities to outcomes based on our state of knowledge at the time. Probability is never “known,” and probability assignments are continually updated based on new information.

We’ll modify our bent coin example slightly (from our last post). Somebody hands you a bent coin. All you know about it is that it is bent and that it has two sides. You have no choice but to assign a probability of ½ to getting a

head. Then you start flipping, curious whether it will land more often heads or tails. Every new flip tells you something new about the coin and the experiment. Let’s say your first few flips produce heads: it’s reasonable then to assign a slightly higher probability to heads than tails. As you keep flipping you notice there is roughly a two to one ratio of heads to tails. By that time, it makes sense for you to assign a 2/3 probability for heads. You could reasonably infer that if you kept flipping you would get heads and tails in approximately a two to one ratio.

This shows that the flips are not logically independent because you learn from your past flips to make inferences about future flips. The flips are, though, physically independent because the flips have no physical effect on each other. Independence, like probability itself, is a logical, not a physical, concept.

The bond trader was wrong.

Bond rating agencies assign probabilities in the form of ratings. The ratings are based on what they know about the bond issuer, the economy and so on. As they learn new things, they review, and sometimes update, their ratings—just like you would do if you were flipping a coin.

Bonds are trickier than bent coins though—it is as if the coin keeps bending in new ways. For this reason the frequency of defaults for any given rating will vary. Between 1970 and 1999, the default rates for speculative grade securities rated by Moody’s dropped from 9 percent in 1970, down to around 1 percent between 1971 and 1981. They climbed up to over 10 percent by 1991, dropped to 2 percent, and slowly started climbing again. With every major swing, there were systemic changes in the economy. In 1991, for example, junk bond financing was very tight and overall macroeconomic conditions were weak, causing higher defaults of junk bonds.

If we treated bond defaults of similarly-rated bonds as independent events with known probabilities we would, in essence, be deciding that we cannot learn anything from the defaults that would help us make useful inferences about bonds that have not yet defaulted. And that would be a very serious mistake.

]]>http://misleadingindicators.com/?feed=rss2&p=9860Lance Armstrong doping case and bond defaults show challenges of probabilistic reasoninghttp://misleadingindicators.com/?p=967
http://misleadingindicators.com/?p=967#commentsMon, 28 Jan 2013 21:30:02 +0000Phil Greenhttp://misleadingindicators.com/?p=967In an earlier blog post (here) we wrote that, given the evidence available to us at the time, Lance Armstrong was probably not guilty of doping. The main line of our argument was that hundreds of doping tests from certified laboratories using accepted procedures had not found dope. We restricted the evidence to this, in the same way a court of law restricts the evidence that can be used to determine the outcome of a trial.

Lance Armstong. Photo: Brad Cambert/Shutterstock.com

We restricted the evidence for two reasons. First, the definition of “doping” is that an athlete fails the test on both halves of a split sample. If an athlete passes, anti-doping officials should leave him or her alone or athletes will never be able to rise above suspicion. Second, the evidence we excluded was weak anyway (the testimony of other athletes, and EPO tests that could not be reproduced).

We believe that we made the right inference—a restricted inference—about Armstrong at the time. In other words, it was the right inference to make with the information we had and chose to use. Then Armstrong confessed to doping. So we were inferentially right and factually wrong. There is no contradiction here—it happens all the time, from flipping coins to rating bonds.

Suppose someone hands you a bent coin and tells you that he just flipped it 1,000 times. It came up 1/3 of the time one way (either heads or tails) and 2/3 the other way (either tails or heads). But, crucially, he does not tell you which. Now, he asks, what probability would you assign to getting a head in the next flip? (We’re assuming for the sake of this argument that there is no attempt to control or direct the outcome of the coin flipping: that is, it is flipped in the usual, vigorous way.)

What would you answer? You are missing some key information. You have no choice but to answer ½—even though you know that if you flip the coin another 1,000 times you will not get some number of heads close to 500 but closer to either 333 or 667! The assignment of a ½ probability is inferentially right even though you know that it does not correspond to the outcome you will get if you flip the coin many times.

There is no need to flip the bent coin another 1,000 times before updating your probability assignment of ½. After each flip you can—should—update your assignment of probability. It won’t take long for you to reach a new probability assignment of either 1/3 or 2/3. That does not make the original ½ assignment wrong—it was the right conclusion at the time.

When rating agencies rate bonds they use similar reasoning. When they give an initial rating they use all the information available to them at the time and come up with a rating such as AAA. As the conditions change for the bond issuer and new information comes to light—financial problems, economic problems—the rating agency updates the rating.

Suppose the agency downgrades the bond to AA+. Then two weeks later the bond issuer defaults and declares bankruptcy. Was the rating wrong? (Assume there is no corruption or collusion and that it is an honest rating). No—it was the right inference at the time.

There is a quotation attributed a variety of people, including Keynes, Samuelson and Churchill that sums things up: “When the Facts Change, I Change My Mind. What Do You Do, Sir?”

“The mental health effects of any given disaster are related to the intensity of the exposure of the event. Sustaining personal injury and experiencing the injury or death of a loved one in the disaster are particularly potent predictors of psychological impairment.”

The research paper from which the above quote was taken was published shortly after Hurricane Sandy, and was based on “data from 284 reports of disaster-related Post Traumatic Stress Disorder” papers. Theodore Dalrymple in PJ Media rephrased the paper’s conclusion: “In other words those who suffer more suffer more.” Why is such a completely obvious conclusion published in a supposedly respectable medical journal?

Lord Kelvin’s lecture to the Institute of Civil Engineers in 1883 has become an indictment of sorts of those who attempt to discriminate between the measurable and the unmeasurable, and has encouraged people to measure everything they manage. He said,

“I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind.”

People wrongly interpret Kelvin’s comment to mean, “If you cannot measure it, you are ignorant.” But this is not what he meant. Kelvin was talking specifically about the physical sciences. He was making a passing remark, not stating a generable principle of measurement that was applicable to fields outside the physical sciences. False interpretation of his remark has caused a lot of damage—some call it the Curse of Kelvin. Lord Kelvin (or William Thomson, his given name) was more concerned with the existence of intelligible physical properties than with their measurability. He knew that you have to be very careful about what inferences or conclusions you can draw from measurements.

His remark has also fooled people into believing that measurements can put a patina of scientific respectability on “research” that is frequently just the painful elucidation of the obvious, such as the medical research referred to above.

Business folks often fall into the same trap. But others, including entrepreneurs like Steve Jobs, launch great new business ventures on hunches and their own vision without measurement; great leaders have inspired their followers to untold feats without measurement through inspiring visions.

Though Lord Klevin can be faulted for infelicitous language, we cannot blame him for what can be called “physics envy” (after that Freudian nonsense, penis envy). Physics envy actually does exist, and its pedigree goes back to the early 19th century when Newtonian physics reigned supreme and enjoyed a row of magnificent feats such as the finding of the planet Neptune by calculation. The world was imagined as clockwork driven by Newton’s laws.

The budding social sciences thought themselves part of that world of numbers; the world of measurables and predictability. The very aspiration for the status of being called a science was bound up in that positivist hope: If we can express it in numbers, it is science. The telling concept mecanique sociale expresses these hopes for the social science succinctly. It all floundered, of course. And we now know why: irreducible complexity (non-linearity), and the lack of sharp, well defined concepts inherent in the enterprise. Unfortunately it did not dampen the physics envy-driven enthusiasm for achieving scientific status that is found in business literature and softer scientific fields. The frequent result: endless, undigested, and indigestible numbers, questionable conclusions, or vapid trivialities stating the obvious.

The American social scientist Daniel Yankelovich described the descent into measurement hell that results from the curse of Kelvin:

“The first step is to measure whatever can be easily measured. This is OK as far as it goes. The second step is to disregard that which can’t be measured or give it an arbitrary quantitative value. This is artificial and misleading. The third step is to presume that what can’t be measured easily isn’t very important. This is blindness. The fourth step is the say that what can’t be easily measured doesn’t really exist. This is suicide.”

]]>http://misleadingindicators.com/?feed=rss2&p=9550Core earnings: Is new metric misleading?http://misleadingindicators.com/?p=950
http://misleadingindicators.com/?p=950#commentsMon, 03 Dec 2012 13:33:13 +0000Phil Greenhttp://misleadingindicators.com/?p=950Brian Milner, a journalist at the Globe and Mail’s Report on Business, picked up on our blog post on core earnings and cited it in this article this morning.
]]>http://misleadingindicators.com/?feed=rss2&p=9500Core earnings don’t get to the heart of profitabilityhttp://misleadingindicators.com/?p=940
http://misleadingindicators.com/?p=940#commentsTue, 20 Nov 2012 20:51:49 +0000Phil Greenhttp://misleadingindicators.com/?p=940A couple of weeks ago Manulife Financial, a large Canadian insurance company, introduced a new metric called “core earnings.” They reported a net loss of $227 million in the third quarter, but $556 million in coreearnings.

The problem with core earnings as an indicator is that it attempts to determine the revenue from the main or principal business, as opposed to the supposedly ‘minor’ or ‘secondary’ business of exceptional items. The goal is to get rid of the ambiguity caused by earnings statements that report or exclude “exceptional,” or “special” items. But these items are all too frequent.

Manulife is joining many other firms reporting core earnings. The company’s report said:

“In the third quarter of 2012, we are introducing core earnings, a new metric, to help investors better understand our long-term earnings capacity and enterprise value. Core earnings measure the underlying profitability of the business and remove mark-to-market accounting driven volatility as well as a number of items that are material and exceptional in nature. While this metric is relevant to how we manage our business and offers a consistent methodology, it is not insulated from macro-economic factors which can have a significant impact. In the third quarter of 2012, Manulife generated $556 million of core earnings.”

Exceptional items have a disturbing habit of not being so exceptional. Benjamin Graham, in his classic book The Intelligent Investor, described the case of Alcoa back in 1968. The company had a number of special charges in their annual report, including the building of a wall. Graham says “the alert investor might ask himself how does it happen that there was a virtual epidemic of such special charge-offs appearing after the close of the 1970s, but not in previous years? In some case they might be availed of to make subsequent earnings appear nearly twice as large as in reality.”

What defines a special item? In a large corporation there are many special events occurring in any given year. They are part of the ordinary episodic life of business in tumultuous markets. As the Roman poet Ovid said, “Nothing continues the same for long.” Perhaps many business people believe they know what the True State of Their Business Affairs should look like, and any event that is not part of the True State should be cleansed so you can see how much money the business was supposed to make.

Accounting rules state that extraordinary items are supposed to be both unusual and infrequent. But they are now so frequently reported that accounting professors conduct large statistical studies of them. One study analyzed 63,875 firm-years (one company for one year) between 1988 and 2002 and found that, on average, 19 percent reported special items, ranging from a high of 51 percent in low-accrual firms to 12 percent in high-accrual firms. They concluded that “special items reflect underlying economics and are indicative of firms that have over-invested in strategies that have not worked.”

If 19 percent of firms in a typical year report special items, which reflect underlying economics and strategies that have not worked, they are not very special. The September 11 attacks were extraordinary events, certainly unusual and hopefully infrequent, in the minds of most of us, yet the US Financial Accounting Standards Board did not allow companies to treat losses they incurred from these attacks as special items.

Earnings are, strictly speaking, not a measurement at all. A measurement has a clear standard to measure against. There is tremendous discretion in determining earnings. When reporting earnings there is leeway, for example, in determining when to book revenue, choosing a depreciation rate, deciding how to treat inventory and inventory write-downs, and reporting pension fund growth. It is thus more of a characterization than a measurement.

There is much less discretion in cash flow as an indicator (see these posts by MarketBeaters here and here), although there is still some leeway in determining it. Consistent positive cash flow, may be, for this reason, a good indicator of the wealth-generating ability of a company. If a company does not generate positive cash flow it could go belly up even though it has (positive) earnings.

The usefulness of core earnings as an indicator of profitability will only be proven with the test of time.

Philip Green and George Gabor are co-authors of misLeading Indicators: How to Reliably Measure Your Business, published by Praeger. www.misleadingindicators.com

]]>http://misleadingindicators.com/?feed=rss2&p=9400The biggest storm? That’s hot air.http://misleadingindicators.com/?p=927
http://misleadingindicators.com/?p=927#commentsThu, 01 Nov 2012 01:25:01 +0000Phil Greenhttp://misleadingindicators.com/?p=927After every big storm there are always attempts to rank it to see how it compares to other biggies. That certainly happened with this week’s “Frankenstorm” Sandy (for example here and here). There are so many ways to rank storms that there will likely be some way every major storm will earn an impressive rank. But such rankings are often as easy to blow apart as a trailer park in a tornado.

The total value of damage is typically used to rank storms. Terrence Corcoran at the Financial Post smashes that one apart here. If you use the total economic damage, as the US Weather Service does, and adjust it for inflation, you get what Corcoran calls a misleading indicator (excellent choice of words!). To avoid being misled, you need to adjust for what damage a storm would have caused had it landed in the same place today, when there is more property in harm’s way, not the damage it caused when it actually did land. Professor Richard Pielke at the University of Colorado has done just such a calculation (here) and found that the most damaging storm to land within 50 miles of Sandy’s landfall was in 1938. The trouble is that such calculations would involve a lot of educated guesses—they are not measurements taken from the storms themselves.

Philip Eden, past president of the Royal Meteorological Society, writing in the Daily Telegraph (Sept 30 2012) pointed out that there are many different ways to measure how “big” a storm is. There is the central pressure of the depression—the lower the worse. You could measure the total amount of rainfall or snowfall, or the amount of rain or snow in a three-day period. You could compare the maximum wind speeds, the maximum or minimum temperatures, or the height of the storm surge. Then there is always the size of the storm, in miles from edge to edge.

Every storm is different from other storm in one way or another. That’s what makes storms so frightening, fascinating, and hard to compare.

]]>http://misleadingindicators.com/?feed=rss2&p=9270Why it is so easy to be misled by pollshttp://misleadingindicators.com/?p=910
http://misleadingindicators.com/?p=910#commentsThu, 11 Oct 2012 16:46:29 +0000Phil Greenhttp://misleadingindicators.com/?p=910Pat Caddell, the former adviser to President Jimmy Carter, accused the mainstream media (here) of using polls they know are inaccurate as weapons against Republicans. “When I have polls that have the preference of Democrats over Republicans higher than it was in 2008, which was a peak Democratic year, I know I am dealing with a poll that shouldn’t be reported,” he said.

Unskewedpolls.com claims it has found a way to determine how skewed (or inaccurate) polls are. On average, it says, between 10th and 30th of September polls were skewed 6.1% in favour of Obama. Their poll is on-line, and you can find it here. Its methodology page says it is “all about getting it right and putting out accurate numbers on where the political races and other questions of public opinion really stand as of now.”

While we cannot rule out the possibility that there are some shady pollsters in the dirty business of politics, there are many common misconceptions about how polls work and how to interpret them. With less than a month left to till the US election, many people are glued to the polls, attacking the polls or trying to explain the polls. No wonder: polls not only try to measure voting intentions, they can change them. And people are easily misled by them. This is not because polls are generally faulty and, therefore, misleading, but because people do not understand polling numbers and how pollsters sample, and are therefore easily mislead.

How does unskewedpolls.com they know that polls are “accurate”? You can check the accuracy of a thermometer by sticking it in ice water and boiling water. But with what do you calibrate an opinion poll? Comparing the polls with election results is hardly a calibration, because people change their minds between the poll and the ballot.

We blogged here about the Alberta election last April, in which none of the 24 opinion polls done before the election came within 3% of the final election result, despite the claim that polls were accurate to within 3% (or so) 19 times out of 20.

That does not mean the Alberta polls were inaccurate in the sense that a thermometer that reads too low or high is inaccurate. It comes down to a misunderstanding of what the “19 times out of 20” means. It is a probability statement about the sampling method.

There are two alternative ways to explain it.

Suppose a pollster surveys 1,000 people and finds 48% support Obama and 47% support Romney. Those three numbers (1,000; 47% and 48%) are used to calculate how close (e.g. 3%) the results will be in an imaginary long sequence of surveys of the same population, 95% of the time (i.e. 19 times out of 20). But populations only stay the same over long periods in statisticians’ imaginations. Real populations keep changing their minds. Surveys are snapshots of an ever changing landscape subject to violent earthquakes. The pollsters are only telling you something about the quality of the camera lens that takes the snapshot with the 19 times out of 20 business.

Alternatively, it means that in some future survey of the same population that uses the same survey method, there is a 95% probability that the estimate from the survey will be within plus or minus three percent of the value in the whole population.

The key word there is ‘future’. It does not tell you there is a 95% probability that the estimate from the survey is within plus or minus three percent of the value in the population either as it was when you sampled it, nor as it is now.It could be right on, or it could be off by 10%. That is because usually you don’t know how well a particular sample of people represents the population, and because people are always changing their minds. Indeed that is exactly the point of election campaigns—as Albertans appear to have changed their minds.

So it turns out that the 95% number is really only a general guide to the precision of a polling method in a static population, not a guide to a particular sample in a changing population.

Some critics attack polls because of the way the samples are taken. This blog post on PJ Media reports that only 9% of Americans answer polling questions today. Charlie Martin says this makes the polls “nearly useless.”

Martin says that such problems mean that the “sample isn’t really random.” This is a common misconception: there is no such thing a random sample. Once you have a sample, it is what it is. Suppose I give you two samples and say one is “random” and one is not, but don’t tell you which is which. You look at the two samples and they are identical in all respects. Could you tell them apart? No. The same is true of representative samples.

There is such as thing as a random sampling method, but only a reckless pollster would use a purely random sampling method, because purely random sampling throws out all the information you need to get a representative sample.

A low response rate would only make polls useless if the 9% of the Americans who answered pollsters’ questions were unrepresentative of the Americans who voted—for example if Democrats were more inclined to answer pollsters then Republicans, and you did not know that. If you knew, and knew by how much, you could correct the numbers. The problem rather is that such self-selected samples are nearly impossible to evaluate.

A representative sampling method creates miniatures of the population whose voting intentions are close to the population as a whole. Pollsters spend big bucks to devise such methods. But whatever method they devise, the people will still change their minds, and they cannot measure when they are going to do that.

Small mistakes can make indicators very misleading. A chemical plant once discovered a simple error in a laboratory procedure that gave them slightly high readings of water concentration in samples of their final product taken from rail cars. Millions of dollars of product had been downgraded and sold at lower prices because of this error. It was a simple fix.

The hole in the ozone layer went undiscovered for years. When very low ozone readings started coming into the satellite that was monitoring ozone over the Antarctic, the satellite’s software flagged them as possible instrument errors and set them aside. But ground-based readings also showed low ozone. Eventually scientists figured out what was going on. They were not instrument errors, and the ozone layer was decreasing.

A reality check often reveals these errors. I cannot ride my bike at 88kmh—that’s an obvious conflict with reality. Usually it is not so obvious. The engineers were suspicious about so many downgraded rail cars—it did not correspond with their view of what the plant was capable of producing. Ozone scientists had conflicting measurements about ozone.

The problem with my bike computer took a few days to figure out, but it was quick to fix. Nothing in the instructions helped me out. The computer measures the speed by tracking a little magnet fixed onto a spoke. Suddenly it dawned on me that I had not removed the magnet from the previous computer when I put on the new one.

Make sure your measurements and indicators pass a reality check, and never trust measurements blindly. For a reality check to work, you need a healthy suspicion about every measurement. Putting too much faith in numbers can be very costly.

]]>http://misleadingindicators.com/?feed=rss2&p=9010Rare eventshttp://misleadingindicators.com/?p=883
http://misleadingindicators.com/?p=883#commentsWed, 12 Sep 2012 18:46:18 +0000Phil Greenhttp://misleadingindicators.com/?p=883A lecture announcement from the Department of Mathematics and Statistics at Dalhousie University said that “this year our Distinguished Speaker is Professor Srinivasa Varadhan,” who will give a talk on RARE EVENTS on October 11. The abstract of the talk says that “we often have to make a quantitative assessment how rare event (sic) really is. A precise measurement of the probability which is very small is important. We will review some examples and methods for carrying this out.”

JohannesCompaan/iStockphoto

The false belief that probability is “out there”, something tangible and physical like temperature or weight that can be measured (instead of assigned based on allavailable information of which data is only one component) is unfortunately all too common among statisticians, and has spread to many other fields. Its application in business has contributed to disasters such as the recent spectacular losses of many banks, who “measured” risk, and what they thought were rare events, with a technique they call “value at risk”.

Notably, however, this belief has not spread to physics, and for good reason.

Physicist E.T. Jaynes said that the theories that lead to attempts to measure risk require “utter contempt” for the known laws of physics.

Probability theorist Bruno de Finetti said that “The abandonment of superstitious beliefs about the existence of Phlogiston, the Cosmic Ether, Absolute Space and Time, … of Fairies and Witches, was an essential step along the road to scientific thinking. Probability, too, if regarded as something endowed with some kind of objective existence, is no less a misleading misconception, an illusory attempt to exteriorize or materialize our true probabilistic beliefs.”

It would be good if talks by experts who were supposed to understand probability and risk, and why they cannot be measured, were not “rare events.”

Armstrong was tested for drugs perhaps hundreds of times during his cycling career. Every single test pronounced him clean. That is strong evidence against doping.

According to the USADA, “Numerous witnesses provided evidence to USADA based on personal knowledge acquired, either through direct observation of doping activity by Armstrong, or through Armstrong’s admissions of doping to them that Armstrong used EPO, blood transfusions, testosterone and cortisone during the period from before 1998 through 2005, and that he had previously used EPO, testosterone and hGH through 1996.” In words, the evidence is what some cyclists and others said. Personal testimony is weak evidence for doping.

The EPA announced in June this year that, based on blood samples collected in 2008 and 2009, “scientific data showed Mr. Armstrong’s use of blood manipulation including EPO or blood transfusions during Mr. Armstrong’s comeback to cycling in the 2009 Tour de France.” How strong is that evidence?

According to a 2008 report in Science Daily, the test for EPO is not very reliable. Researchers administered EPO to eight volunteers. Blood samples were taken from each volunteer after the initial phase of doping and sent to two labs that were certified to test for EPO. One lab pronounced them all clean, the other reported that they were all doped. As the doping went on and more samples were taken, there were more similarly contradictory results. So the “scientific data”, it would seem, is very weak evidence indeed.

Lance Armstong. Photo: Brad Camembert/Shutterstock.com

As we have blogged here and here, people do not use measurements in isolation when reaching some conclusion. They also use their background information. If the weight they place on their background information is much greater than the weight they place on the measurement, they will ignore the measurement altogether, or get another.

This happens in business all the time when people do not trust a measurement. Typically they demand another sample when some measurement shows a product or process to be defective or out-of-range. There may be some perfectly valid reason for not having faith in a measurement or instrument, or in the person who takes the measurement.

Suppose someone does an experiment to test a Mrs Smith for extra sensory perception. They provide data which shows there is an extraordinary large probability that she does indeed have ESP. It would be quite natural not to believe the claim anyway because there are many other possible explanations for the results—such as poor experimental design, bad data collection, or outright fraud, according to probability theorist ET. Jaynes.

But there was not a valid reason for the USADA to distrust the dope measurements. The tests were done by certified labs and were therefore, without evidence to the contrary, reliable.

Sometimes background information is quantitative and reliable, and new evidence contradicts it. Suppose several ore samples from a property being explored for a mine show it contains small amounts of gold—a low grade. That is the background information. Then a new drilling program begins. The first sample has large amounts of gold in it, indicating a high grade ore. It is normal in that situation to initially place more weight on the background information and less weight on the new sample. If there are more measurements that are consistent with high-grade sample, the weight placed on the background gradually decreases. Thus the measurements—if they point in the same direction—should eventually carry far more weight than the background information.

Is there any background information that the USADA should use when reaching a conclusion based on the hundreds of doping tests performed on Armstrong? None. Even if they personally believed Armstrong to be guilty of doping, that personal belief should carry no weight at all in reaching their conclusion. To put any weight on such a belief, when a man’s reputation is at stake, is prejudice. The negative tests results were all that was needed to throw out charges against Armstrong.

All the weight should be put on the hard evidence of the doping tests which showed Armstrong had not been cheating. The testimony and the EPO tests should be given virtually no weight.

The USADA procedure and conclusion amounts to evidence not that Armstrong was guilty, but that the USADA has invincible prior beliefs, due perhaps either to Armstrong’s extraordinary performance, or to a witch hunt by a careerist bureaucrat, that are well nigh uninfluenced by any data.

Philip Green and George Gabor are co-authors of misLeading Indicators: How to Reliably Measure Your Business, published by Praeger. www.misleadingindicators.com

]]>http://misleadingindicators.com/?feed=rss2&p=8740Obama’s chained inaccuracyhttp://misleadingindicators.com/?p=867
http://misleadingindicators.com/?p=867#commentsThu, 16 Aug 2012 20:32:50 +0000Phil Greenhttp://misleadingindicators.com/?p=867(This article by Philip Green and George Gabor first appeared as a special to the Financial Post on August 16 2012).

The Obama administration is considering adopting the “chained consumer price index,” as the principal measure of inflation upon which increases in payments to such things as Social Security would be tied. The Chained-CPI is lower than the CPI. Adjustments to payouts will fall behind price inflation, reducing the deficit. The administration and some members of Congress tout improved accuracy as a reason for the change. That claim is misleading. The prime reason is a self-serving desire to save money by stealth.

In 2003 the U.S. Bureau of Labor Statistics devised the Chained-CPI because it wanted an additional index that was closer to a cost-of-living index than a price index. In a pure price index, the basket of goods remains fixed and the index tracks the increase of prices in the basket. Consumers, though, have limited budgets, so they substitute products whose prices go up with products whose prices are stable or go down. By doing so, they incur a decrease–or at least a change–in their quality of life. Their consumption behaviour means that a pure price index does not track what they spend as they adapt to prices. The Chained-CPI attempts to track their spending on the items in the basket, rather than the prices of specific things in the basket.

Politicians and the financial press are understandably and justifiably celebrating the prospects of lower spending based on an “improved” CPI. The apparently lower inflation figure however will increase tax bracket creep, and thus taxes. It will also overstate GPD growth.

The proponents of the Chained-CPI justify the change by saying it is more “accurate.” Bloomberg says the “more accurate gauge of U.S. inflation … would yield immediate savings,” to the U.S. government: up to $300-billion within a decade. Payments for Social Security benefits that are tied to the CPI would decrease by $112- billion from 2012 to 2021. Social Security is the largest single expenditure in the U.S. budget.

“I don’t see how anybody can argue against having accurate formulas,” said Senator Mike Crapo (R). The co-chairs of President Obama’s National Commission on Fiscal Responsibility and Reform, said in the “co-Chair’s Proposal” that “current measures of inflation overestimate increases in cost of living” and that “adopting a more accurate measure of inflation would achieve savings government-wide.” The editors of Bloomberg say the Chained-CPI is “a more exact measure that accounts for the substitutions consumers make when a product’s price goes up.”

The justification of “accuracy” does not make sense when talking about inflation. It makes sense to talk about the accuracy of measurement instruments, such as a thermometer, because temperature is a real physical thing. The definition of “temperature” is clear, universally accepted, and firmly rooted in reality. You can check the accuracy of a thermometer by sticking it in ice water to get 0C and boiling water to get 100C.

Inflation is another story. You cannot do such an indisputable check on the accuracy of an inflation index. With any inflation index, the definition is the measurement “instrument” so it can’t possibly be inaccurate. It can, however, clash with a whole slew of other possible definitions according to which it will be deemed inaccurate. You cannot compare it to physical reality. You might be able to argue it is more practical because it is a better representation of reality. But be prepared for someone to disagree because they will see reality differently.

There is nothing wrong with having both a cost-of-living index and a consumer price index. The difference between the two is an indicator of the loss of standard of living caused by consumer price inflation.

But pretending that cost-of-living index can replace a price index would be like government meteorologists changing the way they report the temperature to adjust for consumers substituting autumn coats for winter ones.

Bloomberg explains the chained-CPI with an example of the kind of consumer behaviour it captures: shoppers stop buying Granny Smith apples because their price goes up and switch to cheaper red delicious apples, or switch to oranges if both become too dear. A consumer price index would track the increase in the price of Granny Smith apples; a cost-of-living index tracks what they spend on fruit within the constraints of their limited budgets. One tracks the price of apples, the other the cost of buying whatever fruit consumers can afford. Politicians with backbones could make the case that in austere times increases in Social Security payments should track the chained-CPI rather than the CPI, because in austere times you cannot expect to maintain your quality-of-life.

Economist John Williams, who runs a website called Shadowstats.com, says that base political motives are at work. He told us “the reasons for the CPI changes have been stated clearly by political Washington: to reduce the deficit by cutting cost of living adjustments to Social Security.”

When survival strategies in the face of price inflation are part of the way the government measures inflation, then distortion rather than accuracy seems to be the more apt description.

Philip Green and George Gabor are co-authors of misLeading Indicators: How to Reliably Measure Your Business, published by Praeger. www.misleadingindicators.com.

]]>http://misleadingindicators.com/?feed=rss2&p=8670What measuring Olympic athletic performance can teach businesshttp://misleadingindicators.com/?p=855
http://misleadingindicators.com/?p=855#commentsFri, 10 Aug 2012 02:00:59 +0000Phil Greenhttp://misleadingindicators.com/?p=855How should one measure the performance of countries at producing Olympic medal winners, and of the Olympic athletes themselves? Some members of the press are having some fun answering that question—and there are some lessons for business too.

The National Post played around with various ways of adding up the medal count to compare countries. You can add up the total number of medals each country wins, or you can give each medal—gold, silver or bronze—a score of 3, 2, 1 and add up the scores. Or you could just count the gold medals. The Economist looks at country athletic greatness another way: how many athletes does it take country to win a medal? The fewer athletes it takes, the greater is that country at winning medals. In other words, use an indicator or medal-wining efficiency. Each indicator tells you something different, and the countries rank differently depending on the method.

The Economist asks what measure could be used to measure Olympic “greatness” of athletes. Naturally that depends on how you define greatness. While Michael Phelps has won more medals than any other Olympian, the Economist points out that if you count the number of medals Olympic athletes win as an indicator of greatness, you essential “guarantee that the greatest Olympian will always be a swimmer,” because there are so many swimming events swimmers can enter compared to other sports. You could measure the longevity of athletes—over how many Olympics they win medals. Or versatility—in how many sports they win medals.

The real problem is defining whatever it is you are trying to measure. When it comes to measuring athletic performance, there is pretty much an infinite number of ways to do so. As in business performance indicators, all indicators are misleading indicators in one way or another. So choose your measurement poison, and make sure you know its side effects.

In these examples, they are trying to measure something that is defined by the suggested measurement process. All such instrumental “definitions” suffer from the countless other definitions that sound just as reasonable. But what, in the end, is actually being measured?

]]>http://misleadingindicators.com/?feed=rss2&p=8550Dillusions of importancehttp://misleadingindicators.com/?p=845
http://misleadingindicators.com/?p=845#commentsWed, 01 Aug 2012 02:36:02 +0000Phil Greenhttp://misleadingindicators.com/?p=845Economist Fredrich Hayek, in his 1972 Nobel lecture, criticised what he called the “scientistic” approach to economics. His criticism applies today to those who over-emphasize the importance of measurement in management. He said:

Unlike the position that exists in the physical sciences, in economics and other disciplines that deal with essentially complex phenomena, the aspects of the events to be accounted for about which we can get quantitative data are necessarily limited and may not include the important ones. While in the physical sciences it is generally assumed, probably with good reason, that any important factor which determines the observed events will itself be directly observable and measurable, in the study of such complex phenomena as the market, which depend on the actions of many individuals, all the circumstances which will determine the outcome of a process…will hardly ever be fully known or measurable. And while in the physical sciences the investigator will be able to measure what, on the basis of a prima facie theory, he thinks important, in the social sciences often that is treated as important which happens to be accessible to measurement.

V. E. (Andy) Anderson, once had been a senior executive with Tennessee Eastman, a division of Kodak at the time, said that when he was the manager of one of their large chemical plants, he would clandestinely watch employees coming to and from work through the plant’s gate at shift change. If they came in with feet dragging and left with a bouncy gait, he knew that they dreaded work and that they were happy to leave it, indicating there was trouble coming. There was no practical way to get that information through measurement. Management author and professor Peter Drucker wrote that senior executives know that a company is bound for extinction when it cannot attract or hold able people—another very difficult thing to quantify.

The best approach is to determine what things are important to your business, and then decide which of these can and cannot be measured. You do not need to be able to measure something to manage it, despite popular management clichés. Just because something can be measured does not necessarily make it important, or even useful.

Hayek went on to say:

Why should we, however, in economics, have to plead ignorance of the sort of facts on which, in the case of a physical theory, a scientist would certainly be expected to give precise information? It is probably not surprising that those impressed by the example of the physical sciences should find this position very unsatisfactory and should insist on the standards of proof which they find there.

The reason is because of the complexity that exists in the social sciences—including business. Hayek made it clear that he wanted “to avoid giving the impression that I generally reject the mathematical method in economics” because of “their value as a description of particular situations.” The same applies to measurement in business. It can be a useful way to describe particular situations. But its limitations should be recognized.

]]>http://misleadingindicators.com/?feed=rss2&p=8450Investors beware: mining resource estimation methods give inconsistent resultshttp://misleadingindicators.com/?p=813
http://misleadingindicators.com/?p=813#commentsFri, 27 Jul 2012 18:11:37 +0000Phil Greenhttp://misleadingindicators.com/?p=813After the Bre-X scandal in 1996, when fake measurements of gold wiped out billions of dollars in shareholder value of the Canadian mining company, the mining industry developed new standards for measuring mineral resources. While they are a vast improvement, the words and methods the industry uses to describe resources still leave room for the un-initiated to be misled.

Resources are determined by independent geologists called Qualified Persons, or QPs, according to the standards. They describe the “measured,” “indicated” and “inferred” resources. The trouble is, different Qualified Persons, can come up with very different numbers.

In April this year, Endeavour Silver purchased the El Cubo mine from Aurico Gold. Shortly afterward Endeavour looked at Aurico’s data and updated Aurico’s resource estimates (see Aurico’s 2011 annual report). There were significant drops in Endeavour’s estimates of measured, indicated and inferred resources compared to Aurico’s, using the same data.

Aurico

Endeavour

Change

Dec 31/11

Jun 1/12

Oz.

Oz.

%

Silver Resources

Measured

701,000

640,000

-9%

Indicated

6,231,000

3,790,000

-39%

Inferred

26,682,000

11,410,999

-57%

Gold Resources

Measured

12,000

13,000

8%

Indicated

257,000

63,500

-75%

Inferred

548,000

220,000

-60%

The words “measured,”, “indicated,” and “inferred” can be misleading because, although they rely on measurements, are not themselves measurements. They are estimates based on measurements, or more accurately, they are inferences, in the sense that a scientist makes an inference based on data and background information. Furthermore, different QPs can use different techniques to make their inferences.

The three ways of describing the mineral resources divide up the resource into portions. A “measured” resource is that part of mineral resource in which a QP estimates the quantity, density, shape, and grade of a mineral with a high level of confidence. It is based on the resources’ geological and grade continuity. An “indicated” resource is estimated with a reasonable level of confidence. The spacing of the mineral samples is insufficient to confirm geological and grade continuity, but good enough to assume it. An “inferred” resource is estimated with a low level of confidence because of the lower quality information about grade continuity.

The Canadian Institute of Mining says that the “industry practices employed by qualified persons vary since different qualified persons will estimate and classify a mineral resource estimate for the same mineral project using different techniques.”

The measurements that go into these estimates depend on the drilling and sampling density, the drilling and sampling methods, assay quality and many other factors. So no two QPs will have the same set of data from which to estimate the resources from two different mines. This makes perfect consistency between mines hard to achieve.

What about consistency for the same mine? A desirable feature of scientific inference is that two people, with the same data and background information should make the same inference with the same confidence. Probability theorist E.T. Jaynes said that an ideal inferential robot should always reason consistently, meaning that with the same evidence and the same state of knowledge in two problems, it will reach the same conclusions.

When Endeavour looked at Aurico’s data, it used more “conservative estimation parameters,” consistent with those it used in some of its other mines, than the parameters used by Aurico. They got the different results shown in the table.

No matter how qualified they are, QPs do not reason like an ideal inferential robot. The new standards of disclosure are an improvement, but investors should be sure they know what they mean.

]]>http://misleadingindicators.com/?feed=rss2&p=8130This time it’s different: stock market indicatorshttp://misleadingindicators.com/?p=803
http://misleadingindicators.com/?p=803#commentsSat, 30 Jun 2012 22:34:37 +0000Phil Greenhttp://misleadingindicators.com/?p=803There is no shortage of indicators in the stock market. And anyone who can find reliable leading indicators that give reasonable predictions of what the markets are going to do in the future will quickly get very rich.

Many people try—and fail. That does not mean there is necessarily anything wrong with the indicators themselves. Take this chart produced by Crestmont Research. It shows secular bull and bear markets and makes some general conclusions about them, such as “Secular bull market periods have always started when P/Es were below average, and secular bear markets have never ended when P/Es were above average.”

A misleading indicator is one from which an unreasonable, unwarranted or plain wrong inference is made. Often people make lousy inferences because the indicators themselves are misleading—there is no shortage of those either. But it is just as easy to make a lousy inference from a good indicator through faulty reasoning.

This is perhaps especially true with financial indicators when people are looking for some way to get rich. How? By hunting for regularities, finding some, and extrapolating them only to find that they no longer apply because of the “this time is different” effect.

You get what you measure
What you measure is what you get
If you don’t measure it, you can’t manage it
Tell me how I’m going to be measured and I’ll tell you how I’ll perform
You cannot improve what you can’t measure
Garbage in, garbage out
If you don’t measure it, it’s just a hobby

“These clichés are true,” according to Forrest Breyfoggle III in his book “The Integrated Enterprise Excellence System.”

Anyone who believes all of them to be true ignores an old lesson from some of the greatest scientists: just —just because you can obtain numbers from measuring does not mean the thing you think you are measuring actually exists. What then are you managing?

Unfortunately, many managers who believe these clichés measure and create all sorts of indicators without asking whether they indicate what they think they indicate. Often they do not.

The only one of the above clichés that even hints at that is “garbage in, garbage out”.

The real problem is not that you can’t manage if you can’t measure it. The problem is to strike the right balance between what should be measured, and what is unmeasurable or is best left to observation, contemplation, or intuition.

Once you have decided that something should be measured, the next problem is to figure out, first, whether you are measuring the thing you think you are measuring, and second, whether you are making reasonable inferences from your measurements.

History is full of stories of leaders accomplishing great feats without recourse to measurement. Can you imagine George Washington consulting his dashboard and evaluating his key performance indicators before deciding to cross the Delaware?