Posts categorized "Chapter 4"

I want to point your attention to recent coverage of Numbers Rule Your World in Chance News. If you are using my book for class or a book club, their writeups act well as reading guides for several chapters.

Chance News has been an extremely valuable resource for statistics teachers for years. They pick up news stories and interesting items, ask intriguing questions but don't offer answer keys. Most of the questions are open-ended, which is as most statistics questions are. We owe much gratitude to the volunteers who have kept the site alive all these years.

News has leaked that the current MVP of the National Baseball League, Ryan Braun, who plays for the Brewers, failed a drug test during the post-season. (See for example here.) He's appealing and we don't know the outcome yet.

But let's parse the statements made to the media.

From Braun's powerful agency (my italics):

There are highly unusual circumstances surrounding this case which will support Ryan's complete innocence and demonstrate there was absolutely no intentional violation of the program.

Sounds like the same defence as every other accused athlete... not disputing that the banned substance exists but claiming innocent reasons for its existence. The Tour de France champion Contador blamed a piece of steak in the most unlikely scenario, for instance (see here). As I argued in Chapter 4 of Numbers Rule Your World, this line of defense does not dispute the testing lab's work so what we really need is a foolproof lie-detector test; unfortunately, such a thing doesn't exist.

From the same agent:

Ryan has impeccable character and no previous history.

"No previous history" presumably means no past positive tests. But this doesn't mean as much as they want it to mean. All anti-doping tests are calibrated to minimize false positives, and as a result, a lot of dopers go undetected. While no past positives means something, it doesn't really mean that much.

Baseball commissioner Bud Selig makes a similar misstatement when he claimed:

The use of steroids and amphetamines amongst today's players has greatly subsided and is virtually nonexistent, as our testing results have shown.

The testing results (few positive findings) do not prove this unless one turns a blind eye to the false negative problem.

***

This next fact means quite a lot more:

[Braun] becomes the first reigning MVP to test positive for performance-enhancing drugs, though many award-winners through the early part of this century -- Barry Bonds, Roger Clemens, Alex Rodriguez and Jason Giambi, to name a few -- either tested positive for steroids or were strongly linked to them through legal proceedings or investigations.

The irony is that winning the MVP is a strong indicator of steroid usage.

Alberto Contador is the elite cyclist and 2010 Tour De France winner who failed a test for clenbuterol, and his case is still being litigated with seemingly endless delays. Contador's lawyers have claimed that the clenbuterol came from an imported steak brought into France by a buddy that he ate coming from a cow that has allegedly been illegally injected with the substance. (I previously wrote about this case here.)

If that were true, poor Contador is the unluckiest athlete in the world.

***

The case is supposedly coming to an end although it has dragged on and on. We were told recently that his defense may be boosted by a situation that happened in Mexico this year.

In the Under-17 World Cup tournament, held earlier in Mexico, 109 out of 208 samples collected from players tested positive for clenbuterol! The anti-loping labs have decided not to prosecute these cases because Mexican farmers have been known to use clenbuterol, and all these athletes may have been victims of food "poisoning".

According to the "conventional wisdom" such as this, the Mexican situation should bolster Contador's case. It shows that Contador's explanation is credible.

***

On the contrary, this situation actually makes the anti-doping labs look good.

First, it shows the effectiveness of the test for clenbuterol. The test performs as expected. It detects the substance.

Second, what happens in Mexico does not translate to Europe. If it did, then we should have seen half of the athletes in the Tour De France failing the test but that didn't happen. Not even close. In addition, we know that European farmers are banned from using clenbuterol and in 2008-9, only one out of 83,000 samples tested positive for it. (That's why Contador must be extremely unfortunate if his story were indeed true.)

As I concluded in Chapter 4 of Numbers Rule Your World, it's not the job of anti-doping labs to do lie detection - their job is to do chemistry. There is no argument that clenbuterol was found in his sample. The debate is over how the clenbuterol got therel; that debate can almost never be settled by a lab test. We need a powerful polygraph but we don't have one.

If Contador's positive result is upheld, there is a small chance that he would have been falsely accused. But in my view, the anti-doping world needs more false positives, not fewer. See my previous post here.

Officials now recommend healthy men against routine screening for prostrate prostate cancer. See here. I discussed this issue before.

A few points of interest:

The clinical trials have mixed findings on the benefits. The harm of unnecessary treatment is real. The harm of false positives outweigh the benefits

Men at risk of getting prostrate prostate cancer are not part of this recommendation. It applies only to healthy men.

"Of the men who had three or four rounds of screening, the false positive rate was 12 to 13 percent." I think this means that among men who don't have prostate cancer, 12 to 13 percent will be told they have cancer three or four consecutive times! This level of inaccuracy is quite astounding.

Dr. Herbert Lepor (urology professor): "The notion that prostate cancer is not a threat to the well being of men is simply wrong." I'm shocked that an academic at NYU could equate a criticism of a detection method with a denial of an illness, without losing a beat.

"Critics say the studies were not long enough to show a benefit." Let us do a thought experiment. We start screening men from 21 years old. Each year, we issue large numbers of false positive results. By the time these men become at risk for prostate cancer, a large proportion of them would have received at least one positive result. This screening policy will surely catch most of those who would develop the cancer. Now imagine, that we didn't have a screening technology, we just randomly tell a certain proportion of men they are positive. This "false screening" will also identify most of those afflicted.

Further, unless prostate cancer has some kind of very long incubation period, waiting longer shouldn't improve the screening accuracy. The study referenced in the article also tracked patients for 10 years. This type of criticism is not serious unless they come prior to the test results being issued.

Note also false positive errors are invisible. Once the little tumors are removed, it is not possible to know if they would have developed into anything malicious. Invisible errors mean the people committing them can't be called into account. By contrast, false negatives are definitely public. For more, see Chapter 4.

Having written about false negatives in steroid testing leads me naturally to ask: how serious is the false negative problem in criminal justice?

It's pretty clear that in law, we also try to minimize false positives, which means that there are criminals who evade justice. I wonder if, and how, one would assess the probability of a criminal not getting caught. Anyone have seen or know about such studies?

It would, of course, not be easy to measure such a thing since one may need a lot of confessions. But there may be smart ways around it. For instance, maybe a large business has tracked the total amount of stuff stolen from its stores over a period of time, and also the total value of goods stolen by those caught red-handed.

Lance Armstrong is being accused of doping by former teammates, many of whom directly helped him to win 7 Tour de France championships. According to this Yahoo! report, Armstrong's response, via Twitter, was:

20+ year career. 500 drug controls worldwide, in and out of competition. Never a failed test. I rest my case.

***

I don't know if Armstrong doped or not. Given that his racing days are years in the rear-view mirror, there is little chance we will ever have direct evidence either way.

However, as I pointed out in Chapter 4 in Numbers Rule Your World, "never a failed test" is not a great basis on which to rest one's case!

We have quite a few examples of athletes who never failed any drug test during their competitive careers but later confessed to doping. Marion Jones and Bjarne Riis are two examples I used in the book. Why is this the case?

The sad truth of steroid testing is that most dopers do not test positive. A recent example (discussed here) illustrates that about 50% of dopers would pass the test -- and that was measured in a controlled laboratory experiment. The reason for such high false negative rates is that the anti-doping labs want to minimize the chance of a false positive error. The underlying statistics dictate a trade-off between false positives and false negatives; the harder one tries to eliminate false positives, the more false negative results will be produced!

On the one hand, they faithfully report the colorful stories of athletes who fail drug tests pleading their innocence. (I have written about the Spanish cyclist Alberto Contador here.) On the other hand, they unquestioningly report athletes who claim "hundreds of negative tests" prove their honesty. Putting these two together implies that the media believes that negative test results are highly reliable while positive test results are unreliable.

The reality is just the opposite. When an athlete tests positive, it's almost sure that he/she has doped. Sure, most of the clean athletes will test negative but what is often missed is that the majority of dopers will also test negative.

We don't need to do any computation to see that this is true. In most major sports competitions, the proportion of tests declared positive is typically below 1%. If you believe that the proportion of dopers is higher than 1%, then it is 100% certain that some dopers got away. If you believe 10% are dopers, then at least 9 out of 10 dopers will test negative!

***

While researching the book, I learned that trying to catch dopers is extraordinarily hard. Here are some reasons why a doping athlete could get a negative test result:

he's using a drug that is also produced naturally by the body, which means that the test needs to detect "unnatural" levels of the chemical, rather than the presence of a foreign substance

he's using a new drug that has no test yet

he's using "masking agents" that hide the performance enhancing drugs

he's used steroids during training but not during competition (many sports don't conduct out-of-season testing, and even if they do, you can't possibly test all athletes all the time)

he's received a "therapeutic use exemption" (I'm not sure why the sports bodies have never disclosed which athletes have been allowed to use which drugs based on TUE)

he's following a drug schedule that attempts to evade testing

Those are not the only reasons. Notice that all of these tactics are not replicable in a lab so the accuracy rates reported by labs are almost for sure overly optimistic.

***

Armstrong's latest accuser is Tyler Hamilton, who features in my book. When I first started writing, he had failed one test, got banned, came back, and failed another test. Even after the second failed test, he had maintained his innocence. By the time I finished writing, he had come back yet again and failed a third test, upon which he retired.

One other statistical point of note: test results for a given individual are not necessarily "independent"! It's not too surprising that people like Hamilton kept failing tests while other dopers like Marion Jones kept passing tests. A failed test indicates that the doping program of that athlete isn't foolproof, and we should expect that athlete to have a higher chance of failing again in the future. A negative test, by contrast, may indicate that the doping program is robust against the testing regime. (In Jones's case, she was using a new designer steroid that took years for the test labs to notice.)

It appears that the IT world today considers spam detection to be a solved problem. One doesn't hear many complaints about spam. Spam used to only infiltrate our emails; they now appear in many formats, including spam websites that adulterate search engine results, and spam comments that pollute blogging communities.

Many types of methods have been deployed to fight spam. Fighting spam is a statistical problem: algorithms try to predict whether each incoming email or comment is spam or not.

Bayesian spam filters, which determines what features of a piece of text are predictive of spam, are popular - the buttons that allow users to "report spam" or "report not spam" are integral to such algorithms, enabling them to learn by experience. Blacklists that block IP addresses are another method. The blacklist is a type of heuristic, essentially fixed rules.

***

I have been bothered by Typepad's seeming ineptitude in spam filtering for a long time. Certain comments that are obviously spam do not get filtered automatically. These are comments that appear to be extremely easy to identify via some simple heuristics. Namely, if the name of the person writing the comment is the name of a pharmaceutical drug or a brand of shoes or fashion, classify the comment as spam.

Imposing such rules will get rid of the following types of spam comments that recently have been "false negatives" under Typepad's so-called spam filtering product:

There is someone that is coming or passing away in your life around the clock, so you may lose sight of those seen, and forget those remembered. There is gain and loss in your life, so you may catch sight of those unseen, and remember those forgotten. Nevertheless, doesn‘t the unseen exist for sure? Will the remembered remain for ever?

bike clothing said:

reaction of officials in Boston who advised residents to stop drinking tap water for fear of possible contamination which has not been verified.

louboutin shoes said:

My readers will note that I disapprove of how most statistics textbooks discuss the correlation/causation issue: you are told what not to do, but left with no idea of what to do.

The blog recently got flooded with 55 spams on a single day, coming from "louboutin", "louboutin shoes", and "christian louboutin sale". I just don't understand how any spam filtering algorithm worth its name can miss such obvious cases. It is extremely unusual to have someone whose first name is louboutin and last name is shoes. If someone has such a name, I'd argue (and sorry to be mean) that your comment deserves being expunged.

***

Every time I provide feedback to Typepad, their IT staff recommends that I manually block those IP addresses (the blacklist method). It appears that they don't see this as an obvious failure. They certainly don't realize that the failure to identify the spam already occurred. So I'm mystified. If you know anything about spam filters, please address these questions:

Is it operationally difficult to add some heuristics to the algorithm?

Is there something wrong with rules that assume no one is named "viagra" or "cheap viagra" or "cheap jordans"?

The New York Times recently printed a valuable article discussing the PSA screening test for prostate cancer. I previously discussed the statistics of this test here.

The entire article is worth reading. Here, I summarize the key issues:

Prostate cancer is a slow-growing cancer, which means that many men die with it, rather than die from it. This means that autopsies will identify many men with prostate cancer, many of whom did not die because of it. If these men were to have received treatment in their lifetime for prostate cancer, they would still have died the same way they died.

Detecting prostate cancer late in life is rather useless because many men would have died from other causes before these cancers could have killed them.

Screening tests always generate many false positives. The false positive rate depends on the proportion of those tested who actually have the ailment: there are more false positive errors when the tests are given to people who are less likely to have the ailment. Thus, the younger men taking PSA tests will suffer a higher chance of false positives.

If you are given the knowledge that there is a chance you have prostate cancer (even if it is a small chance, say 30%), you may elect to do a biopsy to buy yourself "peace of mind". Likewise, the doctors may want to insure themselves against future negligence lawsuits, or may simply want to earn the fees associated with the surgeries.

Since most of those being treated are false positives, the treatment will have zero benefit, and some unlucky patients will suffer side-effects.

False positive is a laboratory term. Your test result is a positive, period, not a "false" positive. You must make the decision whether to treat it or not - and risk side-effects - before you find out if it is a false positive. (If you know it is a false positive, then you already know you don't have cancer, which means you don't need to be screened in the first place.)

The PSA screening test is a multi-billion-dollar industry.

These are the same issues raised during the mammography controversy. One ought to be consistent and gender-neutral in the way one approaches each.

***While the article is great, the title is not. In the print edition, it is called "Screening prostates at any age". The article is primarily concerned with whether elder men should be screened at all.

Here comes the promised second installment to my recent post on anti-doping in which I argue that we should pay a lot more attention to false negatives. Here's the last paragraph:

For me, the difficult question in the statistics of anti-doping is whether the current system is too lenient to dopers. If the risk of getting caught is low, the deterrence value of drug testing is weak. In order to catch more dopers, we have to accept a higher chance (than 0.1%) of accusing the wrong athletes. That is the price to be paid.

Note that this post has been contributed to the Statistics Forum (link), which is a new blog sponsored by the American Statistical Association and edited by Andrew Gelman, and is reprinted here. You can click on the link above or scroll below to read the full post.

A reader of my blog, Joran E., pointed me to this great article (by Ross Tucker) that covers one of the newer anti-doping measures (the biological passport), which links to this recent NYT article on two Italian cyclists found guilty of doping. So while I was researching this latest development, I came across the latest legal maneuvres in the case of Alberto Contador, the Spanish cyclist and multiple winner of Tour de France who tested positive after last year's victory, and subsequently blamed a contaminated steak (I mentioned his case here last year).

Anti-doping provides a perfect back drop to revisit all five statistical concepts that form the spine of my book, Numbers Rule Your World.

***

The most potent form of doping these days is human growth hormones (HGH), EPO and similar compounds that have the characteristic of occurring naturally so that labs must seek to separate dopers from people who have "natural highs". By contrast, for compounds that don't occur naturally, such as clenbuterol that ensnared Contador, even minute amounts can be proof of wrongdoing.

In order to know what level of a compound is "unnatural", statisticians need to establish what is natural. This is the concept behind Chapter 1: we calculate the "average" (natural) value, but focus on examining variations around the average.

Admitting that the natural value is not uniform across all people, statisticians determine different averages for different "types" of people; the simplest such subgroups would be male/female, and age groups. The biological passport takes this idea to the extreme: each individual athlete is tracked over time to establish his or her own average. We just put into practice the concept behind Chapter 3, which is to avoid lumping together things that are different.

The cases of the two Italian cyclists represent the first two in which athletes have been punished based on evidence from the biological passport. Previously, the enforcers need a failed drug test or a police bust to convict dopers.

Readers who like the materials in the Conclusion chapter (as related to Chapter 4) on the hematocrit test should definitely read Tucker's article for details on how the biological passport works.

***

The developments in the Contador case are very discouraging: the Spanish cycling federation showed an unwillingness to expose the biggest star of the sport, first by assessing a one-year ban (when the norm is two years), and most recently, overturning that shortened punishment. USADA, the US anti-doping body, expressed its concern here, and the reversal of the ban is under appeal.

The Spanish authority accepted the Contador camp's explanation of unintentional consumption of tainted beef as the reason for testing positive. Statisticians who believe in the logic of hypothesis testing will find such a conclusion absurd.

Let's walk through how we apply the logic as described in Chapter 5 of Numbers Rule Your World to this situation. Assuming that Contador did not dope, what is the chance that minute amounts of clenbuterol would be found in his body? Unfortunately for Contador and other athletes failing this drug test, the chance is vanishingly small.

Like most accused dopers, his camp did not challenge the presence of clenbuterol; they merely offered an alternative theory for why it was there. A large number of coincidences had to occur in order for their theory to be believed: beef had to be taken from Spain into France to serve Contador, and only Contador (not any of his teammates); a different source of beef must have been used on other days during the Tour on which Contador ate beef (since he tested negative on most other days of the tour); he was one unlucky fellow since anti-doping tests have high false negative rates in general, and he managed to test positive on that one time he ate the contaminated beef; he was also extremely unlucky since Europe banned the use of clenbuterol to raise cows in the 1990s, and the beef he ate on that one occasion had to have come from an unscrupulous farmer violating the ban.

Statisticans would politely listen to all that, and declare "rare is impossible". It's much easier to believe that he was doping. (We would admit that there is a miniscule chance that the conclusion is incorrect -- the chance is precisely that of those coincidences occurring.)

***

Why does the scientific process disintegrate into this sort of he-said-she-said argument?

The concept behind Chapter 2 proves useful here. The statistical model that links the biological passport and/or the drug test to doping is one based on correlation, not causation. The passport or drug test does not provide direct evidence of doping (unlike a police bust). But as I point out in the book, correlational evidence can be powerful, and has been profitably used in all kinds of decisions. Because clenbuterol is not produced naturally in the human body, this test result is very close to causal evidence; it's less secure for things like EPO and HGH.

It's just more complicated when causal evidence is unavailable because people can now advance all sorts of hypotheses to explain the correlation. We then get story time, a phenomenon I frequently discuss on my blog. I'm happy to hear the stories but one must seek evidence to support these stories.

In the Contador case, for example, I'd like to see evidence that the steak was eaten, receipts from the vendor who imported the beef, documentation of which farm raised the cow, inspection of the farm to confirm that it used clenbuterol, traceback of beef from that farm to find the presence of clenbuterol, etc. In none of the reports on this case have I seen any of this evidence, and more disturbingly, the supporters of Contador don't appear to be asking any such questions. (See, for instance, Christian Josi on Huffington Post.)

***

Why would statisticians accept the chance of falsely accusing a clean athlete, however small that chance is? This is because we know that there is no such thing as a perfect test. The only test that will never yield a false positive is the test that never issues any positive results!

We already accept this type of situation in the Western legal system. A criterion of "beyond reasonable doubt" in the courts does not guarantee no wrongful convictions. In fact, thanks to the work of groups such as the Innocence Project, we know that some unfortunate people are wrongfully convicted, sometimes for long sentences for grievous crimes they did not commit.

As explained in my book (which I won't repeat here), the real issue in anti-doping is not about false positives but about false negatives. I fear that the entire system is so lenient toward dopers that they would take the (small) risk of detection. I'll make a case for this in a future post.