The GTx Cancer Muscle-Wasting Drug Studies Will Fail. Here's Why

MEMPHIS, Tenn. ( TheStreet) -- Before the end of the current quarter, GTx ( GTXI) is expected to announce top-line results from two phase III studies of a cancer-related muscle wasting drug known as enobosarm. And once again, one of my favorite fund manager sources is short GTx because he expects the enobosarm studies to fail.

This is the same fund manager who was most recently short Ziopharm ( CLSN), Celsion ( CLSN) in anticipation of negative clinical trial data. He was correct" both times. Past performance doesn't always predict future results, but his track record is pretty damn good.

GTx shares are up 54% year to date, trading most recently at $6.96. Analysts who follow the company are bullish and have been telling clients to expect positive enobosarm study results. In other words, this is the ideal situation to make a contrarian bet on GTx, which is what my fund manager source is doing. He's prohibited by his fund's rules from being quoted by name.

Let's go over the short thesis on GTx and enobosarm but first some important background information.

Enobosarm is a selective androgen receptor modulator (SARM), a class of drugs designed to mimic the positive effects of testosterone or steroids but without the side effects or safety risks that can come with prolonged exposure.

GTx is developing enobosarm as a treatment for cancer cachexia, a progressive loss of weight and muscle mass which afflicts cancer patients. There are no currently approved treatments for cancer cachexia, but not for lack of trying. Over the years, several different classes of drugs, including steroids and testosterone, have been tried but none have demonstrated a significant clinical benefit. The challenge is finding a drug that will both increase muscle mass and improve muscle performance or function.

FDA will not approve a drug for cancer cachexia based on an increase in muscle mass alone. The drug must also improve a patient's life by demonstrating an increase in muscle function.

The design of GTX's two phase III cancer cachexia studies is fairly simple. Patients undergoing first-line chemotherapy treatment for non-small cell lung cancer with an accompanying diagnosis of cachexia are randomized to receive a daily, 3 mg dose of enobosarm or a placebo. After three months, the patients are assessed on two co-primary endpoints: 1) change in lean body mass (measured using an x-ray scan); and 2) physical function, defined as a 10 percent or greater improvement in the stair climb power test.

Enobosarm has to meet both co-primary endpoints with statistical significance for the studies to be successful.

Signals pointing to the failure of phase III clinical trials can often be found by digging through data from previously conducted clinical trials. In the case of enobosarm, GTx conducted a phase IIb study several years ago which enrolled 159 patients with assorted forms of cancer (a subset had lung cancer.) All the patients entered the study with a diagnosis of cachexia. As reported by GTx, treatment with enobosarm dosed at 1 mg and 3 mg increased lean body mass and improved muscle function compared to placebo.

Enobosarm's benefit was greatest at the 3 mg dose and in the subset of lung cancer patients, so this dose and patient group were used in the ongoing phase III studies.

Read the Lancet Oncology paper very closely and what you'll discover is that the study was widely skewed in favor of enobosarm over placebo. Patients with more advanced disease or less muscle function at baseline were randomized disproportionately to the control (placebo) arm, which artificially elevated the effect of enobosarm.

A close read of the study also reveals that a good number of patients were outliers, meaning they performed either really well or really poorly on muscle function tests. Coincidentally or not, the study's endpoints were assessed using mean (average) values, not the median, which in this case, worked out incredibly well for enobosarm relative to placebo.

But when the effect of outlier patients is minimized by analyzing data using median values -- the more widely accepted method -- enobosarm's benefit over placebo disappears.

Take a look at this slide, which comes from GTx and depicts results from the phase IIb study isolated only for the non-small cell lung cancer patients. Remember, this is the group of cancer patients in which GTx says enobosarm had the most benefit, which is why the ongoing phase III studies are only enrolling lung cancer patients.

Sixty-one lung cancer patients were enrolled in this study out of 159 total, or 38 percent. But notice only half the enrolled lung cancer patients -- 31 to be exact -- were included in the efficacy analysis.

According to the Lancet Oncology article, there were 10 patients and 13 patients with lung cancer randomized to the placebo and 3 mg enobosarm arms of the study, respectively. This means GTx claims positive results for enobosarm in lung cancer based on data from 15% of patients in the phase IIb study.

Why would enobosarm work better against cachexia in lung cancer patients than in other forms of cancer? GTx hasn't offered an explanation but the Lancet Oncology paper shows clearly that the lung cancer patient subgroup was skewed in enobosarm's favor.

Six of ten, or 60 percent, of the lung cancer patients randomized to placebo entered the study with stage IV disease -- the most aggressive and advanced stage of lung cancer. By comparison, five of 13 lung cancer patients, or 38 percent, randomized to the 3 mg enobosarm treatment had stage IV disease.

The placebo lung cancer patients were a lot sicker and therefore had greatly reduced physical (muscle) function compared to the enobosarm patients. This difference easily explains the enobosarm benefit on the stair climb power test observed in the phase IIb study.

Looking at the Lancet Oncology paper again, 39 percent of patients enrolled in the phase IIb study had colon cancer -- the second-largest group -- yet here the playing field was tilted against enobosarm. Of the colon cancer patients randomized to placebo, 53 percent had stage IV disease, compared to 75 percent with stage IV disease in the 3 mg enobosarm arm.

Now we know why GTx claims enobosarm "works" in lung cancer but not in colon cancer. The actual data show this efficacy signal to be a mirage.

Let's take a closer look at the data on the co-primary endpoints of the enobosarm phase IIb study. Here's the chart depicting change in total lean body mass, taken from the Lancet Oncology paper:

Not much to say here except increasing lean body mass is relatively easy to do, but won't get a cachexia drug approved by the FDA. Note the p values in the chart compare each arm against itself (baseline vs. Day 113.) It's also noted in the fine print that the increases in lean body mass between the 1 mg vs. placebo arms and 3 mg vs placebo arms were also statistically significant. (p values of 0.006 and 0.041, respectively.)

This chart, again from the Lancet Oncology paper, depicts data for the second co-primary endpoint -- change in stair climb test at day 113:

I'm going to focus discussion on stair climb power since that's the test being used in the phase III studies. Again, there was a statistically significant improvement in both the 1 mg and 3 mg enobosarm arms when each is compared against its own baseline. (p values of 0.0008 and 0.0006, respectively.) The change in the placebo arm was not statistically significant.

But notice there is no mention of statistical significance or p values for a comparison between the enobosarm patients and those treated with placebo with respect to stair climb power. Had this inter-group difference been statistically significant, GTx would have mentioned it.

Remember above when I said GTx manipulated the phase IIb data using mean values (instead of medians) to make enobosarm look more effective than it really is? Here's how that works.

First, this is how the authors of the Lancet Oncology paper describe the observed muscle function benefit:

This looks impressive. The 3 mg dose of enobosarm produced an almost 22 percent mean improvement in stair climb power compared to just a 4.7 percent mean improvement for the placebo patients.

The stair climb power data in the chart right above depicts the same benefit in a different way. Note the mean 2.21-watt improvement in stair climb power for placebo patients, which is much smaller than the mean 16.81-watt improvement for the enobosarm 3 mg patients.

But it's impossible to interpret these so-called improvements in muscle function without first knowing the relative strength (or weakness) of each patient group before the study began.

This is where it gets interesting, because the mean baseline stair climb power is not disclosed anywhere in the Lancet Oncology paper. If there was an imbalance in mean stair climb power at baseline, it could explain away the observed enobosarm benefit.

Sure enough, that's exactly what happened in the phase IIb study. Mean baseline stair climb power -- at the start of the study -- was 46 watts for placebo patients compared to 80 watts for the 3 mg enobosarm patients. These numbers can be calculated using the data available in the Lancet Oncology paper.

In other words, the mean muscle functionality of enobosarm patients entering the study was almost twice as strong as the placebo patients. No wonder the enobosarm patients performed better on the stair climb power test over time -- they had a huge headstart.

Take another look at the chart above depicting stair climb power. Notice that in the placebo arm of the study, there's a big difference between the mean value (2.21 watts) and the median value (11.34 watts.) I don't want to get too bogged down into mathematical concepts but this means many of the patients in the placebo arm performed very poorly on the baseline stair climb power test. They're outliers.

By comparison, the mean stair climb power of patients in the 3 mg enobosarm arm (16.81 watts) is higher than the median value (12.84 watts.) In other words, many of the patients here performed really well on the stair climb power test. They're outliers in the other direction.

You should see now why GTx chose to analyze the phase IIb co-primary endpoints using mean values. Including the outliers in the analysis tilts the entire study in favor of enobosarm. The drug's benefit looks impressive on paper but it's a false signal.

Think about it this way: Cancer drug studies use median overall survival for a reason, not mean overall survival.

A larger, more well-balanced study -- presumably like the phase III studies which enroll a total of 600 patients -- will be a more level playing field where enobosarm's true benefit to cachexia cancer patients will be more modest, or disappear altogether, relative to placebo.

Strong evidence of enobosarm's lack of benefit for cancer cachexia patients can actually be seen in the phase IIb study by analyzing the stair climb power endpoint using median values. Calculating results this way minimizes the effect of the outlier patients.

The median improvement in stair climb power for placebo patients: 7.5 percent.

Adam Feuerstein writes regularly for TheStreet. In keeping with company editorial policy, he doesn't own or short individual stocks, although he owns stock in TheStreet. He also doesn't invest in hedge funds or other private investment partnerships. Feuerstein appreciates your feedback; click here to send him an email.