Better Tools Needed to Measure Treatment Outcome

Better Tools Needed to Measure Treatment Outcome

The need for better tools, as well as better use of existing tools, to measure treatment response in clinical trials was a principle focus of the 46th annual NIMH-sponsored NCDEU (New Clinical Drug Evaluation Unit) meeting, held June 12-15 in Boca Raton, Fla. Improved clinical research techniques are needed to better separate treatment effect from placebo response, to distinguish between active comparators, and to facilitate development of novel treatments, according to several presenters at the conference.

In the workshop titled "Enhancing Precision in Clinical Trials," Mark Rapaport, MD, of the Cedars-Sinai Medical Center in Los Angeles, challenged clinical researchers to ensure the validity of the measures that they use to establish treatment outcomes. He asked workshop participants to discuss the selection, development, and refining of methods to evaluate acute and long-term outcomes of interventions for mood and anxiety disorders, as well as efforts to discern treatment effect while minimizing risks to participating subjects.

The Hamilton Rating Scale for Depression (HAM-D) was one of the established instruments revisited in discussions on improving the validity of clinical investigations. Although it has long been considered a standard for measuring severity of affective symptoms, its use to assess therapeutic intervention end points has been questioned. The problem, according to Ellen Frank, PhD, of the University of Pittsburgh, has less to do with the instrument than with its application.

"I knew Max Hamilton," Frank recounted. "I even had the privilege of training with him on his now iconic instrument and hearing him talk about what he had in mind when he developed it, and it had nothing to do with how we are using it today."

Frank contrasted the scale's current use--as a gauge of symptom change in inpatient and outpatient populations with mild to severe depression--with its original use to measure relative severity of symptoms in patients typically hospitalized for severe depression or melancholia. Researchers have been reluctant to stop using the instrument, according to Frank, because of its prominence and because it links new research to past studies.

Frank agreed with Rapaport on the need to have new end points for studies of depression. The new measures should include improvement in function, she asserted. "What patients and patient advocacy groups tell us they want out of treatment are a home, meaningful relationships, and satisfying work," Frank noted.

With academic and pharmaceutical industry researchers now recognizing the need for more sensitive measures of interventions for depression, Frank suggested that a choice be made to direct resources toward developing either a single broad instrument or multiple scales for different forms of depression. In addition, Frank urged increased adoption of new technology for the development of better assessment instruments and processes.

One attempt to improve upon the HAM-D--undertaken by a collaboration of researchers from several pharmaceutical manufacturers and universities--was described in another section of the NCDEU conference by Nina Engelhardt, PhD, of MedAvante, Inc, in New Jersey. Engelhardt explained that the GRID-HAM-D offers a standardized scoring system that incorporates intensity and frequency of depressive symptoms into the severity score.

Engelhardt reported on validity testing of the GRID-HAM-D total and item scores drawn from a sample of 150 outpatients with depression. Inter-rater reliability was comparable for the GRID-HAM-D, the structured interview guide for the HAM-D, and the unstructured interview Guy version of the HAM-D. Engelhardt declared that the GRID-HAM-D was as reliable as the current HAM-D, with the advantages of a standardized scoring system, integrated conventions, and an interview guide.

"These features may provide specific benefits for typical raters who have less clinical assessment experience than the highly experienced raters in this study," Engelhardt indicated.