There are many ways to assemble numbers about your costs and outcomes (effectiveness and benefits). Some may be more useful than others, depending on your program and your funding situation. Each method of assembling cost and outcome data, together with its possible determinants, serves a slightly different purpose. Graphs usually show the best picture of the costs paid for treatment and the results of treatment. Tables and ratios give a simpler picture but may bury or eliminate important information that could change your decision about which treatment procedures to use or which programs to fund. Mathematical models, which are wonderfully complex, are beyond the scope of this manual.

Measure Effectiveness

Traditionally, effectiveness is analyzed one measure at a time to provide a comprehensive picture of what a program is doing with its resources. To analyze overall effectiveness, standardized outcome measures can be multiplied by a weighting to show their relative importance. These standardized, importance-weighted measures of effectiveness then can be combined into an overall measure of effectiveness. This method of integrating specific measures of effectiveness into one overall measure was described by Yates and associates (1979) in a cost-effectiveness analysis of residential treatments for predelinquent youth.

Weight Effectiveness Measures

Even with the same units or same scale, the same improvement on two measures may not be valued the same. For example, many counselors would view a 50-percent improvement in drug-free days to be more important than a 50-percent improvement in gainfully employed days.

The importance of each measure can be rated by persons involved in funding decisions. For instance, eight staff and two community representatives could rate each measure of effectiveness for importance on a 10-point scale, with "1" meaning "much less important than the other measures" and "10" meaning "much more important than the other measures." A 10-point scale has the advantage of not providing a midpoint that can be used to say "as important as other measures." Having an even number of rating points forces raters to decide whether a measure is less or more important than the other measures. Table 15 shows a simple example with two measures of outcome and two raters.

Table 15. Importance Ratings on Two Outcome Measures

Rater

Outcome Measure

Importance Rating (1-10)

A

Drug-free days

10

Employed Days

6

B

Drug-free days

8

Employed Days

4

Because some raters might use one part of the scale more than others, the first step in standardizing the ratings is to standardize ratings between raters. This sounds more complicated than it is. Find the average rating for a rater, and then divide each rater's ratings by his or her average rating. The result will be numbers slightly greater or less than 1.00 (table 16). An importance number (weighting) greater than 1.00 indicates that the measure is considered more important than the other measures. An importance weighting less than 1.00 indicates that the measure is considered less important than the other measures.

Table 16. Importance Weighting Index

Rater

Outcome Measure

Importance Rating (1-10)

Importance Weighting

A

Drug-free days

10

10 ÷ 8 = 1.25

Employed Days

6

6 ÷ 8 = 0.275

Average

(10 ÷ 6) ÷ 2 = 8

(1.25 ÷ 0.75) ÷ 2 = 1.00

B

Drug-free days

8

8 ÷ 6 = 1.33

Employed Days

4

4 ÷ 6 = 0.67

Average

(8 ÷ 4) ÷ 2 = 6

(1.33 ÷ 0.67) ÷ 2 = 1.00

Do the same calculations for each rater. Now average these importance ratings for all raters. The result is the average importance weighting - a numeric consensus on how important each measure is (table 17).

Table 17. Average Importance Weighting

Rater

Outcome Measure

Importance Weighting

A

Drug-free days

1.25

Employed Days

0.75

B

Drug-free days

1.33

Employed Days

0.67

Average for all raters

Drug-free days

(1.25 + 1.33) ÷ 2 = 1.29

Employed Days

(0.75 + 0.67) ÷ 2 = 0.71

You are now ready to calculate the composite measure of overall effectiveness. Multiply the average importance rating for a measure by the effectiveness value for that measure, and add up the products (table 18). The result is a single composite index of program effectiveness for the patient. You can then average across patients to describe the effectiveness of the program as a whole.

Table 18 shows that without importance weightings, Patients A and B would show the same improvement. With importance weightings, Patient A's improvement on the more important outcome measure becomes clear. Table 18 also shows that, after averaging across patients and outcome measures, the original average percentage change in outcome measures (60 percent) reappears. The effect of averaging across outcome measures is to cancel out the effects of importance weightings. So, why did we do all those weightings to begin with? These importance-weighted outcome measures are important to retain when examining cost-outcome relationships per patient. That way, the processes, procedures, and resources that increase the more important outcome of drug-free days will be given more weight.

Table 18. Composite Index of Program Effectiveness

Patient

Outcome Measure

Percentage Change

Average Importance Weighting

Importance-weighted Percent Change

A

Drug-free days

110%

1.29

141.9%

Employed Days

10%

0.71

7.1%

Average

60%

1.00

74.5%

B

Drug-free days

10%

1.29

12.9%

Employed Days

110%

0.71

78.1%

Average

60%

1.00

45.5%

Average for Program

Drug-free days

60%

1.29

77.4%

Employed Days

60%

0.71

42.6%

Average across outcome measures

60%

1.00

60%[(77.4 + 42.6) ÷ 2]

Average across patients

60%[(74.5 + 45.5) ÷ 2]

These importance weights are, of course, subjective. They are no more subjective, however, than most psychological measures. "Subjective" does not mean that these measures are inherently bad or that they cannot be used. Importance weights simply describe what most people do when they read program evaluation reports with several measures of effectiveness, according to psychological theory backed by laboratory research (Anderson and Shanteau 1970). Most people do not attach equal importance to each of the many effectiveness measures available in most program evaluations. The ratings recommended here just make explicit the psychology of the readers of an evaluation report. If one measure of effectiveness is being ignored, that will be apparent in the ratings.

Within specific interest groups, these importance weights may be quite similar. If a policymaker wishes to use a particular set of importance weights, that certainly can be done. Or, representatives of the various interest groups each can be polled for their weightings of the importance of different measures. Better yet, perhaps the different degrees to which each of these outcomes predicts long-term abstinence from substance abuse or long-lived contributions to society could be measured by statistical analyses of large data sets and then applied as importance weights here. Cost-effectiveness analyses, such as those reported by Yates and associates (1979), have included these importance weights obtained by surveying treatment staff for ratings of the relative importance of different outcome measures.

Newman Tables

One way to measure outcomes so that they reflect the contribution of the program is to compare outcome measures before versus after treatment, or before versus after treatment begins. For example, the number of criminal acts could be compared for 1-year periods before and after treatment begins. Another way is to construct a Newman table that displays the number of patients who began treatment at one level of functioning (or other outcome measure) and ended treatment at another level.

A basic form of the Newman table is displayed in table 19. The numbers in the table show the number of patients who began treatment at one level of functioning (indicated by the labels in the rows) and ended treatment at another level of functioning (indicated by the labels in the columns). The numbers in parentheses show the percentages of all 90 patients that are in each cell of the table.

Table 19 shows that 11 patients who began treatment at the "Poor" level ended treatment at the same level and 2 patients got worse. Functioning improved for 71 (45 + 9 + 17) patients. Generally, Newman tables segment outcome measures into more levels (5 to 10) both before and after treatment. These more detailed tables provide many more cells, enabling a more specific analysis of the effects of treatment for patients who begin treatment at different levels of functioning (Carter and Newman 1976; Davis and Yates 1982).

Table 19. Sample Newman Table

Pretreatment functioning

Posttreatment functioning

Poor

OK

Great

Poor

11(12%)

45(50%)

9(10%)

OK

2(2%)

5(6%)

17(19%)

Great

0

0

1(1%)

Before-after comparisons of outcome measures may exaggerate the impact of treatment, since treatment often is started because patient behavior is becoming progressively more severe. Without treatment, patient behavior might have continued to worsen, of course, but it also could have ebbed in severity because of processes unrelated to treatment. Criminal behavior, for example, seems to decline as patients grow older. Before-after comparisons of the same outcome measures for individuals who did not receive treatment, but who are as similar as possible to the patients, can help discern how much of the apparent improvement in patient behavior is due to other factors.

Calculate Procedure Dose

By examining the time and other resources used for each procedure, it may be possible to assess the degree to which a procedure was implemented. Some procedures, such as certain individual and group therapies, may have no set criteria that indicate whether the procedure was fully implemented. The extent to which these procedures are implemented is indeterminate. The dose for open-ended procedures can be calculated for the month by minutes of implementation or number of sessions (of standard duration). More hours spent in individual therapy indicates a larger dose of that therapy.

If a set duration or set number of sessions is prescribed as the desired, complete implementation of the procedure, the dose of the procedure actually received by the patient can be captured as a percentage. The percentage of procedure implementation can be calculated for each patient by dividing the total time (or other measure of procedure implementation) by the amount of time the procedure was supposed to be implemented.

Other procedures, such as drug education and more formally prescribed therapies, may be characterized with checklists of specific steps, points, presentations, demonstrations, or other specific operations to be performed by service providers. The degree to which an operations-based procedure actually was implemented can be calculated according to the percentage of points on the checklist that were addressed or the number or duration of presentations delivered versus desired. This requires a carefully operationalized description of the procedure.

Some procedures, such as methadone maintenance, can be described in absolute terms according to the total dose received for the month. Some providers may find it more useful to specify the extent to which methadone maintenance was delivered according to the actual versus ideal number of days that methadone was received.

Even if patients are prescribed different amounts of specific treatment procedures, the percentage implementation of each procedure can be calculated easily for each patient and then averaged across patients for each procedure. Sample calculations for these procedures are described in table 20.

There are other ways to calculate the average implementation of a procedure. The averaging shown in table 20 considered each procedure to be equal. The percentages shown in the rightmost column in the last four rows of the table could be weighted by the amount of time that patients were supposed to spend in the procedure.

Table 20. Sample Calculations for Dose of Operations-Based Procedures

Patient

Procedure

Prescribed hours

Actual hours

Percent procedure implemented

A

Group counseling

10

9

90%(9 ÷ 10 x 100)

Relapse prevention

10

6

60%(6 ÷ 10 x 100)

Individual counseling

4

4

100%(9 ÷ 4 x 100)

Case management

2

2

100%(2 ÷ 2 x 100)

B

Group counseling

10

11

110%(11 ÷ 10 x 100)

Relapse prevention

5

3

60%(3 ÷ 5 x 100)

Individual counseling

5

4

90%(5 ÷ 4 x 100)

Case management

2

2

100%(2 ÷ 2 x 100)

Average

Group counseling

10[(10+10)÷2]

10[(9+11)÷2]

100%[(90% + 100%) ÷ 2]

Relapse prevention

7.5[(10+5)÷2]

4.5[(6+3)÷2]

60%[(60% + 60%) ÷ 2]

Individual counseling

4.5[(4+5)÷2]

4[(4+4)÷2]

95%[(199% + 90%) ÷ 2]

Case management

2[(2+2)÷2]

2[(2+2)÷2]

100%[(100% + 100%) ÷ 2]

To do this, multiply the number of hours prescribed for a procedure by the percentage implementation of the procedure and divide by the total number of hours that procedures were prescribed. Working with the averages for Patients A and B, shown in the bottom four rows of table 14 (Tables 12-14 are in pdf format), the weighted average for percentage implementation of the procedures would be:

You could begin using graphs to explore cost-outcome relationships by graphing data for individual patients for one time period. Outcomes are on the vertical axis because that, traditionally, is the place for the variable of primary interest. Costs are on the horizontal axis because that, traditionally, is the axis for the variable that is thought to determine the variable of primary interest. Cost and outcome data can be assembled in a table or spreadsheet with columns for costs and outcomes and rows for each patient (table 21).

Table 21. Treatment Cost and Drug-Free Days per Patient

Patient

Costs of treatment for February

Drug-free days in February

A

$512

23

B

$716

19

C

$632

20

etc.

Cost-Outcome Newman Tables

Cost information also can be combined with outcome data in a Newman table. Generally, the total or average cost of moving patients from one level of functioning to a another one (or to maintaining the same level of functioning) is displayed as shown in table 22.

Table 22. Cost-Outcome Newman Table

Pretreatment functioning

Posttreatment functioning

Poor

OK

Great

Poor

11($798)

45($643)

9($890)

OK

2($152)

5($672)

17($801)

Great

0

0

1($139)

These cost-outcome indices do not reflect what could be high variability among individuals in the value of resources that were devoted to their treatment. Within each outcome cell, there may be considerable variability in patients' responses to treatment. Having only three levels of functioning also is a problem. For instance, a patient might begin treatment at the low end of the "OK" level of functioning, end at the high end of the "OK" level of functioning, and be tallied in table 22 as "no change."

Including cost information in a Newman table may confuse some funders or lead them to hasty and erroneous decisions. A funder might, for example, express amazement that an average of $672 was spent on the five patients who began and ended treatment at the same "OK" level. Why, it might be asked, was all that money spent yet no change was produced?

Several answers are possible. One alluded to earlier is that change did occur, but the outcome measure was not sensitive enough to detect the change. Another answer is that, without treatment, these patients' functioning would have deteriorated. A control group might allow for a more direct test of this explanation, although the typical course of patients' addictive behavior may suggest that most will get worse without treatment.

Yet another answer is that it may indeed be possible that treatment did not work for these five patients. Although definite costs were involved, the outcomes of treatment cannot be guaranteed. Many other factors affect patient functioning. Should treatment be responsible for factors that are beyond the reach of treatment to change?

A final, possibly more constructive answer to hard questions about funds spent with no apparent clear result is, although most patients seem to benefit from treatment, we need to find out what procedures and processes combined to produce outcomes that were less than what we wanted.

Cost-Outcome Ratios: Patient Level

Several types of ratios can be calculated for programs, all of which may be interesting but which represent with varying degrees of accuracy the cost-effectiveness or cost-benefit of the program.

At the patient level, cost-outcome ratios can be formed by dividing the total cost of treating the patient by the one outcome measure, or by a composite outcome measure. For example, if the cost of treatment for Patient A for one month is $80 and the change in drug-free days is 110 percent, the simple cost per percentage change in drug-free days is $80 ÷ 110% = $0.73 per month.

If we had divided the cost of a month of treatment by the number of drug-free days during the month (say, 23 verified drug-free days), we would have another type of cost/outcome ratio ($80 ÷ 23 = $3.48 per drug-free day). This would probably underestimate the cost of a drug-free day, unless it can be assumed that there could be no drug-free days without treatment.

These ratios can be calculated for each patient for a specific period of treatment, such as the first 3 months, or for the entire treatment. They also can be averaged for patients, like any other statistic. However, the ratios do not describe the cost-outcome relationship over a wide range of costs. For example, finding a cost-outcome ratio of $0.73 per percentage change in drug-free days for Patient A might make some people think that $0.73 x 100 = $73 per month would produce a 100-percent change in drug-free days for Patient A. This is unlikely to be the case.

A patient-level benefit/cost ratio also could be calculated by dividing the total value of criminal behaviors and social services avoided by the cost of treatment during an appropriate period. If the sum of benefits for these cost savings was $160 for Patient A, and the cost of treatment was $80, the benefit/cost ratio would be 2.0. The question of what an appropriate period is requires discussion. Basically, the cost should be for the period of treatment to which the cost-savings outcomes can be attributed.

Cost-Outcome Ratios: Program Level

The average patient-level cost-effectiveness ratio is a measure of program cost-effectiveness. Sometimes, however, a few patients may have especially low or high cost-effectiveness ratios that may throw off the average. In such cases, other statistics, such as the middle cost-effectiveness ratio for all patients (the median) or the most common cost-effectiveness ratio for patients (the mode) might represent the cost-effectiveness of the program better than the average.

Another approach to measuring the cost-effectiveness of a program is to divide the total cost of the program by an outcome measure, or a composite outcome measure. For example, if the total cost of the aftercare substance abuse treatment program is $5,400 and the total drug-free days generated for all patients during that period was 100, the cost per drug-free days would be $54 per drug-free day.

Cost-effectiveness ratios could be calculated for the other outcome measures as well by dividing the program cost by the effectiveness measure. That assumes, however, that the total cost of treatment was devoted to drug-free days. It would be far more accurate to calculate the portion of program resources that were directed specifically toward the generation of drug-free days. Cost-procedure-process-outcome analysis does this. A composite cost-effectiveness ratio could, however, be calculated by dividing the total program cost by the composite effectiveness index.

As the past few paragraphs implied, calculation of cost-effectiveness ratios is not always simple, and the results can be misleading. Table 23 describes these ratios, notes whether they are true cost-outcome ratios or another type of ratio, and also notes their sensitivity to how cost and outcome are defined and measured.

Table 23. Cost-Outcome Ratios

Cost measure

Outcome measure

Resulting ratio

A cost-outcome ratio?Sensitivities

Program level of specificity

Cost of treating all patients who began treatment during a particular period (say, the second year of operations)

Number of patients who stayed drug free for a specific period after treatment and who began treatment during the same period for which the costs are measured

Average cost per successfully treated patient.

(Criterion for success: staying drug free for the specific period)

Yes

Very sensitive to length of drug-free period and definition of after
treatment

Cost of treating all patients who began treatment during a particular
period (say, the second year of program operations)

Drug-free day.
Does not distinguish between one patient being drug free for a long
period and many patients being drug free for short periods

Average cost per drug-free day

Yes

Sensitive to definition of drug-free day and when outcome measurement
begins

Cost of treating all patients who began treatment during a particular period (say, the second year of program operations)

Monetary benefits (cost savings,, income production) of treatment

Total benefit/cost ratios

Yes

Sensitive to assumptions made when estimating monetary value of different
outcomes (e.g.,, cost of a theft)

Individual patient level of specificity

Cost of treatment for individual patients

Number of drug-free days for individual patients

Average cost per drug-free day, calculated
separately for each patient, then averaged across patients

Yes

Does not distinguish between long and short periods of abstinence
Does not indicate variability of treatment cost- effectiveness between
patients

Cost of treatment for individual patients

Total monetary value of treatment outcomes for individual patients

Dollars spent per dollar produced

Yes

Sensitive to assumptions made when estimating monetary value of different
outcomes

False cost-outcome ratios

Cost of treating all patients who began treatment

Number of patients beginning treatment

Average cost per patient of all patients who
began treatment

No

This is a useful measure of cost but is not of outcome.

Cost of treating all patients who completed treatment

Number of patients who completed treatment

Average cost per successful patient

No

Including only the cost for patients who completed treatment underestimates
the cost

Cost of operating the treatment program for a short period, say one month

Number of patients seen one or more times by the program during
the same short period

Average cost per patient per short period

No

This is a potentially useful measure of cost, sometimes called slot
cost, but it involves no outcome data.

Many of these ratios also can be calculated for individual patients, and then summed. Generally, a cost-outcome ratio is more accurate and more representative of program functioning if it is calculated for individual patients and then averaged. Mathematically, the results can be different than will result from calculating the average cost for individual patients and dividing that by the average outcome for individual patients.

Costs per Procedure or Process

The most complete and useful cost-outcome analysis includes information about treatment procedures and about internal patient processes that occur between costs and outcomes. These analyses necessarily are performed at the level of individual patients. This level of detail enables us to see clearly what treatment interventions affect which processes in which patients. This information can help you decide how to deliver the most effective treatment to the most people given your budget constraints.