These numbers should look familiar. Return to the table of probabilities. The differences across the cumulative probabilities are equal to conditional probabilities. This is obvious. However, the important thing to note is that in this example, since white2=0, the only component that shifts the cumulative probabilities are the cutpoints. This raises a more general point: since the regression coefficients, x’β, are the same for all J cutpoints, changes to the cumulative probability curves will be solely due to the cutpoints. This is what is known as the “parallel slopes” assumption.

Let’s turn now to ordinal probit. Here I estimate a model with the same covariate as before. The coefficients are now expressed in terms of inverses of the cumulative normal distribution (i.e. 1/Φ).

. oprobit aff2 white2

Iteration 0: log likelihood = -2256.8146

Iteration 1: log likelihood = -2118.8459

Iteration 2: log likelihood = -2118.7591

Ordered probit estimates Number of obs = 2128

LR chi2(1) = 276.11

Prob > chi2 = 0.0000

Log likelihood = -2118.7591 Pseudo R2 = 0.0612

——————————————————————————

aff2 | Coef. Std. Err. z P>|z| [95% Conf. Interval]

————-+—————————————————————-

white2 | -1.019296 .0612999 -16.63 0.000 -1.139441 -.8991502

————-+—————————————————————-

_cut1 | -.4874818 .0552574 (Ancillary parameters)

_cut2 | .1004991 .0548217

_cut3 | .4979683 .0557226

——————————————————————————

I can use Stata to compute the probabilities (note again I must identify the probabilities):

. predict pr1 pr2 pr3 pr4

(option p assumed; predicted probabilities)

(26 missing values generated)

For illustrative purposes, let’s tabulate them by the value of the race indicator:

. table pr1 white2

———————-

Pr(aff2== | white2

1) | 0 1

———-+———–

.3129585 | 563

.7025726 | 1,896

———————-

. table pr2 white2

————————

Pr(aff2== | white2

2) | 0 1

———-+————-

.1660268 | 1,896

.2270675 | 563

————————

. table pr3 white2

————————

Pr(aff2== | white2

3) | 0 1

———-+————-

.0668006 | 1,896

.1507208 | 563

————————

. table pr4 white2

————————

Pr(aff2== | white2

4) | 0 1

———-+————-

.0646 | 1,896

.3092532 | 563

————————

I could also compute these probabilities using Stata as a calculator. For the case when white2=1, we obtain:

. display norm(_b[_cut1]-_b[white2]*1)

.70257259

. display norm(_b[_cut2]-_b[white2]*1)-norm(_b[_cut1]-_b[white2]*1)

.16602683

. display norm(_b[_cut3]-_b[white2]*1)-norm(_b[_cut2]-_b[white2]*1)

.06680057

. display 1-norm(_b[_cut3]-_b[white2]*1)

.06460001

Verify that these agree with Stata’s computations. (They do).

One last little thing. There is an equivalency between the binary version of these models and the ordinal version. To “prove” this, return to logit. Let’s apply ologit to a binary dependent variable:

. ologit hihi white2

Iteration 0: log likelihood = -673.904

Iteration 1: log likelihood = -660.43261

Iteration 2: log likelihood = -659.29047

Iteration 3: log likelihood = -659.28661

Ordered logit estimates Number of obs = 2459

LR chi2(1) = 29.23

Prob > chi2 = 0.0000

Log likelihood = -659.28661 Pseudo R2 = 0.0217

——————————————————————————

hihi | Coef. Std. Err. z P>|z| [95% Conf. Interval]

————-+—————————————————————-

white2 | -.8732477 .1561835 -5.59 0.000 -1.179362 -.5671336

————-+—————————————————————-

_cut1 | 1.857531 .1233321 (Ancillary parameter)

Now let’s apply logit:

. logit hihi white2

Iteration 0: log likelihood = -673.904

Iteration 1: log likelihood = -660.43261

Iteration 2: log likelihood = -659.29047

Iteration 3: log likelihood = -659.28661

Logit estimates Number of obs = 2459

LR chi2(1) = 29.23

Prob > chi2 = 0.0000

Log likelihood = -659.28661 Pseudo R2 = 0.0217

——————————————————————————

hihi | Coef. Std. Err. z P>|z| [95% Conf. Interval]

————-+—————————————————————-

white2 | -.8732477 .1561835 -5.59 0.000 -1.179362 -.5671336

_cons | -1.857531 .1233321 -15.06 0.000 -2.099257 -1.615804

What is the difference? The coefficient is identical; however, in the ordered logit model, _cut1=-b0. Why?