I'm examining some binary response data using a non parametric approach. I
want to test the parametric approach. For each observation in the data, I
estimated a probability of response: call it
Pr[response][observation(i)].
I have the observed response for the data, call it
ObservedResponse[observation(i)].
I created a new table with 10 cells and three columns. The first column
counts the # of observations where the estimated
Pr[response][observation(i)] is in the interval [0,0.1}, the second column
of the new table is the sum of the probabilities in this category, and the
third column is the count of the number of observations where a response
occurred. If I divide the second and the third column by the first column,
I have an average probability for the range, and an an average frequency of
response for the range. The other 9 cells of the table are constructed
similarly. There are more than 20 observations in each cell in the new
table. I want to test the hypothesis that average frequency=average
probability. A regression model
average frequency=c+k * average probability +error
does not reject c being 0 and k being 1. Is there a better test strategy?
Would some Chi Square variant work here?
Richard Palmer