$\begingroup$An ROC curve shows the TPR as a function of FPR. Neither of these measures exists in the context of regression, so there is no such thing as ROC curves for regression.$\endgroup$
– Marc ClaesenMar 5 '14 at 16:46

3 Answers
3

I haven't enough reputation to make a comment to Matt's comment, that's why I add something via an "answer". Maybe I am wrong, but you can use regression as a classifier, like a logit/probit model, if you have a binary outcome (y variable). Than your "knob", as Matt called it, would be the threshold at which value you choose to see your y* (your continuous prediction of e.g. a linear regression) to be y = 1. Than you can use this threshold for a ROC.

Edit: I agree to the (*) edit of Matt's answer.

Example: There is a continuous variable x and a binary variable y. What you can do is a normal regression of y on x. Then you calculate the predictions of your model dependent on x for each individual, calling these predictions y*. Than you look for a threshold c which does something like
$y_{prediction} = \left\{\begin{matrix} 1\text{ if y*} > c
\\
0\text{ else}
\end{matrix}\right.$

Than you can use this c for a ROC analysis. (Sorry for my bad formatting, it is my first post here)

$\begingroup$Unless I am missing something, this is basically shoe-horning a classification problem into a regression problem and then evaluating it using classification metrics.$\endgroup$
– Marc ClaesenMar 5 '14 at 17:46

$\begingroup$Agreed; maybe I should make my footnote a little more general. In fact, I think some of the early credit modeling stuff worked like this--linear regression with the output "clamped" to within a certain range, followed by a decision rule.$\endgroup$
– Matt KrauseMar 5 '14 at 17:47

$\begingroup$At least, to my own answer I can do a comment :) @MarcClaesen: I don't say it is the best way to do it. But in our Econometrics class we still had it under "you can do it, with the advantage of using a familiar tool to a new subject; now let's move on to more elaborated stuff for this problem -> logit/probit"$\endgroup$
– user2075339Mar 5 '14 at 17:57

$\begingroup$By the way, you should have enough reputation to comment everywhere now. Welcome to Cross Validated!$\endgroup$
– Matt KrauseMar 5 '14 at 20:39

A (binary) classification task has a small set of possible outcomes: you either correctly detect/reject something or you don't. The ROC curve measures the trade-off between these (specifically, between the false positive rate and the true positive rate). In this setting, there's no notion of "close-but-not-quite-right", but there is often a "knob" you can turn to increase your true positive rate (at the expense of more false positives too), or vice versa.

Regression typically(*) makes continuous predictions. With so many possible outcomes, it's vanishingly unlikely that the model will make an exact prediction (imagine predicting Amazon's annual sales down to the penny--it's not going to happen). There also isn't a TP/FP trade-off.

Instead, people measure a regression model's performance using a loss function, which describes how good/bad a certain amount of error is. For example, a common loss function is the mean-squared error: $\frac{1}{N}\sum_{i=1}^{i=N} (\textrm{obs}_i - \textrm{pred}_i)^2$. This penalizes large errors a lot, but tolerates smaller errors more.

* In some cases, regression can be converted into a classification problem by adding a decision rule. For example, logistic regression, despite the name, is often used as classifier. The "bare" logistic regression output is the probability that an example (i.e., a feature vector) belongs to the positive class: $P(\textrm{class=+} | \textrm{ data})$.

However, you could use a decision rule to assign that example to a class. The obvious decision rule is to assign it to the more likely class: the positive one if the probability is at least a half, and the negative one otherwise. By varying this decision rule (e.g., an example is in the positive class if $P(\textrm{class}=+) > \{0.25, 0.5, 0.75, \textrm{etc}\}$, you can turn the TP/FP knob and generate an ROC curve.

All that said, for most regression tasks, where you're predicting something continuous, ROC analysis is an odd choice.

$\begingroup$Logistic regression is only a classification technique in conjunction with some rule that a predicted probability > $x|x \in [0,1]$ is assigned to one outcome or another. On its own, the logistic regression model is inference on the latent variable Pr(class membership=1).$\endgroup$
– SycoraxMar 5 '14 at 17:25

$\begingroup$That's a fair point. Interestingly, you could something similar about Naive Bayes (or many other maximum likelihood classifier). How do you feel about the edited version?$\endgroup$
– Matt KrauseMar 5 '14 at 17:58

$\begingroup$I agree with the edit overall. However, I'm not sure what you mean by "you could assign an example to the more likely class if you wanted to perform a classification task instead." What example? Why assign it to a class? I think if you delete that text, it's a great addition to your answer.$\endgroup$
– SycoraxMar 5 '14 at 18:18

$\begingroup$I'm not sure where I picked this up, but I tend to call one "row" of the data matrix an example. For example, the Fisher's Iris data set has 3 classes (the species of flower), 50 examples per class, and each example has 4 attributes (the length/width of the petal and sepal).$\endgroup$
– Matt KrauseMar 5 '14 at 18:26

$\begingroup$(+1) After the rewrite, I understood -- It makes sense to me now. Regardless, good answer.$\endgroup$
– SycoraxMar 5 '14 at 18:29

This is too late to answer the original question, but it's something I've been interested in and while searching for a method of implementing ROC curves for regression I came across the following paper which may be of some use to others wondering the same thing