3 MAP vs AccuracyAverage precision is the average of the precision scores at the rank locations of each relevant document.Ex: has average precisionMean Average Precision (MAP) is the mean of the Average Precision scores for a group of queries.A machine learning algorithm optimizing for Accuracy might learn a very different model than optimizing for MAP.Ex: has average precision of about 0.64, but has a max accuracy of 0.8 vs 0.6 in above ranking.

7 Adapting to Average PrecisionLet x denote the set of documents/query examples for a queryLet y denote a (weak) ranking (each yij 2 {-1,0,+1})Same objective function:Constraints are defined for each incorrect labeling y’ over the set of documents x.Joint discriminant score for the correct labeling at least as large as incorrect labeling plus the performance loss.

8 Adapting to Average PrecisionMaximizesubject towhereandSum of slacks upper bound MAP loss.After learning w, a prediction is made by sorting on wTxi

9 Adapting to Average PrecisionMaximizesubject towhereandSum of slacks upper bound MAP loss.After learning w, a prediction is made by sorting on wTxi

10 Too Many Constraints!For Average Precision, the true labeling is a ranking where the relevant documents are all ranked in the front, e.g.,An incorrect labeling would be any other ranking, e.g.,This ranking has Average Precision of about 0.8 with (y,y’) ¼ 0.2Exponential number of rankings, thus an exponential number of constraints!

11 Structural SVM TrainingSTEP 1: Solve the SVM objective function using only the current working set of constraints.STEP 2: Using the model learned in STEP 1, find the most violated constraint from the exponential set of constraints.STEP 3: If the constraint returned in STEP 2 is more violated than the most violated constraint the working set by some small constant, add that constraint to the working set.Repeat STEP 1-3 until no additional constraints are added. Return the most recent model that was trained in STEP 1.STEP 1-3 is guaranteed to loop for at most a polynomial number of iterations. [Tsochantaridis et al. 2005]

12 Illustrative Example Original SVM Problem Structural SVM ApproachExponential constraintsMost are dominated by a small set of “important” constraintsStructural SVM ApproachRepeatedly finds the next most violated constraint……until set of constraints is a good approximation.

13 Illustrative Example Original SVM Problem Structural SVM ApproachExponential constraintsMost are dominated by a small set of “important” constraintsStructural SVM ApproachRepeatedly finds the next most violated constraint……until set of constraints is a good approximation.

14 Illustrative Example Original SVM Problem Structural SVM ApproachExponential constraintsMost are dominated by a small set of “important” constraintsStructural SVM ApproachRepeatedly finds the next most violated constraint……until set of constraints is a good approximation.

15 Illustrative Example Original SVM Problem Structural SVM ApproachExponential constraintsMost are dominated by a small set of “important” constraintsStructural SVM ApproachRepeatedly finds the next most violated constraint……until set of constraints is a good approximation.

16 Finding Most Violated ConstraintStructural SVM is an oracle framework.Requires subroutine to find the most violated constraint.Dependent on formulation of loss function and joint feature representation.Exponential number of constraints!Efficient algorithm in the case of optimizing MAP.

17 Finding Most Violated ConstraintObservationMAP is invariant on the order of documents within a relevance classSwapping two relevant or non-relevant documents does not change MAP.Joint SVM score is optimized by sorting by document score, wTxReduces to finding an interleavingbetween two sorted lists of documents

19 Finding Most Violated ConstraintStart with perfect rankingConsider swapping adjacent relevant/non-relevant documentsFind the best feasible ranking of the non-relevant document►

20 Finding Most Violated ConstraintStart with perfect rankingConsider swapping adjacent relevant/non-relevant documentsFind the best feasible ranking of the non-relevant documentRepeat for next non-relevant document►

22 Finding Most Violated ConstraintStart with perfect rankingConsider swapping adjacent relevant/non-relevant documentsFind the best feasible ranking of the non-relevant documentRepeat for next non-relevant documentNever want to swap past previous non-relevant documentRepeat until all non-relevant documents have been considered►

23 Quick Recap SVM FormulationSVMs optimize a tradeoff between model complexity and MAP lossExponential number of constraints (one for each incorrect ranking)Structural SVMs finds a small subset of important constraintsRequires sub-procedure to find most violated constraintFind Most Violated ConstraintLoss function invariant to re-ordering of relevant documentsSVM score imposes an ordering of the relevant documentsFinding interleaving of two sorted listsLoss function has certain monotonic propertiesEfficient algorithm

27 Moving Forward Approach also works (in theory) for other measures.Some promising results when optimizing for NDCG (with only 1 level of relevance).Currently working on optimizing for NDCG with multiple levels of relevance.Preliminary MRR results not as promising.

28 Conclusions Principled approach to optimizing average precision.(avoids difficult to control heuristics)Performs at least as well as alternative SVM methods.Can be generalized to a large class of rank-based performance measures.Software available at