Optimal Matching and Social Sciences

Abstract : This working paper is a reflection on the conditions required to use optimal matching OM in social sciences. Despite its striking success in biology, optimal matching was not invented to solve biological questions but computer science ones: OM is a family of distance concepts originating in information and coding theory were it is known under various names among which Hamming, and Levenshtein distance. As a consequence, the success of this method in biology has nothing to do with the alleged similarity of the way it operates with biological processes but with choices of parameters in accordance with the kind of materials and questions biologists are facing. As materials and questions differ in social sciences, it is not possible to import OM directly from biology. The very basic fact that sequences of social events are not made of biological matter but of events and time is crucial for the adaptation of OM: insertion and deletion operations warp time and are to be avoided if information regarding the social regulation of the timing of event is to be fully recovered. A formulation of substitution costs taking advantage of the social structuration of time is proposed for sequences sharing the same calendar: dynamic substitution costs can be derived from the series of transition matrices describing social sub-rhythms. An application to the question of the scheduling of work is proposed: using data from the 1985-86 and 1998-99 French time-use surveys, twelve types of workdays are uncovered. Their interpretability and quality, assessed visually through aggregate and individual tempograms, and box plots, seem satisfactory.