The proliferation of data breach disclosure (security breach notification) laws has prompted a flurry of lawsuits filed by alleged victims of identity theft against corporations that suffer a breach. Using data collected from Westlaw and PACER, we perform docket analysis on a sample of data breach lawsuits over the period from 1999 to 2010. This method of empirical legal research involves collecting, mining and coding relevant data from court documents (such as the complaints and judicial rulings). While much economic and legal scholarship has been written about data breaches, breach disclosure legislation, and the difficulties that consumers face from breach litigation, to our knowledge, this is the first research that attempts to empirical analyze the lawsuits, themselves.

In this working paper, we present preliminary results showing that the trend of known lawsuits appears to generally follow (and lag) the trend in reported data breaches. Since about mid-2006, the time taken for plaintiffs to organize and file a complaint has been steadily increasing, though the time to dispose of these suits has been steadily decreasing. Moreover, the overall duration of a data breach lawsuit is 15 months, on average. We also find that the settlement rate of data breach lawsuits is substantially lower in our sample (26%) compared with estimates found in other legal scholarship (67%). Finally, the average number of records lost is statistically much higher for known lawsuits than for the sample of all reported breaches (9.5m compared with 340k) and financial institutions are over-represented in breach litigation relative to the sample of known breaches, while government agencies and educational institutions are under-represented. Further, we use a probit regression to estimate the probability that a data breach will result in a lawsuit, and a multinomial logit model to examine the characteristics of lawsuits that impact particular outcomes of data breach lawsuits.