How to Make Living Wills Work: A Big Data Solution to a Big Data Problem

Data lakes -- massive repositories of information that consolidate structured, unstructured, batched and streaming data into a single table -- could enable integration of different banks' living wills, providing transparency that would help identify hidden risks in the financial system.

Mark Herman, Booz Allen Hamilton

The "living wills" now being submitted by large financial firms contain the detailed and highly critical data federal regulators would use to dismantle the banks' operations and protect the country's financial system, should one or more firms liquidate in a severe crisis. However, it is one thing to have that data, and quite another to be able to use it effectively to create stability out of turmoil.

While the Dodd-Frank financial reform law that mandated these living wills is well designed, the capability to use living wills effectively is hampered by current computing technologies. This is, essentially, a big data problem. Current technologies -- developed long before big data arrived on the scene -- limit our ability to analyze more than just a very limited portion of big data sets at any one time. Information tends to be stored in discrete data structures, or data silos, which are not easy to connect.

Although a single living will can be analyzed as a stand-alone entity, bringing together even just two living wills for a combined analysis -- essential to understand related impacts of bank failures -- would require a painstaking, time-intensive data-preparation process. More than 100 financial firms are required to have living wills -- linking them all together for analysis, to understand the myriad of interconnections and interdependencies, would simply be technically infeasible. And yet that deep insight is what the spirit of Dodd-Frank requires -- and what is needed to prevent the kind of cascading failure that led to the 2008 financial crisis.

This challenge requires the ability to consolidate an organization's entire repository of information, so that it is all connected, and immediately available for analysis. Thanks to recent advances in data science, the task can be accomplished through a new approach that brings together multiple sources of information in what is known as the "data lake." An industry-recognized term, the data lake is a massive repository of information that consolidates data -- including structured, unstructured, batched and streaming -- into a single table. The data lake eliminates the once-siloed, cumbersome data-preparation process, making information easily accessible to the analysts responsible for mining it.

Because the data lake can hold an almost unlimited amount of information, it would enable regulators to integrate all 100-plus living wills, and more if necessary. And with the advanced analytics that sit on top of the data lake, so to speak, the interdependencies of the banks would become transparent to regulators.

This transparency would also help regulators identify hidden risk that may be building across the financial system. Some experts believe that a major cause of the financial crisis was that a number of large banks were relying on the same questionable risk model. With the data lake, regulators would be able to see when an institutional practice might be creating collective risk -- even if individual banks cannot recognize it within their own organizations.

And because the data lake can be continuously updated, regulators would have the ability to see in real time, or near real time, exactly what is happening when a firm unwinds -- what impact its actions are having on other banks, and how that effect is rippling through the larger financial system. Regulators would be able to take quick action to prevent a firm's problems from spreading to others -- before it is too late -- and perhaps set the firm on a different path to keep it alive.

By building models and scenarios into the data lake, regulators would have the ability to quickly answer any number of "what if?" questions, whether they are stress testing living wills, or supervising their execution during an actual liquidation.

Moving to a new approach such as the data lake requires a new mindset about data analytics. Many organizations tend to focus on computer infrastructure and data storage -- how each might be expanded, for example, as analytic needs change. But if we are to solve big data "problems" like the living wills, we must not focus on just on bigness, but on diversity, finding ways to harness, and extract value from all of the data that we are collecting.