If more data is better than less, what could better than having access to … all the data?

Since around the beginning of the year, I’ve been researching it up with Bankable Frontier Associates, focusing on low income savers. There are two particular multi-year engagements going on, called InFocus and GAFIS, both supported by the Bill & Melinda Gates Foundation. As part of the engagements, we get data dumps from financial institutions around the world – 4 for InFocus and 5 for GAFIS. This includes client information, account information, records of every single transaction within the analysis window, and running account balance data.

Literally, all the data on accounts of savers which we could lay our hands on from the MIS systems.

My job is to beat this data till it decides to play nice and cough up useful information. (Given the effort that goes into cleaning and harmonizing data that often involves millions of accounts and hundreds of millions of transactions per institution, I assure you that this characterization is not overly dramatized!) This information is combined with all manners of other data, such as financial statements, demand-side surveys (i.e. surveys of the clients themselves), qualitative interviews, etc.

I should note that these are the sanitized cliff-note versions of the voluminous reports the individual institutions get. Part of the deal for them engaging with BFA was preservation of confidentiality, which makes the vast majority of the analysis and recommendations not publicly shareable. Still, what is in these three Notes should give a decent idea of both the theoretical basis as well as the general thrust of the analytics supporting the projects.

I’ll take a closer look at some of the things that I found fascinating in upcoming posts. In the meantime, dive into the Notes and lemme know what you find most interesting!