Plaid – A Modern API for Card Data

Our work in payment data cuts across many areas. Who has what data? How is it best monetized? How can it be put to use for better decision making? How can it be augmented and enhanced.

It’s in this last area of “enhancements” that we’ve become fascinated with a new startup company called Plaid. They are making the first serious attempt we’ve seen in a while that rethinks the quality of the transaction data.

Every consumer knows how confusing it can be to open up a statement or view their recent activity and find a bunch of transactions for purchases that are not obvious at first glance or maybe not even obvious after a detailed investigation. What is this? Is this a gas station? Was that me or somebody else using my card? And that’s with human reasoning. It gets a lot worse when you try to couple that level of transaction data quality with analytics and third party apps.

The root cause is the minimal amount of data that is available to describe the merchant in the card systems. The merchant name is often truncated to fit the fixed formats of card messaging. Magnolia Home Theater becomes MHT. Then there are merchant category codes (MCCs) that are defined by the card networks but actually set by the merchant’s acquirer. The codes standardize the category descriptions, but it’s the acquirers that force fit each merchant into exactly one of the categories that’s supposed to define the merchant’s primary line of business.

Think about that lovely meal you had eating out last night. From the perspective of your card issuer, you were spending money at a truncated merchant name located in a zip code that has been assigned to one of two categories:

5812 – Eating Places, Restaurants

5814 – Fast Food Restaurants

Your card issuer also knows how much you spent with that merchant. From this minimal information the issuer (or their analytics provider) can figure out how many times a week you eat out, how much you spend, how often you frequent this merchant, etc. Service providers like Yodlee can provide third parties with access to this data as long as the card holder provides permission and their online banking credentials. You see this Yodlee service used all the time by companies like Personal Capital, BillGuard, and Outright.

Plaid’s big idea is that it should be possible to take this bare bones transaction data and enhance it so that it has more relevance and meaning for third party apps. Essentially, to transform the data to be more usable in today’s market.

The basic model is the same as Yodlee’s in that third party app developers utilize the Plaid API to access statement data once the consumer provides the app with permission and login credentials to access their account. But instead of simply raw statement data, Plaid returns a “cleaned up” merchant name, a hierarchical category mapping, and assorted meta data about the merchant.

The category mapping replaces the notion of flat merchant category codes. Instead of “5812” the app sees the merchant category as a three-tiered structure of “Food & Drink -> Restaurants -> Spanish” for example. In addition to the Plaid categories, the API calls also return the FourSquare category mappings, the AmEx category mappings, and the factual MCC data.

The meta data is perhaps the most interesting enhancement. Here Plaid will make a best effort attempt to provide the merchant contact information (phone number and website) and location data (both street address and geolocation). Because this is not always possible for every purchase, they also provide the app with a meta data confidence score.

Glenbrook’s Take

It will be interesting to see how shopping apps and offer targeting services make use of the Plaid service. Enhancing transaction data for contemporary use in today’s market seems to make a lot of sense. But it wouldn’t be easy.

Plaid’s service is currently restricted for use with accounts held at American Express, Bank of America, Chase, Citi, US Bank, USAA, and Wells Fargo. This mostly likely indicates that Plaid has gone back to square one with a bank-by-bank scraping approach to get the raw statement data. These are certainly the major credit card issuers, but we live in a country with 14,000 plus financial institutions — each of which issues debit cards.

That said, Plaid’s focus on developer tools and needs should help accelerate the pace of innovation with how payment data is used in the industry.

If you are interested in these issues, please join Scott Loftesness and myself at our next Data in Payments Workshop being held July 23nd in Mountain View, CA. We’ll be exploring who intrinsically knows what, how that knowledge is monetized, and the various techniques that are used to gain access to payment data.

2 Responses to “Plaid – A Modern API for Card Data”

Great post, had not come across these guys before and to be clear, data WILL be the new commodity in payments when the auth itself becomes free. This looks beautifully well executed, a Stripe for transaction meta-data if you will.

However I’m somewhat intrigued that Plaid slant is not toward big data from retailer or bank perspective, but enabling data for individual consumers to be available to other apps. I certainly see the value but wonder about the popularity.

How easily will you scale on willingness of customers to share financial data to third parties (“opt in”), my expectation was always that the strength of data to payments would come from “big data” opportunities of bulk enablement/access.

I’ve spoken to great card linked offers startups wrestling with legal impediment of consumer opt in, and have the impression that those that solved by going to the issuers direct to get cross-the-board access, are making quicker progress.

Perhaps Plaid will go that route once issuers see the value of their proposition, though one expects the issuers will want their share of the reward for doing so, this is something they could ultimately do themselves and add more data to.

However businesses like Ethoca in other functions (in this case pooling of early issuer fraud alerts for Merchants) have shown there is a big market for creating more agile tech over the top of the data produced by legacy issuer platforms.

One thought – is Plaid simply scraping data from online issuer portals using user credentials? It seems that way. Models like this rarely scale, are prone to source provider changes and always pivot if given the chance (i.e. first hand access).

Kudos for the Plaid team for a great idea and building a great API to serve this role but one would hope the opportunity to pivot into something better supported and more scalable presents itself – more likely issuers paying to bring in-house.

I think the Stripe analogy is very fitting. In some ways, Plaid is to Yodlee as Stripe is to PayPal. In both cases, a 2010ish update to a 2000ish idea. But working in the same basic framework. It’s interesting that Stipe and Plaid both view the developer as the customer. That’s a very contemporary notion as well.