One of the key issues when you’re looking at any big company is what are the constituent parts – because these days a company of any size is pretty much never a single legal entity, but a web of companies, often spanning multiple jurisdictions.

Sometimes this is done because the company’s operations are in different territories, sometimes because the company is a conglomerate of different companies – an educational book publisher and a financial newspaper, for example. Sometimes it’s done to limit the company’s tax liability, or for other legal reasons (e.g. to benefit from a jurisdiction’s rules & regulations compared with the ‘parent’ company’s jurisdiction).

Whatever the reason, getting a handle on the constituent parts is pretty tricky, whether you’re a journalist, a campaigner, a government tax official or a competitor, and making it public is trickier still, meaning the same research is duplicated again and again. And while we may all want to ultimately surface in detail the complex cross-holdings of shareholdings between the different companies, that goal is some way off, not least because it’s not always possible to discover the shareholders of a company.

So you must make do with reading annual reports and trawling company registries around the world, and hoping you don’t miss any. We like to think OpenCorporates has already made this quite a bit easier, meaning that a single search for Tesco returns hundreds of results from around the world, not just those in the UK, or some other individual jurisdiction. But what about where the companies don’t include the group in the name, and how do you surface the information you’ve found for the rest of the world?

The solution to both, we think, is Corporate Groupings, a way of describing a grouping of companies without having to say exactly what legal form that relationship takes (it may be a subsidiary of a subsidiary, for example). In short, it’s what most humans (i.e. non tax-lawyers) think of when they think of a large company – whether it’s a HSBC, Halliburton or HP.

We’ve also decided to link these corporate groupings to Wikipedia entries. Why? First it’s the right thing to do, from a community perspective, increasing the authority of the Wikipedia entries by linking to them, and because the entries on Wikipedia are the best narrative descriptions of the companies we’ve seen, and we don’t want to reinvent the wheel by trying to duplicate them.

Second, it bridges the gap between the fuzzy representations of the companies in Wikipedia and the legal representations in OpenCorporates. [It’s worth noting that critical to this is that both Wikipedia and OpenCorporates are openly licensed.]

So how do you add a company to a corporate grouping? We wanted to make this as easy as possible, taking no more than 10 seconds in most cases, and under 30 seconds including the signup/login process. That meant avoiding the messy and painful email/password process and instead authenticating users using their twitter accounts (we’ll add other options in the future). As a bonus it was easy to add in some social networking sweetness – when you add a company to a corporate group you get the option of tweeting that information to the wider world without any extra work.

So here are the three steps for adding a company to a corporate grouping:

Step 1: When you find a company without a company grouping click on the add new link below the Corporate Grouping subheading. If you haven’t done this before, you’ll be asked to login using your Twitter account (important: we don’t get to see your password, and you can deauthorize OpenCorporates at any time from your Twitter account). Otherwise, the link should automatically change into a field for entering text:

Step 2: Type in the name of the group. In this case BP (doesn’t matter whether it’s upper or lower case), click the grouping that comes up (in this case there’s only one). If there isn’t already a Corporate Grouping for that group you’ll be asked if you want to add one.

Step 3: Press the button ‘Add & tweet about this’ and that’s it.

The company is now associated with that group…

and a tweet will appear from you announcing it (hint: click ‘add silently’ if you don’t want a tweet sent)

Less than 10 seconds, and we’re well on our way to mapping this thing called BP.

Finally we’ve made the information super easy to get the data out – as JSON, XML, or even as an RSS feed so you can subscribe to a particular grouping and see what new companies are added to it. (p.s. we’ll do an RDF version too if anyone wants to help put it together.)

This is, it should be stressed, an attempt at a genuinely difficult problem, and we’d welcome feedback, but as a way of distributing the job of researching and storing a corporate group’s constituent companies, we think it’s pretty cool.

p.s. It’s worth mentioning that although the whole process is built using some sexy ajax/javascript, it also works with javascript turned off and should therefore usable for those with accessibility issues, or using screenreaders.

Any chance of automating some of the fact-collection? National company registries could perhaps be convinced to give out data? Are you taking data exports from any country?
How do you deal with circular situations, i.e. Firm A owns 50% of B owns 50% of A? (Most loops would be longer and more subtle though :)

Scraping is as automated as is possible for the most part. We use the UK Companies House API, and are talking with other company registries, but many are in thrall to the companies that would rather buy the data (and so sustain their rent-seeking business models), and so the idea of opening up the data is not one that comes naturally.

Re the circular issue, that’s one of the reasons why we’ve gone with the fuzzy concept of Corporate Groupings, which sidesteps this issue completely, and also allows connections be drawn where the exact path is not known.