a blog about data, entrepreneurship, and software

Menu

Category Archives: Bay Area, CA

Introduction

A screenshot of AngelList mainpage

AngelList is probably the largest open network of start ups, founders, and investors. It also provides a nice API for others like myself to play with the data. I had some fun analyzing the dataset since January and wanted to put a bit more formality into sharing the result. So I will be organizing the methodologies and results as a series of posts instead of tweets.

Understanding investors has multiple benefits.

One can see the trend in markets. It is important not only for identifying pain points but also pivoting on your existing business or ideas.

Use it to target more relevant investors. Perhaps even a lead investor.

And more…

Methodology

Investors

Investors are filtered from a full list of users who had a “startup role” of “past_investor”.

Primary Locations and Meta Location

Investors’ primary location was chosen as the first in the “locations” attribute.

Meta location was determined by manually merging primary locations.

There may be some inconsistencies or misrepresentation of some investors’ location.

Connections

Connections are drawn by finding the number of co-invested companies between two investors. For example, if “investor 1” and “investor 2” both invested in “company A” and “company B”, there will be a link drawn between them with weight “2.”

The Network

Results

Centrality of investors versus followers and number of companies invested

Sized by betweenness centrality score, colored by number of companies invested

Scatterplot of betweenness centrality score and number of companies invested

Sized by betweenness centrality score, colored by number of followers

Scatterplot of betweenness centrality score and number of followers

Both number of followers and number of companies invested have some correlation with betweenness centrality score. One with number of companies invested is expected since the network was generated using the co-investments.

Giant cluster of Silicon Valley investors

Closer look at the central cluster of Silicon Valley investors

I don’t know whether AngelList data is skewed toward Silicon Valley investors or many investors list SV as a primary location even if they don’t live there but SV investors take large majority and they are very central. They are well-connected to pretty much every group and co-mingled with the second largest group, NYC/Boston investors(teal color).

David McClure and 500 Startups because of their number of investments, have the highest betweenness centrality scores and pretty much all other centrality measures.

Investors in within Silicon Valley region

Within Silicon Valley, there is no distinct sub-groups based on smaller regions.

Silicon Valley investors acting as hubs

Brad Holden, a Silicon Valley investor is positioned to connect many Los Angeles based investors.

There are many examples of SV investors acting as hubs to other regional groups of investors. The most prominent one is Brad Holden(bottom right) who is connecting a very well-connected group of Los Angeles investors.

Joshua Baer is a Texas based investor who is connecting many investors in the same region.

Another example is an investor who is based in a region outside of Silicon Valley but has made many investments with SV investors acting as a hub to regional investors. Joshua Baer(top center) and Bill Boebel are both are based in Texas but have many co-investment connections with SV investors are connecting other Texas based investors.

Ideas for Further Analysis

I wish I was able to get some temporal information to do more advanced analysis such as

A group of investors acting as flocks – How does certain attributes of investors inform/motivate other investors to act together?