Eric Sun, a software engineer on the entities team, outlined some of the challenges in a blog post on Thursday.

Some of the factors that go into forming those connections for the end user range include removing junk Pages, matching Wikipedia data, and manual labeling for the largest nodes/categories (i.e. movies).

Sun explained further:

Every day, the entities graph grows with new nodes and connections at a pace far greater than our small team can manually inspect. Therefore, we concentrate on building multi-pronged systems that are scalable and that will improve the graph over time. Current efforts are focused on cleaning up the long tail via back-end machine learning as well as expanding crowdsourcing efforts to allow all of our users to contribute corrections and additions to our entity pages.

Background information and more technical information about the path developers have taken to reach this point in the Entity Graph are available on The Facebook Engineering Blog now.