Get Job Alerts

"Why," you might ask, "does the world need a robot that can read web pages -- particularly when we still lack robots that can make pancakes, wash the dog, or clean up after the robots that make pancakes and wash dogs?"

The answer: when we have a robot that can read web pages, its human masters can do amazing and significant things with data extracted by said robot. Things like repurposing content from any web page onto any device; parsing product information from any e-commerce site; building a master calendar for Planet Earth featuring every event listed on any page on the web. That kind of stuff.

We're building this robot using computer vision, natural language processing and machine learning. These are buzzwords for other companies, but not for Diffbot. We live and breathe this stuff -- just as our robot doesn't.

Who are we? We're PhD-engineers that have won numerous AI competitions in the past, built search technology at Microsoft, Yahoo and Powerset, and founded one of the pioneering web search engines. We are also a group of investors -- including the original backer of Google, the founder of Earthlink, top executives at Twitter and Facebook, and the chair of the MIT Media Lab and Creative Commons -- who've put a combined $2M of our own money into the company, contributing our experience with building web-scale systems.

And we're working on something that has the potential to fundamentally change the way the web works.

We currently analyze over 100M pages per month in real-time for our clients, powering behind the scenes many of the web's largest sites like StumbleUpon and Instapaper. That's a lot, but in many respects we're just getting started.