Dan Lecocq

Dan Lecocq

Seattle, United States

Member since May 6, 2014

Dan is an engineer and cowboy coder with a background in big data and distributed systems. He has extensive experience with profiling, optimization, asynchronous network I/O, and getting huge amounts of work pushed through a pipeline reliably and efficiently.

Simple command-line dispatch of Python functions. Users find themselves regularly wanting to invoke small, simple Python functions from the command line, so I wrote what has become one of Moz's most popular repos.

A rich queueing system for Redis, used for production services both at Moz and elsewhere. It utilizes Redis's Lua script support to implement complex atomic operations for queueing. It consists of a Lua core (https://github.com/seomoz/qless-core) and Ruby (https://github.com/seomoz/qless) and Python (https://github.com/seomoz/qless-py) bindings.

Fast simhash in Python. It supports maintaining and finding near-duplicates in a set of documents with extreme speed. It consists of our underlying library simhash-cpp (https://github.com/seomoz/simhash-cpp) and the surrounding Python bindings.

Web page content extraction. This is the implementation supporting some published work (http://dl.acm.org/citation.cfm?id=2487828) where we separate the main content of web page articles and blog posts from the other components (navigation, headers, footers, etc.).