We’ve working on a compiler for our own language named "lzsql". It’s for our nginx-based web service platform that drives our data product lz.taobao.com. Our "lzsql" compiler can now emit lua code that has passed lots of real world tests.

We can now decide whether to run a sql query at a remote mysql node or at the nginx core, all in the lzsql language.

For "local sql queries", we’ve implemented a full-fledged sql engine in pure lua. It’s damn fast, especially using LuaJIT. 6k q/s for a single nginx worker process is not uncommon in our benchmark.

And we’ve introduced a type system in our language such that it can handle sql quoting rules automatically. The typechecker can ensure that a lzsql variable with a specific type is used correctly in the context of the sql query. The sql language is part of the language anyway. Therefore, sql injection cannot happen.

We mostly use the "local sql engine" for merging data from completely different data sources, like those from both mysql and a non-relational data source. We do have some non-relational data sources like our real time stats services and other Java-powered web services from other departments of Taobao.com.

Here’s a small example:

text $pattern; location $mysql_node;

@a := select count(id) as count from cats where name contains $pattern group by park at $mysql_node;

@b := select count(id) as count from other_service.some_api($pattern) group by park;

return (@a union all @b);

In this sample, "other_service.some_api" is a non-blocking call to some remote non-relational data source. And the first SQL query runs on a remote mysql node specified by the variable $mysql_node while the last two both run directly in the nginx core by our sql engine written in Lua.

The .lzsql source file is compiled down to (very compact) Lua code before deploying to our production servers. Because it is a true compiler, we use Perl 5, one of the not-so-fast scripting languages, to implement the whole compiler (approximately 3k lines of hand-written code). Perl modules like Moose and Parse::Descent have made the compiler construction process quite enjoyable 🙂

In the future, the lzsql compiler is also expected to optimize the sql queries automatically for specific remote sql engine, like mysql’s.

The lzsql compiler will be eventually be released under an opensource license with the name "RestyScript" when we decouple those our specific business logic from the compiler. For now, we hardcode some business logic into the compiler for the sake of convenience. We’re going to move them into compiler plugins or language extensions and make the lzsql toolchain itself more general.

My intern students become very productive when they start using the lzsql language 😉 The old system they’re replacing is written in tons of ugly php code, oh well 😉 we’ve cut off 90% of the codebase size and also got 20 ~ 30 times faster 😀

We’re also puting our heads around VoltDB, a really nice memory database. And we’re also looking forward to rewriting our "real time stats services" mentioned above using VoltDB and Erlang or Lua or etc. An nginx upstream module for the VoltDB binary protocol is also on chaoslawful’s and my TODO list.

The only sad part regarding VoltDB is that it’s written in Java, but it’s not a very big issue for us. It has some ugly limitations regarding its sql and interfaces, but we can work around those details on the level of our lzsql language and just use it (combined with java) as the runtime.

I’m delighted to announce the v0.0.12 release of ngx_drizzle, a non-blocking upstream module that helps nginx talk directly to mysql, drizzle, and sqlite3 servers (and with an optional connection pool). The project source repository and the homepage is on GitHub:

This module adds HTTP 1.1 chunked input support for Nginx without the need of patching the Nginx core.

Behind the scene, it registers an access-phase handler that will eagerly read and decode incoming request bodies when a "Transfer-Encoding: chunked" header triggers a 411 error page in Nginx. For requests that are not in the chunked transfer encoding, this module is a "no-op".

To enable the magic, just turn on the chunkin config option and define a custom 411 error_page using chunkin_resume, like this: