MySQL Row Change Event Extraction and Publish

This talk introduces a MySQL row change extraction system we built at Google. We have modified MySQL 5.1 source code to generate row-level change logs without use of triggers. The system continuously extracts row change events out of a large-scale replicated MySQL server cluster, converts these events sequentially to self-contained protocol buffer records, and then stores them to permanent GFS storage. The result real-time row change event streams are also replicated to various data centers and fed to various proprietary applications. As these row change event streams offer a complete and ordered sequence of transaction changes, they can be used for many purposes. They are used by many applications to track changes to databases and combined with simple filters, they are used to maintain derived views out of MySQL databases. As each row change record contains both the old and new table entry data, a lot of applications can avoid costly periodical full table extractions. Another usage case is to maintain database replicas in a non-relational storage infrastructure. Applications with high read bandwidth but low transaction consistency requirements can read from the non-relational storage infrastructure instead of MySQL servers.