HowTos & Experiments

Exporting MySQL query results using the CSV engine

MySQL Storage Engines

MySQL supports a quite large number of different storage engines. Each has different characteristics (transactional or non-transactional, foreign key support etc) and table file format. Some of them show very special behaviour and I believe that are worth noting because they can enable a DBA to easily perform various tasks:

CSV

Stores data in the very popular CSV format that can also be processed by common spreadsheet software. There are limitations though.

Federated

This engine can be used for setting up a table representation from a remote MySQL server. You’ll need to supply a connection string in the format:

mysql://user_name[:password]@host_name[:port_num]/db_name/tbl_name

Blackhole

Does not store any data, but checks & executes DML and DDL statements consistently. Can be used for setting up a relay server for replication.

For this post, I am using the CSV engine in combination with the Federated engine. By storing the query results in a table which uses the CSV engine we have actually performed a quite easy and cool export. In addition, using the Federated engine enables us to combine results from a remote server. In our example, each database stores different records based on values of the foreign key field user_id, let’s say because we are implementing sharding and we need aggregated results. Thus, without using any external programming language with a MySQL API to combine resultsets and then write the rows in CSV format we simply let MySQL server do some magic.

Database & Table setup

First we setup the 2 databases we need, named federated_1 & federated_2.

CREATE DATABASE federated_1;
CREATE DATABASE federated_2;

We create a table named t2 in federated_2 database with the following structure:

We insert some sample data to federated_1.t1 and federated_2.t2 tables. Notice that we are concatenating the database name for the description field so that we can easily recognize local and remote records:

-- Insert some random sample data in table t1
USE federated_1
INSERT INTO t1(id, description, user_id) VALUES(NULL, CONCAT(DATABASE(), "_", MD5(CAST(RAND() AS CHAR))), 2);
-- you can repeat this INSERT command as many times as you want, it'll just double the records each time
INSERT INTO t1(id, description, user_id)
SELECT NULL, CONCAT(DATABASE(), "_", MD5(CAST(RAND() AS CHAR))), 2 FROM t1;
INSERT INTO t1(id, description, user_id)
SELECT NULL, CONCAT(DATABASE(), "_", MD5(CAST(RAND() AS CHAR))), 2 FROM t1;
-- Insert some random sample data in table t2
USE federated_2
INSERT INTO t2(id, description, user_id) VALUES(NULL, CONCAT(DATABASE(), "_", MD5(CAST(RAND() AS CHAR))), 1);
-- you can repeat this INSERT command as many times as you want, it'll just double the records each time
INSERT INTO t2(id, description, user_id)
SELECT NULL, CONCAT(DATABASE(), "_", MD5(CAST(RAND() AS CHAR))), 1 FROM t2;
INSERT INTO t2(id, description, user_id)
SELECT NULL, CONCAT(DATABASE(), "_", MD5(CAST(RAND() AS CHAR))), 1 FROM t2;

Getting the job done

We’ll need a UNION of the records of the 2 tables for our “report” and we are creating a table which uses the CSV storage engine from that resultset. Notice the limitations of the CSV engine in the comments. I am using DATE_FORMAT to show that you can use MySQL functions to transform your data at your will before exporting:

If you open the table_combined.CSV file in an editor you’ll see the following contents:
4,"federated_1_275b5bf71cc676897ecb6ffa11b72eb3",2,"2012-11-04 18:17:04"
2,"federated_1_890e1409f7a007139670d5f1925b9bb0",2,"2012-11-04 18:17:02"
1,"federated_1_97b64f11bca0e439e8f485428f4eb760",2,"2012-11-04 18:16:29"
3,"federated_1_c08581747c9321659e5da35639c08736",2,"2012-11-04 18:17:04"
366,"federated_2_04106e4c4f37cfd6ce7a86217b9eec99",1,"2012-11-04 18:23:34"
...

…and we’re ready! We saw how MySQL server storage engines can help us get resultsets from remote servers and export them in a friendly format without needing a scripting language.