The problem

Recently, I came across what started as a straightforward problem: In MySQL I was JOINing from Table A to Table B on a’s primary key, and wanted to display a list of all items in Table A, each followed by a list of all associated items in Table B.

My initial thought was to hand-roll my own JSON, so that the result of the query would have a few columns with the results of Table A, then a column with the JSON-encoded data from Table B. I realize that one should never hand-roll their own JSON, but there isn’t a native JSON encoder in MySQL, so I had to make do.

It was easy enough to come up with a basic JSON format using the following. I’ve done some pretty aggressive tabbing and newlines here to try to make the layers of functions as straightforward as possible; different arguments to each function align vertically.

However, if b.name has a quotation mark (or a variety of other characters), this creates invalid JSON. How do we ensure that we always create valid JSON, even when there are special characters or unicode/multi-byte characters?

The solution

Ensuring we always have valid JSON was surprisingly easy – we can hexadecimal-encodeb.name in the MySQL query, which ensures that its character set is 0-9 or A-F, so it will always be properly contained within the quotation marks around it. We then convert it back into a normal string in the application code.

So let’s say b.name was something like '☃' (because I know people are always naming their things [unicode Snowman]), the resulting hexadecimal representation is 27E2988327. Opaque, but definitely JSON-safe!

Note that we also use COALESCE in case there are no associated items in Table B; otherwise CONCAT('[', NULL, ']' gives us NULL, not [].

All we have left to do is convert the hexadecimal representation back into text in the application. Here’s a function in PHP that does the trick: