Menu

MySQL 8.0 Labs: JSON aggregation functions

In MySQL 5.7 we introduced JSON functionality into the MySQL Server. This work included the introduction of a JSON data type, virtual columns and a set of approximately 20 SQL functions that allow you to manipulate and search JSON data on the server side.

The JSON functionality has been well received, and in MySQL 8.0 we have plans to improve it in a number of ways. This post outlines improvements to the SQL functions with the addition of aggregate functions.

TL;DR

Starting with MySQL 8.0 (lab release)* two new aggregation functions were added and can be used to combine data into JSON arrays/objects:

mysql>SELECT grp,JSON_ARRAYAGG(`key`)AS`keys_grouped`FROM t1 GROUP BY grp;

+------+------------------+

|grp|keys_grouped|

+------+------------------+

|g1|["key1","key2"]|

|g2|["key3"]|

+------+------------------+

2rows inset(0,00sec)

mysql>SELECT JSON_OBJECTAGG(`key`,val)AS`key_val`FROM t1;

+--------------------------------------------+

|key_val|

+--------------------------------------------+

|{"key1":"v1",

"key2":"v2",

"key3":"v3"}

|

+--------------------------------------------+

1row inset(0,00sec)

mysql>SELECT grp,JSON_OBJECTAGG(`key`,val)AS`key_val_grouped`FROM t1 GROUP BY grp;

+------+------------------------------+

|grp|key_val_grouped|

+------+------------------------------+

|g1|{"key1":"v1","key2":"v2"}|

|g2|{"key3":"v3"}|

+------+------------------------------+

2rows inset(0,00sec)

Now for the less impatient:

Let’s think about this scenario: you have a database which contains both structured and semi-structured data and you’ve decided to adopt the EAV model (Entity–Attribute–Value). The tables will look more or less like this:

You have a product table which contains the common attributes:

1

2

3

4

5

6

7

CREATE TABLE`product`(

`id`int(11)NOTNULLAUTO_INCREMENT,

`name`varchar(120)DEFAULTNULL,

`manufacturer`varchar(120)DEFAULTNULL,

`price`int(11)DEFAULTNULL,

PRIMARY KEY(`id`)

)ENGINE=InnoDB DEFAULTCHARSET=latin1;

Then you have the attribute table containing all the non-common attributes that a product might have:

1

2

3

4

5

6

CREATE TABLE`attribute`(

`id`int(11)NOTNULLAUTO_INCREMENT,

`name`varchar(120)DEFAULTNULL,

`description`varchar(256)DEFAULTNULL,

PRIMARY KEY(`id`)

)ENGINE=InnoDB DEFAULTCHARSET=latin1;

With some possible entries:

1

2

3

4

5

6

7

8

9

10

11

INSERT INTO attribute(id,name)VALUES

(1,"color"),

(2,"material"),

(3,"style"),

(4,"bulb_type"),

(5,"usage"),

(6,"cpu_type"),

(7,"cpu_speed"),

(8,"weight"),

(9,"battery_life"),

(10,"fuel_type");

And finally the value table which combines the product key, the attribute key with the actual value.

1

2

3

4

5

6

CREATE TABLE`value`(

`prod_id`int(11)NOTNULL,

`attribute_id`int(11)NOTNULL,

`value`text,

PRIMARY KEY(`prod_id`,`attribute_id`)

)ENGINE=InnoDB DEFAULTCHARSET=latin1;

Now let’s insert a few products and their attributes:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

INSERT INTO product(id,name,manufacturer,price)VALUES

(1,"LED Desk Lamp","X",26);

INSERT INTO value VALUES

(1,1,"black"),

(1,2,"plastic"),

(1,3,"classic"),

(1,4,"LED"),

(1,5,"Indoor use only");

INSERT INTO product(id,name,manufacturer,price)VALUES

(2,"Laptop","Y",800);

INSERT INTO value VALUES

(2,1,"blue"),

(2,6,"quad core"),

(2,7,"3400 mhz"),

(2,8,"2,1 kg"),

(2,9,"9h");

INSERT INTO product(id,name,manufacturer,price)VALUES

(3,"Grill","Z",300);

INSERT INTO value VALUES

(3,1,"black"),

(3,8,"5 kg"),

(3,10,"gas");

If you need to select complete products that combines all the attribute keys and values as JSON object, though combining the structured data (in product) and the semi-structured data (in attribute and in value): you can use the JSON_OBJECTAGG aggregation function:

mysql> SELECT JSON_ARRAYAGG(DISTINCT d) FROM t;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘DISTINCT d) FROM t’ at line 1

Ok. I thought you were refering to SELECT DISTINCT JSON_*AGG. At the moment JSON_ARRAYAGG does not support DISTINCT option but this is something that we can consider implementing. JSON_OBJECTAGG does not need it because MySQL’s JSON implementation will, by default, skips the key-value pair insertion if the key is duplicate. Thanks for the suggestion! Please consider filling in a feature request at https://bugs.mysql.com/