Welcome to db2Dean’s web
site. I’m Dean Compher an IBM IT Specialist who, along with my team,
helps customers and prospective customers with DB2 on Linux, UNIX and Windows
(LUW) technical questions and issues. As this page makes painfully clear,
I am a DBA and not a web designer, but I would be happy to get your DB2
questions answered or talk to you about the great features of DB2 or IBM
Integration Products. If you are looking at a new database solution or
want to compare us to your existing database vendor, please do not hesitate to
contact me about getting a presentation or just to ask questions. My
e-mail address is dean@db2Dean.com

I am located in Utah and
primarily serve DB2/LUW and dashDB customers in the Western US, but I can
forward requests to my peers in other technology and geographic areas as
well. My team also covers Big Data and Informix and products. There
are questions that I get on a regular basis, and I will write articles relating
to them here. I hope that you find them useful. I also welcome
suggestions for future content. Click here for more information about
me.

If your organization is using JSON more and more, you may
want to start storing those documents in Db2 where you can query by
particular fields in the documents or just get entire documents when
needed. For some time now, Db2 has allowed you store and query JSON
documents in the database. Since the JSON format is quite widely used
to communicate between processes and organizations, it is frequently useful
to store it. In Db2, you insert the entire document into a column in a
table and DB2 provides functions to query values, arrays or objects in the
document. There are also functions to generate JSON syntax from
relational data. In this article I discuss how to get started adding
JSON data to a table and querying it, since I believe that is most
useful.

In this article I’ll provide some examples that will allow
you to begin trying the JSON features using some of the more common
functions. This is only a sampling of what can be done, so I’ll provide
links at the end so that you can see all of the available functions and read
more about them.

I’ll start by discussing a JSON document that comes from
the solar power inverter on my house. I query it with a REST API
command (using cURL) and get back a JSON document. All data is in a JSON
object labeled “energy” (the key of a key/value pair) and the value has a
number of other keys including “timeUnit”, “unit” and “values”. The
key, “values”, is an array of JSON objects that have the keys “date”
and “value”. The “{}” symbols denote a JSON object and “[]” denotes an
array.

Example 1. JSON Document from Solar Inverter

{"energy":{"timeUnit":"HOUR",

"unit":"Wh",

"measuredBy":"INVERTER",

"values":[

{"date":"2019-02-04
08:00:00","value":211.0},

{"date":"2019-02-04
09:00:00","value":352.0},

{"date":"2019-02-04
10:00:00","value":477.0},

{"date":"2019-02-04
11:00:00","value":557.0},

{"date":"2019-02-04
12:00:00","value":1038.0},

{"date":"2019-02-04
13:00:00","value":1311.0},

{"date":"2019-02-04
14:00:00","value":831.0}

]}}

JSON functions have been available since Db2 10.1 fix pack
1. However, the International Standards Organization recently released
standards for JSON functions, and as of Db2 11.1.4.4 we have incorporated
those standards into the database. In all of my examples I will use the
new ISO functions.

If I have this table:

Example 2. Table for JSON docuents

CREATE TABLE json.energy_json

(SEQ INT NOT NULL GENERATED
ALWAYS AS IDENTITY,

READING_TS TIMESTAMP
NOT NULL WITH DEFAULT,

JSON_FIELD
varbinary(2000))

The document in Example 1 can be inserted into the Example
2 table with this command:

1.I created the column
for my JSON documents with the VARBINARY data type and use the BSON_TO_JSON
function when inserting the document.

2.Columns can instead
be created with a variety of character and binary data types including
VARCHAR, BLOB, CLOB, VARBINARY and others. See the “Overview” link in
Table 1 below.

3.You will need to use
BLOB and CLOB data types if your document exceeds the page size of the
tablespace.

4.Documents can be
stored in the native JSON strings as a character data type or in BSON which
is a binary representation of JSON. BSON has some major performance
benefits when selecting values within the document or operating on the
document with the built-in functions. It can also save some space if
you don’t use adaptive compression in the database. However, there is
some overhead when converting documents to BSON and when converting back to
JSON. So if all you are doing is dropping the documents into the table
and retrieving them as-is then storing them as a VARCHAR or CLOB may be a
better choice.

5.I use the
BSON_TO_JSON function to convert the document to a binary format to take
advantage of the BSON format benefits.

6.In the insert
example, I string the document all together because that’s typically how JSON
documents arrive. It is OK to use the indented format in the insert if
you like.

7.I created the table
in the schema, JSON, but you can use any schema name your like or skip the
schema and create the table in the default schema.

Frequently, your JSON document will be in a file so here
is an example of using the INGEST command to insert the same document from a
file called dailyenergy2019-02-04.json:

INGEST FROM FILE dailyenergy2019-02-04.json FORMAT
POSITIONAL

($field1 POSITION(1:2000) char(2000))

RESTART OFF

INSERT INTO json.energy_json
(json_field)

values (JSON_TO_BSON(CAST($field1 AS VARCHAR(2000))))

Please note that the CAST function is required where shown
in this example or you will get an error.

Whether you INSERT or INGEST the data, here is a basic way
to query the JSON field, showing part of the output:

select seq, BSON_TO_JSON(json_field) as
my_doc from json.energy_json

SEQ MY_DOC

---
---------------------------------------------------------------

423 { "energy" :
{ "timeUnit" : "HOUR", "unit" : "Wh","measuredBy"
…

Notice that I used the BSON_TO_JSON function to convert
the data back to a readable format. If I had not done that, I would
just have seen a string of weird characters. Getting the entire string
may or may not be all that useful, but it is probably more interesting to just
select the fields that interest you. To get one value, you can use the JSON_VALUE
function. Here is an example of retrieving values for the “unit” and
“measuredBy” fields in the document in Example 1.

SELECT

JSON_VALUE(json_field,

'$.energy.unit' RETURNING CHAR(20)) AS unit,

JSON_VALUE(json_field,

'$.energy.measuredBy' RETURNING CHAR(20)) AS device

FROM JSON.energy_json

UNIT DEVICE

---- -------

Wh INVERTER

In this select statemen I called the JSON_VALUE function
twice to get two different values from the document. Notice that the
second value I provide the function in each case started with
“$.energy.” This is how you traverse the various levels of the hierarchy
to indicate the specific fields you want. Refer back to example 1 and
notice that “unit” and “measuredBy” are both inside the “energy” label.
It is also very important to use the correct case. For example,
“measuredBy” and “measuredby” are not the same.

My JSON example also contains an array. Here, I show
how to select the forth value of the two array elements, “date” and “value” that
I call forth_date and forth_value respectively in the output. In JSON
the first index into an array is 0, so to get the forth element of an array,
you specify 3 as shown here.

SELECT

JSON_VALUE(json_field,

'$.energy.unit' RETURNING CHAR(20)) AS
unit,

JSON_VALUE(json_field,

'$.energy.measuredBy' RETURNING CHAR(20)) AS
device,

JSON_VALUE(json_field,

'$.energy.values[3].date' RETURNING TIMESTAMP) as forth_date,

JSON_VALUE(json_field,

'$.energy.values[3].value'
RETURNING Decimal(5,1)) as forth_value

FROM JSON.energy_json

UNIT DEVICE
FORTH_DATE
FORTH_VALUE

---- --------
------------------- -----------

Wh INVERTER
2019-02-04 11:00:00 557.0

If you prefer to get your values from a table function
then the JSON_TABLE function is for you. This example provides similar
output as the query above, except that I also include the fifth elements of
the array:

The “’strict $’” string must be the second
parameter passed to the function. That’s currently the only value
allowed.

If you don’t want to construct the input to these
functions every time you want to query the JSON data, you can create views on
the data in the documents. To do that you just use the functions when
creating the views and then you just run normal select statements to see the
data. For example, you could create a view based on the JSON_TABLE
example above like this:

As I noted earlier if you are going to do much with
functions like JSON_TABLE and JSON_VALUE, whether directly or within views,
then you will want to store your documents in BSON format since it performs
better. If you typically query your JSON data by certain fields, then
adding indexes can make your queries even faster. Here is an example of
adding an index on the date of the first reading of the day:

CREATE INDEX JSON.IX_ENERGY_DATE ON json.ENERGY_JSON

(JSON_VALUE(json_field,
'$.energy.values[3].date' RETURNING DATE));

This index will make queries like either of the following
run faster if you have a significant number of rows in the table:

In addition to using SQL to select and manipulate JSON
data, there are other interfaces that are likely to be more familiar to
developers who regularly use NOSQL interfaces. These include the Java
API and a MongoDB listener. The latter allows those familiar with the
MongoDB client to use it to connect to DB2 and store and retrieve JSON
documents. There is also a JSON command line shell you can use to
administer objects used by the other interfaces. See the Client
access to JSON documents page for more information. JSON documents
inserted through these interfaces can be queried and updated by the new ISO
functions described in this article.

George Baklarz and Paul Bird are writing a book on using
JSON in Db2. I’ve read a draft of it and it looks great. I’ll
update this article with a link to it when it becomes available.

There are a number of reasons you may want to use the Db2
JSON features. One of the first that comes to mind is the ability for
developers to be productive and use only JSON interfaces they are used to,
while at the same time providing SQL access to the same data. One
possible use for the relational interface is to do business intelligence
reporting using tools that work best using SQL queries to access data.
Another is to do ELT processing where you load the raw JSON data into staging
tables, and use the functions described here to copy the data into Db2 BLU
column organized tables for blazingly fast BI queries.

Another use case is where your organization gets data from
the outside in JSON documents and you want to keep them for a while for
auditing. When there is a problem downstream with the use of that data
it is helpful to be able review the particular documents to see if the
problem was with what was sent or with the way you processed the data you
received. So, if you just drop all of the documents into a Db2 table as
they arrive you can easily manage them and quickly find the ones of interest
with simple queries as shown in this article. You can then test with
those particular documents and send back copies of the problem documents if
you find that they do not conform to agreed-upon formats. Further, if
you timestamp the rows as I show in my examples here, you can delete old rows
with simple delete queries when enough time has passed.

***

I’m sure that there are quite a few other ways that using
the JSON features of Db2 would be a benefit. If you discover any
creative uses of JSON in Db2 please post them on my Facebook Page or my db2Dean
and Friends Community along with any other comments you may
have.