MaxDB series: …just SQL

2006/04/12by admin

Dear MySQL users, MaxDB users and friends,

The previous “please wait” posting was caused by a new guideline on series postings. The new guideline says that postings must be written one week before they get published. The week between the writing and the publication is used for corrections. I’m writing these lines on April, 4th but the article won’t be published before Wednesday April, 12th. We hope that in the time between writing and publishing we can catch most of the Gotchas to further improve the quality of the series. However, this series will continue to be more like a collection of blog postings than a book, a magazine article or any other “official documentation”. We try to do our best, but unfortunately we cannot apply the same amount of quality assurance and proof-reading to the MaxDB series that we do for magazine articles, official documentation or official class materials.

In this issue

In the past, we recognized that most MaxDB users do have some SQL knowledge. Hardly any of the typical MaxDB users seem the be beginners in the field of databases. Therefore SQL basics are not of interest. For example, most users who asked us SQL questions in the past did know what kind of joins exist, what the difference between a right join, left join and a full join are and how to define tables. Also, there seems to be no demand on describing the basics of views as many articles did when MySQL 5.0 introduced views to the MySQL server.

MaxDB users seem to be more interested in the limits and additional features over the SQL92 Entry Level compatible SQL dialect implemented by MaxDB. For this reason, we will often provide you only with brief overview information on some topics. If you need to details, check the SQL Tutorial chapter in the MaxDB documentation. Other online resources like the SQL Course and the SQL Course 2 may also be worth reading if you want to learn the very basics. Those who like printed books better and prefer reading on a sunny balcony, can check the book listing on the MySQL Developer Zone.

Let us start the discussion of the MaxDB SQL dialect with something very basic: data types. In general you should always select the smallest data type for a column that can hold all values you want to store in the column. Try to keep your data as compact as possbile to save costy disk-read operations and to hold the most records in as short a buffer cache as possible.

For comparisons with MySQL data types, check what mappings have been choosen for the MaxDB plugin of the MySQL Migration Toolkit. The mappings are desribed in two german language blog postings. Although the texts are written in german, no german language knowledge is required to understand the tables given in the articles on mapping strings, BLOB/LONG, numeric types and temporal types.

MaxDB is using a packed decimal storage format for INTEGER, FLOAT and FIXED values. This means that if you store the same values in three columns of the three data types (with comparable precisions), you will get the same performance when doing comparisons.

As you might have noticed there is no UNSIGNED INTEGER data type in MaxDB like in MySQL. It is not needed, because MaxDB supports check constraints. The check contraint syntax is accepted by MySQL and does not lead to any errors, but MySQL leaves it to the client to check the validity of values.

In cases where you are using the same check constraints for multiple tables, consider defining a domain. A domain is a value range definition which consists of a data type definition, an optional default specification and an optional constraint definition. Using domains instead of individual definitions in each table helps you to ensure that certain columns always accept the same value ranges.

MaxDB enables you to define an alternative name (a synonym) for a table. The synonym can be made visible to other users using the keyword PUBLIC. There is not counterpart to synonyms in MySQL. However, you can emulate the feature on MySQL using a view.

It seems less know that MaxDB features function-based indexes. Only user-defined functions can be used with a function-based index, but this is no severe restriction. It’s simple to use a user-defined function as a wrapper for a build-in function as shown in the following example. The example gives a solution to a common pitfall for users with a MySQL background. MaxDB does always perform case sensitive string comparisons whereas old MySQL versions did case insensitive comparisons. As of version 4.1 of the MySQL server you can use a collation to control the comparison rules. However, MaxDB has nothing comparable: ‘ALbert’ = ‘Albert’ and ‘ALbert’ LIKE ‘Albert’ both evaluate to false.

For a case insensitive search, you have to convert both operands to upper or lower cases, for example: SELECT * FROM PEOPLE WHERE UPPER(first_name) = UPPER(‘ALbert’). It would not be a big deal to use such a query if the query would still profit from an index on the
column first_name. But this is not the case. A normal index on the column first_name cannot be used efficiently, because it stores the original value of the first name – ‘Albert’ – and not a upper case version of value – ‘ALBERT’. All values stored in the index must be converted to upper case before the comparison against UPPER(‘ALbert’) can be made. No extra conversion of index values would be needed if the results of the function UPPER(first_name) would have been stored in the index. This is what a function-based index does.

Note: A bug in the current version of MaxDB 7.6 leads to a wrong cost value calculation for function-based indexes. The optimizer will not use any function-based indexes. The problem will be fixed in 7.6.00.26. You can expect to see a new MaxDB version during the next few weeks that contains the fix.

Sequences are number generators. For example, number generators can be used to create primary key values or to record the insert order of the rows of a table by adding an extra column to the table that gets filled with sequential numbers from a sequence.

Let us assume you have a table of registered users in the database and you want to keep track of the order in which users have registered themselves. The username serves as a natural primary key for the table. The column registration_timestamp contains a timestamp with the date of the registration. Although the timestamp includes microseconds it still could theoretically happen that two users register at exactly the same point in time. Therefore another column has been added which gets filled with sequential numbers from the sequence registered_users_seq. When doing the insert, <schema_name>.<sequence_name>.NEXTVAL gets used to retrieve the next number from the sequence. The current number – the last number returned for a <schema_name>.<sequence_name>.NEXTVAL call – can be retrieved using <schema_name>.<sequence_name>.CURRVAL. Note that you must specify the schema when calling the number generator.

You cannot use <schema_name>.<sequence_name>.NEXTVAL for the default specifition of a table column. But MaxDB has something that is roughly comparable to AUTO_INCREMENT in MySQL: SERIAL. A column that has a DEFAULT SERIAL definition will be automatically assigned a sequential number if no other value is given for the column. Note that you can have only one DEFAULT SERIAL column definition per table and that you cannot update the values assigned by the database. This makes SERIAL especially useful for columns tracking the insert order of records as you cannot fake any other orders afterwards by updating the columns with DEFAULT SERIAL values.

Result sets can be limited to the first n results by adding a ROWNO predicate to the WHERE clause of a SELECT statement. ROWNO is in some
ways similar to LIMIT in MySQL, but it is not as powerful as LIMIT. Using ROWNO you can only limit a result set to the first n records. You cannot use ROWNO to get a list of the 1th to the 20th records of the result set. In case you need this, use cursors instead. On the other hand you can add the ROWNO to the list of columns of the result set which has no counterpart in MySQL.

SELECT ROWNO, * FROM registered_users WHERE ROWNO < = 3

Plans exist to add full LIMIT (TOP) support to future versions of MaxDB.

A cursor is a named result table. Functions exists to drop the named result table and to access its records in sequential and random order. The functions to access cursors are provided by the programming interfaces, for example JDBC, DBI and ODBC. But cursors can also be accessed using SQL commands. It is a bit difficult to demonstrate the use of the SQL commands. The SQL Studio uses ODBC to communicate with MaxDB. ODBC does not support the SQL commands. Therefore, we have wrapped a simple database procedure around the SQL commands for
working with cursors. This way, the commands are run “inside” the server. Alternatively we could have used a non-ODBC client (e.g. sqlcli), but we assume that most readers are using some sort of ODBC-based GUI to access MaxDB.

When you are new to database procedures, pay special attention to all loops inside a procedure. Ensure that the loops will not run indefinetly but will terminate. Hint: $RC is set not only by FETCH, but also by INSERT!

In the example a loop gets used to fetch the 1st, 3rd, 5th and so on user name from the table registered_users to insert it into another table. Instead of calling FETCH NEXT twice to skip one record, you could calculate an offset value and use FETCH POS(offset) instead. Note that inside a database procedure you do not need to open a cursor explicitly. Please check the manual on further important syntax details.

CREATE DBPROC BASIC_CURSOR_EXAMPLE AS
VAR username CHAR(64);
DECLARE users_cursor CURSOR FOR
SELECT username FROM DBADMIN.registered_users ORDER BY registration_order;
FETCH NEXT users_cursor INTO :username;
WHILE $RC = 0 DO BEGIN
INSERT INTO DBADMIN.selected_users(username) VALUES (:username);
FETCH NEXT users_cursor INTO :username;
FETCH NEXT users_cursor INTO :username;
END;
CLOSE users_cursor;

The hottest aspects of MaxDB cursors is that they can be used for recursive queries. Recursion is particulary helpful to traverse tree structures. A classical example of a tree structure is the reporting structure of a company: Ulf reports to Patrik, Patrik reports to Kaj and Kaj reports to Marten. It is difficult to retrieve a list of all employees reporting to Kaj using classical SQL and you have to use a “trick” like “Nested Sets”. With MaxDB, however, you can create a recursive cursor that builds an initial result set and adds new records to it until it does not find any further hits, then returning the entire result set to you.

The syntax of a recursive cursor reflects the three steps explained below. A recursive cursor consists of an initial select, a recursive select and a final select. When the cursor gets executed and the names result table gets built, the initial SELECT is executed first. We will show step by step how it works, and explain the complete example below.

DECLARE report_cur CURSOR FOR WITH RECURSIVE
emp(emp_name) AS (SELECT employee FROM DBADMIN.employees WHERE boss = :boss_name
UNION ALL SELECT employee FROM DBADMIN.employees INNER JOIN emp ON employees.boss = emp.emp_name)
SELECT emp_name FROM emp ORDER BY emp_name

The example shows how to find all people reporting to Kaj. In the example, the initial select is used to retrieve all employees that report directly to Marten.

Original data set (table employees)

Result set of the initial select (reference name of the intermediate result set used during recursion)

The initial select identifies Kaj as the person reporting directly to Marten and adds Kaj to an intermediate result set. The intermediate resultset has been given the reference name emp. This reference name can be used by the recursive query which is executed next. The recursive query joins intermediate resultset with the original data set and looks for all people reporting to Kaj. It finds Patrik. The UNION ALL part of the recursive cursor statement adds Patrik to the intermediate result. As there have been results, the execution of the recursive query is repeated. This time, the query looks for all people reporting to Patrik and adds Ulf. As there have been results, the execution of the recursive query is repeated, but there is nobody reporting to Ulf. No more records are added to the intermediate result set and MaxDB continues with the final step, the final select.

This issue has focused on several interesting details of the MaxDB SQL dialect. It might have looked like a “random” selection and in some ways it has been such. The next issue will also be on the SQL dialect, but it has only one topic: transactions. We will repeat the basic properties of transactions, explain about isolation levels, locking, subtransactions and say some words about deadlock detection.

Meanwhile you could check the manual for system triggers. System triggers are called after the start of a database instance… Check it out!