Thursday, January 14, 2010

When you identify a theme with an unusually high "database time", it's time to look for database fixes. While typical database query optimization tips and tricks still apply, spatial queries have some unique characteristics, especially when the base table of your 'troubling' theme contains a large number of rows.From my own observations over the years, in many cases the "database time" of a MapViewer theme is actually database idle time or wait time. To be more specific, it is time the database spent waiting on disk I/Os. This is probably nothing shocking given that many MapViewer theme queries can fetch a lot of data in terms of pure volume (as measured in total bytes), especially when polygon or polyline type geometries are involved. Compared with a theme query, your typical shopping cart query that fetches a product listing is almost negligible as far as result data volume is concerned. With a huge base table, chances are the database buffer cannot easily cache all the data, so it often needs to read physical disk blocks in order to fetch and return data for a query.

When diagnosing database bottlenecks such as disk I/O waits , there are two powerful tools that come with Oracle database 10g and later versions. They are AWR (Advanced Workload Repository) and ADDM (Automated Database Diagnostic Monitor). I found them invaluable in finding out why your MapViewer application is running slow, as they can not only identify I/O bottlenecks, but also many other kind of bottlenecks in the database. Note however they do not replace other tuning tools such SQL Tuning Adviser. In fact ADDM will often suggest you run SQL Tuning Adviser on specific SQL queries it deemed problematic.

Let's jump right into action already. The following are the basic steps of using these two tools.

Step 1 Login as DBA, then create a "before" snapshot of the database: SQL> exec dbms_workload_repository.create_snapshot;

Step 2 Run your MapViewer application load tests. When done,

Step 3 Create an "after" snapshot of the database, using the same command: SQL> exec dbms_workload_repository.create_snapshot;

With these two snapshots, you can now execute the AWR and ADDM scripts to generate very informative and detailed reports on what happened inside the database during these two snapshots. Keep in mind that the stats gathered by AWR and ADDM cover all database activities for all sessions, not just sessions created by MapViewer to perform theme queries.

Step 4 Generate an AWR report.To create an AWR report, execute the following as DBA:

SQL> @$ORACLE_HOME/rdbms/admin/awrrpti.sql

When prompted, use HTML as the report format, and pick the two snapshots that you just created.

After exiting SQL*Plus, you will see an HTML file containing the AWR report. Open it in your browser, and you will be greeted with a rich set of database activity reports.One of the interesting tables is titled "Top 5 Timed Foreground Events". For instance, this is the result from a recent AWR report generated for my spatial database:

As you can see, it shows that the database spent a whopping 96.55% of time waiting for User I/Os (db file sequential reads) ! You can also check out several other tables such as "SQL ordered by Elapsed Time" and "SQL ordered by User I/O Wait Time" et al. It is also quite easy to find out which SQL is causing the most User I/Os, so on and so forth. Of course, your report may not show user I/O as an issue, but you get the idea.

While AWR is very useful in providing you with a detailed break down of the database events and timings, it is best complemented by the adviser tool, ADDM, which reads off the same set of stats between the two snapshots, but can provide you with concrete advices on how to fix the bottlenecks in your database.

Step 5 Generate an ADDM report.

While you can always use EM to view the ADDM report, here we just want to generate a simple text report using SQL*Plus by running the following command as DBA:

SQL> @$ORACLE_HOME/rdbms/admin/addmrpt.sql;

Again this script will prompt you for the before and after snapshots' IDs. Then it generates a .lst file which you can open with any text editor. In this text report you can find some very insightful suggestions. For instance, in one of my reports it detected that the disk containing the database data file of a large theme's base table has a below-average I/O speed, and suggested file striping as one way to reduce I/O waits. This is exactly the kind of advices that can immediately provide a performance boost.

While I have been stressing the negative impact of heavy user I/Os, please note that your situation may very well be a different one. For instance you may not have an I/O bottleneck at all, so please be open minded when reading these two reports and be prepared to wrestle with other kind of issues within your database.

Reorganizing your large tables

Let's say you indeed find yourself in a similar situation where you see a lot of User I/O wait events, especially on your large tables (those with hundreds of thousands or millions of rows). Then how do you tackle this issue? One of the most obvious remedies is to partition both your data tables and associated spatial indexes, if they are not already partitioned.

The other tip, which will be described in detail now, is to reorganize the table rows based on the proximity of the geometry column. The basic idea is to store rows in neighboring database blocks if their geometries are close to each other spatially. When stored this way, rows that are to be returned for a theme query will more likely be found in neighboring data blocks on the disk, thus significantly reducing disk reads. This re-org will help most spatial queries with predicates such as SDO_FILTER or SDO_RELATE. MapViewer for instance always uses SDO_FILTER for normal map panning/zooming and WMS type map requests, thus can benefit from such a table wide re-org, sometimes tremendously.

Lets first see what happens if you do not reorganize your table. Lets say each disk block has capacity for 100 rows. In a ‘uniform but random storage’ case, for any given query window, each "hit" block may only have 10 rows that are within this window. Now lets say a particular map request’s query window interacts with 1000 rows according to the spatial index. Assuming the table is so big and the incoming query windows change so often that the database data buffer is basically useless. What the database will end up doing then is reading 100 blocks from the disk in order to fetch these 1000 rows based on their ROWIDs produced by the spatial index. Reading so many blocks for a single query is very expensive. Now imagine 20 concurrent incoming map requests, each having a different viewport, and you can easily imagine why the database spends majority of its time waiting for disk reads. As a result, the overall throughput of MapViewer plummets.

Script for Reorganizing a large table

So how do you reorganize row storage? The following script is actually documented in the “Pro Oracle Spatial for Oracle Database 11g” book, Chapter 16. It uses a linear tessellation key to re-order the geometries stored in a table. For those of you don’t have access to this book, here are the (slightly modified to fix several typos in the book) scripts you can copy and use.

First, create a package in the MDSYS schema (it won't work in any other database schema!) :

Now re-run your MapViewer tests and see if this reorganization has helped its overall performance. In one internal and not so rigorous experiment, after reorganizing a large table using the above approach, we observed as much as 300% increase in overall MapViewer throughput.

In summary, AWR and ADDM are two of the more useful tools at your disposal when diagnosing high database times, and pay special attention to excessive database reads. This blog also assumes that your typical MapViewer theme is a pre-defined geometry theme with relatively simple query conditions. If your theme uses super complex dynamic SQL queries, then chances are you have more than disk reads to worry about. In any case, database tuning in general, and Spatial SQL tuning in particular, is a very deep topic in itself, and I'm really in no position to offer any authoritative advice; so please don't blame me if my little tip above did not actually reduce your database time :)

Wednesday, January 13, 2010

This is the first of a series of blogs on MapViewer performance tuning.

Overview

When your MapViewer application performs poorly, there are many possible causes. Maybe your application is trying to (unnecessarily) display way too many features on a map or layer. Maybe your database query (for dynamic themes) is too complicated and poorly optimized. Or maybe the database is doing too many disk I/Os when fetching the result set for a theme. It could also be due to a poorly configured MapViewer in-memory cache, an overwhelmed J2EE container (overloaded with many other applications), or just a very slow middle-tier box. It could be all of the above. The symptom however is always the same: when a user clicks on the map to pan or zoom, she will wait a long time before the map completely refreshes itself.

So how to find the true bottleneck? Where do you start when diagnosing a slow mapping application? The short answer, is to first identify the most time consuming theme(s) in your application, by looking at the log file.

To do so set MapViewer's logging to the "finest" level, restart MapViewer, and run the application just long enough (or manually submit a single XML map request) to generate some meaningful logs. When scanning the freshly created MapViewer log file, for each theme you should see several log records like the following:

In the first log record, pay special attention to the last number, 2679ms. This number (in milliseconds) indicates how long it took the database to execute the theme query, fetch result rows from data blocks, and transmit the result set over the network to MapViewer. For simplicity we will call this the "database time". Likewise let's call the combined rendering and labeling time indicated by the second and third log records the "MapViewer time"; after all that is how long MapViewer actually spent doing something useful instead of just waiting for and receiving data from the database.

In many cases, the database time will be the dominant one, so you will need to poke around the database and speed things up there. We will get into it later.

If the MapViewer time is high, we will need to find out why. One recommended approach is to run a CPU profiling of the MapViewer instance. Again we will provide more details in a later post.

Reducing these two types of time will have a positive impact on your application's performance regardless of which MapViewer API you are using.

Note that starting with MapViewer 11g R1 there is another way to easily spot themes with excessive "database time". To do so you must first log in as the MapVieweradmin user. Then manually open the /mapviewer/admin.html page in your browser. You will see a "Top theme queries" text area with a Submit button underneath it. Click on the submit button to see a list of top (database) time-consuming queries. It's a good idea to first click the "Reset top theme queries" button so that MapViewer clears any existing list of themes and starts gathering fresh stats for you.

Sometimes you may find that all of your themes are loaded and rendered quickly, but your application still takes a long time to display the map. Let's simply call this the "user time", since it is the time a user spent staring at a spinning clock or progress bar. When the sum of the "database time" and "MapViewer time" accounts for only a small portion of the "user time", we suggest you use a FireFox addon called "FireBug" to find out what exactly is happening on the wire and in your browser. For instance, is your Oracle Maps application overwhelming the remote web server with too many HTTP requests for map tile and FOI images? Maybe you should look into using the Whole-Image option for your FOI layer(s) to avoid transmitting many small pieces from MapViewer to the client every time a user pans or zooms the map. We will provide more tips like this in a later blog as well.