Add "where" to your life

Comparing Oracle GeoRaster with PostGIS WKT Raster (II)

September 1, 2010

In the previous post, I installed and configured Oracle GeoRaster, and did some basic stuff: load a shapefile, load raster data, view these data and export them, using PL/SQL and Java. We compared some points I consider Oracle GeoRaster’s weak points with PostGIS WKT Raster.

PREVIOUS STEPS

We’re going to use OpenJUMP, to see the results of our queries. So, we first install it. Easy task. But now, we need a plugin that allows OpenJUMP to retrieve geographic information from our Oracle Database. The plugin I used was JUMP DB Query Plugin.

The instructions are pretty clear, so, I only had to:

Unzip the plugin files into the OpenJUMP lib/ext folder

Download and unzip the correct Oracle JDBC driver for my Oracle Enterprise Database version (11.1.0.6.0) in the OpenJUMP lib/ext folder.

Run openjump.bat

And that’s all. Ready to start.

Loading shapefiles (caribou distribution shapefiles)

I used the cariboupoint table generated in PostGIS, just as described in the tutorial. Using pgsql2shp, I dumped the table data into a shapefile.

Dumping cariboupoints table

Then, using shp2sdo again, I converted my data to SDO format:

Transforming cariboupoints SHP to SDO format

Of course, I could create the data using a PL/SQL procedure, just like Pierre did in his tutorial. But the procedure was created only for providing a shapefile to work with, in case you don’t have access to real caribou distribution data. We can use the same file.

Visualizing the caribou point in OpenJUMP

Now, from OpenJUMP, run the query select geom from cariboupoints. The result, here:

Viewing cariboupoints table from Oracle in OpenJUMP

Loading raster data

As we showed in the previous post, we need to insert 4 GeoTIFF files, covering the whole area of Spain. Specifically, 4 SRTM 90m DEM files. There are 2 more files, covering the area of Canary Islands, but they are not continuous with the rest of the data. As Oracle GeoRaster uses a one-georeference-by-layer georeference schema instead of one-georeference-by-raster one, you can’t store non-continuous raster data in the same raster data table. For the purposes of this tutorial, we only need the 4 files covering Spain area.

So, as we saw in previous post, we first reformat and reblock GeoTIFF images, to ensure they don’t have BSQ interleaving:

-co “BLOCKXSIZE=50″, -co BLOCKYSIZE=50”: Dimensions for tiles. The same used in WKT Raster tutorial.

Further information about GeoTIFF driver for GDAL here. Now, we have to repit this with every single file we’re going to load. Once done, we can load the files in Oracle GeoRaster, using SDO_GEOR.importFrom procedure as follows:

The option ‘blocksize=(50,50)‘ is not really needed. As we can read in the Oracle Spatial Developer’s guide, storage parameters section, “If you specify neither blocking nor blockSize, default values are derived from the source GeoRaster object: that is, if the original data is not blocked, the data in the output GeoRaster object is by default not blocked; and if the original data is blocked, the data in the output GeoRaster object is blocked with the same blocking scheme”. As long as we specified block size when reformatting GeoTIFF files, we wouldn’t need to use that option now.

The option ‘spatialExtent=TRUE‘ will be necessary later. Anyway, if we omit this option, we can generate the spatial extent of the raster objects by SDO_GEOR.generateSpatialExtent procedure, once raster data are loaded. Further information on generating and setting spatial extents in the Oracle Georaster Developer’s Guide, section 3.6

We provide, as additional arguments, the World File, with georeference information.

Now, we can see the extent coordinates in the correct reference system:

GeoRaster spatial extent ok

As you can see, the extent of the raster is represented as a polygon in SDO_GEOMETRY format. For me, looks complicated… If you want to know more about the meaning of all fields of the SDO_GEOMETRY object, you can check it here.

What is the reason of keeping a representation format so complicated? Ok, it could be my impression. After all, I’m new in this world. Anyway, I started a thread on PostGIS devel list, asking for this. One important topic that came up in conversation is Oracle has a topological storage model. But if you want to work with this model, you must use the SDO_TOPO package. And there’s probably no one tool able to work with this model, apart from Oracle itself, of course.

Take into account, in this particular case, the SDO_GEORASTER table is composed of 4 objects, because we had 4 GeoTIFF files. So, the extent is calculated for each object, not for the whole coverage. If we want only one object, we should:

The last thing we have to do with raster data is… create and index. As we can read in Oracle GeoRaster developer’s guide, section 3.7, “the most important index you can create on a GeoRaster object is a spatial index on the spatial extent (footprint) geometry of the GeoRaster object”. And we need to take one more thing into account… According to the documentation (chapter 18), one of the prerequisites for creating an spatial index over a geometry column is “The USER_SDO_GEOM_METADATA view must contain an entry with the dimensions and coordinate boundary information for the table column to be spatially indexed”. And again, in the documentation (chapter 2, section 2.8): “Spatial users are responsible for populating these views. For each spatial column, you must insert an appropriate row into the USER_SDO_GEOM_METADATA view. Oracle Spatial ensures that the ALL_SDO_GEOM_METADATA view is also updated to reflect the rows that you insert into USER_SDO_GEOM_METADATA.” .

Of course, you need to install all the PostGIS stuff. You have an useful tutorial on how to do this here (in Portuguese, but I think it’s easy to follow. If you find it difficult, you can read WKT Raster documentation or ask in the PostGIS users list.

So far, I find too difficult the simple operation of loading GeoTIFF data in Oracle GeoRaster, comparing it with the same operation done with PostGIS WKT Raster. Of course you can always ask in Oracle Spatial forum (register required). The questions I’ve asked were quickly answered. Really great people in these forums.

Another important difference between both extensions are the indexes. Oracle GeoRaster creates spatial indexes over the footprint of raster data, and WKT Raster creates GiST indexes over the raster data itself. Just a newbie guessing: as far as I know, the operations you can do with raster data itself in Oracle GeoRaster are rather limited. Spatial operations, like intersections, can only be done with the MBR of the data, so, that’s the reason the indexes are created over the footprint. On the other hand, in PostGIS WKT Raster, you can do really heavy operations with raster data, like ST_DumpAsPolygons. Then, create an index over these raster data has more sense.

And if you want to interact with real raster data in Oracle GeoRaster? You can use the functionalities provided by SDO_GEOR package or directly attack the data blobs, using DBMS_LOB package. But this last option is a blind attack over raw binary data.

About the spatial extent of the raster data, in WKT Raster the spatial extent of a raster is calculated during the loading process, using GDAL provided raster dimensions, and calculating the georeferenced coordinates for the upper and lower corners. But in this case, the spatial extent of the raster is the enclosing geometry in its model space coordinate system. In Oracle GeoRaster, this is not necessarily in this way. Better idea? Worse idea? Honestly, I don’t know. An expert may surely provide and smarter point than mine. Reference: Oracle Georaster Developer’s Guide, section 3.6

Making buffers around the caribou points

Next step is to make 1km buffers around the caribou points, and reproject our buffers to WGS84, as our raster coverage. We’ll later use the buffers as sampling window for calculating statistics over an area of the GeoRaster data using SDO_GEOR.generateStatistics. We use a sampling window with Oracle GeoRaster because we can’t intersect vector and raster data. The closer operation is the one performed by SDO_GEOR.generateStatistics, or SDO_GEOR.subset. This is, select a window to clip an area of the raster and get statistics or raster data over this area.

Problem: The cariboupoints table doesn’t have a valid SRID. I forgot to specify it when dumping the table from PostGIS. We can set the SRID now:

Cariboupoints table update

We have to update the metadata too:

Cariboupoints metadata update

Problem: As we can read in SDO_GEOR.generateStatistics documentation, the sampling window must be rectangular, so, we can’t create round buffers around the cariboupoints. My solution: use the MBR of the buffers as sampling window.

Round buffers with MBR

This is the SQL code for creating the table with rectangular buffers around cariboupoints elements:

And in PostGIS?: As in Oracle Spatial, when you create a new table, you can create a GiST index over the spatial column (or a R-Tree index, but GiST indexes are recommended). However, I think the process in PostGIS is simpler: you only have to create the table and then the index. In Oracle, you first have to update the metadata view, as seen. And you need to know not only the srid of your spatial data (easy to get) but the dimensions limits, and provide a tolerance, one concept you doesn’t need to necesarily know in Oracle’s context.

Let’s continue. Now, we show in OpenJump 2 layers:

One layer with the extent of our rasters (we can’t polygonize them with Oracle GeoRaster, like in PostGIS WKT Raster). SRID 4326.

Intersecting the caribou buffers with the elevation rasters

Big Problem: As said, you can’t intersect vector raster data with Oracle Database. Oracle Spatial has some spatial operators, and a geometry package, and you can perform some spatial operations. But Oracle GeoRaster doesn’t have this capability.

The solution adopted was to:

Calculate what are the buffers that intersects the spatialExtent of the GeoRaster objects.

Using these buffers as sampling windows, generate statistics for each pair buffer-georaster object, and store these statistics in a table

Using the statistics generated, calculate the weighted mean elevation of the raster areas delimited by the buffers, and store the results in another table.

-- Intersects spatial extent of the rasters with the buffers
for intersection in
(select t.id, t.geom, r.image_id as rid from spain_images r,
cariboupoint_buffers_wgs t
where sdo_geom.sdo_intersection(r.image.spatialExtent, t.geom, 0.005)
is not null)

The code can be improved, as Jeffrey Xie suggested in this post, by using the result of the intersection operation as sampling window. Anyway, this code runs very fasts. Less than 2 minutes. Only 212 buffers of the 814 intersects with the raster extents. Here, we can see a screenshot of the raster data and the intersecting buffers only (212). We can compare it with the previous OpenJUMP screenshot, with all the points (814).

To really intersects vector and raster data (NOT vector with raster MBRs), we have to polygonize the raster, and then intersects these polygons with the vector data. This is a really heavy operation. In Oracle, you only check intersection betweem MBR, and only 212 of the 814 buffers intersect with the raster

The sampling windows used in Oracle GeoRaster for intersection are rectangular. The buffers intersected with the raster data in WKT Raster are not rectangular.

Summarizing the elevation values for each buffer and exporting the table to CSV

As Pierre did, our last step is to compute the weighted mean elevation raster. We’ll do in four steps:

Create a table to store the mean elevation data. The SQL code is:

create table means_table(id number, rid number, mean number)

Fill the table. We have a problem here, because the varray field (the statistics obtained by SDO_GEOR.generateStatistics) must be cast to a nested table to access its fields, like we can read here. The PL/SQL code to do this is:

Export the data to a CSV file. We need more PL/SQL for this. In this page there is an excellent method to dump query results in a csv file.

Conclussions

With Oracle GeoRaster you can do a basic raster/vector overlay analysis: compute pixel value statistics on areas delimited by vector polygons. But, in my opinion, it’s a hard task, because of non-intuitive tools and operations, and the fact Oracle GeoRaster wasn’t thought for spatial analysis, but mainly for raster data storage. And even using it for raster data storage, I think it’s a bit difficult. On the other hand, with PostGIS WKT Raster, you can perform this kind of operations in an easier and more intuitive way, at least for a non-experienced user, like me.

Next post: comparing the ideal raster-on-database support with both extensions: Oracle GeoRaster and PostGIS WKT Raster

By the time you wrote your previous post you probably was not aware that there is now a driver for GeoRaster in GDAL too. Then you can time how long it would take to load and extract a Geotiff from GeoRaster and from WKT Raster. That would be interesting.

Besides that I would suggest you considering comparing PostGIS WKT Raster with other implementation of raster on PostGIS. like ArcSDE, TerraLib and Rasdaman. There are designed for PostGIS only and unfortunately the GDAL TerraLib driver is not completed (my fault) but that would be interesting study too. Don’t you think so?

Thanks for the information about GDAL GeoRaster driver. I knew about its existence, but I thought it wasn’t finished yet. For sure loading and extracting data with GDAL driver would be an interesting test (BTW, I will continue working on GDAL WKT Raster driver after FOSS4G).

Anyway, I wanted to use the default loader/exporter for each implementation, and that is PL/SQL&Java for Oracle GeoRaster and gdal2wktraster&GDAL for PostGIS WKT Raster.

About other implementations, yes, you’re right. It would be interesting to compare PostGIS WKT Raster with them. Particularly, I’m very interested in RasDaMan. But for this time, I only wanted the closer implementation to WKT Raster, in the sense I was looking for a “raster-as-native-db-data-type” implementation, and that was Oracle GeoRaster.

Thanks again for your suggestions for further studies. I’d like to work on them as soon as possible

I think your arguments are valid about difficulties for loading to GeoRaster. But in the same way that WKT Raster is relaying on GDAL to state how easy it is to load a raster, you might as well give the same benefit to GeoRaster. So please try:

I could not understand your point here: “As Oracle GeoRaster uses a one-georeference-by-layer georeference schema instead of one-georeference-by-raster one”. I guess there is a little bit of misconception here. See, the GeoRaster doc says: “Layer is a logical concept in the GeoRaster data model. Layers are mapped to bands.”. I have a sense that you are talking about the MBR of each blocks on a RDT record. Right?

You can have GeoRaster objects on records of the same table with different georeferences. What you can’t do is to create index if they are in different SRID (reference system/projection) on the same table. So it shouldn’t be a problem to add those two other raster files.

Vector to Raster querying, cropping, grouping and zonal statistics are just a small part of what raster analysis is, right? It is important but is not all of it. Just a thought.

You might like to take a look at this book on Spatial Database. I am thinking about translating it to English, but for you, I guess it won’t be necessary:

Ok, I’ll test the GDAL driver loading and exporting GeoRaster data, and the driver for loading and exporting WKT Raster data, to compare both in same conditions.

About the one-georeference-by-layer argument, actually just now I’m re-reading about this topic. You’re right: the GeoRaster object georeferenciation is referred to the MBR of RDT records. My (wrong) point was: “because all RDT records related with a given GeoRaster object have the same georeferenciation, you need more than one GeoRaster object (more than one georaster table record) to store non rectangular raster coverages, and hence, more than one raster data table: one for each georaster object”.

But the fact is you can store all your raster blocks in the same RDT, even when they don’t belong to the same georaster object (because they’re identified with rasterid, rasterdatatable properties of georaster object).

And even if your raster coverage is non-rectangular, irregular, you can use only one RDT. The only “problem” is you’ll have to fill non-existent data with empty blocks. Am I right? In that case, I’ll update the article.

BTW, thanks for the book. I think I’ll can understand the most of it :-). And I’d be pleased if you could attend my presentation, of course