Σχόλια 0

Το κείμενο του εγγράφου

Spatio-temporal objects to proxy aPostgreSQL table1. Das neue IfGI-Logo 1.6 LogovariantenLogo für den Einsatz in internationalen bzw.englischsprachigen Präsentationen.Einsatzbereiche: Briefbogen, Visitenkarte,Titelblätter etc.Mindestgröße 45 mm BreiteifgiifgiInstitute for GeoinformaticsUniversity of MünsterifgiInstitut für GeoinformatikUniversität MünsterLogo für den Einsatz in nationalen bzw.deutschsprachigen Präsentationen.Einsatzbereiche: Briefbogen, Visitenkarte,Titelblätter etc.Mindestgröße 45 mm BreiteDieses Logo kann bei Anwendungeneingesetzt werden, wo das Logo besondersklein erscheint.Einsatzbereiche: Sponsorenlogo,Power-PointGröße bis 40 mm BreiteEdzer PebesmaNovember 17,2012AbstractThis vignette describes and implements a class that proxies data setsin a PostgreSQL database with classes in the spacetime package.Thismight allow access to data sets too large to t into R memory.Contents1 Introduction 12 Setting up a database 23 A proxy class 34 Selection based on time period and/or region 35 Closing the database connection 46 Limitations and alternatives 41 IntroductionMassive data are dicult to analyze with R,because Robjects reside in memory.Spatio-temporal data easily become massive,either because the spatial domaincontains a lot of information (satellite imagery),or many time steps are available(high resolution sensor data),or both.This vignette shows how data residingin a data base can be read into R using spatial or temporal selection.In case the commands are not evaluated because CRAN packages cannotaccess an external data base,a document with evaluated commands is foundhere.This vignette was run using the following libraries:R> library(RPostgreSQL)1R> library(spacetime)2 Setting up a databaseWe will rst set the characteristics of the database1R> dbname ="postgis"R> user ="edzer"R> password ="pw"Next,we will create a driver and connect to the database:R> drv <- dbDriver("PostgreSQL")R> con <- dbConnect(drv,dbname=dbname,user=user,password=password)It should be noted that these rst two commands are specic to PostgreSQL;fromhere on,commands are generic and should work for any database connectorthat uses the interface of package DBI.We now remove a set of tables (if present) so they can be created later on:R> dbRemoveTable(con,"rural_attr")R> dbRemoveTable(con,"rural_space")R> dbRemoveTable(con,"rural_time")R> dbRemoveTable(con,"space_select")Now we will create the table with spatial features (observation locations).For this,we need the rgdal function writeOGR,which by default creates anindex on the geometry:R> data(air)R> rural = as(rural,"STSDF")R> p = rural@spR> sp = SpatialPointsDataFrame(p,data.frame(geom_id=1:length(p)))R> library(rgdal)R> OGRstring = paste("PG:dbname=",dbname,"user=",user,+"password=",password,sep ="")R> writeOGR(sp,OGRstring,"rural_space",driver ="PostgreSQL")Second,we will write the table with times to the database,and create anindex to time:R> df = data.frame(time = index(rural@time),time_id = 1:nrow(rural@time))R> dbWriteTable(con,"rural_time",df)R> idx ="create index time_idx on rural_time (time);"R> dbSendQuery(con,idx)Finally,we will write the full attribute data table to PosgreSQL,along withits indexes to the spatial and temporal tables:R> idx = rural@indexR> names(rural@data) ="pm10"#lower caseR> df = cbind(data.frame(geom_id = idx[,1],time_id = idx[,2]),rural@data)R> dbWriteTable(con,"rural_attr",df)1It is assumed that the database is spatially enabled,i.e.it understands how simple featuresare stored.The standard for this from the open geospatial consortium is described here.23 A proxy classThe following class has as components a spatial and temporal data structure,but no spatio-temporal attributes (they are assumed to be the most memory-hungry).The other slots refer to the according tables in the PostGIS database,the name(s) of the attributes in the attribute table,and the database connection.R> setClass("ST_PG",representation("ST",+ space_table ="character",+ time_table ="character",+ attr_table ="character",+ attr ="character",+ con ="PostgreSQLConnection"))Next,we will create an instance of the new class:R> rural_proxy = new("ST_PG",+ ST(rural@sp,rural@time),+ space_table ="rural_space",+ time_table ="rural_time",+ attr_table ="rural_attr",+ attr ="pm10",+ con = con)4 Selection based on time period and/or regionThe following two helper functions create a character string with an SQL com-mand that for a temporal or spatial selection:R>.SqlTime = function(x,j) {+ stopifnot(is.character(j))+ t =.parseISO8601(j)+ t1 = paste("",t$first.time,"",sep ="")+ t2 = paste("",t$last.time,"",sep ="")+ what = paste("geom_id,time_id",paste(x@attr,collapse =","),sep =",")+ paste("SELECT",what,"FROM",x@attr_table,"AS a JOIN",x@time_table,+"AS b USING (time_id) WHERE b.time >=",t1,"AND b.time <=",t2,";")+ }R>.SqlSpace = function(x,i) {+ stopifnot(is(i,"Spatial"))+ writeOGR(i,OGRstring,"space_select",driver ="PostgreSQL")+ what = paste("geom_id,time_id",paste(x@attr,collapse =","),sep =",")+ paste("SELECT",what,"FROM",x@attr_table,+"AS a JOIN (SELECT p.wkb_geometry,p.geom_id FROM",+ x@space_table,"AS p,space_select AS q",+"WHERE ST_Intersects(p.wkb_geometry,q.wkb_geometry))",+"AS b USING (geom_id);")+ }The following selection method selects a time period only,as dened by themethods in package xts.A time period is dened as a valid ISO8601 string,e.g.2005-05 is the full month of May for 2005.3R> setMethod("[","ST_PG",function(x,i,j,...,drop = TRUE) {+ stopifnot(missing(i)!= missing(j))#either of them present+ if (missing(j))+ sql =.SqlSpace(x,i)+ else+ sql =.SqlTime(x,j)+ print(sql)+ df = dbGetQuery(x@con,sql)+ STSDF(x@sp,x@time,df[x@attr],as.matrix(df[c("geom_id","time_id")]))+ })[1]"["R> pm10_20050101 = rural_proxy[,"2005-01-01"]R> summary(pm10_20050101)R> summary(rural[,"2005-01-01"])R> pm10_NRW = rural_proxy[DE_NUTS1[10,],]R> summary(pm10_NRW)R> summary(rural[DE_NUTS1[10,],])Clearly,the temporal and spatial components are not subsetted,so do not re ectthe actual selection made;the attribute data however do;the following selectionstep\cleans"the unused features/times:R> dim(pm10_NRW)R> pm10_NRW = pm10_NRW[T,]R> dim(pm10_NRW)Comparing sizes,we see that the selected object is smaller:R> object.size(rural)R> object.size(pm10_20050101)R> object.size(pm10_NRW)5 Closing the database connectionThe following commands close the database connection and release the driverresources:R> dbDisconnect(con)R> dbUnloadDriver(drv)6 Limitations and alternativesThe example code in this vignette is meant as an example and is not meantas a full- edged database access mechanism for spatio-temporal data bases.Inparticular,the selection here can do only one of spatial locations (entered asfeatures) or time periods.If database access is only based on time,a spatiallyenabled database (such as PostGIS) would not be needed.For massive databases,data would typically not be loaded into the databasefrom R rst,but from somewhere else.4An alternative to access fromR large,possibly massive spatio-temporal databases for the case where the data base is accessible through a sensor observationservice (SOS) is provided by the R package sos4R,which is also on CRAN.5