Comments 0

Document transcript

Package ‘GEOmetadb’December 12,2013Type PackageTitle A compilation of metadata fromNCBI GEOVersion 1.22.0Date 2011-11-28Depends GEOquery,RSQLiteAuthor Jack Zhu and Sean DavisMaintainer Jack Zhu <zhujack@mail.nih.gov>biocViews InfrastructureDescription The NCBI Gene Expression Omnibus (GEO) represents the largest public reposi-tory of microarray data.However,ﬁnding data of interest can be challenging using cur-rent tools.GEOmetadb is an attempt to make access to the metadata associated with sam-ples,platforms,and datasets much more feasible.This is accomplished by pars-ing all the NCBI GEO metadata into a SQLite database that can be stored and queried lo-cally.GEOmetadb is simply a thin wrapper around the SQLite database along with associ-ated documentation.Finally,the SQLite database is updated regu-larly as new data is added to GEO and can be downloaded at will for the most up-to-date meta-data.GEOmetadb paper:http://bioinformatics.oxfordjournals.org/cgi/content/short/24/23/2798.URL http://gbnci.abcc.ncifcrf.gov/geo/License Artistic-2.0R topics documented:GEOmetadb-package....................................2columnDescriptions.....................................3geoConvert.........................................4getBiocPlatformMap....................................5getSQLiteFile........................................6Index 712 GEOmetadb-packageGEOmetadb-package Query NCBI GEO metadata from a local SQLite databaseDescriptionThe NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarraydata.However,ﬁnding data of interest can be challenging using current tools.GEOmetadb is anattempt to make access to the metadata associated with samples,platforms,and datasets much morefeasible.This is accomplished by parsing all the NCBI GEO metadata into a SQLite database thatcan be stored and queried locally.GEOmetadb is simply a thin wrapper around the SQLite databasealong with associated documentation.Finally,the SQLite database is updated regularly as newdatais added to GEO and can be downloaded at will for the most up-to-date metadata.DetailsPackage:GEOmetadbType:PackageVersion:1.1.5Date:2008-09-09License:Artistic-2.0Author(s)Jack Zhu and Sean DavisMaintainer:Jack Zhu <zhujack@mail.nih.gov>Referenceshttp://meltzerlab.nci.nih.gov/apps/geo,http://gbnci.abcc.ncifcrf.gov/geo/Examplesif(file.exists(GEOmetadb.sqlite)) {a <- columnDescriptions()[1:5,]b <- geoConvert(GPL97,GSM)} else {print("use getSQLiteFile() to get a copy of the GEOmetadb SQLite fileand then rerun the example")}columnDescriptions 3columnDescriptions Get column descriptions for the GEOmetadb databaseDescriptionSearching the GEOmetadb database requires a bit of knowledge about the structure of the databaseand column descriptions.This function returns those column descriptions for all columns in alltables in the database.UsagecolumnDescriptions(sqlite_db_name=GEOmetadb.sqlite)Argumentssqlite_db_name The ﬁlename of the GEOmetadb sqlite database ﬁleValueA three-column data.frame including TableName,FieldName,and Description.Author(s)Sean Davis <sdavis2@mail.nih.gov>Referenceshttp://meltzerlab.nci.nih.gov/apps/geoExamplesif(file.exists(GEOmetadb.sqlite)) {columnDescriptions()[1:5,]} else {print("You will need to usethe getSQLiteFile() function to get a copyof the SQLite database file before this example will work")}4 geoConvertgeoConvert Cross-reference between GEO data typesDescriptionA common task is to ﬁnd all the GEO entities of one type associated with another GEO entity (eg.,ﬁnd all GEO samples associated with GEO platform ’GPL96’).This function provides a very fastmapping between entity types to facilitate queries of this type.UsagegeoConvert(in_list,out_type = c("gse","gpl","gsm","gds","smatrix"),sqlite_db_name ="GEOmetadb.sqlite")Argumentsin_list Character vector of GEO entities to convert from.out_type Character vector of GEO entity types to which to convert.sqlite_db_name The ﬁlename of the GEOmetadb sqlite database ﬁleValueA list of data.frames.Author(s)Jack Zhu <zhujack@mail.nih.gov>Referenceshttp://meltzerlab.nci.nih.gov/apps/geo,http://gbnci.abcc.ncifcrf.gov/geo/Examplesif(file.exists("GEOmetadb.sqlite")) {geoConvert(GPL96,out_type=GSM)} else {print("Run getSQLiteFile() to get a copy of the GEOmetadb SQLite fileand then rerun the example")}getBiocPlatformMap 5getBiocPlatformMap Get mappings between GPL and Bioconductor microarry annotationpackagesDescriptionQuery the gpl table and get GPL information of a given list of Bioconductor microarry annotationpackages.Note currently the GEOmetadb does not contains all the mappings,but we are trying toconstruct a relative complete list.UsagegetBiocPlatformMap(con,bioc=all)Argumentscon Connection to the GEOmetadb.sqlite databasebioc Character vector of Biocondoctor microarry annotation packages,e.g.c(’hgu133plus2’,’hgu95av2’).’all’ returns all mappings.ValueA six-column data.frame including GPL title,GPL accession,bioc_package,manufacturer,organ-ism,data_row_count.Author(s)Jack Zhu <zhujack@mail.nih.gov>,Sean Davis <sdavis2@mail.nih.gov>Referenceshttp://meltzerlab.nci.nih.gov/apps/geoExamplesif(file.exists(GEOmetadb.sqlite)) {con <- dbConnect(SQLite(),"GEOmetadb.sqlite")getBiocPlatformMap(con)[1:5,]getBiocPlatformMap(con,bioc=c(hgu133a,hgu95av2))dbDisconnect(con)} else {print("You will need to usethe getSQLiteFile() function to get a copyof the SQLite database file before this example will work")}6 getSQLiteFilegetSQLiteFile Download and unzip the most recent GEOmetadb SQLite ﬁleDescriptionThis function is the standard method for downloading and unzipping the most recent GEOmetadbSQLite ﬁle fromthe server.UsagegetSQLiteFile(destdir = getwd(),destfile ="GEOmetadb.sqlite.gz")Argumentsdestdir The destination directory of the downloaded ﬁledestfile The ﬁlename of the downloaded ﬁle.This ﬁlename should end in".gz"as theunzipping assumes that is the caseValuePrints some diagnostic information to the screen.Returns the local ﬁlename for use later.Author(s)Sean Davis <sdavis2@mail.nih.gov>Referenceshttp://meltzerlab.nci.nih.gov/apps/geo,http://gbnci.abcc.ncifcrf.gov/geo/Examples##Not run:geometadbfile <- getSQLiteFile()IndexTopic IOgeoConvert,4getSQLiteFile,6Topic databasecolumnDescriptions,3geoConvert,4getBiocPlatformMap,5getSQLiteFile,6Topic packageGEOmetadb-package,2columnDescriptions,3geoConvert,4GEOmetadb (GEOmetadb-package),2GEOmetadb-package,2getBiocPlatformMap,5getSQLiteFile,67