Python API Reference for AllegroGraph 3.2

This is a description of the Python Application Programmer's Interface (API) to AllegroGraph RDFStore™ version 3.2 from Franz Inc.

The Python API offers convenient and efficient
access to an AllegroGraph server from a Python-based application. This API provides methods for
creating, querying and maintaining RDF data, and for managing the stored triples.

The Python API deliberately emulates the Aduma Sesame API to make it easier to migrate from Sesame to AllegroGraph. The Python API has also been extended in ways that make it easier and more intuitive than the Sesame API.

A repository contains RDF data that can be queried and updated.
Access to the repository can be acquired by opening a connection to it.
This connection can then be used to query and/or update the contents of the
repository. Depending on the implementation of the repository, it may or may
not support multiple concurrent connections.

Please note that a repository needs to be initialized before it can be used
and that it should be shut down before it is discarded/garbage collected.
Forgetting the latter can result in loss of data (depending on the Repository
implementation)!

Constructor

catalog is the Catalog object that created this Repository using Catalog.getRepository().

repository_name is the name of a repository from Catalog.listRepositories().

access_verb is one of the following:

Repository.RENEW clears the contents of an existing repository before opening. If the indicated repository does not exist,
it creates one.

Repository.OPEN opens an existing repository, or throws an exception if the repository is not found.

Repository.ACCESS opens an existing repository, or creates a new one if the repository is not found.

Repository.CREATE creates a new repository, or throws an exception if one by that name already exists.

Example: Best practice is to invoke the Repository constructor using AllegroGraphServer.getRepository(), which supplies the specialized arguments needed by the Repository contructor.

myRepository = catalog.getRepository("agraph_test", accessMode)

Methods

addFederatedTripleStores(self, tripleStoreNames)

Make this repository a federated store that includes the stores named in the tuple tripleStoreNames. This call must precede the call to 'initialize'. It
may be called multiple times. It returns the modified Repository object.

getConnection(self)

Creates a RepositoryConnection object that can be used for querying and
updating the contents of the Repository. Returns the RepositoryConnection object.

getDatabaseName(self)

Returns a string containing the name of this Repository.

getValueFactory(self)

Return a ValueFactory for this store. This is present for Aduma Sesame compatibility, but in the Python API all ValueFactory functionality has been duplicated or subsumed in the RepositoryConnection class. It isn't necessary to manipulate the ValueFactory class at all.

indexTriples(self, all=False)

Indexes the triples of the Repository. All defaults to False; if True it reindexes all triples; if False it indexes only new triples. (Duplicated in the RepositoryConnection class for Python user convenience.)

initialize(self)

A Repository must be initialized before it can be used. Returns the initialized Repository object.

isWritable(self)

Checks whether this Repository is writable, i.e. if the data contained in this store can be changed.

Register an inlined datatype. Predicate is the URI of predicate used in the triple store. Datatype may be one of: XMLSchema.INT, XMLSchema.LONG, XMLSchema.FLOAT, XMLSchema.DATE, and XMLSchema.DATETIME. NativeType may be "int", "datetime", or "float".

You must supply nativeType and either predicate or datatype.

If predicate, then object arguments to triples with that predicate will use an inlined encoding of type nativeType in their internal representation. If datatype, then typed literal objects with a datatype matching datatype will use an inlined encoding of type nativeType. (Duplicated in the RepositoryConnection class for Python user convenience.)

You must supply either the uri parameter or both namespace and localname. Register a predicate uri (or alternately generate the URI by concatenating namespace+localname). This tells the Repository to index text keywords from string values of this predicate in the triple store. This makes text search possible on this predicate. (Duplicated in the RepositoryConnection class for Python user convenience.)

shutdown(self)

Shuts the Repository down, releasing any resources that it keeps hold of.
Once shut down, the store can no longer be used.

The RepositoryConnection class is the main interface for updating data in and performing queries on a Repository. By default, a RespositoryConnection is in autoCommit mode, meaning that each operation corresponds to a single transaction on the underlying triple store. autoCommit can be switched off, in which case it is up to the user to handle transaction commit/rollback. Note that care should be taking to always properly close a RepositoryConnection after one is finished with it, to free up resources and avoid unnecessary locks.

Several methods take a vararg argument that optionally specifies a (set of) context(s) on which the method should operate. (A context is the URI of a subgraph.) Note that a vararg parameter is optional, it can be completely left out of the method call, in which case a method either operates on a provided statements context (if one of the method parameters is a statement or collection of statements), or operates on the repository as a whole, completely ignoring context. A vararg argument may also be null (cast to Resource) meaning that the method operates on those statements which have no associated context only.

Loads a file into the triple store. Note that a file can be loaded into only one context.

filepath identifies the file to load.

context is an optional context URI (subgraph URI), defaulting to None. If None, the triple(s) will be added to the null context (the default or background graph).

base is the baseURI to associate with loading a file. Defaults to None.

format is RDFFormat.NTRIPLES or RDFFormat.RDFXML. Defaults to None.

serverSide indicates whether the filepath refers to a file on the client computer or on the server. Defaults to False.

addStatement(self, statement, contexts=None)

Add the supplied Statement to the specified contexts of the repository. contexts defaults to None, which adds the statement to the null context (the default or background graph).

addTriple(self, subject, predicate, object, contexts=None)

Adds a single triple to the repository. subject, predicate and object are the three values of the triple. contexts is an optional list of context URIs to add the triple to, defaulting to None. If None, the triple will be added to the null context (the default or background graph).

Add the supplied triples_or_quads to this repository. Each triple can be a list or a tuple of Values. context is the URI of a subgraph, which will be stored in the fourth field of the "triple," defaulting to ALL_CONTEXTS. If ntriples is True, then the triples or quads are assumed to contain valid ntriples strings, and they are passed to the server with no conversion. The default value is False.

clear(self, contexts=None)

Removes all statements from the designated list of contexts (subgraphs) in the repository. If contexts is None (the default), it clears the repository of all statements.

clearNamespaces(self)

Remove all namespace declarations from the current environment.

close(self)

Closes the connection in order to free up resources.

createBNode(self, nodeID=None)

Creates a new blank node with the given node identifier. nodeID defaults to None. If nodeID is None, a new, unused node ID is generated.

createLiteral(self, value, datatype=None, language=None)

Create a new literal with value. datatype if supplied, should be a URI, in which case value should be a string. You may optionally include an RDF language attribute. datatype and language default to None.

createRange(self, lowerBound, upperBound)

Create a compound literal representing a range from lowerBound to upperBound.

createStatement(self, subject, predicate, object, context=None)

Create a new Statement object using the supplied subject, predicate and object and associated context, which defaults to None. The context is the URI of a subgraph.

createURI(self, uri=None, namespace=None, localname=None)

Creates a new URI object from the supplied string-representation(s). uri is a string representing an entire URI. namespace and localname are combined to create a URI. If two non-keyword arguments are passed, it assumes they represent anamespace/localname pair.

export(self, handler, contexts=ALL_CONTEXTS)

Exports all triples in the repository to an external file. handler is either an NTriplesWriter() object or an RDFXMLWriter() object. The export may be optionally confined to a list of contexts (default is ALL_CONTEXTS). Each context is the URI of a subgraph.

Exports all triples that match subj, pred and/or obj. May optionally includeInferred statements provided by RDF++ inference (default is False). handler is either an NTriplesWriter() object or an RDFXMLWriter() object. The export may be optionally confined to a list of contexts (default is ALL_CONTEXTS). Each context is the URI of a subgraph.

getContextIDs(self)

Return a list of context URIs, one for each subgraph referenced by a quad in the triple store. Omits the default context because its ID would be null.

Gets all statements with a specific subject, predicate and/or object from the repository. The result is optionally restricted to the specified set of named contexts (default is ALL_CONTEXTS). A context is the URI of a subgraph. May optionally includeInferred statements provided by RDF++ inference (default is False). Returns a JDBCResultSet that enables Values, strings, etc. to be selectively extracted from the result, without the bulky overhead of the OpenRDF BindingSet protocol.

Gets all statements with a specific subject, predicate and/or object from the repository. The result is optionally restricted to the specified set of named contexts (default is ALL_CONTEXTS). A context is the URI of a subgraph. Returns a RepositoryResult iterator that produces a 'Statement' each time that 'next' is called. May optionally includeInferred statements provided by RDF++ inference (default is False). Takes an optional limit on the number of statements to return.

getStatementsById(self, ids)

Return all statements whose triple ID matches an ID in the list of ids.

getValueFactory(self)

Returns the ValueFactory object associated with this RepositoryConnection.

indexTriples(self, all=False)

Indexes the triples of the repository. All defaults to False; if True it reindexes all triples; if False it indexes only new triples.

isEmpty(self)

Returns True if size() is zero.

prepareBooleanQuery(self, queryLanguage, queryString, baseURI=None)

Parse queryString into a Query object which can be executed against the RDF storage. queryString must be an ASK query. The result is true or false. queryLanguage is one of SPARQL, PROLOG, or COMMON_LOGIC. baseURI optionally provides a URI prefix (defaults to None). Returns a Query object. The result of query execution will be True of False.

prepareGraphQuery(self, queryLanguage, queryString, baseURI=None)

Parse queryString into a Query object which can be executed against the RDF storage. queryString must be a CONSTRUCT or DESCRIBE query. queryLanguage is one of SPARQL, PROLOG, or COMMON_LOGIC. baseURI optionally provides a URI prefix (defaults to None). Returns a Query object. The result of query execution is an iterator of Statements/quads.

prepareTupleQuery(self, queryLanguage, queryString, baseURI=None)

Embed queryString into a Query object which can be executed against the RDF storage. queryString must be a SELECT query. queryLanguage is one of SPARQL, PROLOG, or COMMON_LOGIC. baseURI optionally provides a URI prefix (defaults to None). Returns a Query object. The result of query execution is an iterator of tuples.

Register an inlined datatype. Predicate is the URI of predicate used in the triple store. Datatype may be one of: XMLSchema.INT, XMLSchema.LONG, XMLSchema.FLOAT, XMLSchema.DATE, and XMLSchema.DATETIME. NativeType may be "int", "datetime", or "float".

You must supply nativeType and either predicate or datatype.

If predicate, then object arguments to triples with that predicate will use an inlined encoding of type nativeType in their internal representation. If datatype, then typed literal objects with a datatype matching datatype will use an inlined encoding of type nativeType.

remove(self, arg0, arg1=None, arg2=None, contexts=None)

Calls removeTriples() or removeStatement(). Best practice would be to avoid remove() and use removeTriples() or removeStatement() directly.

arg0, arg1, and arg2 may be the subject, predicate and object of a triple.

contexts is an optional list of contexts, defaulting to None.

removeNamespace(self, prefix)

Remove the namespace associate with prefix.

removeQuads(self, quads, ntriples=False)

Remove enumerated quads from this repository. Each quad can be a list or a tuple of Values. If ntriples is True (default is False), then the quads are assumed to contain valid ntriples strings, and they are passed to the server with no conversion.

removeQuadsByID(self, tids)

tids contains a list of triple IDs (integers). Remove all quads with IDs that match.

removeStatement(self, statement, contexts=None)

Removes the supplied Statement(s) from the specified contexts (default is None).

removeTriples(self, subject, predicate, object, contexts=None)

Removes the triples with the specified subject, predicate and object
from the repository, optionally restricted to the specified contexts (defaults to None)..

setNamespace(self, prefix, namespace)

Define (or redefine) a namespace associated with prefix.

size(self, contexts=ALL_CONTEXTS)

Returns the number of (explicit) statements that are in the specified contexts in this repository. contexts defaults to ALL_CONTEXTS, but can be a tuple of context names from getContextIDs().

Free Text Search Methods

The following repositoryConnection method supports free-text indexing in AllegroGraph.

You must supply either the uri parameter or both namespace and localname. Register a predicate uri (or alternately generate the URI by concatenating namespace+localname). This tells the Repository to index text keywords from string values of this predicate in the triple store. This makes text search possible on this predicate.

Note that text search is implemented through a SPARQL query using a "magic" predicate called fti:search. See the AllegroGraph Python API Tutorial for an example of how to set up this search.

Prolog Rule Inference Methods

These repositoryConnection methods support the use of Prolog rules in AllegroGraph.

addRules(self, rules, language=None)

Add a sequence of one or more rules (in ASCII format).
If the language is QueryLanguage.PROLOG, rule declarations start with '<-' or '<--'. The former appends a new rule; the latter overwrites any rule with the same predicate. language defaults to QueryLanguage.PROLOG.

createEnvironment(self, name)

Repositories use a current environment, which is a container for namespaces and Prolog rules. Rules and namespaces defined in this environment persist across user sessions. Every server-side repository has a default environment that is used when no environment is specified. name is a string label.

deleteEnvironment(self, name)

Delete an environment. This causes all rule and namespace definitions for this
environment to be lost.

deleteRule(self, predicate, language=None)

Delete rule(s) with predicate named predicate. If predicate is None, delete
all rules from the current environment. language defaults to QueryLanguage.PROLOG.

listEnvironments(self)

List the names of environments currently maintained by the system.

loadRules(self, file ,language=None)

Load a file of rules into the current environment. file is assumed to reside on the client machine. language defaults to QueryLanguage.PROLOG.

setEnvironment(self, name)

Choose an environment for execution of a Prolog query. Call deleteEnvironment() to start with a fresh (empty) environment.

setRuleLanguage(self, queryLanguage)

queryLanguage is QueryLanguage.PROLOG.

Geospatial Reasoning Methods

These repositoryConnection methods support geospatial reasoning.

createBox(self, xMin=None, xMax=None, yMin=None, yMax=None)

Create a rectangular search region (a box) for geospatial search. This method works for both Cartesian and spherical coordinate systems. xMin, xMax may be used to input latitude. yMin, yMax may be used to input longitude.

createCircle(self, x, y, radius, unit=None)

Create a circular search region for geospatial search. This method works for both Cartesian and spherical coordinate systems. radius is the radius of the circle expressed in the designated unit, which defaults to the unit assigned to the coordinate system. x and y locate the center of the circle and may be used for latitude and longitude.

createCoordinate(self, x=None, y=None, lat=None, long=None)

Create a coordinate point in a geospatial coordinate system. Must include x and y, or lat and long. Use this method to create the object value for a location triple.

Create a spherical coordinate system for geospatial location matching. unit can be 'degree', 'mile', 'radian', or 'km'. scale should be your estimate of the size of a typical search region in the latitudinal direction. latMin and latMax are the bottom and top borders of the coordinate system. longMin and longMax are the left and right sides of the coordinate system.

createPolygon(self, vertices, uri=None, geoType=None)

Create a polygonal search region for geospatial search. The vertices are saved as triples in AllegroGraph. vertices is a list of (x, y) pairs such as [(51.0, 2.00),(60.0, -5.0),(48.0,-12.5)]. uri is an optional subject value for the vertex triples, in case you want to manipulate them. geoType is 'CARTESIAN' or 'SPHERICAL', but defaults to None.

Create a Cartesian coordinate system for geospatial location matching. scale should be your estimate of the Y size of a typical search region. unit must be None. xMin and xMax are the left and right edges of the rectangle. yMin and yMax are the bottom and top edges of the rectangle.

Social Network Analysis Methods

The following repositoryConnection methods support Social Network Analysis in AllegroGraph.

dropNeighborMatrix(self, name)

Destroy the neighbor matrix named 'name'.

dropSNAGenerator(self, name)

Destroy the generator named 'name'.

listNeighborMatrices(self)

Return a list of the names of registered neighbor matrices.

listSNAGenerators(self)

Return a list of the names of registered SNA generators.

rebuildNeighborMatrix(self, name)

Recompute the set of edges cached in the neighbor matrix named 'name'.

Construct a neighbor matrix named 'name'. The generator named 'generator' is applied
to each URI in 'group_uris' (a collection of fullURIs or qnames (strings)),
computing edges to max depth 'max_depth'.

Create (and remember) a generator named 'name'.
If one already exists with the same name; redefine it.
'subjectOf', 'objectOf' and 'undirected' expect a list of predicate URIs, expressed as
fullURIs or qnames, that define the edges traversed by the generator.
Alternatively, instead of an adjacency map, one may provide a 'generator_query',
that defines the edges.

The Query class is
non-instantiable. It is an abstract class from which the three query subclasses are derived. It is included here because of its methods, which are inherited by the subclasses.

A query on a Repository that can be formulated in one of the
supported query languages (for example SPARQL). It allows one to
predefine bindings in the query to be able to reuse the same query with
different bindings.

Source: \AllegroGraphDirectory\python\franz\openrdf\query\query.py.

Constructor

Query(self, queryLanguage, queryString, baseURI=None)

queryLanguage is one of "queryLanguage.SPARQL", "queryLanguage.PROLOG", or "queryLanguage.COMMON_LOGIC."

Methods

evaluate_generic_query(self)

Evaluate a SPARQL or PROLOG or COMMON_LOGIC query. If SPARQL, it may be a 'select', 'construct', 'describe' or 'ask' query. Return an appropriate response. (Best practice is to use (and evaluate) one of the more specific query subclasses.)

getBindings(self)

Retrieves the bindings that have been set on this query in the form of a dictionary.

getDataset(self)

Returns the current dataset setting for this query.

getIncludeInferred(self)

Returns whether or not this query will return inferred statements (if any
are present in the repository).

removeBinding(self, name)

Removes the named binding so that it has no value.

setBinding(self, name, value)

Binds the named attribute to the supplied value. Any value that was previously bound to the specified attribute will be overwritten.

setBindings(self, dict)

Sets multiple bindings using a dictionary of attribute names and values.

setCheckVariables(self, setting)

If true, the presence of variables in the SELECT clause not referenced in a triple pattern
are flagged.

setContexts(self, contexts)

Assert a set of contexts (a list of subgraph URIs) that filter all triples.

setDataset(self, dataset)

Specifies the dataset against which to evaluate a query, overriding any dataset that is specified in the query itself.

setIncludeInferred(self, includeInferred)

Determines whether results of this query should include inferred statements (if any inferred statements are present in the repository). The default setting is 'true'.

This subclass is used with SELECT queries. Use the RepositoryConnection object's prepareTupleQuery() method to create a TupleQuery object. The results of the query are returned in a RepositoryResult iterator that yields a sequence of bindingSets.

Execute the embedded query against the RDF store. Return
an iterator that produces for each step a tuple of values
(resources and literals) corresponding to the variables
or expressions in a 'select' clause (or its equivalent).
If 'jdbc', returns a JDBC-style iterator that miminizes the
overhead of creating response objects.

This subclass is used with CONSTRUCT and DESCRIBE queries. Use the RepositoryConnection object's prepareGraphQuery() method to create a GraphQuery object. The results of the query are returned in a RepositoryResult iterator that yields a sequence of bindingSets.

A RepositoryResult object is a result collection of objects (for example, Statement objects) that can be iterated over. It keeps an open connection to the backend for lazy retrieval of individual results. Additionally it has some utility methods to fetch all results and add them to a collection.

By default, a RepositoryResult is not necessarily a (mathematical) set: it may contain duplicate objects. Duplicate filtering can be switched on, but this should not be used lightly as the filtering mechanism is potentially memory-intensive.

A RepositoryResult needs to be closed after use to free up any resources (open connections, read locks, etc.) it has on the underlying repository.

Constructor

tripleIDs defaults to false. If true, the RepositoryResult object contains the triple IDs of the matching triples.

Example: Best practice is to allow a querySubclass.evaluate() method to create and return the RepositoryResult object. There is no reason for the Python application programmer to create a RepositoryResult object directly.

Methods

Switches on duplicate filtering while iterating over objects. The RepositoryResult will keep track of the previously returned objects in a java.util.Set and on calling next() will ignore any objects that already occur in this Set.

Caution: use of this filtering mechanism is potentially memory-intensive.

asList(self)

Returns a list containing all objects of this RepositoryResult in
order of iteration. The RepositoryResult is fully consumed and
automatically closed by this operation.

addTo(self, collection)

Adds all objects of this RepositoryResult to the supplied collection. The
RepositoryResult is fully consumed and automatically closed by this
operation.

Constructor

Statement(self, subject, predicate, object, context=None)

subject, predicate, object are the values of a typical triple.

context is the optional URI of the subgraph of the repository.

Example: Best practice is to allow the RepositoryConnection.createStatement() method to create and return the Statement object. There is no reason for the Python application programmer to create a Statement object directly.

stmt1 = conn.createStatement(alice, age, fortyTwo)

Methods

getContext(self)

Returns the value in the fourth position of the stored tuple (the subgraph URI).

getObject(self)

Returns the value in the third position of the stored tuple.

getPredicate(self)

Returns the value in the second position of the stored tuple.

getSubject(self)

Returns the value in the first position of the stored tuple.

setQuad(self, string_tuple)

Stores a string_tuple of a triple or quad. This method is called only by an internal method of the RepositoryResult class. There is no need for a Python application programmer to use it.

A ValueFactory is a factory for creating URIs, blank nodes, literals and Statements. In the AllegroGraph Python interface, the ValueFactory class would be regarded as obsolete. Its functions have been subsumed by the expanded capability of the ResponseConnection class. It is documented here for the convenience of the person who is porting an application from Aduma Sesame.

Constructor

Example: Best practice is to allow the Repository constructor to generate the ValueFactory automatically at the same time that the Repository object is created. There is no reason for a Python application programmer to attempt this step manually.