Toward a Basic Profile for Linked Data

A collection of best practices and a simple approach for a Linked Data
architecture

W3C defines a wide range of standards for the Semantic Web and Linked Data suitable for many
possible use cases. While using Linked Data as an application integration technology in the
Application Lifecycle Management (ALM) domain, IBM has found that there are often several possible
ways of applying the existing standards, yet little guidance is provided on how to combine them.
This article explains motivating background information and a proposal for a Basic Profile for
Linked Data.

Martin Nally, an IBM Fellow, is Vice President and the Chief Technology Officer for the Rational software division of IBM. Martin joined IBM in 1990 with 10 years' of prior industry experience. He has held several architecture and development positions in IBM, including lead architect and developer for IBM VisualAge/Smalltalk and VisualAge/Java. Martin was one of a team of three that launched the IBM project that later became the Eclipse framework. He then led the architecture, design, and development of WebSphere Studio, which evolved into Rational Application Developer. More recently, he has been one of the champions behind moving the Rational portfolio to a web-based architecture and was instrumental in creating Open Services for Lifecycle Collaboration, an integration architecture, and Jazz technology, a set of common services used to combine IBM and non-IBM tools to create an integrated system.

Steve Speicher is an IBM Senior Technical Staff Member who focuses on Rational change management solutions and integrations. He is the lead for the Open Services for Lifecycle Collaboration (OSLC) Core and Change Management topic areas, which delivers open HTTP REST and Linked Data specifications, as well as implementations within the Rational change management products. Steve formerly worked in emerging standardization efforts in healthcare and compound documents (W3C).

Motivation

There is interest in using Linked Data technologies for more than one purpose. We have seen
interest in it to expose information -- public records, for example -- on the Internet in a
machine-readable format. We have also seen interest in using it for inferring new information
from existing information, for example in pharmaceutical applications or IBM Watson™
(see the Resources section for links to more information). The
IBM® Rational® team has been using Linked Data as an architectural model and
implementation technology for application integration.

Rational software is a vendor of software development tools, particularly those that support
the general software development process, such as bug tracking, requirements management, and
test management tools. Like many vendors that sell multiple applications, we have seen strong
customer demand for better support of more complete business processes (in our case, software
development processes) that span the roles, tasks, and data addressed by multiple tools. This
demand has existed for many years, and our industry has tried several different architectural
approaches to address the problem. Here are a few:

Implement some sort of application programming interface (API) for each application, and
then, in each application, implement "glue code" that exploits the APIs of other
applications to link them together.

Design a single database to store the data of multiple applications, and implement each of
the applications against this database. In the software development tools business, these
databases are often called "repositories."

Implement a central "hub" or "bus" that orchestrates the broader business process by
exploiting the APIs described previously.

A discussion of the failings of each of these approaches is beyond the scope of this article,
but it is fair to say that, although each one of those approaches has its adherents and can
point to some successes, none of them is wholly satisfactory. So, as an alternative, over the
last five years we have been exploring the use of Linked Data as an application integration
technology. We have shipped a number of products using this technology and are generally
pleased with the result. We have more products in development that use these technologies, and
we are also seeing a strong interest in this approach in other parts of our company.

Although we are pleased -- even passionate -- about the results that we have seen using Linked
Data as an integration technology, but we have found successful adoption to be difficult. It
has taken us several years of experimentation to achieve the level of understanding that we
have today. We have made some costly mistakes along the way, and we see no immediate end to
the challenges and learning that lie before us.

As far as we can tell, there are not many people who are trying to use Linked Data technologies
in the ways that we are using them, and the little information that is available on best
practices and pitfalls is widely dispersed. We believe that Linked Data has the potential to
solve some important problems that have frustrated the IT industry for many years, or at least
to make significant advances in that direction. But this potential will be realized only if we
can establish and communicate a much richer body of knowledge about how to exploit these
technologies. In some cases, there also are gaps in the Linked Data standards that need to be
addressed.

To help with this process, we would like to share information about how we are using these
technologies, the best practices and anti-patterns that we have identified, and the
specification gaps that we have had to fill. These best practices and anti-patterns can be
classified according to (but are not limited to) the following categories:

Resources

A summary of the HTTP and RDF standard techniques and best practices that you should use,
and anti-patterns you should avoid, when constructing clients and servers that read and
write Linked Data

Containers

Defines resources that allow new resources to be created using HTTP POST and existing
resources to be found using HTTP GET

Paging

Defines a mechanism for splitting the information in large resources into pages that can
be fetched incrementally

Validation

Defines a simple mechanism for describing the properties that a particular type of
resource must or may have

The following sections provide details regarding this proposal for a Basic Profile for Linked
Data.

Open Services for Lifecycle Collaboration (OSLC)The OSLC Core v2 specification defines some of these patterns and anti-patterns,
although perhaps not in an ideal way. This proposal can provide the basis for a simpler and
more standards-aligned way for future OSLC specifications.

Terminology

These definitions are based on W3C's Architecture of the World Wide Web and Hyper-text Transfer
Protocol, HTTP/1.1 (see Resources).

Link

A relationship between two resources when one resource (representation) refers to the
other resource by means of a URI. (reference: WWWArch)

Linked Data

Defined by Tim Berners-Lee as four rules:

Use URIs as names for things.

Use HTTP URIs so that people can look up those names.

When someone looks up a URI, provide useful information, using the standards
(RDF*, SPARQL).

Include links to other URIs so that they can discover more things.(reference: LinkedData).

Specification

An act of describing or identifying something precisely or of stating a precise
requirement

Basic Profile

A specification that defines the specification components needed from other
specifications, plus provides clarifications and patterns

Client

A program that establishes connections for the purpose of sending requests (reference: HTTP)

Basic Profile Client

A client that adheres to the rules defined in the Basic Profile

Server

An application program that accepts connections in order to service requests by sending
back responses

Note: Any given program can be capable of being
both a client and a server. Our use of these terms refers only to the role being performed
by the program for a particular connection, rather than to the program's capabilities in
general. Likewise, any server can act as an origin server, proxy, gateway, or tunnel,
switching behavior based on the nature of each request (reference: HTTP).

Basic Profile
Resources

Basic Profile Resources are HTTP Linked Data resources that conform to simple patterns and
conventions. Most Basic Profile Resources are domain-specific resources that contain data for
an entity in a domain, and that domain can be commercial, governmental, scientific, religious,
or another type. A few Basic Profile Resources are defined by the Basic Profile specifications
and are cross-domain. All Basic Profile Resources follow the rules of Linked Data previously cited in
the Terminology section:

Use URIs as names for things.

Use HTTP URIs so that people can look up those names.

When someone looks up a URI, provide useful information, using the standards (RDF*,
SPARQL).

Include links to other URIs so that people can discover more things.

Basic Profile adds a few rules. Some of these rules could be thought of as clarification of the
basic Linked Data rules.

Basic Profile Resources are HTTP resources that can be created, modified, deleted
and read using standard HTTP methods.(Clarification or extension of
Linked Data Rule 2.) Basic Profile Resources are created by HTTP POST (or PUT) to an
existing resource, deleted by HTTP DELETE, updated by HTTP PUT or PATCH, and "fetched"
using HTTP GET. Additionally, Basic Profile Resources can be created, updated, and deleted
by using SPARQL Update.

Basic Profile Resources use RDF to define their
states.(Clarification of Linked Data Rule 3.) The state of a Basic Profile
Resource (in the sense of state used in the REST architecture) is defined by a
set of RDF triples. Binary resources and text resources are not Basic Profile Resources
since their states cannot be easily or fully represented in RDF. XML resources might or
might not be suitable as Basic Profile Resources. Some XML resources are really
data-oriented resources encoded in XML that can be easily represented in RDF. Other XML
documents are essentially marked up text documents that are not easily represented in RDF.
Basic Profile Resources can be mixed with other resources in the same application.

You can request an RDF/XML representation of any Basic Profile
Resource.(Clarification of Linked Data Rule 3.) The resource might have
other representations, as well. These could be other RDF formats, such as Turtle, N3, or
NTriples, but non-RDF formats such as HTML and JSON would also be popular additions, and
Basic Profile sets no limits.

Basic Profile clients use Optimistic Collision Detection during
update.(Clarification of Linked Data Rule 2.) Because the update process
involves getting a resource first, and then modifying it and later putting it back on the
server, there is the possibility of a conflict (for example, another client might have
updated the resource since the GET action). To mitigate this problem, Basic Profile
implementations should use the HTTP If-Match
header and HTTP ETags to detect collisions.

Basic Profile Resources use standard media types.(Clarification of
Linked Data Rule 3.) Basic Profile does not require and does not encourage the definition
of any new media types. A Basic Profile goal is that any standards-based RDF or Linked
Data client be able to read and write Basic Profile data, and defining new media types
would prevent that in most cases.

Basic Profile Resources use standard vocabularies.Basic Profile
Resources use common vocabularies (classes, properties, and so forth) for common concepts.
Many websites define their own vocabularies for common concepts such as resource type,
label, description, creator, last modification time, priority, enumeration of priority
values, and so on. This is usually viewed as a good feature by users who want their data
to match their local terminology and processes, but it makes it much harder for
organizations to subsequently integrate information in a larger view. Basic Profile
requires all resources to expose common concepts using a common vocabulary for properties.
Sites can choose to additionally expose the same values under their own private property
names in the same resources. In general, Basic Profile avoids inventing property names
where possible. Instead, it uses ones from popular RDF-based standards, such as the RDF
standards themselves, Dublin Core, and so on. Basic Profile invents property URLs where no
match is found in popular standard vocabularies. Note: A number of
recommended standard properties for use in Basic Profile Resources are listed below.

Basic Profile Resources set rdf:type explicitly.A resource's
membership in a class extent can be derived implicitly or indicated explicitly by
a triple in the resource representation that uses the rdf:type predicate and the URL of
the class or derived implicitly. In RDF, there is no requirement to place an rdf:type
triple in each resource, but this is a good practice, because it makes a query more useful
in cases where inferencing is not supported. Remember also that a single resource can have
multiple values for rdf:type. For example, the dpbedia entry for Barack Obama has dozens
of rdf:types. Basic Profile sets no limits to the number of types a resource can
have.

Basic Profile Resources use a restricted number of standard data
types.RDF does not define data types to be used for property values, so
Basic Profile lists a set of standard datatypes to be used in Basic Profile:

Basic Profile clients expect to encounter unknown properties and
content.Basic Profile provides mechanisms for clients to discover lists of
expected properties for resources for particular purposes, but it also assumes that any
given resource might have many more properties than those listed. Some servers will
support only a fixed set of properties for a particular type of resource. Clients should
always assume that the set of properties for a resource of a particular type at an
arbitrary server might be open, in the sense that different resources of the same type
might not all have the same properties, and the set of properties that are used in the
state of a resource is not limited to any predefined set. However, when dealing with Basic
Profile Resources, clients should assume that a Basic Profile server might discard triples
for properties when it has prior knowledge. In other words, servers can restrict
themselves to a known set of properties, but clients cannot. When doing an update using
HTTP PUT, a Basic Profile client must preserve all property values retrieved by using HTTP
GET. This includes all property values that it doesn't change or understand. (Use of HTTP
PATCH or SPARQL Update rather than HTTP PUT for updates avoids this burden for
clients.)

Basic Profile clients do not assume the type of a resource at the end of a
link.Many specifications and most traditional applications have a
"closed model," by which we mean that any reference from a resource in the specification
or application necessarily identifies a resource in the same specification (or a
referenced specification) or application. In contrast, the HTML anchor tag can point to
any resource addressable by an HTTP URI, not just other HTML resources. Basic Profile
works like HTML in this sense. An HTTP URI reference in one Basic Profile Resource can, in
general, point to any resource, not just a Basic Profile Resource. There are numerous
reasons to maintain an open model like HTML's. One is that it allows data that has not yet
been defined to be incorporated in the web in the future. Another reason is that it allows
individual applications and sites to evolve over time. If clients assume that they know
what will be at the other end of a link, then the data formats of all resources across the
transitive closure of all links must be kept stable for version
upgrade.

A consequence of this independence is that client
implementations that traverse HTTP URI links from one resource to another should always
code defensively and be prepared for any resource at the end of the link. Defensive coding
by client implementers is necessary to allow sets of applications that communicate through
Basic Profile to be independently upgraded and flexibly extended.

Basic Profile servers implement simple validations for Create and
Update.Basic Profile servers should try to make it easy for programmatic
clients to create and update resources. If Basic Profile implementations associate a lot
of very complex validation rules that need to be satisfied for an update or creation to be
accepted, it becomes difficult or impossible for a client to use the protocol without
extensive additional information specific to the server that needs to be communicated
outside of the Basic Profile specifications. The recommended approach is for servers to
allow creation and updates based on the sort of simple validations that can be
communicated programmatically through a Shape (see the Constraints section). Additional checks that are required to implement more
complex policies and constraints should result in the resource being flagged as requiring
more attention, but should not cause the basic Create or Update action to fail.

It is possible that some applications or sites will have very strict
requirements for complex constraints for data and that they are unable or unwilling to
even temporarily allow the creation of resources that do not satisfy all of those
constraints. Those applications or sites need to be aware that, as a consequence, they
might be making it difficult or impossible for external software to use their interfaces
without extensive customization.

Basic Profile Resources always use simple RDF predicates to represent
links.By always representing links as simple predicate values, Basic
Profile makes it very simple to know how links will appear in representations and also
makes it very simple to query them. When there is a need to express properties on a link,
Basic Profile adds an RDF statement with the same subject, object, and predicate as the
original link, which is retained, plus any additional "link properties." Basic Profile
Resources do not use "inverse links" to support navigation of a relationship in the
opposite direction, because this creates a data synchronization problem and complicates a
query. Instead, Basic Profile assumes that clients can use queries to navigate
relationships in the opposite direction from the direction supported by the underlying
link.

Common properties

The tables that follow list properties from well-known RDF vocabularies that are recommended
for use in Basic Profile Resources. Basic Profile requires none of them, but a specification
based on Basic Profile might require one or more of these properties for a particular type of
resource.

Commonly used namespace prefixes

Prefix

Namespace URI

dcterms

http://purl.org/dc/terms/

rdf

http://www.w3.org/1999/02/22-rdf-syntax-ns#

rdfs

http://www.w3.org/2000/01/rdf-schema#

bp

http://open-services.net/ns/basicProfile#

xsd

http://www.w3.org/2001/XMLSchema#

From Dublin Core

URI: http://purl.org/dc/terms/

Property

Range

Comment

dcterms:contributor

dcterms:Agent

The identifier of a resource (or blank node) that is a contributor of information.
This resource can be a person or group of people or, possibly, an automated system.

dcterms:creator

dcterms:Agent

The identifier of a resource (or blank node) that is the original creator of the
resource. This resource can be a person or group of people or, possibly, an automated
system.

dcterms:created

xsd:dateTime

The creation timestamp.

dcterms:description

rdf:XMLLiteral

Descriptive text about the resource represented as rich text in XHTML format.
Should include only content that is valid and suitable inside an
XHTML <div> element.

dcterms:identifier

rdfs:Literal

A unique identifier for the resource. Typically read-only and assigned by the service
provider when a resource is created. Not typically intended for end-user display.

dcterms:modified

xsd:dateTime

Date on which the resource was changed.

dcterms:relation

rdfs:Resource

The URI of a related resource. This is the predicate to use when you do not know what
else to use. If you know what kind of relationship it is, use a more specific
predicate.

dcterms:subject

rdfs:Resource

Should be a URI (see dbpedia.org). From Dublin Core: "Typically, the subject will be
represented using keywords, key phrases, or classification codes. Recommended best
practice is to use a controlled vocabulary. To describe the spatial or temporal topic
of the resource, use the Coverage element."

dcterms:title

rdf:XMLLiteral

A name given to the resource. Represented as rich text in XHTML format.
Should include only content that is valid inside an XHTML
<span> element.

From RDF

URI: http://www.w3.org/1999/02/22-rdf-syntax-ns#

Property

Range

Comment

rdf:type

rdfs:Class

The type or types of the resource. Basic Profile recommends that the rdf:type(s) of a
resource be set explicitly in resource representations to facilitate query with
non-inferencing query engines.

Basic Profile
Container

Many HTTP applications and sites have organizing concepts that partition the overall space of
resources into smaller Containers. Blog posts are grouped into blogs, wiki pages are grouped
into wikis, and products are grouped into catalogs. Each resource created in the application
or site is created within an instance of one of these Container-like entities, and users can
list the existing artifacts within one. There is no agreement across applications or sites,
even within a particular domain, on what these grouping concepts should be called, but they
commonly exist and are important. Containers answer two basic questions:

To which URLs can I POST to create new resources?

Where can I GET a list of existing resources?

In the XML world, Atom Publishing Protocol (APP) has become popular as a standard for answering
these questions. APP is not a good match for Linked Data, because this Basic Profile shows how
the same problems that are solved by APP for XML-centric designs can be solved by a simple
Linked Data usage pattern with simple conventions for posting to RDF Containers. We call these
RDF Containers that you can POST to Basic Profile Containers. Here are some of their
characteristics:

Clients can retrieve the list of existing resources in a Basic Profile Container.

New resources are created in Basic Profile Containers by POSTing to them.

Any resource can be POSTed to a Basic Profile Container. A resource does not have to be a
Basic Profile Resource with an RDF representation to be POSTed to a Basic Profile
Container.

After POSTing a new resource to a Container, the new resource will appear as a member of
the Container until it is deleted. A Container can also contain resources that were added
through other means, for example through the user interface of the site that implements
the Container.

The same resource can appear in multiple Containers. This happens commonly if one
Container is a "view" onto a larger Container.

Clients can get partial information about a Basic Profile Container without retrieving a
full representation of all of its contents.

The representation of a Basic Profile Container is a standard RDF Container representation that
uses the rdfs:member predicate. For
example, if you have a Container with the URL http://example.org/BasicProfile/container1, it
might have the representation shown in Listing 1.

The Basic Profile does not recognize or recommend the use of other forms of an RDF Container,
such as Bag and Seq, because they are not friendly to query. This follows standard Linked Data
guidance for RDF use.

The Basic Profile recommends the use of a set of standard Dublin Core properties with
Containers. The subject of triples using these properties is the Container itself.

rdfs:Container domain properties

Property

Occurs

Range

Comment

dcterms:title

zero or one

rdf:XMLLiteral

A name given to the resource. Represented as rich text in XHTML format.
Should include only content that is valid inside an XHTML
<span> element.

dcterms:description

zero or one

rdf:XMLLiteral

Descriptive text about resource represented as rich text in XHTML format.
Should include only content that is valid and suitable inside an
XHTML <div> element.

dcterms:publisher

zero or one

dcterms:Agent

An entity responsible for making the Basic Profile Container and its members
available.

bp:containerPredicate

exactly one

rdfs:Property

The predicate of the triples whose objects define the contents of the Container.

Retrieving non-member properties

The representation of a Container that has many members will be large. When we looked at our
use cases, we saw that there were several important cases where clients needed to access only
the non-member properties of the Container. (The dcterms properties listed in this page might
not seem important enough to warrant addressing this problem, but we have use cases that add
other predicates to Containers, such as for providing validation information and associating
SPARQL end points for example.) Because retrieving the whole Container representation to get
this information is onerous, we were motivated to define a way to retrieve only the non-member
property values. We do this by defining a corresponding resource for each Basic Profile
Container, called the "non-member resource," which has a state that is a subset of the state
of the Container. The non-member resource's HTTP URI can be derived in the following way:

If the HTTP URI of the Container is {url}, then the HTTP URI of the related non-member resource
is {url}?non-member-properties. The representation of {url}?non-member-properties is identical
to the representation of {url}, except that the membership triples are missing. The subjects
of the triples will still be {url} (or whatever they were in the representation of {url}), not
{url}?non-member-properties. Any server that does not support non-member-resources should
return an HTTP 404 File Not Found error when a non-member-resource is requested.

This approach is analogous to using HTTP HEAD rather that HTTP GET. The difference is that HTTP
HEAD is used to fetch the response headers for a resource, as opposed to requesting the entire
representation of a resource using HTTP GET. Listing 1 shows an example.

Design motivation and background

The concept of non-member resources has not been especially controversial, but using the URL
pattern {url}?non-member-properties to identify them has been. Some people feel it's an
unacceptable intrusion into the URL space that is owned and controlled by the server that
defines {url}. A more practical objection is that servers respond unpredictably to URLs that
they do not understand, especially those that have a ? character in them. For example, some
servers will return the resource identified by the portion of the URL that precedes the ? and
simply ignore the rest.

This problem could perhaps be mitigated by using a character other than ? in the URL pattern.
An alternative design that was discussed uses a header field in the response header of {url}
to allow the server to control and communicate the URL of the corresponding
non-member-resource. Presence or absence of the header field would let clients know whether
the non-member resource is supported by the server.

The advantages of this approach are that it does not impinge on the server's URL space and
that it works predictably for servers that do not understand the concept of a
non-member-resource.

The disadvantages are that it requires two server round-trips, a HEAD and a GET, to
retrieve the non-member resources, and it requires the definition of a custom HTTP header,
which, to some people at least, seems comparatively heavyweight.

Additional considerations

Basic Profile Containers should provide guidance in these situations:

When dcterms:modified or Etag changes, or both, when Container membership changes to
effectively allow for caching of Containers

When there are membership limitations (typically, a resource will only be part of a single
Container, although there might be exceptions)

Basic
Profile validation and constraints

Basic Profile resources are RDF resources, and RDF has the happy characteristic that "it can
say anything about anything." This means that, in principle, any resource can have any
property and there is no requirement that any two resources have the same set of properties,
even if they have the same type or types. In practice, though, the properties that are set on
resources usually follow regular patterns that are dictated by the uses of those resources.
Although a particular resource might have arbitrary properties, when viewed from the
perspective of a particular application or use case, the set of properties and property values
that are appropriate for that resource in that application will often be predictable and
constrained. For example, if a server has resources that represent software products and bugs,
for the purposes of displaying information in tabular formats, creating and updating
resources, or other purposes, a client might want to know what properties software products
and bugs have on that server,. The Basic Profile Validation and Constraints specification aims
to capture information about those properties and constraints.

The distinction between the resource and the use cases that it participates in is important to
us. Traditional technologies such as relational databases constrain the total set of
properties that an entity can have. In the Basic Profile, we aim only to define the properties
that a resource can have when viewed through the lens of a particular application or use case,
yet retaining the ability of the same resource to have an arbitrary set of properties to
support other applications and use cases.

The set of properties that a resource can or will have is not necessarily linked to its type,
but exploiting the pattern where resources of the same type have the same properties is a very
traditional approach that supports the development of many useful applications. Sometimes,
knowledge of types and properties for the application is hard-coded in software, but there are
many cases where it is desirable to represent this knowledge in data. The Basic Profile
provides resource types called Shape and PropertyConstraint
to represent this data.

Note on the relationship of Shape to other standards:Although we're all very familiar from relational databases and object-oriented
programming with the model where the valid properties are constrained by the type, it is not
the "natural" model of RDF, nor is it the model of the natural world. The familiar model says
that if you are of type X, you will have these properties that will have values of certain
types. RDF and, to a large degree, the natural world work the other way around; if you have
these properties, you must be of type X. We are not aware of any OWL or RDFS construct that
lets you say "from the perspective of application X, resources with an RDF type of Y will have
the list of properties Z," nor of constraining the types of the values of these
properties.

Class: PropertyConstraint

URI: http://open-services.net/ns/basicProfile#PropertyConstraint

bp:PropertyConstraint domain properties

Property

Occurs

Range

Comment

rdfs:label

zero or one

rdfs:Literal

A human-readable name for the subject. (from rdfs)

rdfs:comment

zero or one

rdfs:Literal

A description of the subject resource. (from rdfs)

bp:constrainedProperty

exactly one

rdfs:Property

The URI of the predicate being constrained.

bp:rangeShape

zero or one

bp:Shape

A bp:Shape that describes the rdfs:Class that is range of the property.

bp:allowedValue

zero or many

range of the subject

A value allowed for the property. If there are both
bp:allowedValue elements and an
bp:AllowedValue resource, then the full set of allowed
values is the union of both.

bp:AllowedValues

zero or many

bp:AllowedValues

A resource with allowed values for the property being defined.

bp:defaultValue

zero or one

range of the object

A default value for the property

bp:occurs

exactly one

rdfs:Resource

Must be one of these
three:http://open-service.net/ns/basicProfile#Exactly-oneorhttp://open-service.net/ns/ basicProfile#Zero-or-one, http://open-service.net/ns/basicProfile#Zero-or-many or
http://open-service.net/ns/ basicProfile#One-or-many

bp:readOnly

zero or one

Boolean

true if the property is read-only. If not set or set to
false, then the property is writable. Providers
should declare a property read-only when changes to the value of
that property will not be accepted on PUT. Consumers should note that the converse
does not apply: Providers may reject a change to the value of a
writable property.

bp:maxSize

zero or one

Integer

For String properties only, specifies maximum characters allowed. If not set, then
there is no maximum or maximum is specified elsewhere.

bp:valueType

zero or one

rdfs:Resource

For literals, see XSD Datatypes.

It is debatable whether we should have a separate bp:PropertyConstraint class with a property
on it called bp:constrainedProperty, or whether it would be better to use rdfs:Property and
simply define new predicates with rdfs:Property as the domain.

Important:

However, it is important not to use rdfs:range, because the semantics are different.

Class: bp:AllowedValues

URI: http://open-services.net/ns/basicProfile#AllowedValues

bp:AllowedValues domain properties

Property

Occurs

Range

Comment

bp:allowedValue

zero or many

same as range of owning property

Allowed value

Class: bp:Shape

URI: http://open-services.net/ns/basicProfile#Shape

bp:Shape domain properties

Property

Occurs

Range

Comment

dcterms:title

zero or one

rdfs:XMLLiteral

Title

bp:describedClass

exactly one

rdfs:Class

Class described

bp:propertyConstraints

zero or one

rdfs:List

The list of propertyConstraints for properties of this Shape. The domains of the
PropertyConstraints must be compatible with the describedClass.

Validation semantics

Validation semantics are expressed by mapping the property and class definitions in terms of
SPARQL ASK semantics. This enables a declarative way in RDF to define the constraints while
using the existing SPARQL ASK specification.

Associating Shapes and Containers

It is useful to be able to specify for a Container what types of members it will return and
accept, plus what properties it expects to be used with resources of those types. To enable
this, the Basic Profile defines two new Container properties, which are shown Table 9.

rdfs:Container domain properties

Property

Occurs

Range

Comment

bp:createShape

zero or many

bp:Shape

One or more Shapes that provide information on the expected data formats of resources
that can be POSTed to the Container to create new members.

bp:readShape

zero or many

bp:Shape

One or more Shapes that provide information on the expected data formats of resources
that can be found as members of the Container. Containers often add properties
of their own to POSTed and PUT resources (creation date, modification date, creator),
and it's useful for clients to know what these might be.

Basic Profile paging

It sometimes happens that a resource is too large to reasonably transmit its representation in
a single HTTP response. A client might anticipate that a resource will be too large (for
example, a client tool that accesses defects might assume that an individual defect will
usually be of sufficiently constrained size that it makes sense to request all of it at once,
but that the list of all the defects ever created will typically be too big). Alternatively, a
server might recognize that a resource that has been requested is too big to return in a
single message.

To address this problem, Basic Profile Resources can support a technique
called paging that enables clients to retrieve representations of resources
one page at a time. For every resource with a URL of {url}, a Basic
Profile implementation might define a companion resource with a URL of
{url}?firstPage. The meaning of this resource is: the first page of
{url}. Clients that anticipate that a particular resource will be
too large might, instead, fetch this alternative resource. Servers that determine that a
requested resource is too large might respond with a 302 redirect message, directing the
client to the firstPage resource.

The representation of {url}?firstPage will contain a subset of the
triples that define the state of the resource with a URL of {url}.
The triples are unmodified, so the subject of the triples will be whatever it was in the
representation of {url}, typically
{url}, not {url}?firstPage. In addition,
the representation of {url}?firstPage will include a few triples
with a subject of {url}?firstPage. Examples are triples with
predicates of bp:nextPage, dcterms:description, and so on.

For example, if you have a Basic Profile Container with the URL of
http://acme.com/BasicProfile/container/1, it might have the following representation (in
Turtle notation):

This representation has a billion triples and over 90 billion characters, which might be a bit
big. Assuming that the implementation that backs this resource supports paging, a client can
chose instead to GET the related resource: http://acme.com/BasicProfile/container/1?firstPage.
The representation of this latter resource would look like this:

As you can see, the representation of this smaller firstPage resource contains the first 100
triples that you would have had in the representation of the large resource in exactly the
same form -- the same subject, predicate, and object -- as in the representation of the large
resource. In addition, it contains another triple with a subject that is the firstPage
resource itself, not the bigger resource, that provides the URL of a third resource that will
contain the following page of triples from the bigger resource. The format of the URLs of the
second and subsequent pages (if they exist) is not defined by the Basic Profile; a Basic
Profile implementation can use whichever URL it pleases. Note that, although this example
shows the triples in a precise order for purposes of simplicity and clarity of the example,
there is no concept of ordering of triples in RDF, so the triples can be in any order, both
within and across pages. An obvious restriction is that all triples that reference the same
blank node, either as subject or object, need to be in the same page (this is simply an
observation on how RDF works, not a Basic Profile policy or limitation).

As illustrated above, when a page is returned, it will include the triple:<url of current page> bp:nextPage <url of next page>

You can tell that you are on the last page when the <url of nextPage> is bp:nilPage.

By the time a client follows a bp:nextPage link, there might no longer be a next page. The
Basic Profile server implementation in this case must respond with an empty page with
bp:nextPage set to bp:nilPage.

The Basic Profile permits {url}?pageSize={n} as an alias for
{url}?firstPage. Because it is just an alias, it has exactly the
same meaning and behavior. A Basic Profile server implementation can (but is not obliged to)
adjust the number of triples on the first and subsequent pages based on the value of
n.

Note that pagination is defined only for resources with states that can be expressed in RDF as
a set of RDF triples. Pagination is undefined for resources with states that cannot be
represented in RDF. Pure binary resources, encrypted resources, or digitally signed resources
might be examples. The representation of a page is defined by, first, paginating the
underlying triples that express the state of the resource being paginated, and then performing
whatever standard mapping is used to map from each page of triples to the requested
representation. In other words, we do not paginate the representations; we paginate the RDF
resource state itself and then create the representations of each page in whatever media type
is requested. This provides a general specification for pages for both RDF and non-RDF
representations. Examples of non-RDF representations are HTML and JSON.

Instability of paging

Because HTTP is a stateless protocol and Basic Profile servers manage resources that can change
frequently, Basic Profile clients should assume that resources can change as
they page through them using the bp:nextPage mechanism. Nevertheless, each triple of the resource that exists when
the first page is returned and is not subsequently deleted during the paging interaction must
be included on at least one page. (Including the same triple more than once is permissible --
identical triples are always discarded in RDF -- but servers need to ensure that the same
triple is not returned multiple times with different object values.) Triples that are added
after the first page is returned might or might not be included in subsequent pages by a
server.

Class bp:Page

URI: http://open-services.net/ns/basicProfile#Page

Table 10. bp:Page properties

Conclusion

We believe that getting to a simple Basic Profile will enable broader adoption of Linked Data
principles for application integration. Additional development of some of the concepts will be
necessary to complete such a profile. The intention of this article is to initiate the
much-needed development of specifications that will fill this gap.

Read An update on RDF
concepts and some ontologies to review recent updates to the Resource
Description Framework (RDF) concept specification and the implications of those updates for
the Semantic Web and the Linked Data movement.

Improve your skills. Check the Rational training and
certification catalog, which includes many types of courses on a wide range of topics.
You can take some of them anywhere, any time, and many of the "Getting Started" ones are
free.

Get products and technologies

Evaluate other IBM software in the
way that suits you best: Download it for a trial, try it online, use it in a cloud
environment, or spend a few hours in the SOA Sandbox
learning how to implement service-oriented architecture efficiently.

The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.