S3XML

S3XML is a data exchange format for Sahana Eden.

S3XML is a meta-format and does not specify any particular data elements. The interface is entirely introspective to the underlying data model, thus the specific constraints defined in the data model also apply for S3XML documents.

Conventions

Name Space

In the current implementation of S3XML, no name space identifier shall be used. Where a name space identifier for the native S3XML format is needed (e.g. when embedding S3XML in other XML), it shall be:

xmlns:s3xml="http://eden.sahanafoundation.org/wiki/S3XML"

Character Encoding

Generally, XML documents can specify their character encoding in the XML header:

<?xml version="1.0" encoding="utf-8"?>

Sources in non-XML formats (JSON, CSV) used with S3XML on-the-fly conversion/transformation are expected to be UTF-8 encoded.

All exported data are always UTF-8 encoded.

Import Sources

There are 3 different ways to specify or submit data sources for import:

Files on the Server

A source file in the server file system can be specified using the filename URL variable:

Remote Files

A source file can be specified by its URL using the fetchurl URL variable:

PUT http://<server>/<controller>/<resource>.xml?fetchurl=<url>

Multiple files can be specified as list of comma-separated pathnames:

PUT http://<server>/<controller>/<resource>.xml?fetchurl=<url>,<url>

Supported protocols are http, ftp and file, where file is interpreted in the server file system context. URLs of different protocols can be mixed.

The specified URLs must be accessible either without authentication, or (if you specify credentials in the URLs) they must support unsolicited HTTP basic authentication - HTTP 403 retries are not handled by the interface.

Request Attachments

Source files can also be attached to a multipart-request. In this case the file extension of the source file must match the request URL file extension. Multiple files can be attached.

Multiple Sources

Where multiple sources are specified or attached, they are first converted and transformed one-by-one and then combined into a single element tree before import.

Duplicate Resolution

The S3XML Importer does not handle duplicates within the same source. As the order of elements in the resulting element tree is not defined, and the last update time attribute is not mandatory in source elements, there is no predictable rule of precedence.

Records in the source must not be fractionated, but submitted in one element. Fractions of records will not be merged by the Importer, and which of the fractions finally would be imported is not predictable

Source elements using unique keys are automatically matched with existing records. Where the match is ambiguous (e.g. a set of keys matching multiple existing records), the import element will be rejected as invalid. For certain resources, the server may have additional duplicate finders and resolvers configured. How duplicates are handled by these resolvers, can differ from resource to resource.

The duplicate resolution strategy in standard import mode is to update the existing record with the values from the source record. In synchronization mode the default strategy is to accept/keep the newest data (the last update time attribute is mandatory in this case).

XML Format

Document Types and Structure

S3XML defines 3 types of documents:

Document Type

Description

Schema Documents

describe the data schema for a resource

Field Option Documents

describe the currently acceptable options for fields in a record

Data Documents

provide the current contents (data) of resources

Schema Documents

Schema documents describe the data schema for a resource. Clients can use these documents e.g. for automatic generation of forms.

Schema documents can be retrieved from Sahana Eden by sending an empty GET request (i.e. without source) to the create.xml method of a resource, e.g.:

the URL query parameter ?options=true adds a list of field options to those fields where options are defined, and combined with the parameter &reference=true, even options for foreign key references will be included

the URL query parameter ?meta=true will include the meta fields (as <meta> elements). In data documents, the meta fields appear as attributes of the <resource> element

Field Options Documents

Field options documents describe the currently acceptable options for fields in a record. Clients can use these documents e.g. for automatic generation and/or client-side validation of forms.

Field options documents can be requested from Sahana Eden by sending a GET request to the options.xml method of a resource, e.g.:

Foreign key references of component records to their primary record will not be exported, and where they appear in import sources, they will be ignored.

Components of components are not allowed (maximum depth 1), and where they appear in import sources, they will be ignored.

References

Foreign key references (except those linking components to their primary record) are represented by <reference> elements.

Foreign keys can be importable UIDs (uuid-attribute, which will be both imported and used to find and/or link to existing records in the DB) or temporary UIDs (tuid-attribute, which will not be imported but only used to find records within the current tree), If a <resource> element with a matching UID key attribute is found in the same tree, it will be automatically imported.

References inside referenced elements will be resolved (unlimited depth) and also be imported. Circular references will be detected and properly resolved.

Multi-references (list:reference type in web2py) use a list of UID keys separated by vertical dashes like uuid=|uid1|uid2|uid3|. The leading and trailing vertical dashes must be present.

If a <resource> element is nested inside the <reference>, either or both of the UID keys can be omitted. Where both keys are however used, they must match. Multiple embedded <resource> elements are allowed for multi-references.

<resource>

This element represents a record (in data documents) or a database table (in schema documents).

<s3xml>
<resource name="xxx_yyy">
...
</resource>
</s3xml>

Parent elements:

<s3xml>, <resource>, <reference>

Child elements:

<resource>, <data>, <field>

Contents:

empty

Attributes:

Name

Type

Description

mandatory?

name

string

the name of the database table

yes

uuid

string

a unique identifier for the record

no*

tuid

string

a temporary unique identifier for the record

no*

created_on

datetime

date and time when the record was created

no**

modified_on

datetime

date and time when the record was last updated

no, default: time of the request** ***

created_by

string

email-address of the user who created the record

no

modified_by

string

email-address of the user who last updated the record

no

owned_by_user

string

email-address of the user who owns the record*****

no

owned_by_role

string

name of the user group who collectively own the record*****

no

mci

integer

master-copy-index

no, default: 2*** ****

(*) Records will be identified within the input file by their uuid, or, if no uuid is specified, by their tuid.

(**) as YYYY-MM-DDTHH:mm:ssZ, always UTC

(***) the last update date/time and mci are required in synchronization

(****) the master copy index specifies how often a record has been copied across sites, see below

(*****) record ownership will be retained if the record owners can be matched against existing users/user groups

The uuid will be stored in the database together with the record. If uuid is present and matches an existing record in the database, then this record will be updated. If there's no match or no uuid specified in the resource element, then the importer will create a new record in the database (and automatically generate a uuid if required).

The mci - master-copy-index - indicates how often this record has been copied across sites:

when importing a new record the mci value is always *imported* as-is from the source

when updating a record, the mci of the database record remains unchanged

the mci of a record is *exported* as its current database value + 1.

the repository first creating a record sets mci=0 in the database record, which appears as mci=1 in the exported XML.

a copying site then imports mci=1 into its database, which appears as mci=2 in its export XML, and so forth...

The mci can be used to filter records for whether they have been originated at a repository or not. If there's a fixed set of synchronization paths between a number of Sahana Eden instances, the mci can be used for conflict resolution. If the mci is not specified, it defaults to 2.

(*) If the field is for file upload, a url attribute should be provided to specify the location of the file. The importer will try to download and store the file (file transfer) from that URL (pull). It is also possible to send the file together with the HTTP request - in this case the filename must be specified instead ofthe url (push). The push variant for uploads is meant for peers which do not support pulling for some reason (e.g. mobile phones). Normal servers would always provide a URL for download in order to allow the consuming site to decide which files to download and when (saves bandwidth).

The text node in the data element provides a human-readable representation of the field value.

The value attribute contains a JSON representation of the field value, retaining the original data type (i.e. strings must be double-quoted) except for date, time and datetime values, which are to be represented as simple strings in the respective standard format (no double quotes). The standard format for datetime values is YYYY-MM-ddTHH:mm:ssZ (ISO format, UTC), date shall be represented as YYYY-MM-dd, and time as HH:mm:ss.

data elements representing passwords can contain the clear text password in the value attribute, or the encrypted password in the text node. Where a clear text password is given as value attribute, it will be stored encrypted, otherwise the password will be stored as-is. Note that clear-text representation of passwords will be accepted by the interface, but never be exported.