CMIS

CMIS is the OASIS specification for content management interoperability. It allows client and servers to talk together in HTTP (REST with JSON or AtomPub) using a unified domain model. The latest published version is CMIS 1.1.

Nuxeo supports CMIS through the following modules:

The Apache Chemistry OpenCMIS library (an Apache project to which Nuxeo is a contributor), which is a general-purpose Java library allowing developers to easily write CMIS clients and servers,

Specific Nuxeo OpenCMIS connector bundles, allowing the Nuxeo Platform to be used as a CMIS server with the help of OpenCMIS. The CMIS connector is included in the Nuxeo Platform by default.

Usage

The following documentation uses http://localhost:8080/nuxeo as the URL of the Nuxeo server but you can replace it with http://NUXEO_SERVER/nuxeo if you have another instance available.

You can access the different services from the following URLs:

Browser Binding root URL: http://localhost:8080/nuxeo/json/cmis

AtomPub service document: http://localhost:8080/nuxeo/atom/cmis

JSON

The Browser Binding (JSON) endpoint is recommended, as it is faster and has more features than the other two endpoints.

You can use a CMIS 1.1 Browser Binding (JSON) client and point it at http://localhost:8080/nuxeo/json/cmis.

If you want to check the JSON returned using the command line, this can be done using curl or wget:

You should probably pipe this through tidy if you want a readable output:

... | tidy -q -xml -indent -wrap 999

Notes

The searchAllVersions=true part is mandatory if you want something equivalent to what you see in Nuxeo (which often contains mostly private working copies).

In order to fetch custom metadata, you must restrict the selection to document types that contain the metadata. For example, if you have a metadata "custom" in a document type "mytype", then your query would be something like:

From Java Code Within a Nuxeo Component

To create, delete or modify documents, folders and relations just use the regular CoreSession API of Nuxeo. To perform CMISQL queries (for instance to be able to perform JOIN that are not supported by the default NXQL query language) have a look at the page Using CMISQL from Java.

Capabilities

The Nuxeo OpenCMIS connector implements the following capabilities from the specification:

Navigation Capabilities

Get descendants supported

Yes

Get folder tree supported

Yes

Order By supported

Custom

Object Capabilities

Content stream updates

PWC only

Changes

Object IDs only

Renditions

Read

Filing Capabilities

Multifiling supported

_No_

Unfiling supported

_No_

Version-specific filing supported

_No_

Versioning Capabilities

PWC updatable

Yes

PWC searchable

Yes

All versions searchable

Yes

Query Capabilities

Query

Both combined

Joins

None(Inner and outer if org.nuxeo.cmis.joins=true)

Type Capabilities

Create property types

_No_

New type settable attributes

None

ACL Capabilities

ACLs

Manage

ACLs propagation

Propagate

Supported permissions

Repository

Model Mapping

The following describes how Nuxeo documents are mapped to CMIS objects and vice versa.

Only Nuxeo documents including the "dublincore" schema are visible in CMIS.

Complex properties are not visible in CMIS by default, as this notion does not exist in CMIS. However, if the server is configured to do so, they can be exposed as JSON-encoded strings (since Nuxeo 7.1, see NXP-14474).

Proxy documents are visible in CMIS if the system property org.nuxeo.cmis.proxies=true (since Nuxeo 8.3 / Nuxeo 7.10-HF08 (default true since Nuxeo 9.1, false in previous versions), see NXP-17313 and NXP-21828).

Secondary content streams are not visible as renditions. Only the Nuxeo thumbnail and renditions explicitly made available through the Nuxeo RenditionService are visible.

Documents in the Nuxeo trash (those whose nuxeo:isTrashed is true) are not visible in CMIS, unless an explicit query using the nuxeo:isTrashed property is done.

This mapping may change to be more comprehensive in future Nuxeo Platform versions.

Nuxeo-Specific System Properties

In addition to the system properties defined in the CMIS specification under the cmis: prefix, the Nuxeo Platform adds some additional properties under the nuxeo: prefix:

nuxeo:isTrashed: To access the trashed state of a document. By default only non-trashed document will be returned in CMISQL queries unless an explicit nuxeo:isTrashed predicate is specifiedin the WHERE clause of the query.

nuxeo:isVersion: To distinguish between archived (read-only revision) and live documents (that can be edited).

nuxeo:lifecycleState: To access the lifecycle state of a document.

nuxeo:secondaryObjectTypeIds: Makes it possible to access the facets of a document. Those facets can be static (as defined in the type definitions) or dynamic (each document instance can have declared facets).

nuxeo:contentStreamDigest: The low level, MD5 or SHA1 digest of blobs stored in the repository. The algorithm used to compute the digest is dependent on the configuration of the BinaryManager component of the Nuxeo repository.

nuxeo:isCheckedIn: For live documents, distinguishes between the checked-in and checked-out state.

nuxeo:parentId: Like cmis:parentId but also available on Document objects (which is possible because the Nuxeo Platform does not have direct multi-filing).

nuxeo:pathSegment: The last path segment of the document (ecm:name in NXQL).

nuxeo:pos: The position of an object in its containing folder, if that folder is ordered, or null otherwise.

All these properties can be used as regular CMIS properties and in a CMISQL query (in a SELECT, WHERE or ORDER BY clause where relevant), except for nuxeo:contentStreamDigest which can only be read in query results or by introspecting the properties of the ObjectData representation of a document.

Use Cases

Document Capture Integration with Ephesoft

Ephesoft is an advanced document capture and data extraction solution to help businesses run more efficiently. It automatically classifies and extracts data from any type of document. Ephesoft has a CMIS interface, which ease the integration with Nuxeo.

Ephesoft has a CMIS import plugin and a CMIS export plugin so that it can ingest documents stored in Nuxeo to extract information and send back the extraction results to Nuxeo.

CMIS Import

Ephesoft monitors a specified folder for a new file (as a hot folder) using a cron job, and process any new document in an Ephesoft batch. Ephesoft uses a "technical" Nuxeo property to tag the document as processed, in order to not process twice the same document (for example, a custom property called invoice:status passes from To process to Processed).

In the picture above:

Parameter

Value

Description

Server URL

http://localhost:8080/nuxeo/atom/cmis

Nuxeo CMIS URL

Username

ephesoft

Username of the technical account to create a connexion between Nuxeo and Ephesoft. This user needs WRITE permission on the documents

Password

mySecretPassword

Password of the technical account

Repository Id

default

Generally default. You can read it from the downloaded file when you enter in a web browser the CMIS Server URL

File Extension

pdf;tif

Cannot be changed

Folder

default/domain/workspaces/folder1

Folder path of the hot folder. The initial /should not be written

Property

invoice:status

Property used by Ephesoft to check which document has been processed

Value

To process

Each document with invoice:status=To Process will be sent to Ephesoft

New Value

Processed

When Ephesoft processes a document, Ephesoft will update the invoice:status to Processed

CMIS Version

1.1

Value of the CMIS implementation

Enabled

true

To activate the CMIS Import

Don't forget to activate the CMIS import by uncommenting the <import resource="classpath:/META-INF/applicationContext-dcma-mail-import.xml" /> line of the applicationContext.xml file as the CMIS import is disabled by default.

With the standard CMIS import addon, you can't keep the original Nuxeo property values from the input to the output. One possible workaround is to check if the binary you're processing has any file property (contained for example in the EXIF or IPTC format) that you can use to perform a synchronization process. The best practice is to use the CMIS REST API instead of the CMIS Import addon to get more flexibility.

CMIS Export

When a document is processed in Ephesoft, it means the platform has classified and extracted the information from the document. Instead of exporting the binary files in a filesystem folder, along with its XML document (corresponding to the field properties), you can export them into Nuxeo with the CMIS export addon. The configuration is quite easy:

Activate the CMIS Export in your Ephesoft batch class modules

Map the Ephesoft property fields with the Nuxeo property fields in the DLF-attributes-mapping.properties