A content repository is a server or a set of services used to store, search, access and control content. The content repository provides these services to specialist content applications such as document management, web content management systems, image storage and retrieval systems, records management or other applications that require the storage and retrieval of large amounts of content. The repository provides content services to these applications such content storage or import, content classification, security on content objects, control through content checkin and checkout, and content query services. Major vendors of content repositories include Documentum, IBM, Filenet, Interwoven, Vignette, OpenText and Microsoft through their Sharepoint system. Content repositories have been around for at least 15-years and generally built on relational databases. Metadata about the content, such as descriptive information, process information, security, classification and relationships about content, are stored in the database for quick and flexible retrieval and to simplify the many ways in which this information might be use. The actual content stored in these repositories can be as simple as HTML and pictures for a website, but more often in an enterprise can be office documents, scanned images, XML and streaming media. Storage may be in the database as binary large objects for large content such as complex office documents, images or rich media, or it may be stored in files to simplify the management of storage and streaming of content for rich media and content transformations. Databases provide the transaction control and recoverability required when adding, updating or deleting this information. What distinguishes content management from other typical database applications is the level of control exercised over individual content objects and the ability to search content. Access to these services requires wrapping the calls in security to prevent unauthorized access or changes to content or its metadata. The finer granularity of this security and its complex relationship to other objects such as people and folders requires a more sophisticated mechanism than provided by SQL security. The hierarchical nature of how content is used, found, controlled and accessed, such as working in folder structures and hierarchical classifications, requires a different type of API than is currently provided by even SQL-2003. The paradigm for search of content introduced by internet search engines such as Google has meant that the highly structured search requirements of the SQL Select statement are not adequate for end user requirements and thus most repositories introduce an external full-text search engine to supplement database queries of structured metadata. The result of the complex requirements of these services means that much of the business logic of the content repository can be as large or larger than the database itself. Almost all the content repository vendors provide proprietary service interfaces and APIs to encapsulate the breadth of functionality required. Despite having tried over the last 10 years to standardize these APIs, it is only over the last two years that any progress has been made. In 2005, the Java community adopted the JSR-170 standard interface although only a couple of the major vendors have adopted this interface and the AIIM IECM effort has only just begun, although it has widespread participation of all the major vendors.

17/06/12 WebDAV – Web-based Distributed Authoring and Versioning W3C standards for locking, browsing and authoring Extensions for version control, security and searching Most vendors support basic capabilities JSR-170 Java Content Repository (JCR) API Java API for accessing any repository An object-database model independent of data model Most implementations are open source, few commercial JSR-283 – Next generation of JCR A lot of vendor participation Covers a lot of weaknesses of JSR-170 iECM – Interoperable Enterprise Content Mgmt Active participation of Documentum, IBM, Filenet, Microsoft Currently moving from use cases to specification Where is the SQL of Enterprise Content Management?

Open source is now acceptable in F1000 73% of F1000 are using or will use open source Cost of distribution and support are dropping dramatically ECM is recognized as “must have”, but expensive Regulation and information explosion Fastest growing category of enterprise software ECM vendors are alienating customers &amp; channels Enterprise software model is broken Pressure on customers to upgrade Vendors are taking margin away from partners by selling their own services New technologies available via open source Open source provides better infrastructure for free A unique window of opportunity Next in the stack after OS (Linux), DBMS (MySQL) and App Server (JBoss)

Ease of Use Use Editor, Application, Portal of Choice Use Search of Choice: Google CIFS, FTP, WebDAV Secure Developer Productivity One Model for all Clients ECM: Document Lifecycle, Version, Audit, Compliance User Independent of Client No Way to Bypass

Scale in Information Complex search, structure &amp; classification of information Change control and dependency can tax RDBMS Scale in Activity Complex information per activity with dynamic views Updates with full object-level security Scale in People Writing in gigabytes and terabytes Up to 100,000s of readers Scale in Geography Sharing of information across continents Collaboration must be real-time

8.
Predictions for ECM1. ECM will standardize, commoditize and the business model will change2. ECM will become simpler, lighter-weight, and much easier3. ECM will deploy new technologies to scale to dynamically serve the enterprise and beyond4. ECM will decentralize, federate and integrate with the rest of the enterprise5. Open source will become a powerful force for change in ECM June 17, 2012 8

16.
Scale Requirements for an Enterprise Platform Scale in Information – Complex search, structure & classification of information Scale in Activity – Complex information per activity with dynamic views with full object-level security Scale in People – Up to 100,000s of readers and writers of gigabytes and terabytes Scale in Geography – Sharing of information across continents in real- time June 17, 2012 16

22.
Summary Time is ripe for open source in enterprise content management Open source brings the community into the development, support and service process Open source changes the sales and price dynamics of the industry Open source brings back the innovation process into the industryJune 17, 2012 22