Pluggable Storage

Considerable thought, discussion, design, prototyping, etc has focused on what a DSpace+2.0 asset store should look like. In particular, whether and how such an asset store should model archival information packages (AIPs) in a more substantial manner than in the current architecture. This model is conceived as a DSpace asset store API/interface. Of course one big advantage of such an interface is the ease of writing implementations for different storage systems (file-based, grid, etc) and plugging them into DSpace. However, it should be noted that this benefit redounds to just having an interface, not having an AIP-aware one in particular.

Modeling an AIP asset store is very important (and hard), and it has proven difficult to achieve consensus - and this has led to holding the benefits of pluggable storage for DSpace hostage to agreement on an AIP model. The work this page describes attempts to circumvent this problem by refactoring the existing asset storage system to drive a very thin API wedge between the storage manager and the actual storage back-end. It conspicuously does not attempt to model an AIP, only the low-level storage primitives. If successful, it should make it far easier than it is today in DSpace to attach different storage solutions. It should also feed the AIP asset store design process by providing insights into the powers and limits of different storage systems.

I call this API a BitStore to emphasize that it is not an AIP model, and have provided a refactored BitstreamStorageManager that utilizes this interface, rather than the direct calls it had into the file-store or SRB store. In addition, I have provided 3 implementations of the interface:

DSBitStore - this is simply the current DSpace file system store.

SRBBitStore - the existing SRB store.

S3BitStore - an asset store using Amazon's Simple Storage Service. NB: this is a commerical (not free) service

Another advantage of this approach is modularity: we will no longer have to include all the code and required libraries for (e.g.) Storage Resource Broker unless we actually want to use it. These storage modules also present some new use-cases for the Add-on mechanism work, since they are optional modules, but not separate applications.

Detailed notes on using each store will follow. Note that this is prototype code, not production quality. Feedback welcome, including other possible store implementations (e.g. a RBDMS store). (See the Discussion page for some thoughts on an RDBMS implementation.)