EPrints Bazaar

EPrints includes GUI tools for creating new Bazaar packages and selecting the source files.

As part of this change a new system-level lib/plugins directory has been added allowing extensions to be installed globally. This reflects how mod_perl works, which has a single namespace for all repositories. By default plugins installed in lib/plugins will be disabled and hence can be enabled on a per-repository basis.

Search engine plugins

Searches are now executed through the plugins layer with a new Search plugin type. EPrints 3.3 comes with support for one new search engine: Xapian. Xapian is a probabilistic search engine that supports boolean queries. To use Xapian you must install the Search::Xapian Perl library and a perform a full re-index. The Xapian index will be written to archives/[repoid]/var/xapian/.

The (default) Internal search engine plugin has support for prefixed terms in simple search e.g. "title:(american eagle) birds". This allows users to specify more complex search queries in a similar fashion to e.g. Google. The available prefixes are based on the field id listed in the simple search configuration.

XSLT-based Imports, Exports and Citation Styles

If XML::LibXSLT is installed import and export plugins and citation styles can now be written using the XSLT language (XML stylesheets). XSLT export plugins support "templating" to add headers and footers to the export.

Flexible object support

EPrints already has a very flexible metadata scheme basic on primitive types (ints, dates etc.). In 3.3 this flexibility is enhanced by allowing the easy creation of user-defined dataset classes. Objects in these flexible datasets can be browsed, searched and viewed in a similar way to eprints or users. They may also be referred to by existing objects (item-referencing) or behave as a child-parent (as documents are to eprints).

SWORD 2.0/CRUD support

The EPrints /id/ URIs are now CRUD-aware, including content-negotiation. To content-negotiate for JSON do:

Improved workflow and document management

AJAX-based upload, deletion and updates of documents. Files now upload automatically and have a progress bar that works in all current browsers.

Document "actions" are now individual plugins allowing extensions to add actions to individual documents in the workflow. Supplied actions include conversion, unpacking (.zip and .tar.gz), additional files and metadata extraction.

Import workflows

Import plugins can now have a sibling Screen plugin that provides the interface for using that importer. This is demonstrated by the ISI Web of Knowledge plugin that (if SOAP::ISIWoK is available) provides a search query and tool for importing items.

Flexible Login and Registration

User authentication and registration are now controlled through sub-classes of Screen::Login and Screen::Register. A Bazaar package is provided to enable OpenID single sign-on support. Internal (default) login and registration can be trivially disabled by disabling the Internal sub-classes.

Scheduled tasks

Support for scheduled tasks has been added to the EPrints indexer. See Plugin::Event for an example. A tools/schedule tool has been added to allow easy creation of indexer events (including scheduled tasks).

Tasks can be scheduled similarly to cron, down to a per-minute resolution. The indexer provides no guarantees for when tasks may actually happen, just that they will not occur more frequently than the scheduled times.

Misc.

Multipart field type, same behaviour as Compound but for multiples stores all parts in a single table (Name is now a sub-class)

System:: classes for abstracting different system behaviours (i.e. making MSWin32 work)

Support for importing OpenXML bibliography format

Upgrade Notes

Backup your database before installing this update. While upgrading has been tested you should always backup your repository before installing updates that change your database schema.

This is a new branch release that will make significant changes to your repository's database and configuration. Upgrading repositories may require fixes to appearance and/or configuration to continue working correctly.

Thumbnails are no longer double-linked - the thumbnail document itself contains relations to its parent but not vice-versa. This is to reduce the occurrence of corrupted thumbnail documents linkages.

text-type fields will now use the longtext MySQL type, which can accommodate a full UTF-8 65k entry.

The magic _fulltext_ field name is replaced by documents in eprint__rindex (as-in terms from the document sub-objects).

The screens for user profiles and saved searches have been replaced by the generic datasets workflow screens.

/cgi/search/XX no longer treats XX as a search definition but as a dataset id, instead use /cgi/search/archive/XX and add XX to the list of Internal-supported search types $c->{plugins}{"Search::Internal"}{params}{search} = [qw( simple/* advanced/* xx/* )];

The default and secure templates are now language neutral and live below /lib/templates/ - upgrading repositories should consider migrating their existing templates. Along with other improvements, the new templates include an ie6.css that may be necessary for IE 6 clients to work correctly.

CSS-defined colours have been refactored and are now located in /lib/static/style/auto/colors.css. The general look-and-feel has been simplified.

Antiword is no longer used for indexing so isn't required (although still used for the doc → pdf 'toy' converter)