Convert EPrints data to Hello World data

This subroutine "converts" an eprint object by getting its title and using it in a Hello, World! message.

Testing the Hello World plugin

Save the HelloWorld.pm file and then restart the Web server, eg.:

service httpd restart

Why do I need to restart the Web server? EPrints uses mod_perl which loads all Perl modules at start up; therefore whenever these modules change they need to be reloaded.

The Hello World export plugin handles lists of eprints and single eprints. Therefore, EPrints displays it in the list of export plugins on the search results page:

Selecting the Hello World export plugin from the search results page

When the Hello World export plugin is activated, the convert_dataobj subroutine is applied to every item in the list to produce the result:

The output of the Hello World export plugin

Walkthough: Using existing plugins to build new plugins

Walkthrough: Deposit activity plugin

Imagine we want to create an export plugin that will take a group of eprints (or a single eprint) and output a csv file containing a list of who deposited the eprints, and the dates on which they were deposited.

This will create a filter object, and set a number of configuration constants:

name - The name of the filter

accept - A list detailing what the filter will take as inputs. In this case, a list of eprints or a single eprint. It is possible to write filters for dataobj types 'eprint', 'user', 'subject', 'history', 'access' and '*' (all).

visible - Who can see this filter. It's set to 'all' above so that anyone can use it. It could be set to 'staff' to only allow repository staff to use it. If set to 'API' then the filter is not available through the web interface.

suffix - Appended to the url to create a filename extension.

mimetype - Should be set to the correct mime type for the output of the filter.

Note that 'name' and 'accept' are essential. These allow the filter to register itself with EPrints.

We will be extracting the username of the depositor, so we need to use 'EPrints::DataObj::User'.

Conversion

The 'output_dataobj' function takes a dataobj (in our case an eprint object) and returns a perl scalar which will be the output. We are going to extract some data from the dataobj using EPrints API calls.

Retreiving the username takes a little fancy footwork because the EPrints object contains depositor userids. We need to create a user object and get the username from that.

We use '$dataobj->get_value' to retrieve metadata from the eprint (or user) objects.

As we're outputting in CSV, we need to do a little normalisation.

Put it in a Module

Put all this into a file called 'DepositorActivity.pm' and save the file into the 'eprints3/perl_lib/EPrints/Plugin/Export/' directory. Don't forget to add this to the bottom of the file:

1;

Before you can use the plugin, you must restart the webserver. This will cause EPrints to load it.

Adding Column Headings

The 'output_dataobj' runs on a single EPrint. If the plugin runs over a list of eprints (we've given it that capability), the default behaviour is to run 'output_dataobj' on every eprint in the list and concatenate the results.

The output_list function is what handles the lists. This takes itself ($plugin) and a hash (%opts) as arguments. The %opt hash contains the list. It could also contain a filehandle. When writing 'output_list', you need to check for the filehandle and if present, print to it. If it's not present, return the results as a scalar.

Here is an output_list function that will add column headings to our CSV file.

More Complex List Processing

output_list can be used to do more than simple concatenating results from output_dataobj. For example, the plugin above will output a table containing one entry for every eprint showing the depositor and the deposit date. Perhaps this could be made more useful by changing the table so that it contains a row for each user that deposited an eprint. Perhaps three columns (userid, number of deposits, datestamp of latest deposit) could be useful.

For readability, output_list is shown without filehandle handling. If this were a real filter, IT WOULD BE NECESSARY!

Firstly, an auxhillary function that will return a CSV normalised username. It's similar to the output_data function above, so should be easy to understand.

Note that because this plugin can take a single eprint as well as a list of eprints, you must have a output_dataobj function that will do something sensible. However, bear in mind that search results are always a list, even if there's only one result.