Preconditions

You should have read and understood the documentation about the JobManager, especially the configuration of workers and workflows if you want to create new workers.

You should have at least an idea about the OSGi framework and OSGi services. For links to introductory articles and tutorials see [1]. For a quite comprehensive overview on OSGi see [2].

Project templates

Before writing your own worker we recommend you to take a look at the sample workers. You can get them by importing the bundles from the examples directory of SMILA's repository into your workspace and use them as templates:

Adding the bundles to the configuration

config.ini

To start the bundle in the built application, add the following line to SMILA.application/configuration/config.ini as the second last line:

org.eclipse.smila.integration.worker@4:start, \

(To be honest, it does not matter at all, where exaclty you add your bundle in the file, as long as the syntax (end of lines must be escaped for all lines except the last one, of course) is correct.)

launcher

You also have to adapt your launcher:

Click on Run configurations...

Select the OSGi Framework-->SMILA configuration

In the Bundles page, check the box before org.eclipse.smila.integration.worker, leave Start Level on default, set Auto-Start to true.

Click Apply

Scale up

Finally you should add the scale up limits (see ScaleUp) to the cluster configuration file (if you use the standard simple clusterconfig service, you will find the configuration file as org.eclipse.smila.clusterconfig.simple/clusterconfig.json).

E.g. add the following snippet to the existing ones in the workers map to limit scale up of the worker to a maximum of concurrent tasks (be sure, your worker label is the same as in the workers.json). If you do not add your worker's scale up here, the worker is limited to one concurrent task.

Example to limit the worker HelloWorldWorker to a maximum of 4 concurrent tasks:

"HelloWorldWorker":{"maxScaleUp":4},

Running

You should now test your workspace setup to make sure that everything works with the prepared stuff.

Run the application

Select "Run" -> "Run Configurations" or "Debug Configurations"

Select "OSGi Frameworks" -> "SMILA".

Click "Run" or "Debug" and SMILA should start just like when started from the command line.

When starting the SMILA.launch in eclipse, you should see something like the following output in the console window:

This shows that the HelloWorldWorker has done something. Of course, the test also contains an assertion so that it will fail when the attribute has not the expceted value.

Create your own worker

Use template

The easiest way to create a new worker is by implementing it in the bundle org.eclipse.smila.integration.worker (see Project Templates). There you can just place your new worker beside the HelloWorldWorker example worker, or replace it. Things you have to do when renaming the bundle/package or creating your own worker bundle are described later on.

Bundle dependencies

The dependencies of the bundle are managed by the OSGi framework and have to be configured explicitly in the MANIFEST.MF file so that the OSGi framework can resolve them (in the correct versions) when the services are started.

To create a worker that reads and writes Records, we need at least the following bundles imported as packages (see META-INF -> "Dependencies" -> "Imported Packages"):

This is already configured. If access to other packages is needed, just extend the MANIFEST.MF file in section "Imported Packages" accordingly.

Worker Implementation Java Class

Create a worker class which implements org.eclipse.smila.taskworker.Worker. Have a look at the example worker org.eclipse.smila.integration.worker.HelloWorldWorker that comes with the SDK in the org.eclipse.smila.integration.worker bundle. You must implement two methods:

getName() must return a unique name for your worker. Exactly the same name (case sensitive) must be used later in the worker descriptions and workflow definitions.

perform() does the actual work. It is called with a TaskContext object that provides access to the task properties, input and output objects, and counters.

Thread safety: Make sure that your worker implemention is thread safe! Otherwise you can't use it for scale up.

OSGI Declarative Service

Every worker must be declared as an OSGi Declarative Service (DS) in order to be registered properly to the worker framework. To configure your worker as DS, you have to add an appropriate XML file to the folder org.eclipse.smila.integration.worker/OSGI-INF.

The file can be created either manually or using the Component Definition wizard.

The file describes (1) the interface that the worker has to implement (and through which it will be accessed in the OSGi application by means of dependency injection), (2) the class being the concrete implementor of that interface, (3) the services that it references (our simple worker does not reference any, you can find a description later on), (4) and the name of the service.

To describe your own worker, just create a copy of the OSGI-INF/helloworldworker.xml file in the same directory. Then change at least the "name" attribute in the root element and the "class" element in the "implementation" element.

When you don't need the HelloWorldWorker anymore you may want to remove at least its component definition file from the bundle. Otherwise, it will always be running and asking for tasks in the final deployment. While it should not really be a problem, it causes some unnecessary overhead that can easily be avoided.

You should check in your MANIFEST.MF that your component definition is included in the build and it is listed as Service-Component (e.g. as a line in your MANIFEST.MF Service-Component: OSGI-INF/*.xml and the bin.includes of the build.properties file should contain OSGI-INF/).

Register your worker in jobmanager configuration

These are the steps to use your new worker with the jobmanager framework.

Worker definition

Edit workers.json from <WORKSPACE>/SMILA.application/configuration/org.eclipse.smila.jobmanager folder and add the definition for the new worker.

Important: The name in the worker definition has to be the same that is returned by the getName() method in the worker implementation!

For the example worker HelloWorldWorker we want to use one input and output slot. And we use recordBulks as data object type cause we want to modify (bulks of) records with this worker:

Workflow definition

To use your worker in a workflow you have to add a new workflow or change an existing one. You can either use the jobmanager API to add a workflow definition to the running system, or you can edit workflows.json from <WORKSPACE>/SMILA.application/configuration/org.eclipse.smila.jobmanager folder and add/change a workflow.

This example is a test workflow that uses the HelloWorldWorker to manipulate all records which where pushed into the system using the bulkbuilder. Because it's pretty useless as such, we did not add it to SMILA.application/configuration/org.eclipse.smila.jobmanager/workflows.json, but it's used in the unit test bundle org.eclipse.smila.integration.worker.test: The test case reads the output bulk created by the HelloWorldWorker to check if it been running.

Bucket definition

If you want to use a new persistent bucket for your workflow (see jobmanager documentation) you have to add it via the jobmanager API or add it to the configuration: Edit buckets.json from SMILA.application/configuration/org.eclipse.smila.jobmanager folder and create desired bucket.

Here's an example from the test bundle org.eclipse.smila.integration.worker.test for the workflow above that makes the final bucket helloWorldExportBucket persistent. For the unit test, the output bucket of the worker must be persistent so that the test case can still read the result records when the workflow has ended. Otherwise the jobmanager would remove the transient object immediately after the HelloWorldWorker has finished.

{
"name":"helloWorldExportBucket",
"type":"recordBulks"
}

Activate the Worker

Add the worker bundle to the configuration and set scale-up as described above if you haven't already done yet.

Testing

Use the launcher

If everything was done correctly and you start the SMILA.launch in Eclipse, you should see the same output as described above, but for your own worker.

You should also check whether your new workflow definition is visible in [3].
If not, you maybe misstyped a worker name or something. If there is no workflow at all, the workflows.json file has invalid syntax.

Create worker unit test

You can use the test bundle template org.eclipse.smila.integration.worker.test to add a test for your worker. Have a look at the example test class org.eclipse.smila.integration.worker.test.TestHelloWorldWorker that comes with the SDK.

All configuration files for the test are in org.eclipse.smila.integration.worker.test/configuration. This is similar to SMILA.application/configuration, but contains only the configuration files necessary to run the tests, not all files needed by a complete system. Also, some configuration files may differ from those in SMILA.application, e.g. some components may be configured with smaller limits to make tests run quicker. However, if you create a new worker, you must add its description to the workers.json in the test bundles and define persistent buckets and workflows required to run the test. Additionally make sure that the config.ini contains the names of your worker bundles and those of services your worker needs to access.

To start the test in eclipse you have to copy the launch for TestHelloWorldWorker and adapt it to your new test class.

Manually installing the worker in SMILA

In the following we describe the steps to deploy your worker manually to an existing SMILA installation.

Create a feature project

A feature project is a container project that defines the Plug-ins needed for a specific feature. In our case our feature is to provide a worker, so we'll only have one Plug-in included in that feature, but it can also be reasonable to include all worker Plug-ins that are necessary to extend the SMILA to be able to handle a specific scenario in one feature that can be deployed and so includes all plugins necessary.

If you ever need to create an own feature project you can use Eclipse's New... wizard:

New --> Plug-in Development --> Feature Project

Enter a Project name (e.g. org.eclipse.smila.integration.feature)

Fill in other feature properties to describe the new feature ('Version' should match the version of your new worker bundle)

Next

select your worker bundle

Finish

Deploy your worker feature

Now it's easy to export your custom bundles to files that can be easily deployed into SMILA:

Select your feature project

Right-click on it

Click on Export...

Select Plug-in Development --> Deployable features

Next

Select your new worker feature

Select a destination folder. If you are re-exporting after changes (especially after renames), you should first delete the destination folder.

Click Finish

After that you will find plugins and features directories in your destination directory that contain the deployable software. The export process produces two additional files artifacts.jar and contents.jar which are not for our purposes.

Advanced How To's

How to access another OSGi Service inside your Worker

With SMILA there come a lot of components with APIs for different purposes. Sometimes you may want to access such an API inside your worker. With the concept of OSGi Declarative Services (DS) this is just a matter of configuration.

Example: Reading all cluster nodes

Assumed, we want to know the names of all cluster nodes in our worker. This is possible via ClusterConfigService API. Here are the steps to access this API in your worker:

Precondition: We assume you already configured your worker as OSGi Declarative Service as described before.

To use the ClusterConfigService you have to import the appropriate package org.eclipse.smila.clusterconfig in the MANIFEST.MF/Dependencies (see "Bundle Dependencies")

Configure ClusterConfigService as referenced service in the service description xml (OSGI-INF/...):

Now, the OSGi framework will automatically set the SimpleClusterConfigService (which implements the interface ClusterConfigService) in your worker at startup via the specified method. So the ClusterConfigService API will be accessible at runtime:

...
List<String> clusterNodes = _ccs.getClusterNodes();
...

How to add / access a configuration for your Worker

You can add a worker configuration, e.g. a property file, by adding it to the application configuration.

Example: Adding a property file "myWorker.properties" and access it in the worker

To add a worker configuration create an appropriate folder in the application configuration and place the property file there:

SMILA.application/configuration/MY_BUNDLE_NAME/myWorker.properties

To easiest way to access the configuration in your worker is via org.eclipse.smila.utils.config.ConfigUtils class

To use this class you have to import the appropriate package org.eclipse.smila.utils.config in the MANIFEST.MF/Dependencies (see "Bundle Dependencies")

For the following example code you should also import org.apache.commons.io

Exception Handling and Logging

Exception Handling:

There are four possible outcomes of a worker:

SUCCESSFUL: The perform() method returns without exception. This is interpreted by the Worker Manager as a successful task execution so that it will finish the task with a SUCCESSFUL task completion status. All open output data objects will be committed. If this fails, it will continue as explained below, depending on the exception type. The task result includes all counters produced by the task execution so that they can be aggregated by the Job Manager in the job run data.

RECOVERABLE_ERROR: The perform() method aborts with org.eclipse.smila.taskworker.RecoverableTaskException. This will be interpreted as a temporary failure when accessing input data or writing output data to DOS. The result is that the task will be finished with a RECOVERABLE_ERROR task completion status and the Job Manager will usually reschedule the task for a later retry. Any produced counters will be ignored by the Job Manager in the job run data.

POSTPONE: The perform() method aborts with org.eclipse.smila.taskworker.PostponeTaskException. This means that the worker cannot yet perform the task for some reason but it should be resubmitted again later. The task will be re-added to the “todo queue” of this worker and it will be delivered again later (but quite soon, usually). Such tasks have the POSTPONE task completion status.

FATAL_ERROR: The perform() method aborts with any other exception (including all exceptions of type RuntimeException). This will be interpreted as an indicator that the input data cannot be processed at all, for example, because it is corrupted or contains invalid values. Such tasks will be finished with a FATAL_ERROR completion status and will not be rescheduled. Any produced counters will be ignored by the Job Manager in the job run data.

Logging:

You can use the log4j logging that comes with SMILA in your worker too. Your logging output will be logged in the standard smila.log.

Please be sure that your OSGi component definition file is included in the MANIFEST.MF file in the Service-Component section! Otherwise the service component will not be recognized and thus not be started.

Please be sure that the OSGI-INF/ folder is included in your build.properties

test bundle:

Adapt the test bundle to the changes:

change name of test bundle and java package (Refactor/Rename, like described above for the worker bundle itself).

correct the imported packages in the code and the MANIFEST.MF (if not done correctly by refactoring)