WEBINAR:On-Demand

Scheduling and Load Balancing are two very closely knitted concepts. Most use the term “Load Balancing” in the context of Network Load Balancing – for example, https://en.wikipedia.org/wiki/Load_balancing_(computing). This wiki describes (network) load balancing in the context of a web server farm. But, what options are available today if what you need is a grid for analytic/scientific computations? We do a lot of batch processing in hedge funds and investment banking space – from (S)FTP file transfer to/from brokers (Deal/execution submissions/Allocations), to derivatives pricing/risk/stressing/Pnl calculations (Real time, Day-end, Month-end). Many firms, still, implement their infrastructure from scratch.

We can find from Wiki a precursory survey under two separate categories: “Job Scheduler” and “Grid Computing Software/Middle”.

The question remains – How do they measure up against what set of criteria?

Adding to the confusion, should you decide to build your own batch processing/grid infrastructure, many Open Source libraries are available. Many support Scheduling but not Load Balancing, and vice versa. Yet, some others are simply too immature or lack a following – you can tell from # downloads, # broken links and absence of documentation. For example, NGrid supports load balancing, but not scheduling. Quartz.net, on the other hand, supports both scheduling and“Clustering”, but with specific limitations – the job must be coded in .NET, and it must implement the “IJob” interface (Less restrictive compared to NGrid where you need subclass from “GObject”).

The objective of this article is to explore what options we have from both Commercial and Open Source spaces.

We Don’t Need a Scheduler for Everything

Before we explore our options further, I want to first establish that while scheduling and load balancing are very closely knitted concepts, we do NOT need a scheduler for everything.

Fig 1. Real-time updates of derivatives sensitivities is an example where we don’t need a Scheduler

“Market Data Feed Adapter/Server” may be listening on a socket (Bloomberg Desktop API for example), and publishes arriving ticks to Message Bus, accessible from only the firm’s application within the intranet environment. Some calculation grid monitoring the message bus picks up the newly published market data, runs its calculation, and publishes the result back to the Message Bus. Clients, Desktop or Web, subscribes to updates from Message Bus. In this scenario, you do NOT need a scheduler – what you need is a Message Bus, RabbitMQ for example. Say for example your calculation grid is built in .NET, your primary concern should be to integrate jobs implemented in different languages: unmanaged C++, Java, Perl scripts, .NET.

Fig 2. Day-end/Month-end processing is an example where we DO need a scheduler

Typical day end batch includes mark-to-market, Pnl and risk calculations, stressing/scenario analysis, aggregation of position level data to different levels (book/account/strategy/country levels…etc) – in this case, we need a scheduler.

As mentioned, Wikipedia is a good starting point to get a grasp what tools are available today if you need batch processing and load balancing capability in your firm, or in the new application you’ll be building. In the following passage, we’d try to make detail comparisons using the following criteria:

Cost estimates and Time-to-Delivery (Case when you decide to build your own Scheduler/Grid)

Platform Compatibility

Scheduling facilities

Load Balancing facilities

Built-in ERP adapters

Built-in ETL commands (And Open Source libraries available, from the perspective of a .NET developer, if you build your own)

Yellow boxes are modules you’d need to build – i.e. components that are not included with the application. Green boxes are “Nice to Haves”.

Summary

We have explored four options available today.

Build your own (We’ve described also the number of modules you need to build in order to “Connect-the-dots” and the number of Open Source libraries available, in particular, Quartz.NET+RabbitMQ – from the perspective of a .NET Developer. They also have the most polished GUI – parent child jobs are displayed in Flow Chart Diagram).

Applied Algo ETL Suite (Commercial Scheduler+Load Balancer, best suited for anyone which does a lot of number crunching – their persistence mechanism automatically store processed data from FTP transfer to output from a Time Series Analysis is particularly geared for quantitative/scientific analysis. Applied Algo ETL Suite also bundled the most).

Schedulix (Open Source Scheduler+Load Balancer, with support contract available from independIT. Everything you need for General IT automation purposes. No built-in ERP adapters or ETL commands however.).

This article presented only four alternatives among the many options available from both Open Source and Commercial space. Readers are welcomed to submit additions via the comments below. Please, however, use the comments to make suggestions, not to market your product.

Comments

There are no comments yet. Be the first to comment!

You must have javascript enabled in order to post comments.

Leave a Comment

Your email address will not be published. All fields are required.

Name

Email

Title

Comment

Top White Papers and Webcasts

In the past, security was a major obstacle in moving workloads to the cloud. Today, the main obstacle to cloud adoption is the pain of migrating data. But not anymore. A data migration project using purpose-built tools that automate procedures and allow real-time testing can reduce the amount of downtime from hours or days to minutes or seconds. This eBook summarizes the major migration pain points, breaks down the flaws in traditional approaches, and illustrates how modern tools help businesses stay agile by …

Cloud has the potential to offer many benefits that can enable great success within your business. However, there are still many myths floating around about backing up to the cloud. In this eBook, you'll discover the truth about five of the most common cloud myths, including myths about security, maintaining regulatory compliance and more. Get to the truth, so you can backup to the cloud with confidence.