Sunday, November 23, 2008

Empty Spaces

Apache River hasn't gone much farther from the last time I looked at it, but I liked the bare-bones reference implementation aspect. GigaSpaces seems a bit thick for my tastes and seems to be tightly coupled with their application server. I thought Blitz JavaSpaces might be a better fit, especially if I could use their fault tolerant edition.

I was able to get Blitz up and running then configured it to do unicast discovery to a pre-existing Jini registrar without a problem. I was having problems getting my client to connect in its security context, so I decided to dig a little deeper. As I did I also kept an eye towards fault-tolerance, but found that branch seemingly suspended. I later found a post from the author indicating he didn't really see a good motivation for moving forward with his fault-tolerance work:

In my spare moments I've been doing a re-implementation but the fact of the matter is that it's not a trivial problem to solve (though I believe I do have a solution). And here's the rub, this work doesn't pay the bills which means that it's going to take a long time to implement because I have to do a day's work first. For those who don't know, most of Blitz has been written during time between periods of employment - not over weekends and evenings as you might expect.This presents me with a problem - users seem to want this feature but I'm struggling to see doing this as a good thing. Here's some of my reasons:

I'd be building a significant feature which will, judging by demand, make a lot of money for those who use it but zilch for me.

Not only do I earn nothing from this venture but I have to earn a significant amount of cash just to allow me time to develop the feature. Basically, I'd be financing everybody else's money making ventures.

One of GigaSpaces key value adds is the clustering/replication feature - they are fully commercial and need to earn a crust plus they're one of only a few credible companies that can provide corporate grade support for JavaSpaces. Were I to do this work for Blitz I'd maybe be damaging the market I've been helping to create.

Right now, I feel like the price of this piece of work is too high for me personally and for others in the commercial JINI world (and like it or not they are an important element in any future success for JINI). I can see why Blitz users might want this feature - they can avoid paying Gigaspaces a license for starters.

So... it seems like the development of an enterprise ready Blitz isn't in the cards. Casually strolling through Wikipedia's definition of a Tuple space brought up Fly Object Space, a tuple space that is not a JavaSpace implementation. While it doesn't fit into the Jini realm I know and love, it is a more minimalistic implementation of an object space that fits my desire of something smaller and to-the-point. It doesn't appear to support replication or fail-over on the non-commercial level, but I'm checking to see if there are plans to support it on a commercial level.

It's tough. I need an object space that has a minimalistic implementation, has a small footprint and can at least run active/passive for fault tolerance. Maybe I might have to dust off my old Terracotta instance and try out SemiSpace.

10 comments:

One of the things that we tried hard is to keep the simplicity of JavaSpaces API and make the JavaSpaces development even much simpler to configure and develop.

Most of the things that we added recently was along those lines i.e. the Spring integration (OpenSpaces) is a way to make JavaSpaces programming extremely simple for those who are already familiar with POJO driven design and Dependency injection. Most of the work was added to shield the developer from worrying about Jini configuration and lookup service setup. This level of abstraction was added as an additional layer and doesn't block you from using a plain Jini configuration and setup if you wish to (Why would you choose to do that is a different question).

Note that the sole purpose of the FREE community editionM is to enable users a way to leverage the simplicity enhancement that we added over the years and use our product as simpler reference implementation that happens to be enterprise ready by the nature of customers and application were been working with over the years.

I would encourage you to give it a try. In any case i would be very interested in your feedback if from what ever reason you would still find our software complex for your needs.

I have tried out the community edition, and it is a more minimalistic implementation. However it's limited to a single instance and one isn't allowed to build any clustering or fault-tolerant capabilities onto it.

I do have the XAP version and did run through the plain JavaSpaces tutorial - that's currently what I've switched my JavaSpaces implementation over to. I was definitely pleased to see a straight-JavaSpaces implementation ready to go.

The big barrier isn't necessarily with me per se - I'll run anything under the sun. But finding convincing reasons to have an IT operations dept. install yet another application server can be tough, especially when they already have five bazillion vendor & homebrew components to manage. If I send them a fairly simple component that just requires you to type "RUN" that's more likely to be accepted. If I send them a whole app server to configure & deploy they'll (rightly) balk.

The community edition would definitely be my preference, even considering Blitz, SemiSpace or FlySpace. But without fault-tolerance I need to evaluate other alternatives. That leaves SemiSpace (and I'm a bit wary of Terracotta clustering) and GigaSpaces XAP.

When it comes to associative memory implementation for a clustered environment, GigaSpaces definitely seems to be the most enterprise-ready implementation. In my ideal fantasy world I'd just be able to get JavaSpaces w/ replication a la carte.

It is possible that if you offered the GigaSpaces solution for EC2 to an IT operations dept, they may have a lot less concern. That solution allows you to pay a few dimes per hour - use it when you need the compute power and drop it when you are through. The setup is literally, "one-click" as you select the EC2 AMI and then deploy it.

Thanks for prompt response and detailed feedback. It definitely shade some light on your experience. I appreciate it!

1. "If I send them a fairly simple component that just requires you to type "RUN" that's more likely to be accepted. If I send them a whole app server to configure & deploy they'll (rightly) balk."

I'm still not sure I follow.Why can't you just bundle our libraries just as you would do with the JavaSpaces tutorial and embed it with your application server?Many of our customers are actually using our product in conjunction with other containers for caching purposes, messaging or parallel processing.

Since most App Server was not designed for scale-out and are relatively expensive and complex we offer full XAP that will enable you to run GigaSpaces completely independently. If your using one of the OpenSource containers such as Jetty or Glassfish we have a fairly strong integration that makes those OSS alternative equivalent to any of the high end alternatives. I refer to this approach as "Compatible but Independent". The reason I mentioned all that is that some of the organizations trying to move away from the bloated AppServers so in many cases they will be happy to replace them with something that can help them get both better scaling, performance and simplicity and at the same time save the $ cost associated with it especially these days when everyone is facing the economical pressure.

Anyway if that is not the case in your org you can stick with the simple embedded libraries that I mentioned above.

2. I understand your comment about our community edition. You should note that even with this limitation you can run the space as persistent space to ensure recoverability in case of a failure. If you need replication and full clustering support then you should use the standard edition which is a fairly low cost edition (pricing wise it is fairly comparable with most opensource alternative in the market).

As Owen said in his response you can also consider the use of EC2 and our cloud solution for both simplicity and pay per use model.

Anyway I apologize for the relatively long response (Bad habit i can't get rid of:)) – I hope that I was able to address some of your concerns.

That's very true about embedding your Jini libraries. I'm currently tearing apart the XAP scripts & libraries to see how each tick, and I do indeed see the Jini factories & libraries that follow closely to the reference implementations now. I had to dig a bit to find them; it seems like the simple JavaSpaces examples aren't included with the distro anymore, but after some digging in your Confluence wiki I was able to find them.

Currently having object streaming issues when trying to do just bare-bones JavaSpaces... we'll see if I can work past those.

Was able to get a simple, minimalistic JavaSpaces client written. The one caveat was that the "plain JavaSpaces tutorial" doesn't quite work. With some help I was able to get the registrar to hand me a JavaSpace instance using the discovery managers (LookupDiscoveryManager & ServiceDiscoveryManager), but using the raw LookupLocator to get a registrar (verbatim to the tutorial) just gave me object serialization errors.

Believe it or not I'm actually not wanting to abstract - bare metal Jini is a good thing in my instance. What I really am trying to do is simply keep a high write volume stateful object in a tuple space to keep track of a workflow that is being managed by a series of asynchronous processes. So a series of messages being passed back and forth between asynchronous components comprises a conversation, and the state of this conversation is (perhaps) managed by an object space. Due to the heavy amount of writes and tightened isolation level, it seems like object spaces are a good fit.

"What I really am trying to do is simply keep a high write volume stateful object in a tuple space to keep track of a workflow that is being managed by a series of asynchronous processes"

If this is what you are looking for you should look into the polling containers. Polling containers enable you to choose different event model strategies without changing the code (take/write,read/write,notify,with tx without tx, with timeout...). You can easily choose the template through configuration. Based on various benchmarks there isn't any performance degradation associated with it. The other benefit is that your code remains pure POJO, you can easily switch to other implementation if you wish to at some other point in time. The DataExample is a good reference that shows you how you can manage workflow in this way. In this case there is a feeder that sends events, the events are marked as "new" the first event listener listen to new events and once they are processed mark them as processed and write them back to the space (via the polling container). This triggers another service that listens to all processed event etc. You can also take a look at the advanced section of the JavaSpaces tutorial to see a full flagged workflow example.

Note that you can always call our API and start your application yourself if you don't want to run within our containers. Or simply call the Integrated processing unit container which is essentially an extension of the Spring Application context that has built-in space and cluster awareness.

I will definitely look at those options and pass the info along to the other architects I'm working with. We were talking about such options earlier this morning and I'm sure they'll be interested in the polling containers.