Discussions

Whirlycache 1.0 has been released. Whirlycache is a fast, configurable in-memory object cache for Java. Like other caches, it can be used to cache objects that would otherwise have to be created by querying a database or by another expensive procedure. The authors claim it's faster than any other cache they've seen.

One thing to consider for Whirlycache is that it creates a thread to maintain the cache, which may not be a good idea in a servlet environment, despite its intended use in a web environment.

I've been using it with great success. (Both with Servlets and elsewhere in the application stack). I like having cache pruning in another thread as it doesn't block application threads when they need to access the cache.

Please provide programmatic configuration support via IOC. A magic XML file on the classpath and a static API is no longer acceptable. What if I want different subsystems within my application to have different caches and policies?

The design is fundamentally different than EHcache in several ways, so I wouldn't start the comparisons there.

Your question was about what you should do if you want different parts of your application to have different caches and policies:

Whirlycache has always supported multiple named caches with their own policies since the first public release. So there's nothing further to say there.

And if you don't like the ability to configure caches with an XML file, then don't. You can do it programmatically by passing in CacheConfiguration objects to the CacheManager, another feature that Whirlycache has had for quite a while.

Eventhough i can understand the statements, i find these two assumptions combined together quite risky when implementing business sensitive processes.

I succesfully used ehcache, in a 24*7 high performance application. Performance was really an issue, hence the cache was the main datasore, with the DB instance acting only as disaster recovery mechanism, if necessary. The Oracle sales guy was quite unhappy of our architectural choice by the way !

Event though during the architecture phase we tried to guess the maximum data input rate, we were never sure that, for whatever unfrequent reason, this rate could not be increased by a factor of 10, exceeding the ability of the process to take data out of the cache after due computation. No joke here, i have seen systems going down because some administrator ran some data injection process at the full power of a big Sun server to reinject a full week of data.

Not so unfrequent mistake by the way, administrators can have a lot of pressure, and not enough info to analyse the overall impact of the processes they are running.

Hopefully, Ehcache provides a disk overflow mechanism virtually unlimited (at least in my memories...) so that if ever the cache would reach its max objects limit, you could be sure that the data would kindly overflow to disk, until the cache is purged from the obsolete objects and get back to a normal state where all data fits in memory. Getting back to normal state easily takes a couple of hours, in this particular case.

If i understand the statements 3 and 4, i could not have used Whirlycache, as it would have used all the JVM available memory until the whole JVM would crash.

Am i right ? If yes, is there a hook in the API that would provide graceful shutdown of the JVM or similar to avoid disaster situation due to memory lack ? Or in another way, for what kind of projects do you suggest to use Whirlycache and when do you suggest to use another more fault tolerant cache library ?

Getting back to normal state easily takes a couple of hours, in this particular case.If i understand the statements 3 and 4, i could not have used Whirlycache, as it would have used all the JVM available memory until the whole JVM would crash.Am i right ? If yes, is there a hook in the API that would provide graceful shutdown of the JVM or similar to avoid disaster situation due to memory lack ? Or in another way, for what kind of projects do you suggest to use Whirlycache and when do you suggest to use another more fault tolerant cache library ?

Whirlycache limits are "soft" in the fact that a .store() into the cache will never fail due to the cache being full. In order to have an OOM situation as you describe, data would have to fill ram before the tuning thread ran (at a configurable interval) to prune the cache.

So yes, Whirlycache does not fully prevent this situation as it sounds like EHCache does by paging to disk when full. Different applications have different requirements, and for the situations you describe, Whirlycache might not be the best choice.

In order to have an OOM situation as you describe, data would have to fill ram before the tuning thread ran (at a configurable interval) to prune the cache.So yes, Whirlycache does not fully prevent this situation as it sounds like EHCache does by paging to disk when full. Different applications have different requirements, and for the situations you describe, Whirlycache might not be the best choice.

Peter,

Thanks for your answer. For sure each application has specific requirements, and i can imagine that the speed advantage of Whirlycache can be an important decision factor for some situations, i.e. heavy loaded web sites with mostly read-only data, for instance.

As a suggestion, would it be possible that you put in your API some way to register a callback function to handle near-disaster situation without sacrificing too much speed power ? It would give the possibility for the developer to handle limit situations, either by refusing more data to cache, or by properly logging then shutting down the JVM ?

Maybe simple add-on would allow Whirlycache to be valuable in more architectures, i believe. I know some Sun administrators who really dislike the JVM because the JVM daemon does not crash and stays idle in the system while in an unrecoverable out-of-memory error state, until somebody kills it.

I am not really sure of the interest of this, but i personaly do not feel comfortable with systems with too fuzzy limits. Of course the interest of this would depend on the main usage of Whirlycache.

As a suggestion, would it be possible that you put in your API some way to register a callback function to handle near-disaster situation without sacrificing too much speed power ? It would give the possibility for the developer to handle limit situations, either by refusing more data to cache, or by properly logging then shutting down the JVM ?

Actually, you can accomplish this already with a little elbow grease.

If you implement Cachable in your cache keys, you will get callbacks. You can then subclass CacheDecorator to get a handle to the tuner thread. Your Cachable keys would then need to check cache size in the onStore event, and interrupt the tuner thread to force it to run and prune the cache.

Or, just throw a runtime exception or something in onStore, it could be whatever the business rules dictate.

Thanks for the info, it is an important one, at least for the kind of projects i work on.

I can spend hours and hours and hours in meetings trying to persuade managers that we do not need a full featured RDBMS everytime we have to manage some data, and all these efforts can be defeated by an OOM in the middle of a process, then the manager or administrator claiming that our code can not handle data with same robustness as RDBMS and we loose the next project...

Hi Chris,Is there any benchmark or something that can show us, what will give experience about cache and direct to db engine ?ThxTjiputra

Tjiputra,

I have no benchmark for this project sorry, as we quickly stated we did not need a RDBMS according to the business requirements. It seemed obvious that in memory java cache would be orders of magnitude more efficient to serve object than RDBMS would, via O/R mapping then JDBC queries over the network. Just as RAM and CD-Rom speeds can not be compared, at least if you do not want to pay a fortune for the hardware and software to run the RDBMS instance.

I haven't seen a good way to handle disk overflow that I'm particularly impressed by. The ability of an application to perform well at some point becomes a function of the speed of the disk and the type of data structures that the serialized data is stored in.

As soon as you start hitting disk, performance takes a massive hit, as you can probably expect. Is it better to start overflowing to disk and potentially cause the entire application to grind to a screeching halt or to simply not offer the feature?

I'm not saying that disk overflow is a bad idea... just that I haven't seen an impressive way to handle it.

If you need to store 15Gb of data in a cache and you only have 2Gb of data, then you can't really use Whirlycache for that. In that situation, you *may* actually be better off spooling to a database and keeping the in-memory stuff as fast as it can be. But that's not a fact either way.

If you want disk overflow, start talking to us on the dev mailing list. We're nice people and will listen to your ideas.

If you want disk overflow, start talking to us on the dev mailing list. We're nice people and will listen to your ideas.

Philipp,

I am a nice person too, am i not ? ;o)

Just a couple of things.

First, and to be clear, i am not making any attack against Whirlycache, and i have no personal interest in ehacahe, although i happily used it in some projects. I am really trying to evaluate the potential of Whirlcache, as it seemed to be a good fast product. One more tool in my toolbox, potentially. And i believe TSS is still a technical community, which can handle technical questions.

If i would not be interested in Whirlycache, i would not have bothered going to your website and asking any technical question based on what i read there.

I never said i needed ehcache to have disk overflow, i am just thinking of a way to avoid OOM problems in the context of using Whirlycache. You understand that as a software developer/architect, i need to understand the technicals a bit, so i know the strenghts of each product and use it when appropriate.

Regarding the disk overflow feature, i really use it as it is : an overflow, not a part of the cache itself. The idea is to have all the data in the JVM memory via the cache for ultra fast data access. But if for some reason the data flow exceeds what was planned, sometimes it is unacceptable to have a brutal OOM. So the overflow to disk will avoid complete failure of the system, until the situation goes back to normal. To the price of a slow down of the system during extra-ordinary situation compared to full memory cache, but the price of an OOM JVM error can be much higher.

It all depends on the target application, of course.

So, still in the spirit of avoiding OOM error, i did not ask but suggested to have some call back function to inform of a potential cache/memory overflow. But this might not be necessary, or is even a stupid idea, i am not that smart :o)

OOM exceptions suck. I agree with you on this. However, if I give you an object, how do you tell how big it is? You can't. It could be 8 bytes or 80Mb in memory and you have no way of telling in Java.

Fundamentally, you have to understand how your application will perform in order to know where the pressure points are.

But your idea is solved quite easily with the current release. You can add an instance of a Policy class. The job of this class would be to perform normal evictions from the cache as required, but that it would do a full clear() if memory dropped below a threshold. You could do this yourself in about 10 minutes if you pull down the source tree.

Or, like I suggested, join the mailing lists and let's talk about your ideas there some more.

OOM exceptions suck. I agree with you on this. However, if I give you an object, how do you tell how big it is? You can't.

I sometimes can tell its max size for instance, it depends on the business requirements of the target application. In this situation, i can tell how much max memory needed by the cache, if i master the max size of single entry and the max number of entries in RAM.

The job of this class would be to perform normal evictions from the cache as required, but that it would do a full clear() if memory dropped below a threshold. You could do this yourself in about 10 minutes if you pull down the source tree.Or, like I suggested, join the mailing lists and let's talk about your ideas there some more.

Ok so i understand there are solutions to avoid dirty OOM. I also understand i am talking about taking the cache to the limits, but in a tough competition some little detail might give you lot of credit, or the opposite, so i think you understand my concern. And i will join the mailing list in a near future.

TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations technology projects - with its network of technology-specific websites, events and online magazines.