Login

In the third installment of this series we will add some more functionality to the cache that will alleviate some of the drawbacks that we discussed in earlier articles.

Refreshing the Cache

One of the issues with using cached data is that your cache can get out of sync with the database. This can lead to unpredictable behavior in your code and is the reason why we introduced the caching switch. Obviously that is not really a solution to the problem. The real solution would be to implement a mechanism that “detects” changes in the DB and automatically refreshes the cached data. Unfortunately, such a mechanism is really beyond the scope of this series, but we can at least implement cache-refreshing procedures so that, if you want to create a refreshing mechanism, you have everything ready.

We will create two different procedures, a procedure that refreshes one record, and another one that refreshes the whole cache. Let’s start with the first one, refreshing one record from the database. We can actually achieve this very simply by adding a new Boolean parameter p_force_db_read to the FUNCTION dept_data which defaults to false:

IF (l_dept_data.NAME IS NULL) THEN l_dept_data := read_from_db (p_dept_id);

IF ((l_dept_data.NAME IS NOT NULL) AND (caching)) THEN write_to_cache (p_dept_id, l_dept_data); END IF; END IF;

RETURN l_dept_data; END dept_data;

Now we will skip the cache read if either the caching is turned off, or if we set p_force_db_read to TRUE. This in turn will trigger a read from the DB, refreshing the cached record in the process. So now we just need to write our refresh procedure calling dept_data with the correct mode:

We will ignore the department data that gets returned from dept_data. It would probably be handy to have a function that does both, i.e. refreshes the cache and returns the found record, but you should not put that functionality in this procedure. The procedure is called refresh_cache, not refresh_cache_and_return_value. You should always be careful not to introduce “side effects” into your code. In the long run, this will ALWAYS lead to problems. If you really want a function that does this, create one:

And if you are thinking that this function now has the side effect of refreshing the cache, you are wrong. Well, actually you are half right, it does refresh the cache, but that is not a “side-effect,” but the desired effect. I am caching, am I not?

For the second form of refresh, we will create an overriding procedure that just loops over the cache content and calls refresh_cache for every record:

You publish these procedures in the package specification for people to use as they please. For example, the developer who is going to design the “keep in sync with DB” feature can use these procedures to achieve his goal.

{mospagebreak title=Limit the size of the cache}

Another drawback of caching is that it consumes memory. If you design your cache incorrectly, or if you pick the wrong tables to cache, you might actually bring the whole database to its knees, something that does not impress customers (I speak from experience). Big caches also tend to become slower. It is therefore useful to design for this up front and somehow limit the size of the cache. Unfortunately, the “optimal” size, i.e. the maximum size of the cache without causing memory issues, varies from system to system. Some customers have huge servers with massive amounts of RAM, some don’t. So you can only accommodate, not actually set the cache size (although we will default it to 1000 records).

When you limit the size of your cache, you must confront the following issue: what do I do when the cache is full? In this case, what if I already have 1000 records in the cache and now I want to cache a new record? The only option you have available is to overwrite an existing record. Actually, we are not really overwriting but deleting an existing record and creating a new one. But which one? Well, that is the subject of many debates and there are many selection algorithms out there that you are welcome to implement (MRU, LRU, LFU …). Again, those algorithms are out of the scope of this series. I am just going to remove a random record from the cache and then add the new one.

First we need a new global variable that holds the maximum cache size:

Basically this function runs through the cache a random number of times, at which point it stops and returns the index of the record it stopped at (remember that the index is not just an index number in our cache, but the Department Identifier). It’s probably not the finest piece of code ever created, but it works.

Now we modify the write_to_cache procedure to make sure we never exceed the maximum cache size:

Now your cache will never grow beyond the maximum size, and your customers can set the cache size to whatever is suitable for their system. It is sufficient to say that the smaller the maximum cache size, the less efficient your cache actually becomes, although that also depends on the amount of distinct values. In my case for example, there are only four distinct values in DEPTNO, so if I set the cache to four I’ll cover all of them, and there will never be any in and out swapping, truly an ideal cache size.

{mospagebreak title=Showing the cache content}

Another useful ability is to be able to see what is actually stored in the cache, so we will create a log_cache procedure:

I am sending the output of this procedure to the screen, but you might have another mechanism for this, e.g. maybe you log trace messages in a trace table. You can modify the procedure to use your own logging mechanism. It is always very handy to be able to peek into the cache, especially when you are debugging.

{mospagebreak title=Conclusion}

We have shown some techniques that can be used to manage the cache size and refresh records in the cache. If we add this to the Caching Architecture we get the following:

Create a cache.

Create a procedure to empty the cache.

Create a switch for the cache, a public procedure to “set” the switch and a function to verify what the switch is set to.

Create a read, write and query procedure.

Create Procedures to refresh the cache content.

Create a global variable to store the maximum cache size, a public procedure to “set” the size and a function to verify what the size is set to.

Create a data retrieval function.

Create a log procedure.

In the next part of the series we will add procedures that monitor the effectiveness of our cache.