Wednesday, August 25, 2010

Dealing with Kettle DBCache

I came from 4 weeks of summer holidays... First time in my life but it was beautiful to spend a lot of time with my family. I need to take some time for them more often.

Anyway, today I was doing my usual work with Kettle but something strange happened that makes me crazy. I added a new field to my table and then I came to Kettle to update the fields layout in my Database lookup step. When I tried to find out my new field from the fields list I remained surprised... No new field appeared in the fields list for my table. I cried because I thought that it magically disappeared by my table but after a rapid check with my database client I saw my field was there in my table and I was happy. So what happened?

Cache database metadata with DBCache

DBCache is a cache where recurring query results (about database metadata informations for example) are saved the first time you access them. That informations are persisted in a file called db.cache located in the .kettle directory below your home directory. Informations are saved in row format so you have a look inside the file or edit its content.

Every time, for example, you go through Kettle database explorer looking for a table layout, the first time you access table metadata that results are cached through DBCache so that second time you you go through the cache saving an access to the db. But if you forget that and you update your table DDL in any way (like me after a veeeery long period of holidays) you could be surprised seeing that your updates seems not to be caught by Kettle.

How can we clear the cache and have our table metadata updated next time I need to access them? You can go through the Tool menu and choose Database > Clear cache and your fresh set of table metadata will be get from the database.