Faster Nepomuk Queries

21 Aug 2012

Nepomuk has a very decentralized architecture where the different
components exist as different processes. They are all variants of the
same executable - nepomukservicestub. This servicestub loads
appropriate service plugin. The main reason for doing this was
stability. If one of the components crashes, then it doesn’t take all
the other components with it.

Unfortunately this architecture doesn’t hold very well when the
different components need to communicate with one another. In that case
they need to use complex methods such as dbus or local sockets. Another
problem is the increased memory consumption cause each process has its
own internal cache (Nepomuk stuff) and other KDE specific stuff.

The Storage Service is responsible for managing the ontologies,
initializing virtuoso, and other data management functions. The
QueryService exists for caching queries and running them in a separate
thread.

Now the Query Service obviously need to access the virtuoso database,
and for that it needs to go through the storage service. This
communication happens through a local socket. The same socket which all
other applications use to access Nepomuk.

Last week, I finally merged the query service into the storage service.

I was aiming for a small memory decrease, and a slight performance
upgrade on the queries. Boy, was I wrong! The additional local socket
seems to have been a huge bottleneck.

Here are some benchmarks listing about 12,500 resources.

There are still many more performance upgrades that can be done, but
this seemed like a good place to start :)