Zeitgeist is a project to keep a chronological log of events (things the user does or that otherwise happen): accessing/editing files (local, remote and web), having IM conversations, launching applications, etc.

In this page, we talk about making GtkRecentManager use Zeitgeist's logging backend instead of (or in addition to) using that XBEL file, and about extending its API to support some concepts which would make things much more useful for Zeitgeist.

Summary

The idea is to keep GtkRecentManager as the simple logging API for apps that deal with "real file URIs". Applications handling special things like IM conversations or music can use the Zeitgeist API directly.

The logging and querying API will be kept as legacy for apps that use them (to generate recently-used menus and such). For other more sophisticated usage, the Zeitgeist API should be used directly.

Objectives

Get more details logged. Particularly, correct timestamps and a distinction between create, open, modify and (new) close events.

Get data from Zeitgeist, since it's expected to be more complete (ie. not only from GNOME apps).

API Proposal (WIP)

Insertion

The following operation will replace gtk_recent_manager_add_item and gtk_recent_manager_add_full:

The methods return void so that the operation can be made asynchronous. Type can take the following values: _CREATED, _OPENED, _CLOSED and _MODIFIED. Data can have the following information: uri, display_name, mime_type, application.

Outstanding issue:

Do we need the timestamp parameter or can GTK+ just use the current time?

Retrieval

GList* get_latest_items_for_app(GtkRecentManager *manager, gchar* app, gint limit) //returning all items for app sorted by recently used where the app should be the name of the .desktop file, limit sets the count of items to be returned
GList* get_latest_items_for_mime_types(GtkRecentManager *manager, GList* mime_types, gint limit) //returning all items for mimetypes sorted by recently used, limit sets the count of items to be returned

Function gtk_recent_manager_get_items gets deprecated. We need to figure out how it'd make sense to implement it in terms of get_latest_items_for_{app/mime_types} (requesting just the last 100 subjects of any kind may have too much noise -all of them may be websites-).

Targeted retrieval

gtk_recent_manager_lookup_item(uri) -> GtkRecentInfo

The GtkRecentInfo structure isn't the most optimal thing for Zeitgeist but I suppose for backwards compatibility we don't really want to change it.

Decision: Keep GtkRecentManager as the simple API for apps that deal in "real file URIs". Special stuff like IM conversations or music can use the Zeitgeist API directly.

Keep the querying APIs as legacy for apps that use them (to generate recently-used menus and such). For other more sophisticated usage, promote using the Zeitgeist API directly.

Right now GtkRecentManager doesn't deal with timestamps at all; they are magically generated in !GBookmarkFile. Zeitgeist needs an accurate and finer-grained view of timestamps - bug.

Zeitgeist likes to have "event interpretations" that say what happened to a file - opened? closed? edited? So we need an API to specify that.

Problems with GtkRecentManager

Vague timestamps

GtkRecentManager doesn't provide a way for apps to specify the timestamps for its items, and it does not actually add timestamps automatically itself. Instead, it assumes that the underlying GBookmarkFile will do it. However, GBookmarkFile has a very simplistic way of handling this when the caller doesn't specify timestamps at all:

added/modified/visited timestamps start as a magic value of -1.

When a new item is added (one whose URI wasn't present in the bookmarks file), its added/modified timestamps are set to NOW. This is okay for a "save" action, but not for an "open" action, as the latter does not modify the data.

When an item is written out to the bookmarks file, any timestamps that remain as -1 are set and written as NOW.

Clearly, Zeitgeist needs a more accurate view of timestamps for each event: it thinks in terms of "events", not "items".

Event information

Zeitgeist likes to have "event interpretations" that say what happened to a file. Was the file opened? saved (with modifications)? closed?

We need a way in the GtkRecentManager API to specify a few standard or common actions, ideally along with their timestamps ("opened at $time", "saved at $time").

Note: Zeitgeist also has "event manifestations", which mean, "how did this come to happen?". For example, did something take place because of direct action from the user, or did the system do something on his behalf. For GtkRecentManager, we can probably always say that the user did it, or we can leave the manifestation field blank.

Usage in applications

Gedit

gedit-window.c has two simple wrappers, _gedit_recent_add() and _gedit_recent_remove(), for gtk_recent_manager_add_full() and gtk_recent_manager_remove_item(), respectively. They only use the URI and mime-type when adding items; everything else (application, groups) is the "obvious" info for gedit.

Items get added to the recent manager when a file is finished loading (look for _gedit_recent_add() in gedit-tab.c:document_loaded()), and when a file is finished saving (document_saved() in the same file).

Items get removed from the recent manager when a file fails to load (look for _gedit_recent_remove() in various places in gedit-tab.c). I'm not sure if removing items from the list is a good idea in this case.

Geany

This is practically the whole source for the Zeitgeist plugin for Geany - it is particularly enlightening as to how an API should look like: we can pass event interpretations as strings; we could have some #defines for them (what about language bindings?).

API discussion

Insertion

The Gedit plugin uses a subject interpretation of Interpretation.DOCUMENT, while the EOG plugin uses Interpretation.IMAGE. In the Desktop Summit, we decided to leave GtkRecentManager as an "easy API" for apps that deal in plain old files (i.e. not Evolution, which would represent emails in Zeitgeist events differently). For those apps, can the subject's interpretation simply be derived from the MIME-type, or do they really need to specify it by hand? Yes, it looks like we can derive this from the MIME-type.

We can probably always assume a subject manifestation of Manifestation.FILE_DATA_OBJECT (i.e. users of this API deal in plain old files), and an event manifestation of Manifestation.USER_ACTIVITY (i.e. not for automated notifications and such). Are these assumptions correct? Yes, FILE_DATA_OBJECT and USER_ACTIVITY are correct. RainCT: Not really, gedit can handle remote files (eg. sftp://) and GtkRecentManager currently handles those correctly; deciding between local and remote file programatically shouldn't be a problem though.

Events have a storage property. What do we do about the storage type? Gedit's plugin doesn't even specify it, while Geany always uses "net". RainCT: Just leave the field empty and Zeitgeist will know to do the right thing (if Geany is using "net" that's badly broken).

From the above, it looks like we could simply use the following, with some provision to specify the app-specific information like app identifier (gedit.desktop) and the MIME-type.

Queries would remain the same as they are right now, to return "items" rather than "events" (i.e. all accesses of foo.txt constitute an item, while Zeitgeist may be storing multiple events for that subject internally).

We can probably use a GtkRecentData in the API above, rather than the URI, to allow specifying the app name and MIME-type. We may need to deprecate some fields in that structure (groups? app_exec?).

Retrieval

Queries in the current API don't allow you to filter upfront which "uris" are of you interest. Currently to find all files opened with gedit one needs to:

This needs to be done otherwise. With a Zeitgeist backend this get_items will return a much bigger list than what we are used to by gtk.recentmanager. Limiting it to "n" unique items but that would not be fair. Playing "n" items with rhythmbox will result in the current API not returning any items for gedit. For that I sugges adding a new API.

GList* get_latest_items_for_app(GtkRecentManager *manager, gchar* app, gint limit) //returning all items for app sorted by recently used where the app should be the name of the .desktop file, limit sets the count of items to be returned
GList* get_latest_items_for_mime_types(GtkRecentManager *manager, GList* mime_types, gint limit) //returning all items for mimetypes sorted by recently used, limit sets the count of items to be returned

This API would allow us to actually make Zeitgeist do the calculations for us. It is quicker in response than iterating through a list and filtering out, since everything will be done directly in the Zeitgeist DB.

This naming is just awkward: 'add_create', 'add_access', 'add_leave' - confusing to put verbs next to each other like that. And I don't think we really want to introduce 'events' in this api at all. Events are something quite different in GTK+.

I also need to see self-contained documentation describing what these do, just referring to some Zeitgeist docs does not suffice.

Finally, I'd caution against an API that collects data without a way to remove the collected data.

Proposal from ebassi:

typedef enum {
CREATE, /* Document -> New */
OPEN, /* Document -> Open */
SAVE, /* Document -> Save, Save as? */
CLOSE /* Document -> Close */
} GtkRecentActivityType;
void gtk_recent_manager_add_activity (GtkRecentManager *, GtkRecentActivityType, GFile *, const char *display_name_or_null, GCancellable *, GError **);
int gtk_recent_activity_iter_init (GtkRecentActivityIter *, GtkRecentManager *, GList *mimetypes); /* for stuff from this app */
int gtk_recent_activity_iter_init_for_mimetypes (GtkRecentActivityIter *, GtkRecentManager *, GList *mimetypes); /* for stuff from anyone matching the mimetypes */
-- we don't use mime types: we use GContentType.
don't use GList, it's an awful data structure
you can use: GContentType **types, int *n_types
though I'd wager that people will use "text/*", "image/*", and something similar
why do you need the mime types there as well? just let the people check if it matches a content type during the iteration; it's not like it's going to be more efficient if it's done by the server or by the client app, and the client app encodes much more knowledge than the server can possibly do.
gboolean gtk_recent_activity_iter_next (GtkRecentActivityIter*, GFile*, GtkRecentInfo*);

I like the idea of using an enum for the event type ("interpretation" in Zeitgeist's parlance) instead of having a different function for each one. Maybe even macros with strings for extra extensibility.

Aren't we discussing the query API too much? This was supposed to be an easy API to put things into Zeitgeist. The query API is pretty much only for GTK+'s internals ("gimme the recent files for this File menu"); is there something that uses GtkRecentManager in a more exotic way? I.e. apps that need sophisticated queries are better off using the Zeitgeist API directly.

However, I do like the idea of a GFileEnumerator - probably makes things more consistent. Zeitgeist queries reply asynchronously but contain the whole reply in a single chunk; you don't get multiple callbacks - not sure how well the enumeration API would work with that scheme.

Matthias: I think all the stuff I wrote above explains the interpretation types; please tell me if it's not clear enough. They are about what happened to a certain file or URI.

From my perspective, GtkRecentManager is more or less a dead-end API in GTK+. It is designed for Win95-era File menus with a list of recent files; and these menus are not really something you expect to see in modern apps. So, I question somewhat the value of investing tons of effort into improving the recent manager implementation at this point. That being said, there's no harm in making GtkRecentManager write proper timestamps etc. But a zeitgeist dependency in GTK+ is not going to be acceptable. And a backend abstraction with loadable modules is even more effort/overhead.

Isn't DELETE missing from the activity types ? Or are deletions not logged ?

Wrt. to the lack of documentation for the proposed apis, things that need to be documented include:

are app authors expected to call this themselves, or does gtk call it behind the scenes ? If so, where and when ?

are there consistency rules that have to be followed ? e.g can't close without a prior open

what files are you expected to call this api for ? (I guess ~/.myconfig doesn't qualify...)

I understand no zeitgeist in Gtk. There is no place for it there. The overhead effort for loadable modules i also agree on. However if the new Gtk API can follow the scheme discussed above it is easy for Zeitgeist to passive log any events inserted into Gtk.Recentmanager Deleteing a file should also be logged too. But Moving a file around would kinda have to trigger it to. I suggest not logging it.

The authors would have to call the methods themselves from within their Application when a file is opened/modified/closed/created. You cant force rules really, because activities are sequential so at some point the file might be open which would give it a greater open timestamp and close and the other way around too. The only timestamp that needs to be smaller than all others is "CREATED" timestamp

As for files on which is would apply, the rule is "if you touched it then its logged". The UI can later decide if it wants to display hidden files or not.