RDF Calendar Notes

Abstract

This is a rough, short writeup of some thoughts about use cases for a RDF calendaring
systems.
It discusses what we might use calendaring systems for in
RDF, and raises some general issues that need to be discussed.

Status of this Document

This is a draft!

Introduction

The ILRT uses a custom calendaring system based on a SQL database with only an html front
end. It does not allow syncing with PDAs or other calendar sources. In addition, one of us
(Dan) works
with W3C and ILRT and has a PDA, and so finds it time-consuming to calculate when he is
supposed to be where. For these reasons we wanted to think about ways of merging calendaring
data from different sources. RDF was a good candidate for modelling calendar events (cf
Connolly, Berners-Lee, ABC document).

We have built an implementation of a very simple system which merges RDF
data from different sources. This document is a rough outline of some of the discussions we
have had and the difficulties and solutions raised by the implementation.

1 Calendar Use Cases

1.1 Merging

This is the use case we implemented in Squish (described below). In this case we are simply
fetching all the information about events from different urls, putting it all into one
queryable source and then making queries of it. There is no loss of data; no superceding of
one source of data over another. There is no notion of identity of events.

This could be useful where access to calendar data is in different places or on different
devices, but is represented or representable in RDF. It is simple to implement (see
below).

Issues and difficulties

Where events started and ended on different days, not all events were found for any
given day, since without reasoning, the calendaring system could not work out that days in
between were included in a query, which is a straightforward text match (and see metadata
format below).

No 'order by' facility in the Squish query language we used, making display cumbersome.

Difficulties managing large datasets (e.g. over several weeks) so difficult to have
sufficuent data to represent more than one day.

Password protection of RDF data urls: SiRPAC the RDF parser used cannot handle password
protected sites, and so we had to prefetch the RDF content. This raised security issues.

1.2 Syncing

This involves merging data from different sources, but implies that there is some mechanism
for reconciling conflicts between events; which in turn implies that there is a mechanism
for deciding when one event is the same as another.

Issues and difficulties

Syncing from diverse data sources is
a difficult problem for several reasons.

What are the identity conditions for an event? - a very difficult question, philosophically
and practically.

Practically managing syncing. Palms seem to manage syncing by specifying
in the syncing software whether the handheld should override the desktop or vice-versa,
implying a great deal of hidden human control over the data. Where data is arriving from
multiple sources (3+) with multiple creators, which data source is the canonical one becomes
more difficult to
determine.

In general, the degree and granularity of human control of the syncing process needs to be
determined.

Possibly needs reasoning to occur for proper syncing (see below).

1.3 Reasoning

Lots of interesting calendar applications can't be made without something approaching
reasoning. For example, suppose you have two calendar files and you're trying to sync
them. For this you need to be able to say whether someone is busy or free within a certain
period of time. This means that you need to say something like: this person is free if an
event isn't starting at that time, and also no event has started before this event and hasn't
finished before this event and also no event is scheduled to start before this event is
scheduled to finish. The notions of 'before' and 'after' could, I suppose, be built into a
query language.

Another potential place for reasoning is in the determination of whether some event is the
same event as another. It's not at all clear how to do this in a principled way, but you
could perhaps describe a heuristic for determining if an event is the same - if they are
co-located, and the organizer is the same and most of the participants are the same or
somesuch.

This sort of use case would be good for when you don't want people to know what you are
doing, but do want them to know that you are busy. It would also be required for an
auto-scheduler.

Issues and difficulties

Not a closed world

Possible unarticulated assumptions about what 'not being busy' means: even if you are not
busy, you may not be available to take calls, not in the UK, only contactable by email, and
so on.

1.4 Auto-scheduler

I've given this a separate category, even though it is really an application of a reasoning
calendar, because it requires some more details. An auto-scheduler would try to schedule
meetings using RDF data about potential participants. It would need to know if they were
busy or free, in the right location or with sufficient time to get to the right location
for the requested type of meeting. It would need to calculate who was essential to the
meeting and who wasn't. It would need to know what formats were acceptable for the
meeting, what formats people were available in (IRC, email, phone, person), what the
defaults were for a person at any given time. It would need to know how to identify a
person and how to identify an event, and also have all the reasoning capacity about busy/free
periods described in the section above.

2 Other Issues

2.1 Privacy and Trust

We have implemented the aggregative calendar demo using data in the RDFWeb, bringing to our
attention the usual issues of privacy and trust. Since in RDFWeb anyone can say anything
about anyone, one could ascribe events to a person that they had no knowledge of
(particularly interesting once events are considered as potentially historical, see below).
This is a general problem with the trustworthiness of RDF data, and needs to be tackled as
part of the RDFWeb project.

Relatedly, there are privacy problems to do with personal safety and safety of property when
other people know where you are and what you are doing, especially in conjunction with other
information such as where you live and where
you work.
There needs to be control over
the input of data and the viewing and aggregation of data.

2.2 Merging with other information

Examples:
Events become history and may have ABC-related outputs (such as the documents produced as
the outcome of a meeting.
Events can generate automated travel-related information - your personal agent books you a
flight using RDFWeb data to determine your food/seating preferences.

2.3 Datatypes

In the example below, we matched labels on the events as a hack to circumvent the problem of
not having a date datatype. Similarly without datatypes we cannot properly implement before
and after questions.

3 Potential Scenarios

I've got a work calendar (password protected) and a Palm, and I'd like to view the data in
the
same place.

I'm going to a week long meeting, and I want to combine the schedule of the meeting with my
personal calendar.

I'm trying to schedule a meeting with three people without having complete information about
their calendar.

I'm trying to divide time spent on a project between various people

I'm trying to schedule cinema trips with my friends.

I'm trying to divide up my TV watching, TiVO-style.

I'm trying to book plane tickets for someone else.

I want to find out who was at the meeting which produced the Dublin Core 15 elements, and
what documents were produced a result of that meeting.

4. Squish Calendaring Demo

Squish is a RDF query language which uses an SQL-like syntax to navigate around an RDF graph
or graphs. There is currently an implementation in Java written by Libby. A perl version is
planned. The query language is similar to R.V.Guha's RDF db query language.

Essentially it enables you to make a complex query of a graph rather than navigating using
an incremental, node-based or triple based query interface. We have been using it for various
projects to rapidly prototype various sorts of RDF query applications using JSPs
and Tomcat.

Although you can use it with any
RDF database which has a triplesWhere(subject,predicate,object) or similar query interface,
in practice it is
too slow to use with our SQL implementation. We also wanted to make sure that the
calendar data queried was up-to-date at the time of querying, so storage or caching were to
be avoided. We also wanted to query RDF data which was password protected, because of
concerns about privacy raised by members of the ILRT.

We therefore set up a JSP which downloaded specified urls (specified either as cgi
parameters or as specified in RDFWeb), prompted for username/password as required, and then
queried the data found using the following query:

The result was an unordered list of event descriptions and dates/times in a
java.sql.ResultSet table format. We matched these for each time and printed the result.
Events without a time had dummy times represented as '--:--', because Squish queries are
conjuctions, and so otherwise we have would needed to make two queries, one for events with a
time, one for other events.