Blog entries

During last cubicweb sprint, I was asked if it was possible to customize
the search box CubicWeb comes with. By default, you can use it
to either type RQL queries, plain text queries or standard shortcuts
such as <EntityType> or <EntityType> <attrname> <value>.

Ultimately, all queries are translated to rql since it's the only
language understood on the server (data) side. To transform the user
query into RQL, CubicWeb uses the so-called
magicsearch component which in turn delegates to a number of
query preprocessor that are responsible of interpreting the
user query and generating corresponding RQL.

The idea is simple: for each query processor, try to translate the
query. If it fails, try with the next processor, if it succeeds,
we're done and the RQL query will be executed.

Now that the general mechanism is understood, here's an example
of code that could be used in a forge-based cube to add
a new search shortcut to find tickets. We'd like to use
the project_name:text syntax to search for tickets of
project_name containing text (e.g pylint:warning).

Here's the corresponding preprocessor code:

fromcubicweb.web.views.magicsearchimportBaseQueryProcessorclassMyCustomQueryProcessor(BaseQueryProcessor):priority=0# controls order in which processors are trieddefpreprocess_query(self,uquery,req):""" :param uqery: the query as sent by the browser :param req: the standard, omnipresent, cubicweb's req object """try:project_name,text=uquery.split(':')exceptValueError:returnNone# the shortcut doesn't applyreturn(u'Any T WHERE T is Ticket, T concerns P, P name %(p)s, 'u'T has_text %(t)s',{'p':project_name,'t':text})

The code is rather self-explanatory, but here's a few additional comments:

the class is registered with the standard vregistry mechanism and should
be defined along the views

the priority attribute is used to sort and define the order
in which processors will be tried in the main processor loop

the preprocess_query returns None or raise an exception if the
query can't be processed

Drupal is a CMS written in PHP that is getting more and more visibility in the Semantic Web crowd. Several researchers from DERI have been using it as a test bed for their research projects and developed extensions to showcase their ideas. It is for example used to build the Semantic Web Dog Food site that archives the semantic web conferences and publishes them as Linked Open Data. The URL for this year's ISWC is http://data.semanticweb.org/conference/iswc/2009

This led me to read more about Drupal than I had had the incentive before. I have not had time to give it a try, but I skimmed the documentation and will try to compare it with CubicWeb from a software architecture point of view.

Drupal defines a Node as an information item. The CCK (aka Content Construction Kit) can be used to define new types of Nodes thru a web interface. Nodes and the bits and pieces used to display them as HTML are not packed together in components. The Features extension is planning on getting this bits packaged.

If you are a Drupal user/developer and think I am not being fair to Drupal, please comment below.

On the other hand, CubicWeb has implemented very early the concept of reusable component. What is called a Node in Drupal is an Entity in CubicWeb. By design, CubicWeb does not have a web interface to define entities. The data model is part of the code. To efficiently maintain applications in production, changes to the data model must be tracked with changes to the code. Data model changes imply migration procedures. In CubicWeb, all of this is versionned and made part of the components. Where Drupal needs to grow extensions like CCK and Features, CubicWeb has more advanced possibilities by design, for example the ability to develop featurefull applications by assembling components.

This was a very short comparison. I'm looking forward to getting a chance of discussing it with knowledgeable Drupal hackers.

Two days ago, the French government released thousands of data sets on http://data.gouv.fr/ under an open licensing scheme that allows people to access and play with them. Thanks to the CubicWeb semantic web framework, it took us only a couple hours to put some of that open data to good use. Here is how we mapped the french railway system.

The Location object is used for both train stations and level crossings. It has a name (text information), a latitude and a longitude (numeric information), it can be linked to multiple FeatureType objects and to a DataGovSource. The FeatureType object is used to store the type of train station or level crossing and is defined by a name (text information). The DataGovSource object is defined by a name, a description and a uri used to link back to the source data on data.gouv.fr.

CubicWeb allows to build complex applications by assembling existing components (called cubes). Here we used a cube that
wraps the Mapstraction and the OpenLayers libraries to display information on maps using data from OpenStreetMap.

In order for the Location type defined in the data model to be displayable on a map, it is sufficient to write the following adapter: