Putting XML in LDAP with LDAPHttp

A software project is like a journey, and in this article I want to bring you along as a passenger. Foremost, I intend to describe the process of writing an application using my own LDAPHttp framework and gateway, a set of Java classes based on the Netscape/Mozilla LDAP SDK that provide simple MVC abstraction to use directory database back ends through Java servlets. The suggested app involves reading news and weblog feeds to create new data, so I will also get to touch on parsing RSS. Although the actual functionality of this little example may seem limited (and the overall approach unorthodox), hopefully when I'm done, the question of why will seem as unimportant as the general idea of combining XML with LDAP-driven models seems natural. This is not a how-to for LDAPHttp. A developer's guide will be forthcoming, I promise, but here I plan to skip the details in favor of the flavor.

The Briefest Introduction to Directories

The Lightweight Directory Access Protocol (LDAP) is an Internet standard for obtaining and manipulating data in directory databases through TCP/IP. Descending from the X.500 standard, an LDAP directory service is very different from a traditional RDBMS but can perform many of the same tasks at comparable speeds, with lower complexity and overhead. Directories are predominantly used to centralize user-contact and account information, but can be distributed, replicated, and extended to satisfy a wide variety of needs. LDAP is defined by several public RFCs and is implemented in server products by many major vendors, including Sun, IBM, Oracle, Microsoft, and Novell. OpenLDAP is a free, open source directory client and server offering based historically on the code developed at the University of Michigan in the original LDAP project. Because of the open nature and maturity of the standard, hooks into LDAP are manifest in a plethora of operating systems, tools, and development languages. There are also lots of books on LDAP, including ORA's own LDAP System Administration.

A central schema defines object classes and attributes for use in the associated directory database. The atomic unit of a directory, an entry, is essentially a uniquely identified collection of attributes, each of which may be assigned a value or values. Every entry participates in one or more object classes, determining for which attributes it must or may hold values. Directory databases are composed of entries organized in a hierarchical structure called a directory information tree (DIT). Figure 1 shows the portion of my DIT that pertains to this example:

Figure 1. An example DIT

The boxes represent both entries in the database and nodes in the tree. The labels are significant attribute/value pairs. Each entry is identified with a handle, called a distinguished name (or dn), which is built from its location in the DIT. For instance, the account for charlie in this tree is referenced by the dnuid=charlie,ou=Generic,ou=People,o=mentata.com. Entries can be described, imported, and exported in the LDAP Data Interchange Format (LDIF), which identifies the entry and then lists the pairs. Meet Charlie:

As you might guess, Charlie's game is journalism, not network security. This will be his project, so I'm giving Charlie (and thee) the keys to his own branch of entries in my DIT.

access to dn.subtree="ou=Reps,ou=Comments,ou=Expressions,o=mentata.com"
by dn="uid=charlie,ou=Generic,ou=People,o=mentata.com" write
by * read

A Pundit's Plan

Charlie yearns to express himself, but wants to start small by writing editorial commentary on weblogs and news stories he finds on the Internet. Of course, he could post comments on all of these disparate sites, but Charlie wants his own empire, so he's decided to consolidate his work on his own presence with references back to the sources. His idea is to grab titles, links, and descriptions for selected items from news or weblog feeds, add his own two cents, and store the results as an entry in my directory. Somewhere between a reply and a rip, we'll call these entries Reps (RSS extracted postings?) and put them in the DIT under ou=Reps,ou=Comments,ou=Expressions,o=mentata.com.

Default schemas delivered with a directory server typically come from standards bodies, but this type of entry requires something unique. I happen to already have a custom
OpenLDAP schema file that will do the trick, so we'll just reuse it and make a Rep just another expression. Once added, a Rep entry may look something like this:

dn: uid=CRA3,ou=Reps,ou=Comments,ou=Expressions,o=mentata.com
uid: CRA3
cn: India putting the boots to Microsoft.
businesscategory: ora
dnqualifier: 20030530161712Z
description: The Times of India is reporting: "President A P J
Abdul Kalam on Wednesday urged Indian IT
professionals to develop and specialise in open source code software
rather than use proprietary solutions based on systems such as
Microsoft Windows." Steve Mallett
link: http://www.oreillynet.com/pub/wlg/3248
content: India is smart about software quality, so this is
quite an endorsement!

Supporting the Gateway Theory

Charlie has no intention of hand-coding LDIF files any more than he wants to read RSS feeds himself, so some software is clearly required. Let's use mine :) The mentata.ldaphttp package defines a framework but doesn't deliver any functionality by itself. It includes a web gateway, which is composed of a second package, mentata.gateway, and a half-dozen servlets. Self-contained and simple, I like to consider the gateway an example LDAPHttp application that can solve many primitive directory needs, such as creating or searching entries. Rather than write a new application from scratch, we'll keep things easy and meet these requirements with the gateway.

The building blocks for LDAPHttp applications are contexts, packages of Java classes under mentata.ldaphttp in the class hierarchy visible to the JVM of the servlet container. I happened to be working on a context called forum for needs similar to Charlie's. Figure 2 contains a view of the relevant portion, where the necessary classes will reside:

Figure 2. My LDAPHttp forum package

A localContext class is required for all LDAPHttp context packages, and it's used to configure the controller with basic information about the target LDAP server and its DIT. Every other class above represents an LDAPHttp object — an abstraction for a type of entry. It's all about inheritance: a comment is a simple textual expression, while a solicitation would be a comment with a form for adding responses (more comments). Reps will therefore be solicitations that are created, in part, by reading an RSS feed. We will further subclass rep to define very small objects to represent the peculiarities of each source, in this case, news from The Register (registerrep), Lawrence Lessig's weblog (lessigrep), and the weblogs from O'Reilly Meerkat (orarep).

The LDAPHttp gateway servlets all use a straightforward grammar for naming the context and object, as well as perhaps the identifier or attribute, germane to a request. Hence, the URL for the action in the HTML form to add a new entry with the create servlet is as simple as http://server/path/create/forum/orarep. The only other things we'll want to submit with the creation request are the ID for the particular reference article and Charlie's own remarks, of course. The rest will come right off of the Internet.