An Introduction to LDAP

A year or two ago, it seemed like everyone had heard about LDAP, and quite
a few people were talking about it, but no one was really doing anything with
it.

That seems to finally be changing, which is especially good for
administrators and developers. LDAP can
play a vital role in networks of all sizes, but like most new technology,
it suffers from the Catch-22 of no one using it because it's not
supported and developers not supporting it because no one is using it.

To understand why and how LDAP is going to be such an important tool
in the life of a network administrator, it is necessary
to understand what problems LDAP was developed to solve and how it will do
so. This means it is also necessary to understand LDAP itself,
both as a technology and as a tool.

Because of the difficulty in truly separating the job of system administrator from the job of network administrator, and because there is often so much
cross-over, especially in smaller environments, I will generally refer to
people of either category as network administrators. I choose
"network administrator" as the generic term because a quality system or
network administrator is concerned with the entire network of devices or
systems, not individual nodes. The term "network administrator" places more
stress on viewing the network as a whole, making it a more appropriate term.

In this introductory article, I hope to introduce LDAP and the concept
of online directories, and explain why you might want them and what you
can do with them. In later articles, I'll provide
a more in-depth technical explanation of how to use LDAP, along with some
example applications.

What is LDAP?

LDAP is the latest iteration in a rather lengthy development process beginning
with the X.500 directory specification and its corresponding Directory Access
Protocol (DAP) in the late 1980s and early 1990s. (For a more complete history of LDAP, and more information in general, see "Understanding and Deploying LDAP Directory Services", by T. Howes, M. Smith, and G. Good.)

DAP was a consistently difficult protocol to work with and implement, so
easier protocols were developed with most of its functionality but
significantly less complexity. Eventually, these versions were
passed on to the IETF and
OSI-DS and got merged into the Lightweight Directory
Access Protocol, or LDAP, specification, first published as RFC
1487 in 1993.
LDAP gained some widespread use in version 2, specified in RFC
1777.

LDAP is a protocol definition for accessing specialized databases called
directories. It is similar to SQL in that it is a language for
interacting with databases without specifying a particular database. In fact, the back-end for LDAP directories is nearly
always a more general RDBMS system, such as LDBM or Oracle.

Using LDAP to interact with a database does place constraints on that database,
because of the assumptions the protocol makes and the specialized needs of
a directory versus a standard relational database. But these constraints
are necessary to be able to gain all of the desired features of a
directory.

What is a directory, and why does my network need one?

Use of the word "directory" in this context may confuse people into thinking
that most networks currently don't make use of directories, or that LDAP will be
the only directory on the network. In actuality, directories are already a mainstay
of life, especially in the computer world. Most people are familiar with talking
about a phonebook or a map of the mall as a directory, but for some reason
insist upon using the term "database" in computing, even when directory would be
more specific and correct. As an example, I call Unix's system of storing user
information the "passwd database," but that database easily qualifies as a
directory.

At its most basic definition, a directory is any database specialized more
for reading than for writing: The phonebook only comes out once a year,
the mall directory is only changed when stores change, and the passwd
database is only updated when user information is changed -- but all of that
information is read frequently. The definition really does not get
more specific than that because there are so many different information
stores which can qualify as directories, although generally speaking a
directory is much more likely to be searched than browsed. The listing most
often referred to as a directory in the technical industry, a file system
directory, also fits this
definition, because the directory is read whenever a listed file is
accessed in any way, but is only written when files are created or destroyed.
Also, the directory is far more likely to be read when searching for a specific
file rather than just browsing the listing.

Almost anyone involved in the development or maintenance of networked
applications or services is already
working with at least one directory: a system of maintaining user
information. Nearly all services require some sort of authentication
services, thus mandating that those same services maintain a user directory. Because of this, the most common form of online directory is for user
information. Directories are useful for a larger variety of information
than that, though. For instance, Unix systems maintain directories for
services, groups, and many other data types, the Domain Name System (DNS) is
a very specialized global directory for host-name-to-address correlation, and
web directories used for navigating sets of web sites are springing up
everywhere.

If I'm already using directories, why switch to LDAP?

We have already seen that almost any network uses a variety of
directories, often for specific services. In database-speak, that means the directory data is not normalized, which means many pieces
of data are stored in more than one place, and thus must be changed in
more than one place when changes are necessary.

This is a problem for many reasons. The most obvious is that every time
any information in any of these directories gets changed, all of the other
directories must be hunted through to make that same change. This is not only
difficult, it is often completely unmanageable -- witness the ease with which
passwords for the same user in different services go out of sync.

Beyond that, every time two services implement their own versions of the
same style of directory, there is significant redundant effort. Not only
did the developers for each service have to develop their own directory,
but now the managers of each service have to separately maintain the
directories -- a single user of both services will almost certainly have
a different user experience with each service, and it is nearly impossible
to centrally manage these multiple directories.

Security is an even worse problem. Probably every developer and administrator
is familiar with the headaches associated with user security. Are the
passwords secure? Is the transport secure? Has the user really proved his/her
identity? Did I accidentally leave a loophole to gain higher access? Will
the user directory always be available? When multiple services implement
separate directories for the same information, each of them must completely
cover the security issues. Basic statistics virtually guarantee more
security holes
than a unified directory, and this also means that the services are likely
to have differing and even conflicting security policies.

What you really want in a network is unification of directories, and this
is exactly what LDAP was designed for. With this unification, you get
data normalization, central management, consistent user experience,
consistent management and security policies, fewer security holes, and less
wasted development time.