LDAP for MySQL geeks

Authentication

Some tools (like slapadd against the config directory) are used via authentication provided by your Operating System (Ubuntu for me, which I think is an odd duck for OpenLDAP).

OpenLDAP’s documentation is leary of Linux distributions. I can see why in that Ubuntu has some quirks. You use sudo and let the OS do authentication. It also keeps most of the config in an LDAP directory, instead of a config file. Whatever.

Other tools will be used with the “directory manager” username and password. There is often a special rootdn user setup. rootdn is magical and outside of ACL. Kind of like the root user of MySQL I guess.

The rootdn and the rootpw are setup specially. You can change these by changing the config and restarting the server. All other accounts are setup over LDAP. Again, like MySQL, but takes a while to figure out.

Other tools, say ldapsearch, will be used as a user within the directory. These users are subject to Access Control Lists (ACL). Speaking of ACLs…

GRANT ALL ON db1.* TO ‘homeslice’@'localhost’

ACLs are like MySQL GRANT… on 5-hour energy drink!.

There is a very rich DSL which lets you specify who has access to what. Access is very fine grained. I hope to cover this in another blog post.

Primary Keys

Remember those gnarly-ass distinguishedNames (dn)? Think of these as primary keys.

This was a stumbling block, but the dn, base dn, and default scope are actually cool features.

When I first learned RDBMS systems and tried my hand at data modeling, the natural inclination was to use composite natural keys to make a primary key. In practice you often use artificial keys.

In a way, directory services went for composite natural primary keys, which is situated hierarchically. This is actually way more humane.

So a newbie in RDBMS will look for two probably unique attributes and combine them… I’ll do John Smith and concatenate their phone numer, that should cover it. id=Fireman_JohnSmith

It’s a concatenation of how you get to that element in a hierarchical db.

You can use an arbitrary incrementing id. You could use uid=111 isntead of cn=John Smith. Contrived example above just shows how natural keys are possible.

mysqldump

slapcat is like mysqldump. It is low level and operates on the data directly. You can safely use this while slapd is online.

You can also craft ldapsearch queries to dump data, but this is slower and less complete.

How do I bulk load data?

LDIF is a data serialization format used throughout the command line tools. It is a bit like JSON or using CSV dumps from MYSQL… pretty cool..

ldapadd or slapadd can be used to bulk load LDIF data. Slapadd is faster and operates directly on the datastore, but you must stop your server. Ldapadd goes over the LDAP protocol and is safer, but slower.

Writing LDIF by hand? Beware – the LDIF parser (or standard?) totally blows. Whitespace is significant. Pythonistas rejoice, but the rules are actually unexpected. I mumble and make sure to put whitespace very carefully.

InnoDB

Just as you can configure the backend store of MySQL, in slapd the backend is configurable. Typically data is stored in multiple Berkley DBs. There are bdb or hdb flavors.

Hosting

A single MySQL install can host multiple databases. A single LDAP directory server can store multiple directory trees.

Schema

However, schema information is global and bleeds into different directory trees. This seems like a pre-web version of distributed systems. Ouch.

In RDBMS systems, when you do DDL actions they are sandboxed to the current database. This is not the case with OpenLDAP. You define a foo attribute, it is global. You define a bar objectclass in directory A, yep… it bleeds into directory B.

So this can be confusing for writing installation instruction. This is not very agile nor sane. It’s like Dewey Decimal made it into the information age.

Would you like to create a new objectclass or attributetype? You’ll need to register an OID with the central authorities. Please mail one SASE to … As we learned in Everything Is Miscellaneous, this is horribly antiquated.

Your OID, which must be globally unique, will serve as your base OID and you’ll add more numbers to it to get a globally unique object identifier per attributetype or objectclass.

You don’t have to look at OIDs very often. You can alias them to friendly names. But be aware of them.

Foreign Keys

MySQL has foreign key references. You can do the same thing by using attributes which are distinguishedName references.

You can use a dn for a value. Common attributetypes for this are the seealso or member attributes. This is super cool and like a foreign key or symbolic link.

By default LDAP doesn’t enforce referential integrity.

You can add a dn that doesn’t exist

Deleting a record doesn’t purge dn references

There is a RefInt overlay available for providing referential integrity. Overlays are like extensions and there are several available to add services in a performance sensitive manner.

People are pretty comfortable enforcing referential integrity in the application or another layer these days, so it’s all good.

Schema Migration

A pain point with RDBMS and web applications is schema evolution. Rolling out schema migrations to big data systems is a PITA. NoSQL databases are a current topic for many reasons, but this is one of the drivers.

An LDAP Directory’s schema is even more rigid than RDBMS. Reading the literature, once gets a sense that you should design it right the first time. In practice, it’s not that type of party.

Wanna change something? I haven’t found an easy to use DDL. You have to use ldapmodify and a DSL to remove attributetypes then readd them, etc. Remember, this affects every directory running under slapd. I also got a lot of errors, but maybe I fat fingered something.

I imagine it is possible that using good Emergent Design methodology and auxiliary types might combat this issue. Following the open/closed principal and such. Good luck with that one on real world teams

I’d advocate keeping the LDAP layer as thin as possible and using it only when appropriate. Data storage can be augmented with web services, RDBMS, and NoSQL backends.

Brutal Workflow

Please let me know of something that works better on Ubuntu’s OpenLDAP, but here is what I do to rapidly iterate a schema design:

The key is nuking all OpenLDAP config files as well as the low level bdb files.

DB client

I use something like DBVisualizer when working on relational databases. The equivalent is Apache Directory Studio. This app is great for poking around and learning LDAP concepts. It’s easier to use once you understand how directories work.

Conclusion

That should be enough to get the MySQL geek going on next steps with slapd.