Recently a SE Asian software development company was building an AD-integrated application for a customer, and they were running into some problems with manipulating group memberships. They had gone back and forth with the customer several times and after making no progress (and pissing off the customer), the software company concluded they needed to set up an AD test environment that mirrored the customer’s AD. All they had was an unhappy customer, a looming deadline, and a LDIF file. Lacking any AD experience, they asked me to help out.

Building AD test environments is a problem near and dear to my heart, going back to the first AD Disaster Recovery Lab Guido, Ulf, Jorge and I put together at Directory Experts Conference. Even though we eventually sorted out the mechanics of delivering the hands-on lab, the process of creating the AD environment for the lab was never straightforward and required a lot more hand-tweaking than I thought it should. Creating an AD test environment modelled on a production environment seemed like another flavour of the same problem. I’d been looking for an excuse to dig into PowerShell and the Hyper-V cmdlets, and my Filipino colleagues had provided just the spark I needed.

The tooling for building virtual environments has improved dramatically since that first DEC AD DR lab. Hyper-V has grown into a strong virtualization platform, PowerShell has become an extremely capable scripting language, and AD itself has added several improvements (e.g. to DCPROMO) to make it easier to automate. All the tools one would need to completely automate building an AD test environment seem to be in place.

The broad technical goals for my little project are as follows:

Accept LDIF files from the Config, Schema, and one or more domain NCs

Provision Hyper-V VMs for the domain controllers defined in the source Config NC. Has to be configurable to avoid creating too many DCs.

Run DCPROMO to create the domain controllers based on source domain NC information

Extend the schema based on the LDIF from the source Schema NC

Abstract, de-personalize, and load the source domain data, including users, groups, OUs, and group policy information

Do all of this in an entirely hands-off manner

The result should provide a mechanism where you can take LDIF files from a customer and produce a working Active Directory environment that mirrors the customer’s environment in all of the important aspects.

In this first post I’ll cover parsing LDIF files with PowerShell to produce something useful for automation. Subsequent posts will describe some of the new SYSPREP capabilities (needed to efficiently clone Hyper-V images), using remote PowerShell with VMs, the new DCPROMO PowerShell cmdlets, extending the schema, using PowerShell to create AD objects, and exporting and importing group policy with PowerShell.

TL;DR: Get the Bits

The Get-LDIFRecords cmdlet and the LDIFDistinguishedName class are packaged up in the LDIF PowerShell module. You can download the module at https://github.com/GilKirkpatrick/LDIFPowerShell. If you’re not looking to contribute to the source code, just click the “Download ZIP” button on the main project page. Copy the Module folder to wherever you keep your installable PowerShell modules, and use the Import-Module command to import the modules into your PowerShell environment.

Lightweight Directory Interchange Format (LDIF)

LDIF (Lightweight Directory Interchange Format) is a text file format for representing LDAP directory and LDAP operations. It is defined in RFC 2849, and Microsoft provides the LDIFDE command-line program to manipulate LDIF files. Linux and Unix LDAP systems usually provide programs called ldapsearch, ldapmodify, and ldapdelete that do roughly the same thing.

LDIF is a straightforward format, but it is annoying to parse using a line oriented scripting language. Each record in an LDIF can be either an object record representing an entire directory entry, or an operation record representing an LDAP operation and its parameters. Here’s a sample taken from an LDIF dump of the Config NC (with some information removed for brevity):

Each LDIF record consists of multiple lines, each of the form <attribute name> : <value>.

The first line of each record is dn: <distinguished name>. It identifies the object, or the target of the operation.

Each record is terminated by a blank line or the end of file.

Lines that exceed 80 characters wrap and the subsequent line fragments each start with a space.

Multiple values of an attribute appear each on their own line.

Binary values use a double colon “::” and are encoded in Base-64.

The distinguished name of each object is a string, and for this project I need to handle DNs not as strings but as objects in their own right with some special capabilities. In particular, I need to be able to get the RDN (relative distinguished name) and parent DN from a DN, and get the number of segments in a DN (you’ll see why later). Although it isn’t difficult to write this sort of code in PowerShell, I opted to write a module implementing a LDIFDistinguishedName class in C# just because it seemed easier.

Parsing LDIF Files with the Get-LDIFRecords Cmdlet

The Get-LDIFRecords cmdlet parses LDIF files and presents the LDIF records on the PowerShell pipeline so they can be processed by the usual PowerShell functions like Where and Select. Here’s the documentation for Get-LDIFRecords:

SYNOPSIS

Parses an LDIF file and produces a set of hashes on the pipeline corresponding to the LDIF records.

SYNTAX

DESCRIPTION

Get-LDIFRecords parses an LDIF file and produces a set of hashes in the PowerShell pipeline corresponding to the LDIF records in the file. Each record is a PowerShell hash containing name-value pairs corresponding to the attributes of the LDIF record. The value of each attribute is provided as an array of strings, one array entry per attribute value. The dn: (distinguished name) attribute is provided as a LDIFDistinguishedName object.

Note 1: Even attributes that have only a single value are provided as arrays, unless you use the AsScalar parameter.

Note 2: The dn entry is an object of class PSDistinguishedName, and not a string. The distinguishedName attribute however is an array containing a single string.

Note 3: The objectGUID entry is an object of class System.Guid

PARAMETERS

-InputFile <String>

The name of the input LDIF file to process.

-AsScalar <String[]>

Because Get-LDIFRecords does not know a priori whether an attribute is single or multi-valued, it provides each attribute as an array of strings. This can be inconvenient in some cases where you know the attribute is defined in the schema as single valued. The AsScalar parameter is an array of attribute names that Get-LDIFRecords will process as single valued. If an attribute specified by AsScalar actually has multiple values in the LDIF file, Get-LDIFRecords will provide only the last value encountered in the LDIF file.

Using Get-LDIFRecords

We can parse LDIF files… so what? Well, because Get-LDIFRecords puts generic hash tables on the pipeline, we can use all the compositional goodness of PowerShell to do some cool things. For instance, let’s say we just want the user objects from our LDIF file.

Get-LDIFRecords domain.ldif | Where {$_.objectClass –eq ‘user’ }

Or let’s say you just want everyone’s first and last name, sorted by last name.

The possibilities are endless. By combining Get-LDIFRecords and the other PowerShell pipeline functions, you can get nearly the same flexibility as querying Active Directory directly with LDAP. You can see how this capability will play into building our AD test environment.

The LDIFDistinguishedName Class

This is a good place to talk about the LDIFDistinguishedName class that I use for the dn: value of each LDAP record. The LDIFDistinguishedName class provides methods to get at the components of an Active Directory distinguished name. Each DN is comprised of two parts: the Relative Distinguished Name (RDN) that identifies the object within its parent container, and the parent DN. For instance, the DN for a user might be something like this:

CN=Smith, Roger,CN=Users,OU=Accounting,OU=Corp,DC=Megaco,DC=com

In this case the RDN is “CN=Smith, Roger”, and the parent DN is “CN=Users,OU=Accounting,OU=Corp,DC=Megaco,DC=com”. The LDIFDistinguishedName class provides methods to extract the RDN and a Parent method that returns another LDIFDistinguishedName object corresponding to the container’s DN. The LDIFDistinguishedName class includes several other methods for dealing with distinguished names.

LDIFDistinguishedName Members

DN

Returns the distinguished as a string

RDN

Returns the Relative Distinguished Name of the distinguished name

ParentHierarchy

Returns an array of LDIFDistinguishedName objects representing the parent hierarchy of the distinguished name

Parent

Returns a LDIFDistinguishedName object representing the parent of the distinguished name

NameType

Returns the type (attribute) of the RDN. If the DN is “cn=foo,ou=bar”, NameType returns “cn”.

Name

Returns the name (value) of the RDN. If the DN is “cn=foo,ou=bar”, NameType returns “foo”.

Depth

Returns an integer representing the depth in the naming hierarchy of the distinguished name. If the DN is “cn=foo,ou=bar”, Depth returns 2.

The LDIFDistinguishedName class makes several aspects of building an AD test environment simpler. For instance, let’s say we want to build the OU and container hierarchy from the contents of a LDIF file. The problem you immediately run into is that you can’t guarantee the ordering of the records in the LDIF file, which means you might be in a position of trying to create an object before its parent container has been created. The usual solution is to make multiple passes over the set of containers until you get all of them created. The LDIFDistinguishedName class simplifies this problem:

The Where statement selects all of the LDIF records corresponding to containers and organizationalUnits. The Select statement is interesting: it selects the objectClass and dn properties, and adds a calculated property called “depth”, which is the depth in the hierarchy of the distinguished name. The Sort statement sorts the pipeline by the depth property so that the higher level containers appear first. Finally, the For-Each statement creates the container objects from the top level on down, using the objectClass, dn.Name, and dn.Parent properties from the pipeline. Note that because objectClass is an array, I reference the last element of the array using $_.objectClass[-1]. This should be the most specialized class, e.g. “organizationalUnit”. Although the ordering of multi-values is undefined by the LDAP specification, in my experience Active Directory always presents the values of the objectClass attribute from least-specialized (e.g. “top”) to most-specialized.