Sometimes I can be a bit short-sighted. AdFind and LDAP_SEARCH_S: Size Limit Exceeded

Yesterday I received an email from one of the faithful joeware customer’s with an issue that when I first started reading confused me and then as I continued reading I was reminded of my earlier, eviler, more maverick days… The days when men were men and sheep were scared, at least in New Zealand and parts of Australia and the Netherlands. Of course we are talking about the early years of Windows 2000… But let’s have Art explain a little about what he saw…

…we discovered an interesting issue with ADFIND last week in our SYS AD forest (systems testing). Turns out ADFIND and a number of our scripts stopped working for some reason. We are able to run the tool in two other forests that mirror SYS and ADFIND runs with no issues. When we attempt to perform a query with ADFIND in SYS it returns the below error:

You may think it is odd that I started chuckling, but I did because I knew for sure I was to blame. I had been stupid. Or at least severely myopic. But I continued reading anyway…

Something in the schema is clearly causing the size limit error. Recently we extended the schema in SYS for Exchange 2007 SP2, OCS 2007 R2, and SCCM 2007. These changes have NOT yet been introduced in our 2 other forests and are scheduled for implementation in April and May.

And further along the email…

Lab time. I took this into an isolated AD environment (vanilla install of win2k3 sp2 / default AD query values) and using ADSchemaAnalyzer exported our production AD schema (the one without Exchange 2007 SP2/OCS 2007 R2/SCCM 2007) and imported into my lab. ADFIND works as expected and returns the results for all queries. I then extended the schema for Exchange 2007 SP2. Immediately after the schema extension I am able to reproduce the issue and see the same size limit exceeded error:

and still further

Same failure, immediately after ADFIND loads the OIDs from the schema. I decided to change the MaxPageSize in Active Directory to 10, 000, reboot and ADFIND works again. I change it back to 1000, reboot and ADFIND fails. After digging around the ADFIND help file I noticed the tool includes a switch ‘-dloid’:

And finally

ADFIND returns all the user objects in my isolated lab forest. I then ran the same query in our SYS forest and ADFIND and all our ADFIND dependant scripts start working again. Interesting that by either adding the -dloid switch or changing the MaxPageSize in AD to 10,000 immediately fixes the issue. Changing the MaxPageSize to 10,000 will not be possible given MS recommendations and Exchange requirements for this value to be set at 1000. For the time being I’m using the DLOID switch.

Any idea what could be causing the failure without the DLOID switch? is it something specific to the Exchange 2007 SP2 schema upgrade? or did we exceed some sort of limit somewhere? let me know if there are any additional logs we can gather for you.

As I mentioned, I knew pretty early on that I was at fault and every paragraph after that re-emphasized that it was my problem. I was happy to see the depth and quality of Art’s debugging of the issue. Great job Art. 🙂

The history behind the issue… back in the wild days of Windows 2000 when I first put AdFind together, I knew of no other way to determine how to decode various attributes than to have a set of mapping structures to identify the attributes when they were encountered. That way I knew which attributes to decode as time fields, which were SIDs, which were GUIDs, which were binary, etc. I would manually update the maps as I found the attributes. This worked ok initially but once I went north of 400 separate requests for people who wanted me to add “their” custom attributes I decided I needed to do something else and I implemented it in V01.09.00 in 2002.

That something else ended up as a routine that tore through various attributes in the schema looking for the types that I cared about and building the attribute maps dynamically on the fly. The –dloid switch avoids that process. Even if you use –dloid, there are still some attributes hardcoded into the maps so there will be some decoding that occurs, just not all of the decoding that could occur.

Anyway… the stupid assumption I made, and obviously now quite embarrassed by, is that I used a non-paged search to pull the attributes I wanted from the schema. Back when I did it I knew that paged searches were goodness and should be used but I never really thought it would be an issue for this specific query… I mean seriously, even in a default Windows Server 2003 schema, there are only 293 attributes that I pick up for my query and the failure doesn’t occur until after 1000 attributes of type DN, binary, or SID… Lots of head room right? Should be safe right? Wrong… And this is why we tell people to use paged queries when requesting information from Active Directory, this very reason.

So I fixed the issue by changing the section that pulls the info from schema to use a paged query just like I should have used in the first place. Ran the new binary against my “hacked” test directory with a MaxPageSize of 10 entries and it worked like a champ. I sent the link to the beta binary to Art and he tested it in his environment and he is working A-OK again.

If anyone else is having the issue and wants to try the beta, you can find the new betas for AdFind and AdMod at