"isPublic" = true if the document is available to all (all other fields are ignored if this is true)

"denyACLs” for users and groups specifically denied read access (takes precedence over all other ACL fields)

“allowACLs” for users and groups who are allowed to read the document (includes inherited ACLs, if any)

“parentACLs” for users and groups from a parent container (such as the database or space) that must also be checked before allowing access

3. All documents are then submitted to the indexer.

ACL fields are indexed as standard search engine fields.

Note that the connector may also need to handle incremental indexing, name mangling, ACL encoding and other processing, depending on the requirements of the content source and the search engine. See my “indexing ACLs” article for more details.

The Algorithm, Part 2: Query Path
Early Binding requires modifying the query before it is submitted to be executed by the search engine. This can occur either inside or outside the search engine server. The algorithm for this is as follows:

1. Authenticate the query request

This is often performed by the web application that manages the user interface.

There are multiple technologies available for authentication, including:

Basic authentication: Prompt for a username & password

Single Sign-on (SSO), Using NTLM, NTLM2, Kerberos, or other tools such as SiteMinder

The result will be the username of the user who is submitting the request.

2. Gather all groups to which the user is a member (called “Group Expansion”), including:

Groups from LDAP or Active Directory

Groups from every content source indexed into the search engine

Nested groups (where groups are members of other groups)

A group cache is usually required to perform this with speed and reliability.

3. Modify the query.

The query from the original user is modified to include a clause that filters out all documents for which the user does not have read access

Once the query is correctly modified, it can be executed like any other search query.

5. Return the results

The standard search engine results are returned, including the total document count, facets (with facet values and counts) and search results with metadata fields.

Because the results have been filtered by the security filter, only documents to which the user has read access are returned and included in the counts.

It is critical to note that all of these steps must be performed in a secure area on the server, either in the user interface server or inside the search engine server itself. This is necessary to make it impossible for the user to tamper with the HTTP transaction URL (or other data) to give themselves more access rights than they would normally have.

So, we are well past the halfway point in our Graduate Course on document-level security in enterprise search. The next article in this series will address the indexing of Access Control Lists.