In this article, I'll move further on to make use of the technique to tackle the authorization and row based access control problem. This is an important target to conquer because a good solution to the problem would, potentially, be reusable in many future projects as most serious enterprise applications need such a sub-system one way or another.

Here are the design goals that I set, at the beginning, for such a sub-system, which I called AFCAS:

The sub-system should be a general purpose component so that it is re-usable without significant modifications. This implies also that it should be able to stand alone, with no dependency to the objects that it is supposed to secure access to.

The sub-system should support MS Windows like Access Control Lists (ACL), which will then allow definition of operations like Add, Delete, Update, View, etc for each of resources such as rows in database tables. This is actually strongly related to goal #1 as such a system would be broadly applicable.

The sub-system should have negligible effect on the design of the rest of the system.

The sub-system must support row-level secured access to database tables with a reasonable performance hit i.e. the number of records on the secured tables must not have a major performance impact, even on tables with multi-million rows.

The sub-system should enable a transparent solution for reporting and analysis tools. More specifically, the sub-system must enable automatic authorization filtering using database views so that our reporting tools are transparently row-level security enabled.

On top of these higher level goals, I set the following more specific requirements:

The sub-system should allow use of existing Active Directory Groups as the security principals. I.e. it should be able to use MS Windows identity of the users and also must support groups that have other groups as members. That feature is essential to support both enterprise level roles and application roles that have enterprise roles as members; and full support for nested Active Directory groups, in general.

The sub-system should support hierarchies for both Operations and Resources. This would make the component usable in more scenarios.

The sub-system should provide good support for off-line scenarios.

The following are considered unnecessary or too costly to implement:

Capability to express arbitrary authorization rules: That would mean allowing a scripting language or some kind of authorization rule plug-in architecture. Both of which adds major complexity without a lot gains. Besides, it is usually quite difficult to classify such rules as authorization aspects; they usually are core business aspects and can best be implemented as part of the business layer.[1] Considerable performance hit it may incur in the case of row based access was another reason for omission of this capability.

Capability to secure individual attributes of a given row: on the grounds that this is not a very common requirement and would make the design considerably more complicated. Furthermore, it is probably a better database design to factor-out such columns into their own dedicated tables.

An authentication component to complement authorization: Because authentication is considered best handled on the OS and/or development platform level and there are built-in support in all major platforms.

A labelling schema that would allow associating multiple authorization markers with the resources: Because such a system would require a label evaluation procedure on the where clause of select statements and hence create a major performance hit for large datasets (due to table scans instead of index look-ups). Furthermore, resource hierarchies can be used to remove the need for such a schema. See the section "Emulating Multiple Labels" for an example.

Goal #6 cannot be easily achieved without writing extended stored procedures in SQL Server that would return groups of a given user and vice versa. Even if we write such extended stored procedures, it would then lock the component to SQL Server, and more importantly, would create a significant performance hit (because the SQL engine may not be able to use indices and falls back to table scans if such stored procedures are used in WHERE clauses). So, I decided to use an agent to regularly replicate the Active Directory users and groups to the SQL database. This has also the nice side effect of allowing seamless mixing of Active Directory users and groups with application defined users and groups, which is a common scenario for public facing web applications: Use of Active Directory and integrated windows security for intranet and application level defined users for external parties.

The general question that an authorization subsystem has to answer is the following: Is the userP authorized to perform the operationO on the resourceR?

This question can be modelled as checking for existence of a tuple (P, O, R) from a repository, in our case an SQL database that stores authorization information in the form of such tuples. Obviously, the SQL required for checking the existence of such a tuple is quite easy to construct. If we assume that we represent the identities (primary keys) of the user with P, operation O, and resource with R and store such predicates in a table called AccessPredicate, then the SQL query in Listing 1 can be used to get the required answer: If it returns a non-empty set then the authorization is granted otherwise not.

Unfortunately, the solution to this problem is not that simple. As you might have already seen, there is a fundamental flaw in the query in Listing 1: What if the user itself is not granted access, but one of the groups that the user is a member of is? E.g. the user is a member of technicians group and the access was granted to that group[2].

Of course, the query would return an incomplete list in this case and hence incorrect authorization.

It is possible to deduce that the user indeed has the access rights by first querying all the groups the user has a membership to, then running the query in Listing 1 against all the groups that user is a member of, then returning the union of all query results while replacing all groups with P. If the resulting set contains the tuple (P, O, R) then the access is granted, otherwise not. Unfortunately, there is still a problem with this approach: What if the groups that the user are a member of are themselves members of other groups, and the access was actually granted to one those grand parent groups? In short, all such attempts are futile because SQL does not support querying such data which have recursive relations.

To solve this recursion problem and accommodate SQL's inability to express recursive relations, AFCAS relies on the storage of full closure of relational information of DAGs. The design here uses a slightly modified (using the deletion method described by Guozhu Dong et al. to minimize the impact on database size) version of the DAG closure storage technique that I described in my previous article, for all of principal, operation, and resource DAG structures. The technique allows us to use plain SQL to get answers to any question about group, operation, or resource recursive relationships. (Please refer to the section How to Use the Edge Table in that article and Figure 5 and Listing 4 there to see an example.) For example, the following SQL can be used to get all the memberships of the user Jale in Figure 5 of that article:

SELECT EndVertex
FROM Edge
WHERE StartVertex = 'Jale'

Listing 2: SQL for getting group memberships of user “Jale”; see article for more details

Therefore the authorization problem can now be reduced to creation of a query that somehow combines the queries in Listing 1 and Listing 2.

Please note that the query in Listing 1 implicitly assumes that operations and resources are simple objects without any of the recursive relationships that the users and groups have. That is certainly a big restriction, especially for the case of operations, because allowing a hierarchy of operations would certainly reduce the administration and programming efforts spent for authorization, tremendously. So, let us try to get a formula for the more general case that both the operations and the resources are allowed to have recursive relations, as well, like the simple hierarchies shown in Figure 1.

Assume that hierarchies of operations and resources also have the same semantics as the group hierarchies i.e. Operation O includes O1; Resource R includes R1, etc. Assume also that a grant is issued to group G for access to resource R for the operation O. Let’s denote that with the tuple (G, O, R). Then the existence of the tuple (G, O, R) implies the existence of tuples composed of all the possible combinations of the entities in Figure 1, like (G1, O, R), (G2, O, R), (U, O, R), etc. Therefore, any check for authorization for any of the children of G, O, or R must ensure checks for existence of tuples made of all of combinations of vertices in Figure 1. Conversely, we would have to form all combinations of direct and implied parent vertices of these entities, if we want to check just the existence of an implied or direct authorization tuple e.g. (U, O1, R2).

With the principles laid out in the previous paragraph, let us now construct the query that would list all direct or implied authorizations for the user U1 for the operation O1 on resource R2 i.e. the tuple (U1, O1, R2). Here is the resulting query in Listing 3:

Listing 3: The query for returning all direct or implied authorization grants for a given triple (U1, O1, R2)

Please note that, we assumed hierarchy for all entity types are expressed in the same Edge table, using a Source column to separate different types of entities. This query runs fast for a given tuple of (P, O, R). So, it is a good candidate for writing an IsAuthorized stored procedure or an SQL function. It, however, cannot be used for row-level security, as it is necessary to know, in advance, for which tuple we are looking for.

An efficient row level security mechanism can only be implemented on the database level because moving data to the client or middle-tier and then filtering unauthorized data has the following consequences:

It is not scalable: Although it may be acceptable to move all records from database to client or middle-tier for small sets of data, it just will not work if the data to be secured is more than several thousand records.

It will not allow reporting and analysis software work as intended: I am not sure if such reporting applications for ad-hoc reporting and analysis can even function as intended with row-level security requirements, if we do authorization checks in the middle-tier or the client.

It is less secure: When we want to secure data, we usually do it for some good reason. Allowing un-authorized data to be transferred to the client, would pose a security threat as the client software can easily be manipulated to reveal all data it receives from the server, or network sniffing techniques can be used.

Therefore, it is not an option to delegate responsibility of authorization checking to the middle-tier or the client for serious applications. Authorization checks must be done in database level.

However, our query in Listing 3 is not suitable for filtering out unauthorized rows, as noted before. It just allows us to determine whether a user is authorized for a specific resource or not. We need a mechanism that would list all the authorized resources for a given user.

A database view is a good candidate as it can be used to filter-out unauthorized rows from the resource tables. In fact, a very robust security wall can be constructed on the database level using this technique along with an integrated DB server authentication schema.

Listing 4 is the query that generalizes the query in Listing 3. It returns all direct and implied tuples for the given set of explicit authorization grants stored in the AccessPredicate table.

Please note that the query in Listing 4 also assumes that hierarchy of resources are stored in the same Edge table, but, obviously, that can be done on a separate table.

You may have already started to think that for most database applications resources are not hierarchies but a simple row in a table. For this case, we can simplify the query in Listing 4 and make it a bit faster. The simplified query is shown in Listing 5:

Listing 5: Query to return all explicit and implied authorizations with flat resources

A further simplification would be elimination of hierarchy from the operations. That makes a lot of sense if we already know that all the operations for viewing the data are not hierarchical and this view is to be used only for row level security. The resulting SQL code is in Listing 6:

Listing 7: Query to return all explicit and implied authorizations with flat operations

Once we formulated our FlatGrantListXXX views, it is now quite easy to filter out unauthorized resource rows by simply constructing a view for that specific resource that joins with FlatGrantListXXX and puts a where clause on the user identity and operation and a special view operation defined for each of the secured tables). There are two ways of achieving this, both with some advantages and disadvantages:

Use a stored procedure that would return the authorized rows, which expects the user identity as a parameter: This solution is applicable only if there is a middle-tier component that is the only passage to the database. All the users can share the same database connection string and hence a scalable solution due to possibility of connection pooling. However, it cannot be used if the users are required to have direct access to the database due to ad-hoc reporting and data analysis requirements as they can send just any identity they want to the server. Furthermore, ad-hoc reporting tools cannot effectively use stored procedures and make SQL joins in this scenario (they would have to pull all the data and then make joins on reporting server, which may not be feasible)

Use database integrated security and build secured views for each resource: This solution requires all the database access rights be removed from the underlying tables and update and delete operations should be made using stored procedures. The main advantage of this solution is its transparency to both the application code and to the ad-hoc reporting and analysis tools. The big disadvantage is scalability. The database server would have to maintain a connection for each of the clients, which may be a problem for a large number of concurrent users. It also has a higher administration overhead, and increases attack surface of the database. The technique also requires use of Kerberos delegation on windows networks, if the application design is multi-tier. I note this here because it seems that Kerberos delegation does not work flawlessly in Windows networks (at least for me, with a Windows Server 2003 network and SQL Server 2005)

The SQL code in Listing 8 shows how both of these options can be implemented.

Please note that the code in Listing 8 is not limited to for viewing rows. One can imagine that such code can be used, for example, on a delete screen where the user is presented with the list of items that he is authorized to delete, etc. All we need is just to define an operation and grant the relevant groups to execute the operation on the desired resources.

Please note that code in Listing 8 only answers the question of getting the data from the database. Row-level security must also have a pillar for supporting database updates and inserts. Obviously, there must be some mechanism for maintaining the data in AccessPredicate table as well as resource hierarchies. Unfortunately, there is no generic solution available to this problem as it is highly application dependent. However, AFCAS provides the basic facilities for maintaining the resource hierarchies and Access Predicates. Each application would then decide whether to provide explicit authorization grant/revoke use cases with devoted screens, or make automatic authorizations depending on the properties of the user and/or resources. A better place, arguably, for automatic resource hierarchy maintenance or automatic granting would be an application specific stored procedure that is used for insert/update operations on the resource tables. But, all that depends on particular scenario.

As we are finished with the database side, we can now turn to the application interface of the system. AFCAS targets mainly database applications with tables that need to be secured for controlled access. However, AFCAS is not bound to database applications that require row-level security. It has a very sophisticated authorization schema that can satisfy most authorization needs.

Obviously, AFCAS cannot make a lot of assumptions about the resources that it is going to secure. But, there has to be a link to the resources that is understood by both the client application and AFCAS itself. The AFCAS database uses a single string identifier for each of the resources it secures for efficiency reasons. This is safe, as each persisted object row must have a unique set of attributes that can usually be converted to a string without any problems. If a conversion is required to generate the string identifier, it must be defined by the clients of AFCAS. So, Listing 9 is the basic interface that AFCAS implements:

ResourceHandle is the abstract base class that encapsulates the logic to convert a given resource to a string identifier and vice versa. The users of AFCAS would have to write sub-classes for each of the resources they want to secure and register with AFCAS. Please see the attached source code for examples.

ResourceAccessPredicate is the structure that holds the tuple (PrincipalId, OperationId, ResourceId) data for specific access grant (or denial). Its definition is in Listing 10:

The class Operation serves a similar purpose and maintains properties such as Id, Name, and Description of a particular operation.

Listing 11 is the petty enumeration ResourceAccessPredicateType that I used to specify the type of access. As you see there, AFCAS supports explicit denials which are considered to have higher precedence than grants, as is in MS Windows's ACL. I initially thought that would be a useful feature to simplify administration. However, it turned out to be one of those not-so-required features and consequently I removed its database implementation because of its performance hit[3]. However, I still left it intact on the .NET code so that it can be turned on afterwards if it really is needed.

The IAuthorizationProvider interface is plain and simple to understand and use. It allows the dependent sub-systems to ask questions using string IDs of the principal (most probably the logged-in user), the operation, and optionally a ResourceHandle. The ResourceHandle is considered optional because some operations do not have a meaningful resource context. In such cases, NullResource.Instance should be set as a convention. Here is the expected way to use AFCAS:

Please note that IAuthorizationProvider does not define any method for the maintenance of users, groups, and operations. That is the job of an administration console component for AFCAS. An interface called IAuthorizationManager is defined for management of Users, Groups, and Operations, as well as their hierarchies. An implementation of the interface is also provided. Please refer to the attached source code for details.

The only missing part is an UI that would use IAuthorizationManager interface for an administration console, which is composed of the following parts:

User and Group Management: Relying simply on AD synchronization saves us from writing and administrating console applications for user/groups management, as standard Active Directory tools can be used for such purposes. But, it is still necessary to write such a console application for outwards facing internet applications with application level users. I did not write such a tool simply because the need has risen yet.

Operations Management: The list of operations for a given application is generally fixed when the application is released to the production. The operations list changes only if there are added/removed functionality, meaning changes to the source code. So, for most applications, there is no need for an administration console for the management of operations that is intended to be used by IT operations staff.

Resource Hierarchy Management: The hierarchy of resources is generally an application specific task and most applications have their own UI for management of resource hierarchies. IAuthorizationManager interface, however, provides functionality for maintaining the hierarchies. The UI should be written by client applications.

Granting/Revoking Access: This part is highly application dependent too, and most applications should create its own UI that would also display meaningful context information about the resources to be secured. The actual work is done by AFCAS thru its IAuthorizationManager implementation.

In short, the basic AFCAS component shown here completely avoids writing an administration console. However, there still exists a need for a management console of users and groups. I'll address the issue in the future whenever I have some time to spare for it.

All the important calculations are performed at the database level. Most of its authorization calculations are delegated to the database server in the forms of SQL queries that were presented in the previous sections. The reason is that SQL is better than any imperative language when it comes to performing set algebra which is the basis of all the work done by AFCAS.

The AFCAS database is composed of four tables:

Principal: Stores the basic information about both the users and groups.

Operation: Stores the names and codes of the operations that is to be secured, the data there, once setup, is mostly static

Edge: This table stores the DAG model for the Principals, Operations, and also Resources. See my previous article for a full explanation of its uses.

AccessPredicate: Stores the explicitly granted (ort denied) access rights. This is the association table to link resources to the access rights

The complete structure of the tables as well as the code for stored procedures and C# code can found in the attached zip file to this article.

As noted before, AFCAS aims to use existing Active Directory user and group accounts. We also noted that AFCAS would do all authorization calculations in the database using plain SQL queries. To remedy the problem of associating existing AD users and groups with the ones that are defined in the AFCAS database, I created a synchronization (only from AD to SQL database) module that would replicate users and groups from AD into SQL server. It uses the objectID property of AD users and groups as the pivot for synch comparison. The frequency of synchronization can be adjusted according to needs, down to minutes. For our case, a daily sync was just fine.

ASP.NET has a fairly usable authentication and authorization infrastructure whose use, by the way, is not actually limited to the web applications. Please refer to the MSDN topic Securing ASP.NET Web Sites for more information. However, its authorization part provides a lot less than AFCAS does, especially when it comes to low-level security. Therefore, it would be very nice to use the existing authentication part of ASP.NET security and replace authorization part with AFCAS.

It seems that integration of AFCAS into ASP.NET membership and role provider model is not difficult. As I noted before, I believe using platform facilities for authentication is the best path to go. So, I do not have any plans to replace authentication part. However, an AFCAS based RoleProvider can be quite useful and I am planning to write one when I have some time to spare for it.

Many organizations that have to deal with sensitive information have adopted a categorization schema that rates how sensitive the information is like “Not classified,” “Sensitive,” “Confidential,” and “Highly Confidential.” They have further divided the information into clusters to facilitate need-to-know principle, such as geography: User A is working only on Country X data, so he can be granted access only to Country X.

The resource nesting ability of AFCAS makes it quite easy to support such categorization schemas. The sample resource hierarchy in Figure 2 shows how it can be used to support such scenarios.

Please note the “Root” node: It enables administrators to grant access to “all” data for high level executives easily and without human error. See how the country data is kept in a separate branch of its own. Once the users are ready to save a “new document” to the system, they have to determine to which country the data belongs to and then the security classification level. They can then use the AFCAS API to put the appropriate relation between the classifications and the document.

It does not make sense to put a document under more than one security category, but it is conceivable that some documents would have to be categorized under more than one country like “Doc11” in Figure 2: It is not a problem for AFCAS. But, the users must be fully aware of security implications when an item is filed under more than one major category.

Figure 2 also emphasizes the importance of an automatic granting mechanism. Without such a mechanism any authorization schema would fail if the number of resources to be secured is in the order of tens of thousands. A good default policy would be to analyse the current application roles of the user and grant view rights to all these roles automatically at the time of committing brand new data. However, that is all application dependent and certainly is not applicable in many cases. Nonetheless, the requirement for setting up an access granting policy remains as a common issue.

There are a number of potential bottlenecks in the database design of AFCAS for applications that require very high throughput. One obvious place is the Edge table as it is used to store relation information for all of Principal, Operation, and Resources. In high throughput environments, race conditions may occur to get database locks on the Edge table.

One remedy for such problem would be maintenance of a separate Edge table for each entity: namely Principal, Operation, and each of the resources that the application wants to secure. Obviously, that would require changes in the views and also duplication of stored procedure code for closure maintenance operations in the database level.

One further performance improvement would be maintenance of separate Principal and Operation entities along with separate Edge tables for each of the resources. Principals and Operations would have to be maintained in a master copy and then replicated to the resource specific copies in regular intervals. This would further reduce possible race conditions and hence improve performance. It also avoids the need for duplication of DB code as it is only necessary to maintain the master copies of Principal and Operation tables. The rest is handled by replication code (which would do some smart replication such as replicating only a sub-set of applicable operations and allowed groups of users for a given resource)

AFCAS has proven to be a useful sub-system for my development, I am hoping the same for the readers. Bug reports, suggestions, and opinions are most welcome.

[1] For example, it is not clear whether the rule “application xxx is allowed to execute only 09:00-17:00” is in authorization domain or business domain. Under close scrutiny, many such complex authorization rules turn out to be just another set of business rules to implement.

[2] A popular approach is that the authorization rules are never changed: Once the roles in the system are defined, access is granted only to those groups that resemble the application roles at the very beginning. Afterwards, only the members of the groups are allowed to change. That would simplify the administration and auditing of authorization sub-system, as this policy keeps only one variable in the authorization equation.

[3] It is quite easy to incorporate explicit denials to the queries presented before. We just need to construct the set of grants (along with their implied tuples) and then remove all the explicit denials (along with their implied tuples) using standard SQL difference operation or simply a left outer join with a where clause. I removed it from the SQL views and from IsAuthorized stored procedure because of performance concerns.