I think that I'd like to first tackle 3) because I think it will lead down a road heading toward the other three.

Ok, so whenever we create any entity (an entity to which we want to attach permissions) we have to associate it with what we call a "Resource". The Resource in turn is what we assign the actual permissions to:

Each scope represents a different tier at which a set of Permissions (resource- actions) are granted.

1 (company scope) = this set of permissions is granted on every entity name within company primKey2 (group scope) = this set of permissions is granted on every entity name within group primKey3 (group_template scope) (let's ignore this one for now)4 (individual scope) = this set of permissions is granted on entity name identified by primKey

So, the SQL to make the association between the entity (primKey) in question and the resourceId (in order to find permissions) looks like:

This means that given a Resource, we have N records in the Permission_ table. For example, if you grant VIEW, UPDATE, DELETE on some BlogEntry to a Role, you are creating 3 rows in the Permission_ table. One for each of VIEW, UPDATE, DELETE.

The SQL to make the association between the entity (primKey) in question and it's Permissions looks like:

But would you ever do this in-line with another query??? That's not even the whole thing, we never associated the Role in there... It's simply NOT feasible. Which is why we never do it. Mind you this is about 30 lines shorter than pre permission algorithm 5.

Wouldn't it be better if this we reduced ALL OF THIS to a single DB table and as few rows as possible???

The problem with that is flexibility. Different types of resources define different sets of resource-actions (distinct operations on the resource).

Some resources have as few as 2 resource-actions, and as many as 14 (Organization).The going average in the core is around 4.

The solution was to make a pattern which didn't have a restriction like you might have with a statically defined DB table definition with a fixed number of columns in which to store permissions. This has some drawbacks including pretty complex and ugly SQL queries.

On the other hand, how many resource-actions would we ever conceivably define on an entity? 15, 20, more, less? Also note that the SQL for checking permissions in the existing design is far from pretty.

Proposed Solution

I propose to make a hard limit of 30 possible actions on any given resource. I believe this would be more than sufficient to meet any requirements we might ever come up with.

Why 30?

Old school programmers out there might have a clue and it goes back to the good old days of low level programming languages... you know, the ones they still use today when things need to happen fast and with minimal cost???

I propose to store the whole list of resource-actions assigned to a given resource as a single 32 bit integer in a single int field.

"How would you do that?" you ask. Simple! bit masks.

In a single 32 bit integer we have 31 available bits to work with. We reserve the 31st bit for our own future purposes. 0 looks like this:

So setting permissions is simply a matter of ORing all the action masks together.

i.e.

1int permissions = VIEW | ADD_MESSAGE | SUBSCRIBE;

For all you java programmers not up to speed with low level bitwise operations | is bitwise OR and & is bitwise AND.

Checking for VIEW permission is as simple as

1if ((permissions & 1) == 1) { has permission }

Checking for any permission is simply

1if ((permissions & ACTION) == ACTION) { has permission }

.

This is just about the fastest operation you can perform on the CPU.

Now, the next thing we want to do is reduce the number of tables. So, first thing is that we should soon drop these tables: Groups_Permissions, OrgGroupPermission, and Users_Permissions as they are no longer used with the new permission algorithm.

Since we're reducing the Permission_ rows into single integers we can pull that into the Resource_ table and and normalize resource code in as well. We end up with a single table like:

Notice we have a permissions_ int field to store "all" the permissions (hence plural permissions_). Also not that this table handles the ResourceCode and Resource_ table details, so we can drop those as well.

Also, now that we only ever assign permissions to Roles.. we don't even need to have a Roles_Permissions table. We can further reduce that to:

That eliminates Groups_Permissions, OrgGroupPermission, Roles_Permissions, Users_Permissions, ResourceCode, Resource_, and Permission_ and leaves us with Permissions_ which supplies exactly the same level of detail and control AND allows us to do in-line queries with pretty simple SQL, including the scope check.

1) when you do a permission check right now, you have an SQL query for each action you want to check. With a single integer holding ALL the permissions, the first permission check retrieves it from the DB. Any further checks can happen on that loaded int value!!!! That alone would be a massive optimization.2) Memory costs are much lower.3) bitwise operations are the lowest level, and thus the fastest operations you can perform

Permission InheritanceI have a plan for this... but this is already enough detail to digest, and covers 3 out of the 4 wish list issues I started with.

Let me know what you think if you manage to get all the way down here.

Although before going into the proposal itself, I'd like to stick up for our current implementation. I found it very complex the first time I started doing some extensions to it but over time I've come to like it a lot. It is complex but no more than needed for the flexibility it provides. In fact, after we implemented the algorithm 5 the performance is quite good (for a small sacrifice in flexibility). To be more specific the fact that there are 4 types of resources is what allows the system to have rules such as "Assign rights to edit blog post 123", as well as "Assign rights to edt *any* blog post in community A", and also "Assign rights to edit *any* blog post in the whole portal". Also note that only the first blog post will create all 4 resources, the following posts in the same community will only add one additional resource.

I'm saying all this because while I know you where just trying to explain why we keep searching for ways to improve it, after reading the first part of your post people may think that our current system is not good. And I don't think that's the case. Actually, the fact that to optimize it even further we have to go to binary operators speaks by itself of the fact that the system is quite optimized already.

Now to the interesting part, I think going to a binary model sounds like a good approach. The only alternative I can think to optimize permissions check further is through an indexing system, but I would prefer to avoid that since a sync problem could have very nasty effects. After a first read, there are only three suggestions that I'd like to make:1) I would try to avoid the limit of 30 actions per resource. It may seem like a lot for the current applications but it's most probably going to bite us back in the future. What if we use a long instead of an int?2) We probably don't want to denormalize ResourceCode for query performance issues. In any case we can test it.3) Since the proposal implies removing support for algorithms 1-4 we have to provide a way to migrate before we can implement this.

Having said that I think we should go ahead and do a prototype with it. Hopefully we'll find that it will achieve significant performance improvements and it will justify the cost of the migration.

If I grant view permission for all blogs within a company, we just need one Resource with company scope. If want to revoke view permission for a single blog entry, then, a Resource with individual scope would be created.

This would avoid some confusion when you are editing permissions for a single blog entry, because even if the UI tells you that VIEW is not granted for this blog entry, there might be an existing permission with company scope that you are not aware of or forgot about.

... the fact that there are 4 types of resources is what allows the system to have rules such as "Assign rights to edit blog post 123", as well as "Assign rights to edt *any* blog post in community A", and also "Assign rights to edit *any* blog post in the whole portal".

The proposal does not remove the scopes! It includes complete support for those, which is exactly what the new table describes.

Also, it is never faster to do a join between two tables than to stay in the same table, so de-normalizing the tables is actually faster than the join.

Fully agreed, I was just trying to make clear why the scopes were needed to avoid people thinking that's something bad of our current system. It's useful and necessary and your proposal just optimizes it.

@Bruno. Note that our permission system does not support revoking (fortunately). Also, regarding the UI I fully agree that it should show the permissions that the user may have in higher level scopes to avoid confusion.

... after we implemented the algorithm 5 the performance is quite good (for a small sacrifice in flexibility). To be more specific the fact that there are 4 types of resources is what allows the system to have rules such as "Assign rights to edit blog post 123", as well as "Assign rights to edit *any* blog post in community A", and also "Assign rights to edit *any* blog post in the whole portal". Also note that only the first blog post will create all 4 resources, the following posts in the same community will only add one additional resource.

I'm saying all this because while I know you where just trying to explain why we keep searching for ways to improve it, after reading the first part of your post people may think that our current system is not good. And I don't think that's the case. Actually, the fact that to optimize it even further we have to go to binary operators speaks by itself of the fact that the system is quite optimized already.

I agree that our current system works and is not broken, and is vastly improved from previous versions. But it has severe limitations:

1) you cannot paginate lists having pre-checked to see if the current user has VIEW permission2) each check of a specific action is a distinct DB call (assume zero caching) the new model proposes that even if caching was turned off, we could still eliminate all but the first DB call simply by hanging onto the object returned from that call and re-using it.3) when checking across scopes, one must check each scope individually. This proposal can do all three at the same time in SQL without at join.

This proposal could allow permission checks in-line with the main query for single entities or collections

Reducing this to bits means that the memory footprint will besignificantly smaller... the operations are faster and the data storageis smaller... Not to mention the queries are easier and require lessString manipulation????

All it is is a pipe or an ampersand!!!!! Bitwise operations are beingused right now by whatever application you are using to read thismessage. Why do java devs always try to shy away from it. I don't getit.

I makes no sense to stretch the details that can be stored in a singleint into 30 boolean or Char or even worse VARCHAR table columns... Thismakes no sense to me. Can someone give me a good reason?

I'm not afraid. I was just saying that if a resource has more that 30 some actions, we can have several permissions columns to do the job. So it's more of an extension of your idea that doesn't limit the number of actions and won't add too much complexity .

Realistically speaking if we provide two columns to start, this will be more than enough. That's 62+ actions.

A long should be big enough. If for some extreme reason we need more at sometime, it won't be hard to extend to more than one column. But I can't imagine, as you said, any Resource using more than 10-15 max.

Having implemented many permission systems in the past using bit masks, I'd suggest you actually hide the details of bit extraction as well as the database field is long or int. You'd do this by implementing methods encapsulating Ray's code:

This eliminates the easy to screw-up bitwise operations from permission checking sequences and you can change the size of the permissions_ field (or even the implementation) with minimal impact on working code. So you'd get code that looks like:

Thanks for this implementation, everyone. Ray, nice work bringing a blast from the CS past back to improve the performance of our permissions system. Brian, you rock as always.

Now, I'd like us to get back to the main reason why we started improving the performance of the permissions system in the first place: to improve the user experience for Liferay administrators / end users.

Flexibility is neutral and whether it's good or bad depends on the audience. You can have a set of paints and brushes and tout the "flexibility" of these tools to allow an artist to paint any picture out there. But 95% of the public actually wants a beautiful finished painting (and I suppose they can hire an artist to change that painting if they don't like something about). Developers, like the artists in this metaphor, might appreciate the flexibility of our current system, but our public wants a "finished" system that works for most scenarios (and has flexibility to be further configured).

Jorge makes the case that our permissions system is actually very good because of the flexibility it provides in scoping permissions at various levels. I don't think anyone's arguing with the usefulness of that. The problem is whatever the system does by default should be intuitive and sensible. And perhaps more to the point, the whole user experience of dealing with permissions has to be more intuitive and straightforward for that flexibility to be of any use.

If we stop here and say we've fixed permissions, we've missed the whole point. So let's count this as a win and get back to the heart of the matter, which are the issues raised here:

It seems to me that we have a number of thoughtful community members that have invested time into thinking about this issue and presenting real world problems with usage in production. That's something of tremendous value (and one of the main points of being open source) so let's really take advantage of this opportunity. I know engineers typically test a limited set of scenarios (testUser1, testUser2, testDoc1), so having these real world data points is valuable.

At the beginning of 2009 we talked about wanting to productize Liferay to provide something closer to a finished solution to the end user. And while it's difficult to wrestle through that because of the many different solutions people are building with Liferay, it seems that permissions is one area where we can make the default choices that make sense 90% of the time so that what users get when they first start up is easy.

Bitwise is really cool, and as a past faux-engineer I truly appreciate the application on an intellectual level. But a lot of people out there don't care unless they see:

Bitwise >> improved implementation >> improved performance and flexibility >> more intuitive things you can do / better behavior out of the box (e.g., don't show search results for which people have no rights)

But I think most of the work we have to do is actually not as tied to performance but just to a change in the system behavior...

I fully agree. The model that Ben describes in the thread you linked to is exactly what I had in mind and should be reasonably easy to implement. For the technically inclined here my idea was to add a new element to the resources XML files to reflect the potential children resources. For example for a DL folder we would have :

Based on this information when editing the permissions of a folder there would be two new tables (maybe hidden with panels by default) that allow setting the "default" permissions of any children folder of file entry created underneath the current one.

The next step would be to modify the code that creates those resources so that the permissions configured previously are applied. Probably the easiest way to do this would be at the view tier. In order to do that we would need to modify the liferay-ui:input-permissions taglib to account for the preconfigured permissions of a parent first instead of going to the XML file directly for defaults.

I plan to start working on this after the merge of communities and organizations if there is nothing more urgent before that. In any case if someone is available to start earlier I would be happy to guide him.