Tuesday, September 19, 2006

RoleMiner: Mining Roles using Subset Enumeration

author = {Jaideep Vaidya, Vijayalakshmi Atluri, and Janice Warner},
title = {RoleMiner: Mining Roles using Subset Enumeration},
booktitle = {CCS '06: Proceedings of the 13th ACM Conference on Computer and Communications Security},
year = {2006},
location = {Virginia, USA},
publisher = {ACM Press},
address = {New York, NY, USA},

Jaideep Vaidya et al.'s work uses unsupervised clustering and has a number of references what have also used clustering approaches to extract roles.

The concept behind the proposed “RoleMiner” is unsupervised clustering that uses each set of user permissions as a starting cluster. Roles are then created by finding common sets of permissions between users and other common sets of permissions. The theory behind this approach is that a set of permissions assigned to a user could potentially be a role. Intersects between user permission sets could also be potential roles.

Two approaches are proposed: the CompleteMiner and the FastMiner. The CompleteMiner uses each permission set for each user as the initial possible set of roles. From this, it uses subset enumeration to find all possible subsets as possible roles. To find all possible subsets, the algorithm completes in exponential time. A more computationally feasible approach (quadratic runtime) is to only consider subsets from pairs of initial roles (FastMiner). Both these approaches find possible roles (permission clusters). An ordering algorithm determines which are the best roles identified and the ones most likely to be implemented for the best RBAC infrastructure. The ordering algorithm places a bias on subsets found early from the first few iterations of the CompleteMiner, thus producing results that both approaches produce similar candidate roles. Also in the used test sets, each user is given at most 3 roles so it is very likely that a role is found during the first iteration of the CompleteMiner, the FastMiner.

For simplicity, separation of duty (SoD) constraints and sessions are not considered. Would considering these attributes of RBAC generate an alteration to the way that roles are identified? Is it possible to consider these attributes? Using the definition of a session as being an activation of a set of permissions or a set of roles assigned to a particular user, sessions would have limited effect on role extraction if the algorithm was purely based on analysis of permission to user assignments. SoD constraints would require more consideration. Static SoD’s would simply deny certain users to be allowed certain permissions in conjunction. This becomes a problem if permissions within each role are switched on or off for each user (based on the constraint). Otherwise, roles that cannot be granted in conjunction will cause permissions to not be granted in conjunction and thus will not effect the role mining. I cannot see the immediate effects on extracting roles from user permission assignments when there are dynamic SoD constraints. Dynamic SoD constraints would only effect session activations. In general, different types of SoD constraints may require different approaches. The solution seems unsolvable without more domain knowledge.

An open issue with the approach addressed by the paper is the lack of a role prioritization after candidate roles have been identified. It is suggested that semantics of what is required be added. Their approach places a bias roles found in the initial merge between any two user permission sets as a role. Doing so justifies their FastMiner, the computationally feasible approach that only consider merges from the first iterations of the CompleteMiner. If each user is given multiple roles, more iterations of the subset enumeration must be performed in order to find the original roles. Also, just because these are the roles that existed originally, what shows that they are "good" roles?

What is not discussed is the creation of a role hierarchy. The ORCA approach that RoleMiner is based on identifies role hierarchies while finding roles with no permission overlaps. Permission overlaps in different roles is now permissible, however, no method for extracting hierarchy is discussed. Possible future work may be to incorporate hierarchy definition into this approach. Permission subsets from later iterations of the CompleteMiner could potentially be lower lever roles that are inherited for people who require a superset of the permissions of that role.

It would also be interesting to see if different classification techniques other than clustering can be applied to extract roles. Why have previous approaches only looked at clustering?

1 comment:

Ron said...

Hi Dana:

Your approach for role mining is in fact similar to the one used in Eurekify Sage software. This approach can use the SE-tree algorithm, which I have developed while in Academia (Machine Learning Conference 1993).

The problem is that this family of algorithms does not scale as-is to large RBAC role engineering assignments, such as the ones we usually encounter in Identity and Access Management projects. To deal with that, we have developed quite a lot of specializations, which I cannot disclose since they are trade secrets. We have patented the general approach but without disclosing the fine details that make it scale.

I think there is quite a lot to improve on these algorithms though, so I wish you the best, and I would be happy to read articles that you publish.

Best regards,


Dr. Ron Rymon
Founder, Eurekify