Iyer 2018
Iyer 2018
161
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
Therefore, it is essential for an access control mining approach to applicable decisions. In the rest of this section, we generally use up-
be able to mine policies that may contain both positive and negative percase letters for denoting a set and lowercase letters for notating
authorization rules. We note that although policy mining has been an element in a set.
largely studied in the context of RBAC, due to lack of support for Let U be the set of users in the system and R be the set of re-
negative authorization in RBAC, naturally no previous work in that sources. A user is characterized by a set of attributes. Let UATTR
area addresses this problem [18]. The work by Xu and Stoller [27, be the set of user attributes. To get the value of an attribute for
26], in the context of ABAC, also supports only mining policies a user, we use the notation u.uattr, where uattr ∈ UATTR is the
with positive rules. name of the user attribute. Like a user, a resource is characterized
In this paper, we propose an algorithm for mining ABAC policies by a set of attributes, which will be denoted as RATTR. Given a
that can extract positive as well as negative authorization rules resource attribute rattr ∈ RATTR, the value of that attribute for a
from a given access control information. Rather than designing resource is denoted by r.rattr. Let the domain of an attribute be the
an algorithm from scratch, we adopt an existing rule mining algo- set of all possible values that the attribute can take. The domain
rithm from the data mining literature, called PRISM [4], and extend of an attribute attr ∈ UATTR ∪ RATTR is denoted by dom(attr). For
that to capture positive and negative authorization rules simultane- simplicity, we consider only categorical attributes in this work. We
ously. Therefore, compared to previous work [27, 26], our algorithm assume that every user and resource in the system has a unique
provides a more systematic and less heuristic approach to mining identifier attribute, defined by uid and rid, respectively. Authoriza-
access control rules that not only extracts negative authorizations tions in an ABAC system are determined for actions requested by
but also performs better in terms of time. Our key contributions in users over resources. Let ACT be the set of all possible actions in
this paper are as follows. the system.
• We propose an algorithm for mining ABAC policies that, to The main components of an ABAC rule are attribute expressions
the best of our knowledge, is a first of its kind approach to that together (in a conjunctive format) determine the sets of users
extract negative authorization rules in addition to positive and resources to which a rule applies. An attribute expression can
authorization rules in the access control mining literature. be either an attribute-value pair or an attribute-attribute pair. An
• We present a detailed approach to generate an authorization attribute-value pair specifies the value corresponding to a user or
log that is needed as input to the mining algorithm, in case resource attribute for the given rule to be applicable. An attribute-
it is not readily available for a system. value pair for a user attribute uattr and a value val is expressed as
• We implement a prototype and conduct experiments on the u.uattr = val and that for a resource attribute rattr and a value val
performance of our algorithm in terms of correctness and is denoted as r.rattr = val. The followings are examples of attribute-
conciseness of the mined rules and the time taken to generate value pairs:
them. We demonstrate that, when the original (ground truth)
policy includes negative authorization rules, our algorithm • u.department = CS
generates concise set of rules, which is not possible using • r.type = transcript
previous work [27, 26]. Moreover, our algorithm outperforms
previous work in terms of time in both cases of positive-only In the above examples, the first attribute-value pair indicates the
and positive-and-negative authorization policies. set of users whose department is CS, while the second attribute
The rest of the paper is organized as follows. Section 2 discusses pair indicates the set of resources whose type is transcript.
our reference policy model that we use for specifying ABAC poli- An attribute-attribute pair specifies a pair of user and resource
cies. In Section 3, we discuss about the design goals and challenges, attributes that need to match for the rule to be applicable. Formally,
and formally define the ABAC policy mining problem. Section 4 an attribute-attribute pair can be expressed as u.uattr = r.rattr,
will describe the proposed ABAC policy mining algorithm in depth where uattr ∈ UATTR and rattr ∈ RATTR. An example of attribute-
along with its time complexity. We discuss about an approach to attribute expression is as follows:
generate complete authorization log in Section 5. In Section 6, we
run experiments to test our proposed algorithm and analyze the re- • u.department = r.department
sults. Section 7 discusses related work in the field of policy mining
and how our approach is novel compared to previous contribu- In the above example, the attribute expression is satisfied by that
tions. Finally, in Section 8, we provide additional discussions and set of users and resources where user department is the same as
conclusions. resource department.
Similar to attribute expressions, in particular attribute-value
2 ABAC POLICY MODEL pairs, an ABAC rule includes an action expression that is denoted
In this section, we present a specification and authorization seman- by action = act, where act ∈ ACT . Such an expression determines
tics for ABAC policies that will be the basis for defining the policy the access requests to which a rule will be applicable based on the re-
mining problem and its proposed solution in this paper. quested action. Finally, each ABAC rule includes a rule effect, which
is interpreted as granting applicable requests (PERMIT) or denying
2.1 Policy Specification them (DENY). Given the abovementioned components, an ABAC rule
is formally defined as a pair ⟨ϕ, d⟩ where ϕ is a conjunctive set of at-
An ABAC policy usually contains a disjunctive set of rules com-
tribute expressions and action expression and d ∈ {PERMIT, DENY}
prising attribute expressions on users and resources, actions, and
162
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
is the rule effect. We use the following grammar for rule specifica- the conflict resolution strategy of the policy. For example, if the
tion in this paper: conflict resolution strategy of the policy is deny-overrides, then
only one applicable DENY rule in the policy is sufficient to make
rule ::= ⟨ϕ, d⟩
the final authorization decision to be DENY. Finally, in the third
ϕ ::= exp [; exp ] scenario above, the default decision of the policy determines the
exp ::= u.uattr = value | authorization result. For example, if the default decision of the pol-
r .rattr = value | icy is DENY, then the system denies an access request if the request
is not applicable with any of the rules in the policy.
u.uattr = r .rattr |
action = value 3 PROBLEM STATEMENT
d ::= PERMIT | DENY In this section, we present our design considerations and challenges
The followings are some examples of ABAC rules: for mining ABAC policies that include both positive and negative
rules, and formulate the ABAC policy mining problem.
• ⟨u.position = faculty; u.chair = true; r .type = transcript;
As input to the policy mining process, we consider a low-level
u.department = r .department; action = read_transcript,
log of authorization decisions in a system, which indicates autho-
PERMIT⟩
rization decision (PERMIT or DENY) for any given access by a user
• ⟨u.position = manager; u.department = accounts;
to a resource. Since our goal is to mine rules based on user and
r .type = budget; action = approve, DENY⟩
resource attributes, such a log needs to be accompanied (and aug-
The first example above is an example of a positive rule, according mented) by attributes of users and resources involved in the log
to which a user who is a faculty and chair of a department can entries. Such an access log may be accumulated by and retrieved
perform read_transcript operation on all the transcripts in his/her from a working authorization system (e.g., as in collected in audit
department. On the other hand, the second example is an illus- logs or for the sole purpose of mining). Also, in some cases, we
tration of a negative rule, according to which if a manager from may have an already existing access control policy specified using
the accounts department tries to approve the budget of a project other models such as role-based access control (RBAC [7]) or simple
(project is a resource in this case), then he/she will be denied access. access control lists (ACL [20]). In such cases, based on the existing
As mentioned earlier, an ABAC policy is a disjunctive set of policy, we may generate the desired log using an authorization
rules. We denote the complete set of rules in an ABAC policy by ρ. engine or a log conversion process in case of simple models such
Along with the authorization rules, an ABAC policy also includes a as ACL.
default decision and a conflict resolution strategy. A default decision A central design consideration and contribution of this work is
applies when none of the rules in the policy are applicable to an coexistence of positive and negative rules in a mined policy. Ability
access request. A conflict resolution strategy applies when there is to specify both positive and negative rules is desirable in cases such
an overlap between positive and negative authorization rules. In as handling simple exceptions (e.g., all but one department should
other words, if both PERMIT and DENY rules are applicable to an be able to access a file) or implementing strict requirements for
access request, then the conflict resolution strategy of the policy a group of users/resources (e.g., all employees on administrative
decides the final decision for that access request [15, 16, 11]. leave should not be able to access any resource). This requirement
In the context of this paper, in order to avoid overcomplicating brings on new challenges for mining ABAC policies compared to
our discussion about policy mining, we assume DENY as the default when mining positive rules only. In order to discuss better about
decision and deny-overrides as the conflict resolution strategy. the challenges, we illustrate two sample abstract policies as Venn
diagrams in Figure 1. Here, the universe represents all possible
2.2 Authorization Process access requests and corresponding decisions in the system. Each
An authorization request is a tuple ⟨u ∈ U , r ∈ R, a ∈ ACT ⟩ indicat- policy rule has been represented as a circle determining set of access
ing the requesting user, requested resource and action, respectively. instances to which it is applicable. As such, various overlapping
Given an ABAC policy as described in Section 2.1, an ABAC autho- situations can exist among policy rules. An overlap area indicates
rization system evaluates each policy rule’s expressions based on access instances with multiple applicable policies. As explained in
the request and determines if the rule is applicable to the request or Section 2.2, overlap can lead to a conflict situation if rules result in
not. There are three possible scenarios based on such a matching different decisions. For example, Figure 1(a) represents partial over-
process: laps between two positive rules as well as overlap of the negative
• the access request matches with one or more rules in the rules with the positive rules. Here, negative rules are proper subset
policy, all of which contain the same access decision; or of positive rules which may be used to specify exceptions to some
• the access request matches with more than one rule in the permissions. For example, it can be the case that every student
policy but the access decisions of those rules are conflicting, except students in the CS department can view the courselist:
i.e., include both PERMIT and DENY decisions; or • ⟨u.position = student; actions = view_course_list, PERMIT⟩
• the access request does not match with any rule in the policy. • ⟨u.position = student; u.department = CS;
The first scenario is quite straightforward. In this case, the autho- actions = view_course_list, DENY⟩
rization decision returned by the rule(s) in the policy. In the second Figure 1(b) shows another example where a negative rule overlaps
scenario, the final authorization decision is resolved according to with multiple positive rules.
163
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
(a) (b)
Figure 1: Policy spaces demonstrating PERMIT rules, conflicting DENY rules, and DENY space as default decision (a) Conflicting
DENY rules are proper subset of PERMIT rules; and, (b) DENY rules conflict with more than one PERMIT rule.
Looking at the above examples from the viewpoint of a policy Based on the abovementioned considerations, we define the
mining algorithm, which only sees a flat access log data, it is chal- ABAC policy mining problem as follows. The ABAC policy mining
lenging to discover the rules when both positive and negative rules problem accepts a complete authorization log (augmented with
exist. The solution needs to discern DENY cases that are result of attributes) as input and extracts an ABAC policy (Section 2.1) that
applying negative rules versus those that are result of applying the is concise and consistent with the authorization log. Based on the
default rule. Note that we consider DENY as the default decision in notations discussed in Section 2.1, the authorization log is a set
this paper as explained in Section 2.1, . In Figure 1 the non-shaded of records, each indicating attribute values of a requesting user
area outside of the rules represents cases to which the default policy (UATTR), attribute values of requested resource (RATTR), an action
applies while the crossed areas represent cases when a negative (ACT ), and corresponding access decision (PERMIT or DENY). In this
rule results in DENY. Figure 1(b) highlights another desired char- work, we assume a complete log as input, meaning that every po-
acteristic for our solution. Rather than trying to generate three tential combination of attribute values are provided in the log (or
different specific negative rules, each corresponding to the crossed otherwise assumed and set to be equal to the default DENY decision).
DENY pieces that are cut out of the positive rules, we need to be able As the correctness criterion, the mined policy must be consistent
to detect that they belong to one more general negative rule. with the input authorization, i.e., the authorization of a log entry
Finally, a policy mining solution should strive for deriving a according to the mined policy and the semantics described in Sec-
policy that is as concise as possible as they are more manageable tion 2.2 must result in the same access decision as in the log entry.
and easier to interpret. In terms of an ABAC policy, we would As the quality criterion, a solution to the mining problem must aim
like to create less number of rules as well as creating rules with for policies that are as concise as possible. We quantify performance
less number of expressions (more general rules). Previous work on towards achieving this goal using the abovementioned WSC metric.
mining ABAC policies [26] have adopted the notion of Weighted
Structural Complexity (WSC), previously defined in the context of
RBAC policy mining [19], as a metric for this purpose. We adopt the 4 MINING ABAC POLICIES
same notion here. Informally, WSC of an ABAC policy is the sum In this section, we propose an approach for mining ABAC policies
of weights of all of its rules, where each rule’s weight is calculated that include both positive and negative rules based on the policy
as the weighted sum of the number of expressions in that rule. model discussed in Section 2. Our proposed algorithm follows a
Mathematically, WSC of an ABAC policy composed of the ruleset systematic flow to mine optimal rules, as shown in Figure 2, avoid-
ρ is given as: ing heuristic and sub-optimal procedures as much as possible. The
Õ
WSC(ρ) = WSC(rule) flow starts with mining positive rules, but also discovers conflicting
negative rules simultaneously as a subprocess. In the following, we
rule ∈ρ
present our algorithm and analyze its time complexity.
WSC(rule) = w1 |α | + w2 |β | + w3 |γ |
where α, β and γ are, respectively, sets of attribute-value pairs,
4.1 Positive/Negative Rule Mining Algorithm
attribute-attribute pairs, and action expressions in rule’s ϕ. More-
over, wi s are user-specified weights that adjust their contribution In order to mine attribute-based rules from authorization logs, we
to rule’s conciseness. adopt concepts from a rule mining algorithm, called PRISM [4].
164
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
Yes Yes
Stop Remove the Add the PERMIT rule and the Add the PERMIT
Mined ABAC ruleset
subset from log conflicting DENY rule to ruleset rule to ruleset
The backbone of PRISM algorithm is induction strategy for find- Algorithm 1: mineRules
ing the attribute-value pair, α x , which yields highest conditional Input :loд (complete authorization log)
probability for a particular classification, δn , that is, for which Output : List of rules
P(δn | α x ) is maximum. In context of this paper, conditional proba- 1 decision_col ← дetLastCol(loд)
bility P(δn | α x ) is the probability of occurrence of PERMIT or DENY 2 while PERMIT ∈ decision_col do
decision, δn , for a given attribute expression, α x . 3 X ← loд
At a high level our ABAC policy mining algorithm works in an it-
4 Y ← decision_col
erative manner as shown in Algorithm 1. The input to the algorithm
5 ϕ←∅
is an authorization log (augmented with user/resource attributes)
as described in Section 3. The getLastCol function returns all the 6 while DENY ∈ Y do
access decisions, in order, from the input dataset. The outer while 7 (attr , val, prob) ← findAttrValPair (X , PERMIT)
loop runs until the log does not contain any PERMIT instances, to 8 coveraдe1 = lenдth(дetInstances(X , attr , val))
ensure that all positive rules have been mined. The inner while 9 (attr 1, attr 2, prob2) ← findAttrAttrPair (X , PERMIT)
loop runs until a subset of the log contains all PERMIT instances or 10 coveraдe2 = lenдth(дetInstances(X , attr 1, attr 2))
a conflicting DENY rule is encountered. Basically, the inner while 11 if prob = prob2 and coveraдe1 > coveraдe2 then
loop is used to mine either a positive rule or a pair of positive and 12 (expr _LHS, expr _RHS) ← (attr , val)
conflicting DENY rules. Within the inner loop, lines 7-17 returns the 13 else if prob2 < prob then
attribute expression, either attribute-value or attribute-attribute 14 (expr _LHS, expr _RHS) ← (attr , val)
pair, that yields the highest conditional probability for PERMIT. If 15 else
equal probabilities are encountered, then the one with larger cover- 16 (expr _LHS, expr _RHS) ← (attr 1, attr 2)
age is returned. An attribute expression has larger coverage over 17 ϕ.add(expr _LHS = expr _RHS)
another if the number of instances in the dataset that contains 18 X ← дetInstances(X , expr _LHS, expr _RHS)
the former is greater than that for latter. The selected attribute
19 Y ← дetLastCol(X )
expression is then added to the positive rule. getInstances function
20 deny_rule ← f indDenyRule(X , ϕ, Y )
returns a subset of the log containing those instances that satisfy
the selected attribute expression. Within this subset, existence of a 21 if deny_rule! = null then
conflicting DENY rule is checked using findDenyRule function (Al- 22 ruleset.add(deny_rule)
gorithm 4). If it does, the conflicting DENY rule is added to ruleset 23 break
and the inner loop breaks. Otherwise, inner loop repeats over the 24 ruleset.add(⟨ϕ, PERMIT⟩)
subset created in the previous iteration, until the subset comprises 25 loд.removeInstances(ϕ)
of only PERMIT instances. Lines 24-25 add PERMIT rule, which is 26 decision_col ← дetLastCol(loд)
created by taking the conjunction of all selected attribute expres- 27 ruleset ← дeneralizeDenyRules(ruleset, loд)
sions in the inner loop, to the ruleset, and remove all instances from
the log that are covered by this rule. generalizeDenyRules function
(Algorithm 5) generalizes all the negative rules in the ruleset by
removing redundant attribute expressions from those rules. The Algorithms 2 and 3 manifest two functions for returning attribute
output of the ABAC policy mining algorithm is a set of positive expressions, attribute-value pair and attribute-attribute pair, with
and conflicting negative rules. highest conditional probability for PERMIT. The inputs to both these
functions are the same. The loop in the findAttrValPair function
165
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
166
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
the attribute expression yielding the highest conditional probability ruleset. Since rules represent instances of the access log, the number
for DENY is selected. Lines 7-9 ensure that the selected attribute of DENY rules in the initial ruleset is of the order n.
expression is added to the input rule only if the set of distinct The total running time of our ABAC policy mining algorithm
values contained in attribute attr equals the domain of attr. The is, therefore, O(n 2d 5 ). The time complexity of our policy mining
flag variable indicates whether any attribute expression was added approach is much less than O(n 3 ), which is the worst case running
to the input rule. A subset of the input dataset is created comprising time of [26] (details in Section 7). Suppose, every attribute in the
of all instances containing the selected attribute expression. The log contains exactly two values in its domain, then total number of
loop is then repeated on this subset, until it contains only instances instances in a complete log, n, is 2d . Besides, in a realistic application,
of DENY. At this point, a negative rule is created by taking the domain of attributes have more than two values. So, for large values
conjunction of all selected attribute expressions in the loop. The of d, d 5 << md (= n), where m ∈ {2, 3, 4, ...} depending on the
mined negative rule is indeed a conflicting negative rule if it covers application.
all DENY instances in the input dataset (lines 12-17).
After generating the initial ruleset from the access control log, 5 GENERATING ACCESS LOGS
the policy mining algorithm generalizes the DENY rules in the rule-
In this part, we discuss the algorithm used for generating the log,
set as specified in Algorithm 5. For every DENY rule in the ruleset,
in detail. The proposed algorithm can be used as a framework for
we initially check if it is a subset of a generalized DENY rule, and
generating synthetic logs, which can be utilized for various analysis
if it is, then it is removed from the ruleset. Otherwise, the DENY
purposes. For example, we use synthetic logs, generated from ABAC
rule is generalized by removing its components (attribute expres-
policies, for evaluation of our policy mining approach, so that we
sions) one at a time. Each time a component is removed, we check
can have the ground truth while comparing the mined ABAC policy
if the new DENY rule covers any PERMIT instances. If it does not,
with original ABAC policy.
then the redundant component is removed from the original DENY
Log generation is an important phase in our proposed approach,
rule. Finally, the generalized DENY rule is added to the ruleset. The
because the log outputted from this phase serves as the input for
getRuleCoverage function returns the set of instances covered by a
the ABAC policy mining algorithm. Our goal is to understand the
rule in the access log.
behavior of an underlying access control model. So we consider all
possible combinations of all possible users, resources and actions, in
4.2 Time Complexity short all possible scenarios of access requests, while generating the
The time complexity of ABAC policy mining algorithm (Algo- log, to be able to interpret all possible operations of the underlying
rithm 1) can be calculated as follows. Let n be the number of records access control model.
or instances and d be the number of attributes in the access log. The The log generation algorithm works as follows. Using the set of
outer loop runs as many times as the number of PERMIT instances user attributes and domain for each user attribute, all possible users
in the log. So, the running time of the outer loop is O(n). in the system are created by enumerating all possible combinations
The inner loop runs as many times as the total number of at- of values for user attributes. Similarly, all possible resources in the
tributes involved, including all the attribute expressions, within system are created using the set of resource attributes and domain
a particular PERMIT rule. In the worst case, a PERMIT rule can be for each resource attribute. A unique identifier is allocated to each
formed by all attributes for attribute-value pairs and all combina- user and resource. Then, using the complete set of users, resources
tions of attributes for attribute-attribute pairs. Since a rule cannot and actions in the system, all possible (user, resource, action) combi-
contain duplicate attribute expressions, total number of attributes nations are determined, to enumerate all possible access requests
included in attribute-value pairs is d and that for attribute-attribute that can be created from the system. While generating the complete
pairs is of the order d 2 . So, the running time of inner while loop is set of access requests, each request is evaluated against the given
O(d 2 ). XACML policy to determine the access decision for that request, fol-
Calculating the optimal attribute-value pair (line 7 in Algo- lowing which the access request and corresponding access decision
rithm 1; Algorithm 2) takes O(nd) in the worst case when all the are written to a file.
attributes in the log contains n distinct values. Further, calculating Each record in the log indicates if a certain user can perform
the optimal attribute-attribute pair (line 9 in Algorithm 1; Algo- certain action on a certain resource. In other words, each row in the
rithm 3) takes O(d 2 ) time. So, total time taken for calculating the log contains the tuple (Au , Ar , Action, Decision), where Au and Ar
optimal attribute expression is O(nd). are, respectively, the set of requesting user and requested resource
Computing a conflicting DENY rule (line 20 in Algorithm 1; Algo- attributes, Action is the requested action, and Decision ∈ {PERMIT,
rithm 4) takes total O(nd 3 ) time. This is because the while loop in DENY } is the access decision corresponding to that access request.
Algorithm 4 takes O(d 2 ) time in the worst case when a DENY rule
contains all attributes for attribute-value pairs and all combinations 5.1 Determining domains for each attribute
of attributes for attribute-attribute pairs. Moreover, calculating the A challenge that we encountered while generating the log was to
optimal attribute expression (line 4 in Algorithm 4) takes O(nd) determine the domain for each user and resource attribute, because
time. based on this domain information, all possible combinations of
Generalizing the DENY rules (line 27 in Algorithm 1; Algorithm 5) values for user and resource attributes can be created. In the con-
consumes a total of O(nd) time. This is because the loop in Algo- text of this paper, we assume four types of columns or attributes:
rithm 5 runs for each attribute, within every DENY rule in the initial columns appearing only in the set of user attributes or resource
167
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
attributes (but not in both) referred to as usr-only attribute and res- 6.2 Implementation
only attribute respectively, columns that appear in the intersection Our log generation implementation works based on list of all possi-
of user and resource attribute sets referred as usr-res attributes, a ble user attributes and resource attributes along with the domain
user column dependent on the resource identifier column called as for each attribute, list of all possible actions, and an ABAC pol-
usr-foreign-key column, and a resource column dependent on the icy written in XACML 3.0 [6] (Section 5). Each policy in XACML
user identifier column referred to as res-foreign-key attribute. comprises of a set of rules, where each rule consists of a sequence
The basic algorithm for defining the domain for all attributes is of attribute expressions to determine which access requests the
as follows: rule applies to and a rule effect to determine the access decision in
case the rule is satisfied by an access request. Consistent with our
Step 1: Determine the domain for all usr-only and res-only attributes. policy model, we use deny-overrides rule-combining algorithm for
Step 2: For every column c ∈ usr-res, repeat: XACML policies. The log generation algorithm is implemented in
– First determine the domain of c. Java (JDK 1.8). We use WSO2 Balana [25], an open-source XACML
– Then the domain of each uattr and rattr, where uattr ∩ implementation, to determine access decisions corresponding to
rattr = c, is the same as the domain of c, that is, dom(uattr) each access request for a given XACML policy. The policy mining
= dom(rattr) = dom(c). algorithm is written in Python (Python 3.5). We performed each
Step 3: If the set of user attributes contains a column c ∈ usr-foreign- experiment 10 times and report the average time measurement
key, then the domain of c, dom(c), is the set or subset of in our experiments. The experiments were performed on a 64-bit
resource identifiers, as required by c. Windows 10 machine having 12 GB RAM and Intel Core i7-6700HQ
Step 4: If the set of resource attributes contains a column c ∈ res- processor.
foreign-key, then the domain of c, dom(c), is the set or subset Table 1 summarizes the access logs generated for university and
of user identifiers, as required by c. project management policies. We note that the same number of
access requests were generated regardless of positive-only vs. posi-
6 EVALUATION tive/negative policy versions. |attru | is the number of user attributes,
including the unique identifier attribute. Similarly, |attrr | is the num-
We have implemented the proposed algorithms in Section 4 and
ber of resource attributes, including the identifier attribute. |U |, |R|
report our experimental evaluation in this section. As our evalua-
and |O| are, respectively, the total number of user, resources and
tion approach, rather than starting from an access log, we conduct
actions in the system. |log| is the total number of records in the
our experiments by generating an access log from an ABAC pol-
generated log, which is computed as |U | x |R| x |O|.
icy, and then mine policies based on the generated log. Such an
approach ensures that we have access to ground truth policies (i.e.,
original policies) with which we can compare the results of our
mining algorithm. We follow a systematic approach to generate a 6.3 Experiments with Positive Authorizations
comprehensive as well as minimal log as proposed in Section 5. We first compare the performance of our policy mining algorithm
We compare the performance of our algorithm with the previ- with XSAM [26] on policies consisting of only positive authoriza-
ously proposed algorithm by Xu and Stoller [26], which we refer tions. This can provide an insight on how performances comapre
to as XSAM in the rest of this section. We should note that XSAM on solving a mining problem that both approaches should be able to
is only capable of mining positive attribute-based access control solve by design. We use the complete log as input to our algorithm,
policies. Therefore, we conduct our experiments on both policies and provide only the PERMIT instances as input to XSAM since it
that contain only positive rules, and policies that include positive works based on access control lists (ACLs).
as well as conflicting negative rules. The first four rows in Table 2 show the results of our algorithm
and XSAM on UniversityP and ProjectP policies. The table com-
6.1 Datasets pares approaches on the basis of quality, with respect to preciseness
and conciseness, of mined rules and total time taken for execution.
We perform our experiments on two policy datasets that we have
|ρor iд+ | and |ρor iд− | are the number of positive and negative rules
have adapted from [26] to include negative authorizations. The
in the original ABAC policies, whereas |ρmined+ | and |ρmined− |
university policy, University, authorizes accesses to applications,
are the number of positive and negative rules in the mined ABAC
gradebooks, rosters and transcripts, requested by students, faculties,
policies. Further, WSCorig and WSCmined are, respectively, the WSC
applicants and staff in registrar/admissions office. The project man-
measure for original and mined policies. When calculating WSC
agement policy, Project, controls accesses by accountant, auditor,
for the experiments, we consider all user-specified weights w i to be
planner and manager to tasks, schedules and budgets associated
equal to one. Finally, Run time is the total time, in seconds, taken for
with projects.
mining ABAC policies from the given access control information.
In order to provide a fair assessment and comparison of our al-
As demonstrated in Table 2, both approaches, XSAM [26] and
gorithm versus XSAM, we use two different versions of University
our proposed work, perform exactly the same in terms of mining
and Project policies. Policies UniversityP and ProjectP contain
concise rules that are syntactically and semantically similar to the
only postive authorization rules, while policies UniversityPN and
original policy. However, our approach outperforms XSAM in terms
ProjectPN have both positive and negeative authorization rules.
of running time.
We have included the policies in Appendix A.
168
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
Table 1: Details of the access logs created from original ABAC policies
Table 2: Comparison of our proposed algorithm with XSAM [26] for university and project management policies
Mining Alg. Policy |ρor iд+ | |ρor iд− | |ρmined+ | |ρmined− | WSCorig WSCmined Time (s)
XSAM UniversityP 5 - 5 - 19 19 1540
Proposed work UniversityP 5 - 5 - 19 19 936
XSAM ProjectP 11 - 11 - 49 48 1328
Proposed work ProjectP 11 - 11 - 49 48 896
XSAM ProjectPN 11 4 20 - 67 4324 1370
Proposed work ProjectPN 11 4 11 4 67 64 1032
XSAM* UniversityPN 11 3 -* -* 56 -* 7200+*
Proposed work UniversityPN 11 3 11 3 56 53 1123
* XSAM [26] did not terminate nor produced any output for the UniversityPN policy even after running for more than two hours.
6.4 Experiments with Positive and Negative 6.5.2 Comparison with XSAM. Our policy mining algorithm
Authorizations and the XSAM approach perform similar when only positive autho-
rization rules need to be mined. However, there is a significance
Our second set of experiments is on policies consisting of both
difference when both positive as well as negative authorization
positive and negative authorization rules. The last four rows in Ta-
rules are considered. More particularly, when experimenting on
ble 2 show the performance of the two approaches on ProjectPN
policies containing negative authorization, XSAM either does not
and UniversityPN policies. Our observations show that our ap-
terminate in reasonable time (after two hours for UniversityPN)
proach precisely mines concise positive and negative rules for both
or produces verbose positive rules containing identifier attributes
policies, whereas XSAM [26] computes verbose rules that are more
(which should be avoided for ABAC policies). In addition, our policy
identity-based rather than attribute-based. For example, in case of
mining approach always runs faster than that of XSAM.
ProjectPN, while our proposed algorithm mines total of 15 rules
Although it may be argued that XSAM considers negative in-
with WSC of 64, XSAM produces 20 rules with significantly large
stances implicitly (i.e., any access not permitted is denied), it fails
WSC (4332). Furthermore, in our experiments, XSAM was not able to
badly when considering both positive and negative rules as demon-
terminate and produce an output for UniversityPN even after run-
strated in our experiments. This is because the policies ProjectPN
ning for more than two hours. The result clearly demonstrate the
and UniversityPN are particularly hard to express using only posi-
need for mining negative authorization rules along with positive
tive rules, which emphasizes the need for explicitly mining negative
rules.
authorization rules along with positive authorization rules.
169
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
WSC of the mined RBAC policy is minimized. Consistent with [26], the α x for which the probability of occurrence of the classifica-
we adopt the notion of WSC for measuring the complexity of mined tion δ n , given the attribute-value pair α x , is maximum. However,
ABAC policies. the limitation of PRISM in the context of this paper is that, for a
The limitation of RBAC policy mining is that, for obtaining RBAC particular rule, PRISM tends to find only attribute-value pairs, but
configuration from the given User Permission Assignments, role not attribute-attribute pairs. As a result, when PRISM is run on
mining problems consider only the positive authorizations, in terms the access control log, which serves as a suitable training dataset
of what permissions are assigned to users, based on their roles. Our comprising of two classifications PERMIT and DENY, PRISM creates
ABAC mining approach on the other hand, considers both positive rules containing the identifier attributes like the unique user iden-
and negative authorizations while obtaining ABAC policy. tifier attribute and unique resource identifier attribute. As a result,
Xu and Stoller were the first to introduce the concept of ABAC the output is verbose since it contains large number of rules. Our
policy mining [26]. The motivation behind the idea of ABAC min- policy mining algorithm, although based on PRISM, overcomes this
ing is to ease the burden of migration to ABAC framework from drawback by also considering attribute-attribute pairs, along with
an existing access control paradigm, by partially automating the attribute-value pairs, while constructing a rule for PERMIT or DENY.
process of migration. At a high level, their policy mining algorithm
works as follows. Initially, they generate an Access Control List
8 DISCUSSIONS AND CONCLUSIONS
(ACL), which they refer as the User Permission Relation, from an
ABAC policy and attribute data. Then their policy mining algo- In this paper, we proposed an algorithm for mining ABAC policies
rithm, while iterating over the tuples in the given User Permission capable of discovering both positive and negative authorization
Relation, select a user permission tuple that is used as the seed for rules simultaneously. While previous approaches in access control
creating a candidate rule. This candidate rule is then generalized policy mining literature had focused on positive-only authorization
by replacing conjuncts in attribute expressions with constraints. rules (including more recent work on ABAC mining [26]), our work
The goal of their generalization process is to increase the coverage significantly contributes to the area by discovering negative autho-
of the rule in terms of the additional tuples that can be covered rization rules as well. We evaluated our policy mining algorithm on
by the rule in the User Permission Relation. The set of candidate logs generated from two synthetic but realistic policies. Our obser-
rules, which altogether cover the entire ACL, is then optimized vations from experiments show that the mined rules never reference
by removing redundant rules and merging pairs of rules. A rule identity-based attributes like user identifier and resource identifier
is redundant if it covers instances in the User Permission Relation attributes. Also, the results demonstrate that the mined rules are
already covered by some other rule. Two different rules, having the equivalent to the original ABAC policy, and that the mined policies
same constraints, are merged by taking the union of conjuncts, in are concise compared to them. Furthermore, we demonstrated that
those rules, for every attribute. However, their algorithm does not our approach outperforms previous ABAC mining algorithm [26]
deal with negative authorizations. Moreover the ABAC policy min- through the experiments and theoretical analysis.
ing algorithm presented in [26] is very heuristic and complicated to Our mining algorithm attempts to mine positive and negative
interpret. Importantly, their running time is cubic in the size of the rules simultaneously. An alternative strategy would be to mine all
ACL, whereas the time complexity of our ABAC mining approach possible positive rules first and then combine rules in a way to
is much less than cubic time as explained in detail in Section 4.2. resolve in more general set of positive and negative rules. However,
Recently, Medvet et al. [17] proposed an evolutionary, separate such an alternative strategy will lead to many granular positive rules
and conquer approach for mining ABAC policies, using the same which then need to be considered for generalization. Combining
policy language and case studies as in [26]. In their work, a new rule granular rules is a complex problem itself to solve optimally. We
is generated and the set of access requests decreases to a smaller size note that the previous ABAC mining approach [26] followed such
during each iteration. Similar to Xu and Stoller [26] and unlike our an strategy (for positive rules only). But it relied on heuristics for
proposed approach, their work is not capable of mining negative generalization (by considering only pairs of rules). A main objective
authorization rules. Moreover, there is not much difference in terms of design of our mining approach was to avoid such heuristic, sub-
of performance compared to [26]. Therefore, we only compare our optimal strategies.
performance against [26]. The proposed mining algorithm is feasible to be employed in
Our policy mining algorithm is closely related to the PRISM practice based on our experimental results and theoretical analysis.
rule mining algorithm [4] by Cendrowsk. PRISM is an established Running time of the algorithm was in the order of a few minutes
data mining algorithm for inducing rules corresponding to a given for the synthetic policies. Theoretically, the time complexity of our
dataset. It serves as a solution for the traditional data mining clas- policy mining algorithm depends on size of complete log, which is
sification problem. Given a training dataset, containing different exponential to number of attributes. While we acknowledge this
classifications, PRISM outputs a set of modular rules, where each limitation, we note that it is applicable to any log mining algorithm
rule contains combination of attribute-value pairs for arriving at a that aims to avoid false positives/negatives. Moreover, we note that
particular classification. To yield a set of disjunctive rules, PRISM policy mining is inherently an offline and less time-sensitive task.
uses an induction strategy for finding the attribute-value that deliv- As future work, we plan to extend our approach to incorporate
ers the most information about a particular classification. In other other ABAC features such as support for numerical data and other
words, when determining a rule for a particular classification δ n , relational operators such as subset in attribute expressions. We
PRISM finds the attribute-value pair α x that gives the highest con- will also explore algorithmic improvements, and more extensive
ditional probability for the classification δ n , that is, PRISM selects quantitative analysis based on policies of different sizes.
170
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
171
Session: Research Track - ABAC SACMAT’18, June 13-15, 2018, Indianapolis, IN, USA
• ⟨u .adminRol e = manaдer ; r .type = budдet ; u .depar tment = r .depar tment ; act ion = r ead, PERMIT ⟩
• ⟨u .adminRol e = manaдer ; r .type = budдet ; u .depar tment = r .depar tment ; act ion = appr ove, PERMIT ⟩
• ⟨r .type = schedul e; u .pr oject Led = r .pr oject ; act ion = r ead, PERMIT ⟩
• ⟨r .type = budдet ; u .pr oject Led = r .pr oject ; act ion = r ead, PERMIT ⟩
• ⟨r .type = schedul e; u .pr oject Led = r .pr oject ; act ion = wr it e, PERMIT ⟩
• ⟨r .type = budдet ; u .pr oject Led = r .pr oject ; act ion = wr it e, PERMIT ⟩
• ⟨r .type = schedul e; u .pr oject = r .pr oject ; act ion = r ead, PERMIT ⟩
• ⟨r .type = t ask ; u .t ask = r .r id; act ion = set St atus, PERMIT ⟩
• ⟨r .type = t ask ; r .pr opr iet ary = f al se; u .pr oject = r .pr oject ; u .exper t ise = r .exper t ise; act ion = r ead, PERMIT ⟩
• ⟨r .type = t ask ; r .pr opr iet ary = f al se; u .pr oject = r .pr oject ; u .exper t ise = r .exper t ise; act ion = r equest, PERMIT ⟩
• ⟨u .is Employee = t r ue; r .type = t ask; u .pr oject = r .pr oject ; u .exper t ise = r .exper t ise; act ion = r ead, PERMIT ⟩
• ⟨u .is Employee = t r ue; r .type = t ask; u .pr oject = r .pr oject ; u .exper t ise = r .exper t ise; act ion = r equest, PERMIT ⟩
• ⟨u .adminRol e = audit or ; r .type = budдet ; u .pr oject = r .pr oject ; act ion = r ead, PERMIT ⟩
• ⟨u .adminRol e = account ant ; r .type = budдet ; u .pr oject = r .pr oject ; act ion = r ead, PERMIT ⟩
• ⟨u .adminRol e = account ant ; r .type = budдet ; u .pr oject = r .pr oject ; act ion = wr it e, PERMIT ⟩
• ⟨u .adminRol e = account ant ; r .type = t ask; u .pr oject = r .pr oject ; act ion = setCost, PERMIT ⟩
• ⟨u .adminRol e = pl anner ; r .type = schedul e; u .pr oject = r .pr oject ; act ion = wr it e, PERMIT ⟩
• ⟨u .adminRol e = pl anner ; r .type = t ask; u .pr oject = r .pr oject ; act ion = set Schedul e, PERMIT ⟩
Table 3: ProjectP policy rules
• ⟨u .adminRol e = manaдer ; u .depar tment = dept 2; r .type = budдet ; act ion = r ead, DENY ⟩
• ⟨u .adminRol e = manaдer ; r .type = budдet ; r .pr oject = pr oj21; act ion = appr ove, DENY ⟩
• ⟨u .adminRol e = pl anner ; u .depar tment = dept 3; u .exper t ise = t est inд; r .type = schedul e; act ion = r ead, DENY ⟩
• ⟨r .type = t ask; r .depar tment = dept 2, DENY ⟩
Table 4: DENY rules in ProjectPN policy. PERMIT rules are the same as in ProjectP (Table 3)
• ⟨r .type = дr adebook; u .cour seT aken = r .cour se; act ion = r eadScor e, PERMIT ⟩
• ⟨r .type = дr adebook; u .cour seT auдht = r .cour se; act ion = r eadScor e, PERMIT ⟩
• ⟨u .posit ion = f acul ty; r .type = дr adebook; u .cour seT auдht = r .cour se; act ion = assiдnGr ade, PERMIT ⟩
• ⟨u .posit ion = student ; r .type = t r anscr ipt ; u .uid = r .student ; act ion = r eadT r anscr ipt, PERMIT ⟩
• ⟨u .posit ion = f acul ty; u .isChair = t r ue; r .type = t r anscr ipt ; u .depar tment = r .depar tment ; act ion = r eadT r anscr ipt, PERMIT ⟩
Table 5: UniversityP policy rules
• ⟨r .type = дr adebook; u .cour seT aken = r .cour se; act ion = r ead MyScor es, PERMIT ⟩
• ⟨r .type = дr adebook; u .cour seT auдht = r .cour se; act ion = addScor e, PERMIT ⟩
• ⟨r .type = дr adebook; u .cour seT auдht = r .cour se; act ion = r eadScor e, PERMIT ⟩
• ⟨u .posit ion = f acul ty; r .type = дr adebook; u .cour seT auдht = r .cour se; act ion = chanдeScor e, PERMIT ⟩
• ⟨u .posit ion = f acul ty; r .type = дr adebook; u .cour seT auдht = r .cour se; act ion = assiдnGr ade, PERMIT ⟩
• ⟨u .isChair = t r ue; r .type = дr adebook; u .depar tment = r .depar tment ; act ion = r eadScor e, PERMIT ⟩
• ⟨r .type = дr adebook; u .cour seT aken = r .cour se; act ion = addScor e, DENY ⟩
• ⟨r .type = дr adebook; u .cour seT aken = r .cour se; act ion = r eadScor e, DENY ⟩
• ⟨r .type = дr adebook; u .cour seT aken = r .cour se; act ion = chanдeScor e, DENY ⟩
• ⟨r .type = дr adebook; u .cour seT aken = r .cour se; act ion = assiдnGr ade, DENY ⟩
• ⟨u .depar tment = r eдist r ar ; r .type = r ost er ; act ion = r ead, PERMIT ⟩
• ⟨u .depar tment = r eдist r ar ; r .type = r ost er ; act ion = wr it e, PERMIT ⟩
• ⟨u .posit ion = f acul ty; r .type = r ost er ; u .cour seT auдht = r .cour se; act ion = r ead, PERMIT ⟩
• ⟨r .type = t r anscr ipt ; u .uid = r .student ; act ion = r ead, PERMIT ⟩
• ⟨u .posit ion = student ; u .depar tment = dept 1; r .type = t r anscr ipt ; act ion = r ead, DENY ⟩
• ⟨u .isChair = t r ue; r .type = t r anscr ipt ; u .depar tment = r .depar tment ; act ion = r ead, PERMIT ⟩
• ⟨u .depar tment = r eдist r ar ; r .type = t r anscr ipt ; act ion = r ead, PERMIT ⟩
• ⟨r .type = appl icat ion; u .uid = r .student ; act ion = checkSt atus, PERMIT ⟩
• ⟨u .depar tment = admissions; r .type = appl icat ion; act ion = r ead, PERMIT ⟩
• ⟨u .depar tment = admissions; r .type = appl icat ion; act ion = set St atus, PERMIT ⟩
• ⟨u .depar tment = admissions; r .type = appl icat ion; r .depar tment = dept 2; act ion = r ead, DENY ⟩
• ⟨u .depar tment = admissions; r .type = appl icat ion; r .depar tment = dept 2; act ion = set St atus, DENY ⟩
Table 6: UniversityPN policy rules
172