Polymorphic Information-flow Guards — Guardies
Table of Contents
Caveat
This page presents a work that is sill in progress, and is not yet self-contained; in particular, some technical terms are not defined.
Presentation
This page presents Guardies, a prototype static analysis tool whose primary goal is to automatically detect leaks of confidential data in object-oriented programs. It is designed to target the specific challenges posed by the programming models often encountered on mobile platforms such as Android: object-oriented programming and message-passing with serialized structured data require the development of dedicated techniques to efficiently and precisely capture the flow of secret information within applications.
Note the actual name of the command-line executable for the tool is grds, which is a shorthand for Guardies.
Context
Guardies is currently developed in the Department of Computer Science of the University of Liverpool by Nicolas Berthier, in collaboration with Narges Khakpour from the Department of Computer Science and Media Technology at Linnaeus University.
This work started as a new approach to Narges' previous work in the PROSSES project.
Working Principles and Techniques
The working principles of Guardies are as follows:
- Input methods are given in an assembly-style object-oriented
language very similar to Soot's Jimple language, that closely
reflects Java bytecode. This allows Guardies to handle
already compiled and optimized applications (e.g.
jar
,apk
); - Each method is considered separately and its security semantics is encoded using a symbolic transition system. This encoding assigns security levels to variables and reachable portions of heap, and encodes the control-flow of the method under consideration. It involves a generic symbolic abstract heap domain to capture relations between reference variables, as well as direct and indirect information flow through object mutations;
- Information-flow properties (such as data confidentiality or integrity) are then translated into safety properties on the system;
- Lastly, a co-reachability analysis solver is used to infer a sound polymorphic information-flow guard for the method, that expresses circumstances upon which the method satisfies the in terms of its calling context (i.e. security levels and relations pertaining to its formal arguments).
Guardies relies on ReaTK to solve the symbolic discrete controller synthesis problems.
Download
The tool will be distributed under the terms of the GNU General Public License upon appearance of a first publication about the underlying approach.
Basic Work-flow
grds can receive as input a description of a complete class and
interface hierarchy (file .classes
), and a set of methods (files
.meth
) to be analyzed. It then computes one information-flow
guard per method (files .secsum
), that represents a pre-condition
on the security level of the calling context (pc
) and the
level of each argument, whose satisfaction guarantees the absence of
leak.
Figure 1: Basic workflow; dotted parts are not part of Guardies.
The technique that we employ within Guardies is modular in the
sense that, in order to compute a security summary for a method
m
that calls other methods, Guardies reuses the
guards inferred for the latter to construct the model for the former.
This way, ensuring data confidentiality in an application boils down
to solve multiple small symbolic analysis problems instead of a single
one that would consider all potential information flows within the
entire application at once.
The prototype supports further use cases. For instance, it also accepts security stubs that are concise descriptions of the behavior of third-party methods provided by the user (see Section Security Stubs for some details). It also features a means to compute (an over-approximation of) an application's callgraph from a given set of methods so as to: (i) compute a guard for each of the methods involved; and (ii) handle recursion.
Preliminary Notations
This Section gives some preliminary notations required for the detailed description of the results computed by Guardies.
In the descriptions below, \(\vec{c}\) denotes an object referenced by a
program variable c
(local or formal method argument), and
as a special case, \(\vec{o}\) denotes the instance object referenced
by this
in non-static methods.
Security Levels
Security levels belong to the set \(\{⊥,⊤\}\). \(⊥\) denotes low or
public level, and \(⊤\) denotes high or private level. \(⊔\) is
the least upper bound between two security levels (i.e. in our
case, \(a ⊔ b = ⊤\) iff \(a = ⊤\) or \(b = ⊤\)). The inferred
security summaries involve the security level of variables (noted
level(v)
for a primitive or reference variable
v
) as well as of potentially reachable objects (denoted
obj_level(r)
for a reference variable r
).
Furthermore, pc
denotes the security level of the calling
context for a method.
Synonyms
For convenience Guardies understands several synonyms for security levels, that can be used in place of \(⊤\) and \(⊥\) whenever a constant security level denotation is expected:
- \(⊥\) can be substituted with
L
,l
,B
, orb
; - \(⊤\) can be substituted with
H
,h
,T
, ort
.
Heap-related Relations & Abstract Heap Domains
Guardies makes use of a generic notion of symbolic abstract heap domain to represent heap abstractions as part of the encoding of security semantics. The chosen domain impacts the precision and cost of the analysis, as well as the set of relations that may appear in guards inferred by the tool.
Several kinds of symbolic variables pertaining to reference variables
may appear in guards inferred by Guardies. The descriptions
bellow assume the domain named "deep" is used, which involves the
following Heap-related relations. Given two reference variables
a
and b
, a~b
denotes that
\(\vec{a}\) and \(\vec{b}\) may be the same object (i.e.
a
and b
may alias). Similarly,
a.~>b
denotes that a field (or cell) of \(\vec{a}\) may
(transitively) reference \(\vec{b}\).
Example Analysis Results
This Section describes guards output by Guardies for simple methods. For clarity, support for exceptions is disabled for all examples unless explicitly stated.
Polymorphic Information-flow Guards
Let's assume given a complete class hierarchy, and a method
m
for which we want to compute a security summary.
Listing 1 presents a description of object types as
understood by Guardies, where class A
directly
extends java.lang.Object
and does not declare any field,
class B
inherits from A
and features a
primitive field f
of type int
, and class
C
are objects that may reference objects of the same class
through a field f
.
A <: java.lang.Object B <: A { int f; } C <: java.lang.Object { C f; }
Silent Method
Listing 2 shows the code of an example method m
as given to Guardies. m
is defined in class
A
; it receives as argument a reference b
to an
object of class B
and an integer i
, and
returns the sum between the field f
of \(\vec{b}\) and
i
.
int A:m (B b, int i) { int tmp; tmp = b.f; tmp = tmp + i; return tmp; }
For the method m
, Guardies computes the security
summary given in Listing 3. m
does not output
any value or object on any channel, nor does it call any method, so it
cannot leak any confidential data per se. Therefore, its guard
always holds (guard = tt
).
int A:m (B b, int i) { guard = tt; }
Outputs on Low Channel & Their Impact on Post-conditions
Consider now in Listing 5 the result for method
mo
given in Listing 4. mo
is
broadly equivalent to method m
except that it additionally
outputs the value of the field f
of its first argument on
a public channel (\(⊥\)).
int A:mo (B b, int i) { int tmp; tmp = b.f; output_⊥ (tmp); tmp = tmp + i; return tmp; }
int A:mo (B b, int i) { guard = pc ⊔ level(b) ⊔ obj_level(b) = ⊥; }
In this case, the guard indicates that to prevent (direct and indirect) leaks of confidential data:
m
should be called in a low-context (pc = ⊥
); and- reference argument
b
must not be or point to any confidential data (level(b) ⊔ obj_level(b) = ⊥
).
Security Stubs
When an application to analyze makes use of third-party code, or if some of its methods have not been completely implemented, then a set of security stubs can be used to concisely describe their behavior and still permit the inference of information-flow guards.
Syntax & Semantics
Sets of security stubs can be given to Guardies in files with
extension .secstubs
. Such a file associates method type signatures
with possibly empty sequences of statements. For each one of these
associations, these statements are used by the tool to construct
information-flow guards and effect transformers, which put together
form a summary for the method, in the following way:
- First, a default summary is built based on a few characteristics of the method (whether it is static or not, or whether it is a constructor, whether it returns a reference, and the type of its arguments)—See Section Defaults below;
- Then, the sequence of statements in the security stub is processed sequentially to iteratively amend the summary—See Section Statements;
- The resulting summary is then associated with the method and is later treated by Guardies as if it had been computed from an implementation of the method itself.
Defaults
As of Thursday 04 November 2021 (guardies.0.2.10
), the
default summary for a method corresponds to:
- a "silent" behavior w.r.t. output channels (i.e. the method produces no output, so its guard always holds);
- a "distrustful" behavior w.r.t. its arguments (i.e. the method does not assume their integrity);
- an "evil" behavior w.r.t. object mutations (i.e. the method may mutate every object it has been given access to, in the worst possible way) and returned values and references (i.e. the method may return any object reachable from its argument references whenever their types allow it).
When applicable, the security level of any returned value and object is the least upper bound of that of any of the arguments.
Listing 7 gives default summaries forged by Guardies from stubs given in Listing 6.
static int A:static_default (C, C, int) int A:default (C, C, int) C (A, C, int)
static int A:static_default (C a0, C a1, int a2) { guard = tt; ret(level()) = level(a0) ⊔ level(a1) ⊔ level(a2) ⊔ obj_level(a0) ⊔ obj_level(a1); ret(obj_level(a0)) = ⊤; ret(obj_level(a1)) = ⊤; ret(a0.~>a0) = tt; ret(a0.~>a1) = tt; ret(a1.~>a0) = tt; ret(a1.~>a1) = tt; } int A:default (C a0, C a1, int a2) { guard = tt; ret(level()) = level(a0) ⊔ level(a1) ⊔ level(a2) ⊔ level(this) ⊔ obj_level(a0) ⊔ obj_level(a1) ⊔ obj_level(this); ret(obj_level(a0)) = ⊤; ret(obj_level(a1)) = ⊤; ret(obj_level(this)) = ⊤; ret(a0.~>a0) = tt; ret(a0.~>a1) = tt; ret(a1.~>a0) = tt; ret(a1.~>a1) = tt; } C (A a0, C a1, int a2) { guard = tt; ret(obj_level(a0)) = ⊤; ret(obj_level(a1)) = ⊤; ret(obj_level(this)) = ⊤; ret(a1.~>a1) = tt; ret(a1.~>this) = tt; ret(this.~>a1) = tt; ret(this.~>this) = tt; }
Statements
We describe in Table 1 the various statements that can appear in security stubs.
Statement | Description: The method… | Validity |
---|---|---|
-<~ |
… does not perform any object mutation | any method |
@<~@ |
… may only mutate an object referenced by any of its arguments (excluding \(\vec o\)) with data of security level at most \(α\) | any method |
@<~ℓτ |
… may only mutate an object referenced by any of its arguments (excluding \(\vec o\)) with data of security level at most ℓ | any method |
.*<~@ |
… may only mutate \(\vec o\) with data of security level at most \(α\) | non-static |
.*<~ℓτ |
… may only mutate \(\vec o\) with data of security level at most ℓ | non-static |
*<~@ |
… may only mutate \(\vec o\) or an object referenced by any of its arguments with data of security level at most \(α\) | any method |
*<~ℓτ |
… may only mutate \(\vec o\) or an object referenced by any of its arguments with data of security level at most ℓ | any method |
return @ |
… returns one of its arguments (including \(\vec o\)), of security level at most \(α\) | non-void |
return new_ℓτ |
… returns a value, or reference and object, that is distinct from its arguments (and that does not depend on any them), of security level at most ℓ | non-void |
return new |
… returns a value, or reference and object, that is distinct from its arguments (and that does not depend on any them) | non-void |
return .*ℓτ? |
… returns a field of \(\vec o\) of security level at most ℓ \(⊔~\vec o\) | non-void-static |
return *ℓτ? |
… returns a field of \(\vec o\) or one of its arguments, of security level at most ℓ \(⊔~α~⊔~\vec o\) | non-void |
output_⊥ |
… may output any of its arguments (including \(\vec o\)) on a public channel | any method |
output_⊤ |
… may output any of its arguments (including \(\vec o\)) on a private channel | any method |
-! |
… does not assume integrity of any of its arguments (including \(\vec o\)) | any method |
@! |
… assumes integrity of all its arguments (including \(\vec o\)) | any method |
.*! |
… assumes integrity of \(\vec o\) | non-static |
Examples
Listings 8 and 10 illustrate the construction of summaries from stubs by means of amending the default.
/* None of the following methods perform any mutation */ int A:nice1 (C, C, int) { -<~; return new } int A:nice2 (C, C, int) { -<~; return @ } C A:nice_allocC (C, C, int) { -<~; return new } void A:nice_log (C, int) { -<~; output_⊥ }
int A:nice1 (C a0, C a1, int a2) { guard = tt; ret(level()) = ⊥; ret(obj_level(a0)) = obj_level(a0); ret(obj_level(a1)) = obj_level(a1); ret(obj_level(this)) = obj_level(this); ret(a0.~>a0) = a0.~>a0; ret(a0.~>a1) = a0.~>a1; ret(a1.~>a0) = a1.~>a0; ret(a1.~>a1) = a1.~>a1; } int A:nice2 (C a0, C a1, int a2) { guard = tt; ret(level()) = level(a0) ⊔ level(a1) ⊔ level(a2) ⊔ level(this) ⊔ obj_level(a0) ⊔ obj_level(a1) ⊔ obj_level(this); ret(obj_level(a0)) = obj_level(a0); ret(obj_level(a1)) = obj_level(a1); ret(obj_level(this)) = obj_level(this); ret(a0.~>a0) = a0.~>a0; ret(a0.~>a1) = a0.~>a1; ret(a1.~>a0) = a1.~>a0; ret(a1.~>a1) = a1.~>a1; } C A:nice_allocC (C a0, C a1, int a2) { guard = tt; ret(level()) = ⊥; ret(obj_level()) = (if ret(~a0) then obj_level(a0) else ⊥) ⊔ (if ret(~a1) then obj_level(a1) else ⊥) ⊔ (if ret(a0.~>) then obj_level(a0) else ⊥) ⊔ (if ret(a1.~>) then obj_level(a1) else ⊥); ret(obj_level(a0)) = obj_level(a0); ret(obj_level(a1)) = obj_level(a1); ret(obj_level(this)) = obj_level(this); ret(~a0) = ff; ret(~a1) = ff; ret(.~>a0) = tt; ret(.~>a1) = tt; ret(a0.~>) = ff; ret(a0.~>a0) = a0.~>a0; ret(a0.~>a1) = a0.~>a1; ret(a1.~>) = ff; ret(a1.~>a0) = a1.~>a0; ret(a1.~>a1) = a1.~>a1; } void A:nice_log (C a0, int a1) { guard = pc ⊑ ⊥ ∧ level(a0) ⊔ level(a1) ⊔ level(this) ⊔ obj_level(a0) ⊔ obj_level(this) ⊑ ⊥; ret(obj_level(a0)) = obj_level(a0); ret(obj_level(this)) = obj_level(this); ret(a0.~>a0) = a0.~>a0; }
The resulting summaries are given in Listings 9 and 11, respectively.
Observe for instance how the summary for method C A:nice_allocC (C, C, int)
indicates that the reference it
returns cannot alias any of its arguments of type C
.
/* The following method may mutate any of its arguments with public data */ void A:mutate_arg (C, int) { -<~; @<~⊥ } /* The following method may mutate any of its arguments with any data it recieves, and return anything */ A A:mutate_args (A, C, int) { -<~; @<~@; return * }
void A:mutate_arg (C a0, int a1) { guard = tt; ret(obj_level(a0)) = pc; ret(obj_level(this)) = obj_level(this); ret(a0.~>a0) = tt; } A A:mutate_args (A a0, C a1, int a2) { guard = tt; ret(level()) = pc ⊔ level(a0) ⊔ level(a1) ⊔ level(a2) ⊔ level(this) ⊔ obj_level(a0) ⊔ obj_level(a1) ⊔ obj_level(this); ret(obj_level()) = pc ⊔ level(a0) ⊔ level(a1) ⊔ level(a2) ⊔ level(this) ⊔ obj_level(a0) ⊔ obj_level(a1) ⊔ obj_level(this) ⊔ (if ret(~a0) then obj_level(a0) else ⊥) ⊔ (if ret(~this) then obj_level(this) else ⊥); ret(obj_level(a0)) = pc ⊔ level(a0) ⊔ level(a1) ⊔ level(a2) ⊔ level(this) ⊔ obj_level(a0) ⊔ obj_level(a1) ⊔ obj_level(this); ret(obj_level(a1)) = pc ⊔ level(a0) ⊔ level(a1) ⊔ level(a2) ⊔ level(this) ⊔ obj_level(a0) ⊔ obj_level(a1) ⊔ obj_level(this); ret(obj_level(this)) = obj_level(this); ret(~a0) = tt; ret(~this) = tt; ret(a1.~>a1) = tt; }
Supporting Exceptions
Note that users wanting to handle exceptions when processing .meth
files must give the corresponding translation flag (with
`-tf +exceptions
' on the command-line) before feeding
Guardies with a set of security stubs (similarly for taints…).
Doing so for the stubs of Listing 6 produces
default-w-exceptions.secsums.
Acknowledgments
- The visual presentation of this page is highly based on Solarized CSS.