# Mathematical models of information security

In this short article, the authors will present a number of mathematical models of information security. They will start with some of the earliest work which focused on access control. The unifying theme throughout the study is information flow. They will move from lattice-based models through to the more recent information theoretic and probabilistic models. These latter models recognise that, in most systems, some information flow is inevitable but that it is important to be able to quantify this.

Sep 05, 2017
0
0

Author(s): Chris Hankin

### Introduction

What is a model of information security? Models typically provide an abstract description of a system; the quality of a model is linked to its tractability and predictive power. Many of the models that have been proposed concentrate on flows of information in a system.

From this perspective, we will start by looking at one of the earliest access control models which is sometimes known as mandatory access control. This is a bi-partite model which consists of actors (called subjects) which can be permitted to or prevented from doing things to passive objects. The model of information flow is essentially unidirectional from the objects to the subjects. More recent access control models are considerably more refined but, to a large extent, this unidirectional flow persists. Non-interference models recognise that information more usually flows in both directions and we shall discuss these next. Finally, we will move away from considering all information flow as undesirable to a set of models which attempt to quantify the amount of information which flows.

Lattice theory plays a significant role in the modelling of access control and non-interference. Information theory has also played an important role in the mathematical foundations of information security but this really comes to the fore in the treatment of the quantitative models of information flow.

In the next section, we review the key definitions of lattice theory. We then present the Bell–LaPadula model of access control. In Section 4, we discuss non-interference and some of its generalisations. We then present key concepts of Shannon's information theory before presenting quantitative information flow. Finally, we conclude with a summary.

### Lattice theory

Partially ordered sets. A partial ordering on a set L is a relation : L × L →{ true, false} that is

Reflexive ∀l ∈ L: l ⊑ l,

Transitive ∀l 1 , l 2 , l 3 ∈ L: l 1 ⊑ l 2 ∧ l 2 ⊑ l 3 ⇒ l 1 ⊑ l 3, and

Anti-symmetric ∀l 1 , l 2: l 1 ⊑ l 2 ∧ l 2 ⊑ l 1 ⇒ l 1 = l 2.

A partially ordered set ( L, ) is a set L equipped with a partial ordering (sometimes written L ).

A subset Y ⊆ L has l ∈ L as an upper bound if ∀ l′ ∈ Y: l′ ⊑ l and as a lower bound if ∀l′ ∈ Y: l′ ⊒ l. A least upper bound l of Y is an upper bound of Y that satisfies l ⊑ l 0 whenever l 0 is another upper bound of Y; similarly, one can define a greatest lower bound. Note that subsets Y of a partially ordered set L need neither have least upper bounds nor greatest lower bounds but when they exist they are unique and they are denoted LY and ΠY.

Complete lattices: A complete lattice L = ( L, ) = ( L, ⊑, ⊔, ⊓, ⊥, ⊤) is a partially ordered set ( L, ) such that all subsets have least upper bounds as well as greatest lower bounds. Furthermore, = ⊔∅ = ⊓ L is the least or bottom element and = ⊓∅ =⊔L is the greatest or top element.

A more complete treatment of the theory of partial orders and lattices may be found in [1].

Security levels: The first models that we will consider use complete lattices of security classifications. The models are general but the lattices are often total orders. For example, the current UK document classification scheme is

but we will often work with the much simpler total order

The classic paper on a mathematical model of information security, based on general systems theory, is that of Bell and LaPadula which was first published in 1973 and re-published in 1996 [2]. Our presentation is based on the slight simplification of the model presented in Ryan's paper [3].

The main components of the model are sets of subjects (S) and objects (O), a lattice of security levels (L), a mapping which returns the classification of a subject or object (C: S ∪ O → L) and read and write access ({r, w}). The intention is that subjects are users or processes and objects are data.

The model is based on two rules:

No read up: A subject s is allowed read access to an object o if and only if C( s) dominates C( o) [i.e. C( s) ⊒ C( o)].

No write down – the * property: A subject s is allowed write access to an object o if and only if C( o) dominates C( s).

We will consider more general policies later but these rules effectively enforce an information flow rule that prevents information from flowing downwards in classification levels. Bell and LaPadula’s full model allows five different types of access: read-only access; append access; execute access; read/write access; and control access. Their system requires an access matrix which specifies which access rights each subject has for each object. The model involves the subject making access requests to the system which either allows the access or denies it.

The system may also have to deal with conflicting rules and ill-formed requests. This kind of issue comes to the fore in more recent access control frameworks such as eXtensible Access Control Markup Language (XACML) [4]. In the simplest setting, formal modelling of these types of situation gives rise to a four-valued or Belnap logic. The use of Belnap logic in access control has been studied in depth by [5,6].

### Non-interference

A criticism of the Bell–LaPadula model is that it concentrates too much on one-way flows of information, whereas information flow is often bidirectional. Non-interference is an idea proposed by Goguen and Meseguer [7] to characterise absence of any information flow. Non-interference is a property that is usually associated with a program or process. A simple characterisation of the property partitions the set of program variables or process channels into two disjoint sets High and Low. The assumption is that an observer can only observe low outputs from the program/process. The program has the non-interference property if there is no information flow from any high variable to a low output.

This is a more subtle property than might at first appear. In the program

there is an information flow from H to L because, if we observe that L has the value five, we know that H must be negative or zero.

Clark et al. study this problem as a program analysis in [8]. Non-interference and related information flow properties have been extensively studied by the process algebra community. The earlier cited survey by Ryan [3] is an excellent introduction to this approach.

MacLean introduces a more general approach to non-interference in [9]. He denotes the set of all high values assigned at time t asH t and similarly for L t . Non-interference properties can then be cast as a constraint on conditional probabilities, for example

Recall that conditional probability is defined as follows

where p( A ∩ B) is the probability of A and B co-occurring. The inclusion of L t −1 in the conditional ensures that nothing can be learned about H t −1 by combining L t and L t −1. MacLean's critique of this approach is that it rules out direct causal links between low and high variables but not that they are correlated because of some other low-level event. On the basis of this, he uses a more general property

where Hs is the sequence of values assumed by high-level objects in every state before t and similarly for Ls. Goguen and Meseguer’s non-interference property is then equivalent to

See the cited paper for a more detailed discussion of the implications of this formulation.

### Information theory

In some cases, information flow is unavoidable. When a password or PIN is checked, even if rejected, some information is inevitably returned to the person providing the password/PIN guess. This has led to an interest in quantifying the amount of information that is leaked. The most successful work has been based on concepts from information theory.

Information theory was originally proposed by Shannon [10] for the study of signal processing. In Shannon's theory, the fundamental measure of information in a channel is entropy. For our purposes, we will consider a discrete random variable, X, which can have a finite number of values according to a specified probability distribution. If we consider a communication channel to be associated with a random variable X which represents the possible messages that might be communicated, the entropy of the variable H( X) is defined as

where p( x) is the probability that X takes the value x. It should be clear that the value of H( X) is only 0 if we know with certainty which value X will take – i.e. p( x) = 1 for some value x and zero everywhere else. The entropy is maximal when every value of the random variable occurs with equal probability – the uniform distribution. In general, the entropy gives the expected number of bits necessary for an optimal encoding. The generalisation of this to give the joint entropy over a finite set of random variables is straightforward

The conditional entropy measures the entropy of a random variable X given that we know the value of some other random variable Y

We will also need the concept of mutual information, which is defined in the following way

with the obvious generalisation to conditional mutual information.

### Quantitative information flow

As we have discussed, non-interference is about checking information flows from inputs of a program to its outputs. Clark et al. [11] measure this information flow in the following way

where X and Y are the input and output variables of interest and Z is an additional input variable. This definition may appear more complicated than expected as I( X; Y) should adequately capture the shared information between X and Y. The authors motivate the need for the complex definition by the following example

In this example, the mutual information between X and Y, assuming no knowledge about the values (i.e. their entropy is 1) is 0: this is because the XOR operation provides perfect encryption of X using the key Z. The right-hand side of the alternative formulation of information flow expands in the following way

The fourth step in this derivation, which replaces H( X|Z) by H( X), is valid because X is independent of Z. Since we have no information about the value of X, we assume that its value is uniformly distributed over the possible values and H( X) = 1. If we know the values of Y and Z, then the value of X is also known, so H( X|Y, Z) = 0. Consequently, the mutual information is now 1 0 = 1.

Clark, Hunt and Malacaria show that, for deterministic programs, if the information leaked is 0, then the program satisfies non-interference. They have produced a number of papers developing these ideas and showing how they can be applied to a variety of different programming languages, for example [12].

Smith [13] reviews the different approaches and proposes an alternative basis for quantitative information flow which uses min-entropy as a measure for quantitative information flow. This approach is based on measuring the vulnerability of a random variable, V(X)

which is the worst-case probability that an adversary could guess the value of the random variable with one guess. The min-entropy, denoted H ( X), is defined as follows

Min-entropy corresponds to Shannon entropy when the values of the random variable are uniformly distributed but Smith claims that it gives a more meaningful measure of the information flow in other cases.

Conditional vulnerability can be defined in the expected way

and then conditional min-entropy is

Finally, the information leaked from Y to X is given by

### Conclusion

The first mathematical models of information security are over 40 years old. Whilst one of the earliest models focused on access control, there has been a common thread throughout the models that we have considered which has been about analysing information flows. From an information flow perspective, the access control models may be criticised for treating information flow as a unidirectional phenomenon whereas this is seldom the case. The non-interference-based approaches to information flow avoid the unidirectional trap but may be criticised for being too draconian and unrealistic in prohibiting all leakages. Much of the modern work on non-interference properties includes a notion of declassification which attempts to relax these restrictions and better model the real world [14]. We have concluded our brief survey with an introduction to the work on quantitative information flow. The various approaches based on information theory are in the ascendancy but see Di Pierro et al. [15] for an alternative approach based on probabilistic abstract interpretation.

The constraints of this short article have prevented a full treatment of the finer details of the many mathematical models of information security that have been proposed. We hope that we have whetted the appetite of the reader and that the bibliography will be sufficient to start an exploration of this fascinating topic.

### Acknowledgement

I am grateful to Pasquale Malacaria for his comments on this article and many years of enjoyable collaboration.

### References

1. Davey B. A. Priestley H. A.: ‘Introduction to lattices and order’ (Cambridge University Press, Cambridge, 2002, 2nd edn.).
2. Bell D. E. LaPadula L. J.: ‘Secure computer systems: a mathematical model’, MTR-2547, The MITRE Corporation , 1996, II.
3. Ryan P. Y. A.: ‘Mathematical models of computer security’. Proc. FOSAD, 2000 (LNCS, 2171), pp. 1–62.
4. XACML 3.0. Available at docs.oasis-open.org/xacml/3.0/xacml-3.0-core-spec-os-en.html.
5. Bruns G. Huth M.: ‘Access-control policies via Belnap logic: effective and efficient composition and analysis’. Proc. IEEE Symp. on Computer Security Foundations, 2008, pp. 163–176.
6. Hankin C. L. Nielson F. Nielson H. R.: ‘Advice from Belnap policies’. Proc. IEEE Symp. on Computer Security Foundations, Port Jefferson, US, July 2009, pp. 234–247.
7. Goguen J. A. Meseguer J.: ‘Security policies and security models’. Proc. IEEE Symp. on Security and Privacy, April 1982, pp. 11–20.
8. Clark D. J. Hankin C. L. Hunt L. S.: ‘Information flow for algol-like languages’, J. Comput. Lang., 2002, 28, pp. 3–28.
9. MacLean J.: ‘Security models and information flow’. Proc. IEEE Symp. on Security and Privacy, May 1990, pp. 180–187.
10. Shannon C. E.: ‘A mathematical theory of communication’, Mob. Comput. Commun. Rev., 2001, 5, (1), pp. 3–55 (doi: 10.1145/584091.584093).
11. Clark D. J. Hunt L. S. Malacaria P.: ‘Quantitative information flow, relations and polymorphic types’, J. Log. Comput, 2005, 15, (2), pp. 181–199 (doi: 10.1093/logcom/exi009).
12. Malacaria P.: ‘Assessing security threats of looping constructs’. Proc. ACM POPL 2007, Nice, France, January 2007, pp. 225–235.
13. Smith G.: ‘On the foundations of quantitative information flow’. Proc. FOSSACS 2009, York, UK, March 2009 (LNCS, 5504), pp. 208–302.
14. Sabelfeld A. Sands D.: ‘Dimensions and principles of declassification’. Proc. IEEE Symp. on Computer Security Foundations, June 2005, pp. 255–269.
15. Di Pierro A. Hankin C. L. Wiklicky H.: ‘Measuring the confinement of probabilistic concurrent systems’, Theor. Comput. Sci., 2005, 340, (1), pp. 3–56 (doi: 10.1016/j.tcs.2005.03.002).

### Chris Hankin

Professor, Imperial College London