lecture notes the algebraic theory of information -...

295
Lecture Notes The Algebraic Theory of Information Prof. Dr. J¨ urg Kohlas Institute for Informatics IIUF University of Fribourg Rue P.-A. Faucigny 2 CH – 1700 Fribourg (Switzerland) E-mail: [email protected] http://diuf.unifr.ch/tcs Version: November 24, 2010

Upload: buidat

Post on 11-Apr-2018

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Lecture Notes

The Algebraic Theory of Information

Prof. Dr. Jurg Kohlas

Institute for Informatics IIUFUniversity of FribourgRue P.-A. Faucigny 2

CH – 1700 Fribourg (Switzerland)E-mail: [email protected]://diuf.unifr.ch/tcs

Version: November 24, 2010

Page 2: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

2

Copyright c© 1998 by J. KohlasBezugsquelle: B. Anrig, IIUF

Page 3: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Preface

In today’s scientific language information theory is confined to Shannon’smathematical theory of communication, where the information to be trans-mitted is a sequence of symbols from a finite alphabet. An impressive mathe-matical theory has been developed around this problem, based on Shannon’sseminal work. Whereas it has often been remarked that this theory , albeitits importance, cannot seize all the aspect of the concept of “information”,little progress has been made so far in enlarging Shannon’s theory.

In this study, it is attempted to develop a new mathematical theory cov-ering or deepening other aspects of information than Shannon’s theory. Thetheory is centered around basic operations with information like aggregationof information and extracting of information. Further the theory takes intoaccount that information is an answer, possibly a partial answer, to ques-tions. This gives the theory its algebraic flavor, and justifies to speak of analgebraic theory of information.

The mathematical theory presented her is rooted in two important de-velopments of recent years, although these developments were not originallyseen as part of a theory of information. The first development is local compu-tation, a topic starting with the quest for efficient computation methods inprobabilistic networks. It has been shown around 1990 that these computa-tion methods are closely related to some algebraic structure, for which manymodels or instances exist. This provides the abstract base for generic infer-ence methods covering a wide spectrum of topics like databases, differentsystems of logic, probability networks and many other formalisms of uncer-tainty. This algebraic structure, completed by the important idempotencyaxiom, is the the base of the mathematical theory developed here. Whereasthe original structure has found some interest for computation purposes, itbecomes its flavor as a theory of information only with the additional idem-potency axiom. This leads to information algebras, the object of the presentstudy.

Surprisingly, the second development converging to the present work isDempster-Shafer Theory of Evidence. It was proposed originally as an ex-tension of probability theory for inference under uncertainty, an importanttopic in Artificial Intelligence and related fields. In everyday language oneoften speaks of uncertain information. In our framework this concept is

3

Page 4: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4

rigorously seized by random variables with values in information algebras.This captures the idea that a source emits information, which may be in-terpreted in different ways, which depending on various assumptions mayresult in different information elements. It turns out that this results in anatural generalization or extension of Dempster-Shafer Theory of Evidence,linking thus this theory in a very basic way to our theory of information.

In this way the algebraic of theory of information is thought of as amathematical theory forming the rigorous common background of many the-ories and formalisms studied and developed in Computer Science. Whereaseach individual formalism has its own particularities, the algebraic theoryof information helps to understand the common structure of informationprocessing.

Page 5: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Contents

1 Introduction 9

I Information Algebras 13

2 Modeling Questions 152.1 Refining and Coarsening: the Order Structure . . . . . . . . . 152.2 Concrete Question Structures . . . . . . . . . . . . . . . . . . 16

3 Axiomatics of Information Algebra 213.1 Projection Axiomatics . . . . . . . . . . . . . . . . . . . . . . 213.2 Multivariate Information Algebras . . . . . . . . . . . . . . . 263.3 File Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.4 Variable Elimination . . . . . . . . . . . . . . . . . . . . . . . 333.5 The Transport Operation . . . . . . . . . . . . . . . . . . . . 413.6 Domain-Free Algebra . . . . . . . . . . . . . . . . . . . . . . . 46

4 Order of Information 614.1 Partial Order of Information . . . . . . . . . . . . . . . . . . . 614.2 Ideal Completion of Information Algebras . . . . . . . . . . . 654.3 Atomic Algebras and Relational Systems . . . . . . . . . . . . 714.4 Finite Elements and Approximation . . . . . . . . . . . . . . 774.5 Compact Ideal Completion . . . . . . . . . . . . . . . . . . . 834.6 Labeled Compact Information Algebra . . . . . . . . . . . . . 854.7 Continuous Algebras . . . . . . . . . . . . . . . . . . . . . . . 91

5 Mappings 995.1 Order-Preserving Mappings . . . . . . . . . . . . . . . . . . . 995.2 Continuous Mappings . . . . . . . . . . . . . . . . . . . . . . 1015.3 Continuous Mappings Between Compact Algebras . . . . . . 106

6 Boolean Information Algebra 1116.1 Domain-Free Boolean Information Algebra . . . . . . . . . . . 1116.2 Labeled Boolean Information Algebra . . . . . . . . . . . . . 118

5

Page 6: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

6 CONTENTS

6.3 Representation by Set Algebras . . . . . . . . . . . . . . . . . 125

7 Independence of Information 1297.1 Conditional Independence of Domains . . . . . . . . . . . . . 1297.2 Factorization of Information . . . . . . . . . . . . . . . . . . . 1377.3 Atomic Algebras and Normal Forms . . . . . . . . . . . . . . 141

8 Measure of Information 1438.1 Information Measure . . . . . . . . . . . . . . . . . . . . . . . 1438.2 Dual Information in Boolean Information Algebras . . . . . . 143

9 Topology of information Algebras 1459.1 Observable Properties and Topology . . . . . . . . . . . . . . 1459.2 Alexandrov Topology of Information Algebras . . . . . . . . . 1469.3 Scott Topology of Compact Information Algebras . . . . . . . 1469.4 Topology of Labeled Algebras . . . . . . . . . . . . . . . . . . 1469.5 Representation Theory of Information Algebras . . . . . . . . 146

10 Algebras of Subalgebras 14710.1 Subalgebras of Domain-free Algebras . . . . . . . . . . . . . . 14710.2 Subalgebras of Labeled Algebras . . . . . . . . . . . . . . . . 14710.3 Compact Algebra of Subalgebras . . . . . . . . . . . . . . . . 14710.4 Subalgebras of Compact Algebras . . . . . . . . . . . . . . . . 147

II Local Computation 149

11 Local Computation on Markov Trees 15111.1 Problem and Complexity Considerations . . . . . . . . . . . . 15111.2 Markov Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 15311.3 Local Computation . . . . . . . . . . . . . . . . . . . . . . . . 15811.4 Updating and Retracting . . . . . . . . . . . . . . . . . . . . 162

12 Multivariate Systems 16512.1 Local Computation on Covering Join Trees . . . . . . . . . . 16512.2 Fusion Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 16512.3 Updating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

13 Boolean Information Algebra 167

14 Finite Information and Approximation 169

15 Effective Algebras 171

16 173

Page 7: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

CONTENTS 7

17 175

III Information and Language 177

18 Introduction 179

19 Contexts 18119.1 Sentences and Models . . . . . . . . . . . . . . . . . . . . . . 18119.2 Lattice of Contexts . . . . . . . . . . . . . . . . . . . . . . . . 18119.3 Information Algebra of Contexts . . . . . . . . . . . . . . . . 181

20 Propositional Information 183

21 Expressing Information in Predicate Logic 185

22 Linear Systems 187

23 Generating Atoms 189

24 191

IV Uncertain Information 193

25 Functional Models and Hints 19525.1 Inference with Functional Models . . . . . . . . . . . . . . . . 19525.2 Assumption-Based Inference . . . . . . . . . . . . . . . . . . . 19725.3 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19925.4 Combination and Focussing of Hints . . . . . . . . . . . . . . 20225.5 Algebra of Hints . . . . . . . . . . . . . . . . . . . . . . . . . 205

26 Simple Random Variables 20926.1 Domain-Free Variables . . . . . . . . . . . . . . . . . . . . . . 20926.2 Labeled Random Variables . . . . . . . . . . . . . . . . . . . 21126.3 Degrees of Support and Plausibility . . . . . . . . . . . . . . . 21226.4 Basic Probability Assignments . . . . . . . . . . . . . . . . . 214

27 Random Information 21727.1 Random Mappings . . . . . . . . . . . . . . . . . . . . . . . . 21727.2 Support and Possibility . . . . . . . . . . . . . . . . . . . . . 21927.3 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . 22427.4 Random Variables in σ-Algebras . . . . . . . . . . . . . . . . 229

Page 8: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

8 CONTENTS

28 Allocation of Probability 23528.1 Algebra of Allocation of Probabilities . . . . . . . . . . . . . . 23528.2 Random Variables and Allocations of Probability . . . . . . . 24228.3 Labeled Allocations of Probability . . . . . . . . . . . . . . . 247

29 Support Functions 25329.1 Characterization . . . . . . . . . . . . . . . . . . . . . . . . . 25329.2 Generating Support Functions . . . . . . . . . . . . . . . . . . 25729.3 Extension of Support Functions . . . . . . . . . . . . . . . . . 262

30 Random Sets 27730.1 Set-Valued Mappings: Finite Case . . . . . . . . . . . . . . . 27730.2 Set-Valued Mappings: General Case . . . . . . . . . . . . . . 28130.3 Random Mappings into Set Algebras . . . . . . . . . . . . . . 283

31 Systems of Independent Sources 287

32 Probabilistic Argumentation Systems 28932.1 Propositional Argumentation . . . . . . . . . . . . . . . . . . 28932.2 Predicate Logic and Uncertainty . . . . . . . . . . . . . . . . 28932.3 Stochastic Linear Systems . . . . . . . . . . . . . . . . . . . . 289

33 Random Mappings as Information 29133.1 Information Order . . . . . . . . . . . . . . . . . . . . . . . . 29133.2 Information Measure . . . . . . . . . . . . . . . . . . . . . . . 29133.3 Gaussian Information . . . . . . . . . . . . . . . . . . . . . . 291

References 293

Page 9: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 1

Introduction

Information is a central concept of science, especially of Computer Science.The well developed fields of statistical and algorithmic information theoryfocus on measuring information content of messages, statistical data or com-putational objects. There are however many more facets to information. Wemention only three:

1. Information should always be considered relative to a given specificquestion. And if necessary, information must be focused onto thisquestion. The part of information relevant to the question must beextracted.

2. Information may arise from different sources. There is therefore theneed for aggregation or combination of different pieces of information.This is to ensure that the whole information available is taken intoaccount.

3. Information can be uncertain, because its source is unreliable or be-cause it may be distorted. Therefore it must be considered as defeasibleand it must be weighed according to its reliability.

The first two aspects induce a natural algebraic structure of information.The two basic operations of this algebra are combination or aggregation ofinformation and focusing or extraction of information. Especially the oper-ation of focusing relates information to a structure of interdependent ques-tions. These notes are devoted to a discussion of this algebraic structure andits variants. It will demonstrate that many well known, but very differentformalisms such as for example relational algebra and probabilistic networks,as well as many others, fit into this abstract framework. The algebra allowsfor a generic study of the structure of information. Furthermore it permitsto construct generic architectures for inference.

The third item above, points to a very central property of real-life infor-mation, namely uncertainty. Clearly, probability is the appropriate tool to

9

Page 10: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

10 CHAPTER 1. INTRODUCTION

describe and study uncertainty. Information however is not only numerical.Therefore, the usual concept of random variables is not sufficient to studyuncertain information. It turns out that the algebra of information is thenatural framework to model uncertain information. This will be anotherimportant subject of these notes.

In gthe first part of these notes, the basic algebraic structure of in-formation will be siscussed. Questions are formally represented by theirpossible answers. However the answers may be given by varying granular-ity. This introduces a partial order between questions which reflects refiningand coarsening of questions. If two different questions are considered si-multaneously, then this means that the coarsest set of answers which allowsto obtain answers to both questions should be considered. In terms of thepartial order of questions this means that the supremum of the questionsshould exist. Similarly, the infimum of two questions, representing the com-mon part of the questions, should also exist. In other words, the systemof questions considered in a particular context should form a lattice. Theproblem of modeling questions is discussed in Chapter 2. Often, in appli-cations questions are related to the unknown values of variables. In thiscase, a question is represented by the domain of the variable considered, orby the cartesian product of the domains of a set of variables. The latticeof questions in this case can be identified by the lattice of the subsets ofthe set of variables considered in the given context. More generally, it isoften assumed that there is a universe spanning the realm of interest in agiven situation. Specific question are then represented by partitions of theuniverse. Again we select then a lattice of partitions to represent the systemof questions considered in the context at hand. The existence of a globaluniverse is even not necessary. More generally a lattice of frames, linked byrefining and coarsening relations can be considered.

As mentioned above information has a corpuscular nature, it comes inpieces. In one possible view, one may assume that any piece of informationhas a domain associated with it, which indicates the question to which theinformation refers. Then, of course, it must be possible to extract from itthe part of information referring to a more coarse questions. This is theoperation of projection. Further, two or more pieces of information must beaggregated to a new, combined piece of information. Its domain is then thesupremum of the domains of the parts. These operations lead to a two-sortedalgebra, which satisfies some natural axioms, and which we call informationalgebra. It will be shown in Chapter 3 that there are several variants ofthis algebraic structure. In particular, there is a related alternative, but es-sentially equivalent point of view, were pieces of information are consideredindependent of a particular question they may relate to. It is convenientto have these two alternative views of labeled and domain-free informationalgebras. Depending on the problem or the application, either one or theother variants may be more convenient to work with. Several models of in-

Page 11: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

11

formation algebras will be presented in this chapter, ranging from relationalalgebra to classical logic, passing by affine spaces related to systems of linearequations and convex sets, especially convex polytopes.

The idempotency axiom of an information algebra, permits to introducea partial order reflecting the information content of pieces of information.With respect to combination, the algebra becomes then a semilattice. Thispermits several interesting extension of the theory, which are examined inChapter 4. One of these extension is to introduce the concept of finitenessand approximation: Finite information elements are isolated and conditionsformulated, which allow to approximate all other elements by finite ele-ments. This leads then to compact information algebras, a theory which issimilar and closely related to algebraic lattices and domain theory. In ourcontext, we need to differentiate between compactness in the labeled andthe domain-free case. The former leads to the more general case of continu-ous information algebras, which are similar to continuous lattices. One wayto compactify an information algebra is by ideal completion, which can beconsidered both in the labeled as well as in the domain-free view. Finally,the case of information algebras with atoms is discussed: It permits to showthat information algebras are closely related to abstract versions of rela-tional algebras. This establishes a link between information algebras andBoolean structures and algebras of subsets, a topic which will be pursued inChapter 6.

Chapter 5 is devoted to mappings between information algebras. Westudy mappings which preserve information order. Not surprisingly, it turnsout that such mappings form themselves information algebras, under someconditions of continuity even compact ones.

Many important examples of information algebras have a Boolean struc-ture. The most prominent example is relational algebra. But also the alge-bras associated with propositional and predicate logic possess this Booleanstructure. In Chapter 6 the impact of this particular structure is examined.Again the case of labeled and domain-free algebras must be distinguished.In the latter case the information algebra is also a Boolean algebra, whichis not true in the former case. The duality theory of Boolean algebras in-dicates that information has also a double interpretation: a disjunctive andconjunctive one. This important, but mostly neglected topic is further dis-cussed in Chapter 8. Finally, it is well-known that Boolean algebras areembedded in set algebras. Similar results can also be obtained for Booleaninformation algebras.

An important issue with any kind of information is the question of inde-pendence of information. This has so far been in particular studied in therealm of probability theory and of relational databases, there especially withrespect to normal forms. In Chapter 7.1 we take this subject up and discusssome aspects of it. One issue is the conditional independence of domains,which is especially simple in the case of domains related to sets of variables.

Page 12: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

12 CHAPTER 1. INTRODUCTION

But here we treat it in the more general setting of any lattice of domains.The topic is important for structures like Markov trees or join trees, usedin computation and inference (see part II). It is also important to singleout special classes of information, which factorize in some way, like normalforms in relational database theory.

In the last Chapter 8 ...Part II is devoted to local computation...In part III uncertain information is considered...Finally part IV considers different way informaiton may be represented....

Page 13: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Part I

Information Algebras

13

Page 14: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof
Page 15: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 2

Modeling Questions

Information relates to question. There may be several, often many ques-tions of interest. But these questions will be related somehow. Questionshave also a certain granularity. Therefore, questions can be refined, and aquestion may be the refinement of several questions. Questions can furtherbe coarsened and a certain question can be the coarsening of several otherquestions. We need first to define a mathematical structure capturing theessential of a family of related questions. This will be done abstractly inthe next section. In the following section, several important examples ofconcrete forms of systems of questions will be given.

2.1 Refining and Coarsening: the Order Structure

The varying granularity of questions can be captured by a partial orderbetween questions. So let, in an abstract setting, D be a set whose elementsrepresent questions. The elements of D are also called domains, for a reasonwhich becomes clear later. Its elements are typically denoted by x, y, u, v,etc. We assume that a partial order ≤ is defined on D, where x ≤ y meansthat x is coarser than y and the latter finer than the first one. So ≤ satisfiesthe following three conditions

1. Reflexivity: x ≤ x.

2. Antisymmetry: x ≤ y and y ≤ x implies x = y.

3. Transitivity: x ≤ y and y ≤ z implies x ≤ z.

But we need more: Given two questions x and y, we want to form thecombined question, i.e. the least or coarsest refined question of both x andy. Therefore we suppose that for all pair x, y ∈ D the supremum or thejoin x ∨ y exists in D. Similarly, we want for a pair of questions x and yalso extract their common part, i.e. the largest coarsening of both. So we

15

Page 16: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

16 CHAPTER 2. MODELING QUESTIONS

assume that for all pair x, y ∈ D the infimum or the meet x∧ y exists in D.In other words, we assume that D is a lattice.

A lattice D is called modular, if

∀x, y, z ∈ D,x ≥ z implies x ∧ (y ∨ z) = (x ∧ y) ∨ z.

Sometimes the lattice D is even distributive, i.e.

x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z),x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z).

We do not assume these additional properties for the lattice of domains ingeneral. But in some considerations we do assume them in order to getstronger results.

Sometimes, we may also require that D contains a largest element, thetop element > representing the finest question in the family of questionsconsidered. Sometimes we require further that D contains a least or bottomelement ⊥, representing the empty question, i.e. the question with only onepossible answer. But in the general case the existence of these particularelements in D are not assumed.

IfD is a lattice and x ∈ D, then we may considerD≤x = y ∈ D : y ≤ x.This is clearly still a lattice, and on with a top element, namely x. SimilarlyD≥x = y ∈ D : y ≥ x is a lattice, as is Dx≤z = y ∈ D : x ≤ y ≤ z. Theformer has a bottom element x and the latter both a bottom element x anda top element z.

2.2 Concrete Question Structures

Here we sketch as examples a few particular question structures, which areimportant in applications.

Example 2.1 Multivariate Structures: Consider a countable family of vari-ables X1, X2, . . .. Associate with each variable Xi its domain Di, whichrepresents the set of possible values for variable Xi. Define for any subsetof J of the natural numbers N

DJ =∏j∈J

Dj .

To simplify we write Dj for Dj and we define D∅ = . Then D = DJ :J ⊆ N is a lattice where DJ ≤ DK if J ⊆ K and

DJ ∨DK = DJ∪K ,

DJ ∧DK = DJ∩K .

Page 17: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

2.2. CONCRETE QUESTION STRUCTURES 17

This lattice has as top element DN and as bottom element D∅. Note furtherthat this lattice is distributive. The idea here is that a domain Dj representsthe question “What are the values of variables Xj , j ∈ J?”. Then DJ is infact the set of possible answers. The larger the set of variables we consider,the finer is the question we ask. Sometimes we consider also the sublatticeassociated with finite subsets J or also the lattice D≤J associated with thesubsets of some set J . In all these cases we may as well identify D with thecorresponding lattice of subsets of J .

Example 2.2 Partition Structures: Consider any set U . A partition P ofU is a family of pairwise disjoint subsets of U , whose union equals U . Themembers of a partitions are called blocks. A partition represents the question“To which block belongs the unknown element of U?”. The set Part(U) ofpartitions of U is partially ordered by the relation P1 ≤ P2 if for every blockP2 from P2 there is a block P1 from P1 such that P2 ⊆ P1. We note thatwith regard to partitions often the reverse order is considered, but this onecorresponds better to our needs: A question is finer, the more blocks are inthe partition. The join of two partitions is defined as

P1 ∨ P2 = P1 ∩ P2 6= ∅ : P1 ∈ P1, P2 ∈ P2

We may even generalize the join to arbitrary nonempty families of partitions,∨j∈JPj =

⋂j∈J

Pj 6= ∅ : Pj ∈ Pj , j ∈ J

Now, every set S of partitions of U has as lower bound the partition con-sisting of just one block, namely U itself. So

Sl = P : (∀S ∈ S)P ≤ S

is not empty and ∨Sl exists therefore. We claim that ∨Sl is the infimum ofS. In fact, ∨Sl is a lower bound of S, and if P is another lower bound of S,it must be in Sl, hence P ≤ ∨Sl. So Part(U) is a complete lattice, wheremeet is defined as ∧

j∈JPj =

∨P : ∀j ∈ J,P ≤ Pj.

We cite here as result (Gratzer, 1978) which shows how to obtain the meetof a collection of partitions. If two elements a and b of U belong to the sameblock of a partition P of U we write a ≡ b (mod P). Then,

Theorem 2.1 a ≡ b(∧i∈I Pi) if, and only if, there exists a n ≥ 0, i0, . . . , in ∈

I and x0, . . . , xn+1 ∈ U such that a = x0, b = xn+1 and xj ≡ xj+1

(mod Pij ) for 0 ≤ j ≤ n.

Page 18: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

18 CHAPTER 2. MODELING QUESTIONS

We remark that the lattice Part(U) is neither distributive nor modular.But, besides a bottom element, the lattice possesses also a top element, thepartition consisting of blocks of single elements of U . Sometimes we may beinterested only in partitions with a finite number of blocks. Since the meetof two partitions has less blocks than each of the partitions, and the joinless than the product of the number of the two partitions, these partitionsform a lattice too, although not necessarily a complete one. This is called alattice of finite partitions. Also, if U is infinite, it has no top element.

The importance of partition lattices is that they are universal. Everylattice can be embedded in some partition lattice and every finite lattice canbe embedded in a finite partition lattice (Gratzer, 1978). So, in some sensewe may represent all question structures by partition lattices. For examplethe lattices of cartesian products DJ considered in example 2.1 is in fact apartition lattice with the universe

DN =∞∏j=1

Dj .

Example 2.3 Compatible Frames: Different partitions of a given universereflect different granularity considered in asking questions. However, a fixeduniverse is not really needed. There is no need to assume a most fine ques-tion. In fact, consider any question whose possible answers form a set Θ. Inan attempt to refine the analysis related to this question, we may partitionevery element of Θ into several finer subelements. This gives us a new setΛ together with a mapping τ : Θ→ P(Λ) such that

1. ∀θ ∈ Θ τ(θ) 6= ∅,

2. θ′ 6= θ′′ implies τ(θ′) ∩ τ(θ′′) = ∅,

3.⋃θ∈Θ τ(θ) = Λ.

We remark that the family τ(θ) for θ ∈ Θ is a partition of Λ and Θ istherefore a representation of this partition. Such a map τ is called a refiningof Θ, whereas Λ is a refining of Θ and the latter is a coarsening of Λ. Weshall write in the following τ : Θ→ Λ, instead of τ : Θ→ P(Λ). Note thata refining on Θ can be extended to P(Θ) by defining for any subset A of Θ,

τ(A) =⋃θ∈A

τ(θ).

Further, if τ : Θ→ Λ and µ : Λ→ Ω are two refinings, then µ τ : Θ→ Ω,defined by

µ τ(θ) = µ(τ(θ)),

Page 19: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

2.2. CONCRETE QUESTION STRUCTURES 19

is clearly also a refining.We consider now a family F of sets. If Θ,Λ ∈ F , then we write Θ ≤ Λ,

if there is a refining τ : Θ→ Λ. It is easy to verify that ≤ defines a partialorder in F . We call F a family of compatible frames, if F is a lattice underthis order. Any lattice of partitions relative to some universe U representsa particular instance of a family of compatible frames. In fact, in Part(U),if P1 ≤ P2, then there is a refining τ : P1 → P2 and the order P1 ≤ P2

corresponds to the order induced by these refinings. Hence a family ofcompatible frames, as a lattice, is in general not distributive nor modular.If it has a bottom element U , then the family of compatible frames representsa lattice of partitions with the universe U .

Example 2.4 σ-Fields: A σ-field F ⊆ P(U) is a family of subsets of U ,which is closed under countable unions and intersections. The intersectionof any family ∩i∈IFi of σ-fields is still a σ-field. As is well-known (Davey& Priestley, 1990a) such a ∩-structure induces a complete lattice underinclusion, whose meet is just intersection and whose join is defined by∨

i∈IFi =

⋂F : F ⊇

⋃i∈IFi.

This is a lattice of domains D, where a σ-field F ∈ D represents a questionabout an unknown element of U , whose granularity allows allows as answerssets in F . This structure is particularly important in probability theory.

Later, we shall see many more important examples of domain structures.τ algebras

Page 20: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

20 CHAPTER 2. MODELING QUESTIONS

Page 21: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 3

Axiomatics of InformationAlgebra

Information comes in pieces, often from different sources. The pieces must becombined to an aggregated information. Further, pieces of information bearon specified questions. With relation to questions of interest, the relevant in-formation must therefore be extracted from the given pieces of information.This can be captured by algebraic structures allowing for the operations ofcombination and focusing of information. There are several closely relatedalgebras which seize these elements. We start with a basic axiomatic sys-tem of an information algebra. Several interesting models of this structurewill then be presented. Finally, further different, but equivalent axiomaticstructures will be deduced.

3.1 Projection Axiomatics

At the beginning of Chapter 2 we said that information relates to questions.A structure of related and compatible questions is abstractly represented bya lattice D of domains. Consider now a domain x ∈ D. Pieces of informationrelated to this domain will then abstractly be represented by elements φ froma set Φx. Let then

Φ =⋃x∈D

Φx

be the set of all pieces of information over the whole family of domains x.Suppose that three operations are defined in (Φ, D):

1. Labeling: d : Φ→ D; φ 7→ d(φ),

2. Combination: ⊗ : Φ× Φ→ Φ; (φ, ψ) 7→ φ⊗ ψ,

3. Projection: ↓: Φ×D → Φ; (φ, x) 7→ φ↓x, defined for x ≤ d(φ).

21

Page 22: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

22 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

The labeling operation retrieves the domain of an information φ ∈ Φ. Thecombination operation returns the aggregation of two pieces of informationinto the combined information. Projection finally represents the operationof focusing an information φ to a smaller (more coarse) domain x. Thus wehave a two-sorted algebra (Φ, D) with a unary and two binary operations.

We impose now the following axioms on Φ and D:

1. Semigroup: Φ is associative and commutative under combination.

2. Labeling: ∀φ, ψ ∈ Φ, d(φ) = x, d(ψ) = y imply

d(φ⊗ ψ) = d(φ) ∨ d(ψ),

and ∀φ ∈ Φ and ∀x ∈ D, x ≤ d(φ), we have that

d(φ↓x) = x

3. Neutrality: ∀x ∈ D ∃ex ∈ Φx such that ∀φ ∈ Φx we have ex ⊗ φ = φand ∀y ∈ D, y ≤ x we have e↓yx = ey.

4. Nullity:∀x ∈ D ∃zx ∈ Φx such that ∀φ ∈ Φx we have zx ⊗ φ = zx and∀y ∈ D, x ≤ y we have zx ⊗ ey = zy.

5. Projection: ∀φ ∈ Φ and ∀x, y ∈ D, x ≤ y ≤ d(φ) implies that

φ↓x = (φ↓y)↓x.

6. Combination: ∀φ, ψ ∈ Φ, d(φ) = x, d(ψ) = y implies

(φ⊗ ψ)↓x = φ⊗ ψ↓x∧y.

7. Idempotency: ∀φ ∈ Φ and x ∈ D, x ≤ d(φ) implies

φ⊗ φ↓x = φ.

The semigroup axiom states that pieces of information may be combinedin any sequence without changing the result. As a consequence we do notneed to write parentheses, when we combine several pieces of information.The labeling axiom says that the combined information has the join of thedomains of its factors as domain: If two pieces of information on two do-mains x and y are aggregated, then the combined information pertains tothe least refined question. Similarly, if an information is focused on a coarserquestion, the resulting information bears on the corresponding coarser do-main. The neutrality axiom stipulates the existence of neutral elements ineach domain x. They represent the vacuous information relative to eachquestion. The axiom further states that neutral elements project to neutral

Page 23: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.1. PROJECTION AXIOMATICS 23

elements, vacuous information remains vacuous upon projection. The nul-lity axiom similarly requires the existence of a null element in each domain.They represent contradictory (“impossible”) information, which destroy anyother information. The projection axiom guarantees that projection can bedone in steps. The most powerful axiom is the combination axiom. It per-mits in some cases to avoid large domains resulting from combination, ifafterwards a projection is performed. This has important consequences forcomputation, but is also important in many other respects. Idempotencyfinally tell us that the combination of a piece of information with a part ofit gives nothing new. The repetition of the same information yields nothingnew.

A system (Φ, D) satisfying the axioms above is called a labeled informa-tion algebra.

On some occasions we shall later relax certain axioms. For instance,the idempotency axiom is not needed for developing efficient computationalschemes in part II on local computation. Also the existence of neutral ornull elements is not always required. However, these axioms reflect whatis expected in general, when a general concept of information is modeled.Therefore, without explicit mention to the contrary, we assume in the sequelthe system of the axioms above.

We now derive a few very elementary consequences from these axioms.But before we define for all φ ∈ Φ and y ∈ D, y ≥ d(φ)

φ↑y = φ⊗ ey.

We call φ↑y the vacuous extension of φ to domain y. In fact, as we shallsee below, we transport φ by this operation from its original domain tothe new, finer domain, without adding information. The following lemmacollects some results on vacuous extension and related issues.

Lemma 3.1 1. x ≤ y implies e↑yx = ey.

2. d(φ), d(ψ) ≤ z implies (φ⊗ ψ)↑z = φ↑z ⊗ ψ↑z.

3. d(φ) = x, d(ψ) = y implies φ⊗ ψ = φ↑x∨y ⊗ ψ↑x∨y.

4. ∀x, y ∈ D we have ex ⊗ ey = ex∨y.

5. d(φ) = x implies ∀y ∈ D that φ⊗ ey = φ↑x∨y.

6. d(φ) = x implies φ↑x = φ.

7. y ≤ d(φ) implies φ⊗ ey = φ.

8. d(φ) ≤ x ≤ y implies φ↑y = (φ↑x)↑y.

9. d(φ) = x implies (φ↑x∨y)↓y = (φ↓x∧y)↑y.

Page 24: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

24 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

10. d(φ) = x ≤ y implies (φ↑y)↓x = φ.

Proof. (1) We have by definition of vacuous extension, neutrality andidempotency axioms

e↑yx = ex ⊗ ey = e↓xy ⊗ ey = ey.

(2) By definition of vacuous extension and the neutrality axiom

(φ⊗ ψ)↑z = (φ⊗ ψ)⊗ ez = (φ⊗ ez)⊗ (ψ ⊗ ez) = φ↑z ⊗ ψ↑z.

(3) By the labeling and the neutrality axioms and point (2) above justproved we have

φ⊗ ψ = (φ⊗ ψ)⊗ ex∨y = (φ⊗ ψ)↑x∨y = φ↑x∨y ⊗ ψ↑x∨y.

(4) follows from (3) with φ = ex and ψ = ey and (1):

ex ⊗ ey = e↑x∨yx ⊗ e↑x∨yy = ex∨y ⊗ ex∨y = ex∨y.

(5) By the neutrality axiom, point (4) and the definition of vacuousextension,

φ⊗ ey = φ⊗ ex ⊗ ey = φ⊗ ex∨y = φ↑x∨y.

(6) Follows since φ = φ⊗ ex = φ↑x.(7) If d(φ) = x, then y ≤ x implies x ∨ y = x. Hence, using point (4)

above

φ⊗ ey = φ⊗ ex ⊗ ey = φ⊗ ex∨y = φ⊗ ex = φ.

(8) Once more by point (4) above, using x ∨ y = y, we obtain

φ↑y = φ⊗ ey = φ⊗ ex∨y = (φ⊗ ex)⊗ ey = (φ↑x)↑y.

(9) Using point (5) above and the combination axiom, we obtain

(φ↑x∨y)↓y = (φ⊗ ey)↓y = φ↓x∧y ⊗ ey = (φ↓x∧y)↑y.

(10) By the definition of vacuous extension, the combination and theneutrality axiom, using x ∧ y = x,

(φ↑y)↓x = (φ⊗ ey)↓x = φ⊗ e↓x∧yy = φ⊗ e↓xy = φ⊗ ex = φ.

utNext we prove some results concerning null elements.

Lemma 3.2 1. ∀x, y ∈ D, x ≤ y implies z↓xy = zx.

Page 25: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.1. PROJECTION AXIOMATICS 25

2. ∀x, y ∈ D, x ≤ y = d(φ) and φ↓x = zx imply φ = zy.

3. ∀x, y ∈ D, zx ⊗ zy = zx∨y.

4. ∀φ ∈ Φ, x, y ∈ D, d(φ) = x implies φ⊗ zy = zx∨y.

Proof. (1) From the nullity, the combination and the neutrality axioms,using x ∧ y = x, we derive

z↓xy = (zx ⊗ ey)↓x = zx ⊗ e↓xy = zx ⊗ ex = zx.

(2) By the idempotency and the nullity axioms we have

φ = φ⊗ φ↓x = φ⊗ zx = φ⊗ ey ⊗ zx = φ⊗ zy = zy.

(3) The labeling, neutrality and nullity axioms give

zx ⊗ zy = (zx ⊗ zy)⊗ ex∨y = (zx ⊗ ex∨y)⊗ (zy ⊗ ex∨y)= zx∨y ⊗ zx∨y = zx∨y.

(4) This follows from labeling, neutrality and nullity axioms:

φ⊗ zy = (φ⊗ ex∨y)⊗ (zy ⊗ ex∨y) = φ↑x∨y ⊗ zx∨y = zx∨y.

utIf the lattice D of domains is modular (e.g distributive), then the com-

bination axiom may be strengthened to the following result.

Lemma 3.3 Let D be modular. Then ∀φ, ψ ∈ Φ, d(φ) = x, d(ψ) = y andx ≤ z ≤ x ∨ y imply that

(φ⊗ ψ)↓z = φ⊗ ψ↓y∧z.

Proof. We have first

(φ⊗ ψ)↓z = (φ⊗ ψ)↓z ⊗ ez.

Since z ≤ x ∨ y implies that z ∧ (x ∨ y) = z and x ≤ z implies x ∨ z =z, the labeling and the combination axioms, together with associativity ofcombination gives us,

(φ⊗ ψ)↓z ⊗ ez = (φ⊗ ψ ⊗ ez)↓z

= (φ⊗ ez)⊗ ψ↓y∧z

= (φ⊗ ψ↓y∧z)⊗ ez.

Now, here the first factor in the last line has domain x∨ (y ∧ z). But by themodular law this equals (x ∨ y) ∧ z = z. Thus we obtain finally

(φ⊗ ψ)↓z = φ⊗ ψ↓y∧z.

utAt this point a first, basic example will clarify the concept of information

algebras.

Page 26: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

26 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

Example 3.1 Subsets in a System of Compatible Frames: Let F be a familyof compatible frames. In accordance with the notation in this section wedenote the frames in F by x, y, . . .. Let then Φx be the powerset of x forany x ∈ F and

Φ =⋃x∈D

Φx.

The idea is that a subset φ of x represents an information about the questionrepresented by x: namely the information that the unknown element θ ∈ xbelongs to φ. Define d(φ) = x, if φ ∈ Φx. If y ≤ x, denote the refining of xto y by τx→y : x→ y. If now d(φ) = y, and d(ψ) = y, then the combinationof these two subsets is defined by

φ⊗ ψ = τx→x∨y(φ) ∩ τy→x∨y(ψ).

This means that both subsets φ and ψ are refined to the coarsest commonframe x ∨ y and intersected there, such that d(φ⊗ ψ) = d(φ) ∨ d(ψ).

If φ is a subset of y, and x ≤ y, then the projection of φ to x is definedby

φ↓x = θ ∈ x : τx→y(θ) ∩ φ 6= ∅.

The idea behind this definition is the following: If we consider some θ ∈ x, wemay ask which elements of y are compatible with θ, or, in other words, cannot be excluded if θ is assumed. This are exactly the elements of τx→y(θ).Now, given the subset φ ⊆ y, only elements θ ∈ x, which are compatiblewith elements in φ remain possible, hence those for which τx→y(θ)∩φ 6= ∅.

This defines an algebra on (Φ,F). It is left as an exercise to verify thatthe axioms of an information algebra are satisfied. Note that this examplecovers among other things subset algebras on lattices of partitions of someuniverse U .

An important special case are information algebras related to multivari-ate structures (example 2.1 in Section 2.2). The following section will bedevoted to these particular algebras.

3.2 Multivariate Information Algebras

Multivariate structures for domains are of particular interest. Rememberthat associated with multivariate structures are distributive domain latticesD. Thus, the extended combination property lemma 3.3 will apply to them.The case of multivariate structures is rich in models. Here we introduce area few them:

Page 27: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.2. MULTIVARIATE INFORMATION ALGEBRAS 27

Example 3.2 Subsets in a Multivariate Structure: We refer to example 2.1in Section 2.2. There is a countable number of variables X1, X2, . . . whereeach variable Xi has its domain Di, the set of its possible values. To a setXj : j ∈ J of variables we associate their domain

DJ =∏j∈J

Dj .

The elements x ∈ DJ are called J-tuples; they represent the possible com-mon values of the variables Xj for j ∈ J . Note that a J-tuple can be seenas a mapping x : J → DJ such that x(i) ∈ Di. An information about theset of variables Xj can be given by a subset φ ∈ DJ , i.e. a set of J-tupleswhich contains the unknown common value of the variables. So let ΦJ bethe powerset of DJ , and

Φ =⋃J⊆N

ΦJ .

Let D be the lattice of all subsets of ω = 1, 2, . . .. For φ ∈ ΦJ define thelabeling d(φ) = J . In order to define combination and projection within Φ,we define the projection φ↓K of a J-tuple to a K-configuration for K ⊆ Jas the K-tuple y such that y(i) = x(i) for i ∈ K. Then the projection of aJ-information φ ∈ ΦJ to K ⊆ J is defined by

φ↓K = x↓K : x ∈ φ.

The combination of a J-information φ with a K-information ψ is defined as

φ⊗ ψ = x ∈ DJ∪K : x↓J ∈ φ, x↓K ∈ ψ.

Thus we have a two-sorted algebra (Φ, D) with the unary operation d : Φ→D, and the binary operations ⊗ : Φ × Φ → Φ and ↓: Φ × D → Φ, definedfor K ⊆ d(φ). We leave it to the reader to verify that the axioms of aninformation algebra are satisfied. As a variant we may also take for D thelattice of finite subsets of ω.

A very important example of an information algebra is provided by re-lational databases. In fact, an important reduct of relational algebra is aninformation algebra. It can be seen as a particular case of example 3.2, butit is worthwhile to develop it in its own right.

Example 3.3 Relational Algebra. Let A be a finite set of symbols, calledattributes. For each α ∈ A let Dα be a non-empty set, the set of all possiblevalues of attribute a. The set of attributes corresponds to the set of variablesin the terminology above, only that we restrict it here to a finite set. TheDα are the domains of the variables.

Page 28: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

28 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

Let x ⊆ A. An x-tuple is a function f with domain x and f(α) ∈ Dα

for each attribute α ∈ x. The set of all x-tuples is called Ex. The emptyset of no x-tuple (the empty relation) is denoted by ∅x. For any x-tuple fand a subset y ⊆ x the restriction f [y] is defined to be the y-tuple g so thatg(α) = f(α) for all α ∈ y. Tuples correspond to J-tuples in example 2.1 andthe restriction to a projection.

A relation R over x is a set of x-tuples, i.e. a subset of Ex. The setof attributes x is called the domain of R. As usual we denote it by d(R).Relations represent subsets of domains of a group of variables. For y ⊆ d(R),the projection of R onto y is defined as follows:

πy(R) = f [y] : f ∈ R.

The join of a relation R over x and a relation S over y is defined as

R ./ S = f : f ∈ Ex∪y, f [x] ∈ R, f [y] ∈ S.

Clearly, the join is combination. In fact, it is easy to see that the fractionof relational algebra containing the two operations of projection and joinsatisfies the axioms of information algebra. Here the axioms are expressedin the notation of relational algebra:

1. (R1 ./ R2) ./ R3 = R1 ./ (R2 ./ R3) (associativity),

2. R1 ./ R2 = R2 ./ R1 (commutativity),

3. d(R ./ S) = d(R) ∪ d(S) and d(πy(R)) = y (labeling),

4. R ./ Ex = R, if d(R) = x and πy(Ex) = Ey, if y ⊆ x (neutrality),

5. R ./ ∅x = ∅x, if d(R) = x, and ∅x ./ Ey = ∅y, if x ⊆ y (nullity),

6. πx(R) = πx(πy(R)), if x ⊆ y ⊆ d(R) (projection),

7. πx(R ./ S) = R ./ πx∩y(S), if d(R) = x and d(S) = y (combination),

8. R ./ πx(R) = R, if x ⊆ d(R) (idempotency).

Relational algebra is thus a model of an information algebra, and a veryfundamental one, as explained in the following Section 3.3. Of course rela-tional algebra contains more operations than projection and joins, in factit is also a model of a Bayesian information algebra, we shall see later (seeSection 6.2).

Example 3.4 Idempotent Semirings. A rather general family of informa-tion algebras on multivariate structures can be derived from distributivelattices with a bottom element ⊥ and a top element >. Let A be a dis-tributive lattice in which we denote the meet by × and the join by +. In

Page 29: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.2. MULTIVARIATE INFORMATION ALGEBRAS 29

this view, a distributive lattice is a particular instance of a more generalalgebraic structure, called semiring. Semirings play an important role intheoretical Computer Science. In our context too, they prove to be a richsource of models of information algebras. The present example is a first hintat this fact. More examples in this framework can be found in (Pouly &Kohlas, 2010)

In what follows we write consequently a join over several elements ofthe lattice as

∑. For any subset J of ω consider the set ΦJ of J-tuples

introduced in example 3.2 above and mappings ψ : ΦJ → A, whereby toeach J-tuple a value ψ(x) in A is assigned. Such a mapping is called anA-valuation on J . Let now ΨJ denote the set of all A-valuation on J and

Ψ =⋃J⊆ω

ΨJ .

Within Ψ we define the unary operation d by d(ψ) = J if ψ is a J-valuation.Further we define two binary operations as follows:

1. Combination:⊗ : Ψ×Ψ→ Ψ, (φ, ψ) 7→ φ⊗ψ defined for x ∈ Dd(φ)∪d(ψ)

by

φ⊗ ψ(x) = φ(x↓d(φ))× ψ(x↓d(ψ)).

2. Projection:↓: Ψ × D → Ψ, (ψ,K) 7→ ψ↓K , defined for K ⊆ d(ψ) andx ∈ DK by

ψ↓K(x) =∑

y∈Dd(ψ):y↓K=x

ψ(y).

The defining equation for projection can also be written in the following way,if we decompose the tuples y in domain J = d(ψ) into subtuples x belongingto domain K and subtuples y belonging to domain J −K, y = (x, z):

ψ↓K(x) =∑

z∈DJ−K

ψ(x, z).

We claim that the two-sorted algebra (Ψ, D) with the unary operation d andthe two binary operations ⊗ and ↓ as defined above forms an informationalgebra. The semigroup, labeling axioms are immediate consequences of thedefinition and the commutativity and associativity of meet in the underlyingdistributive lattice A. The neutral A-valuation on J is the constant mappingeJ(x) = > (the neutral element for ×) for all x ∈ DJ . By the idempotencyof the join in the lattice A we see that for any K ⊆ J = d(ψ),

e↓KJ (x) =∑

z∈DJ−K

eJ(x, z) = >.

Page 30: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

30 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

Thus we conclude that e↓KJ = eK , hence that the neutrality axiom is satisfiedtoo. The null A-valuation on J is the constant mapping zJ(x) = ⊥, the nullelement for multiplication. Then we obtain for K ⊆ J ,

zK ⊗ eJ(x) = zK(x↓K)× eJ(x) = ⊥×> = ⊥.

Thus, we see that zK ⊗ eJ = zJ , hence the nullity axiom holds also. Tran-sitivity of projection means simply that we can sum out variables in twosteps. The combination property also follows easily, since × distributes over+ in the distributive lattice A. Finally, the idempotency axiom follows ifwe remember that × denotes the infimum. Thus

ψ ⊗ ψ↓K(x) = ψ(x)× ψ↓K(x↓K) ≤ ψ(x).

Conversely, for y ∈ DK and z ∈ DJ−K ,

ψ ⊗ ψ↓K(y, z) = ψ(y, z)×∑

z′∈DJ−K

ψ(y, z′)

≥ ψ(y, z)× ψ(y, z) = ψ(y, z),

since ψ(y, z) is a term of the sum in the first line and × is idempotent.Now, there are many examples of distributive lattices which yield inter-

esting information algebras. For instance the bottleneck algebra defined onthe set of real numbers, augmented with +∞ and −∞, A = R∪−∞,+∞,takes max for the +-operation and min for the ×-operation. The element−∞ is the bottom element⊥, and the top-element is +∞. More generally,distributive lattices can be used to express qualitative degrees of membershipsof tuples to fuzzy sets. Further, Boolean algebras are distributive lattices.Elements of Boolean algebras can be seen as describe parts of belief associ-ated with tuples. These parts of belief may even be quantitatively measuredby probabilities. Such systems will be discussed in part III on UncertainInformation. In particular, A-valuations with the binary Boolean algebra0, 1 as A represent subsets of tuples φ = x : ψ(x) = 1. This informationalgebra is essentially identical (isomorph) to the subset algebra of Example3.2.

Example 3.5 Convex Subsets in Euclidian Vector Spaces: Let Dj = R ina multivariate structure. Here we admit, to simplify, only finite sets J inthe lattice D, such that DJ are finite-dimensional real vector spaces Rn. Asubset φ of DJ is convex, if x, y ∈ φ implies λ · x + (1 − λ) · y ∈ φ for all0 ≤ λ ≤ 1. Let now ΦJ be the family of convex subsets of DJ , and

Φ =⋃

J⊆ω,J finiteΦJ .

Within the two-sorted system (Φ, D) the unary operation d of labeling andthe binary operations of combination and projection are defined as in the

Page 31: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.3. FILE SYSTEMS 31

subset algebra of Example 3.2. We claim that the combination of two convexsubsets is again convex, just as the projection of a convex subset is stillconvex, i.e. Φ is closed under combination and projection. Then it followsautomatically from Example 3.2 that (Φ, D) is an information algebra, infact, a subalgebra of the algebra of subsets in the corresponding multivariatestructure. Indeed, since addition in DJ is component-wise, we have, forx, y ∈ DJ and K ⊆ J , that

(λ · x+ (1− λ) · y)↓K = λ · x↓K + (1− λ) · y↓K .

Hence, if d(φ) = K and d(ψ) = J , then, for x, y ∈ φ ⊗ ψ consider λ · x +(1 − λ) · y. By definition of combination x↓K , y↓K ∈ φ, hence (λ · x + (1 −λ) · y)↓K = λ · x↓K + (1 − λ) · y↓K ∈ φ. Similarly we conclude also that(λ · x + (1 − λ) · y)↓J ∈ ψ. Thus λ · x + (1 − λ) · y ∈ φ ⊗ ψ, which provesthat φ ⊗ ψ is convex. Further, let x′, y′ ∈ φ↓K , then there exist x, y ∈ φsuch that x′ = x↓K and y′ = y↓K . Then λ · x+ (1− λ) · y ∈ φ implies thatλ · x′ + (1− λ) · y′ ∈ φ↓K . This shows that φ↓K is convex.

In particular, n-dimensional intervals (open, closed, half-open) as wellas convex polyhedra form a subalgebra of convex sets.

Further examples of information algebras are provided by open, closed,compact and Borel sets in real spaces Rn. We leave it to the reader to verifytheses claims.

3.3 File Systems

The example of relational algebra can be generalized in a more abstractframework. Let D be a lattice. A tuple system over D is a set T togetherwith two operations

1. d : T → D,

2. ·[·] : T ×D → T , defined for x ≤ d(f),

which satisfies the following axioms for f, g ∈ T and x, y ∈ D:

1. x ≤ d(f) implies d(f [x]) = x.

2. x ≤ y ≤ d(f) implies f [y][x| = f [x].

3. d(f) = x implies f [x] = f .

4. d(f) = x, d(g) = y and f [x ∧ y] = g[x ∧ y] imply that there exists ah ∈ T such that d(h) = x ∨ y and h[x] = f , h[y] = g.

5. d(f) = x and x ≤ y imply that there exists a g ∈ T such that d(g) = yand g[x] = f .

Page 32: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

32 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

The elements of T are called tuples, d(f) is the domain of a tuple f and f [x]its projection to domain x. Obviously, the mathematical tuples of ordinaryrelational algebra (see previous example 3.3) form a tuple system, where Dis the power set lattice of the set of attributes A.

Let (T,D) be a tuple system with the operations d and ·[·]. Generalizingordinary relations we define a relation over x ∈ D to be a subset R of tuplessuch that d(f) = x for all f ∈ R. The domain of R equals then the domainof its tuples, d(R) = d(f), if f ∈ R. As in the former example, we define Exto be the set of all tuples f ∈ T with d(f) = x and ∅x as the set empty setwith d(∅x) = x. For a relation R and x ≤ d(R), the projection of R onto xis defined as

πx(R) = f [x] : f ∈ R.

Similarly, the join of two relations R and S with d(R) = x and d(S) = y isdefined as follows:

R ./ S = f ∈ T : d(f) = x ∨ y, f [x] ∈ R, f [y] ∈ S.

If the set on the right is empty, we identify it with ∅x∨y. Let RT be the setof all relations in T . It forms an information algebra.

Theorem 3.1 (RT , D) is an information algebra with the operations of pro-jection and join as combination.

Proof. (1) The join is commutative by definition. It is also associative,since f ∈ (R1 ./ R2) ./ R3 implies f [x ∨ y] ∈ R1 ./ R2 and f [z] ∈ R3, hencef [x] ∈ R1, and f [y] ∈ R2, if d(R1) = x, d(R2) = y and d(R3) = z. The sameresult is obtained, if f ∈ R1 ./ (R2 ./ R3).

(2) d(R ./ S) = d(R) ∪ d(S) holds by definition. So does d(πx(R)) = xif x ≤ d(R) and by axiom (1) for tuples.

(3) By definition πy(Ex) = f [y] : f ∈ Ex if y ≤ x. Now, if g ∈ Ey, thenby axiom (5) for tuples there exists a tuple f ∈ Ex such that f [y] = g, henceg ∈ πy(Ex) and therefore πy(Ex) = Ey. Clearly Ex ./ R = R if d(R) = x.So the neutrality axiom holds.

(4) The nullity axiom holds by definition.(5) The projection axiom follows from the definition of projection and

axiom (2) for tuples.(6) In oder to prove the combination axiom (6) assume first that f ∈

πx(R ./ S) where d(R) = x and d(S) = y. Then there exists a g ∈ R ./ Ssuch that g[x] = f and g[x] ∈ R, g[y] ∈ S. This implies f ∈ R. Furtherg[x∧y] = g[x][x∧y] = f [x∧y] and therefore g[x∧y] = g[y][x∧y] = f [x∧y].But this shows that f [x ∧ y] ∈ πx∧y(S), hence f ∈ R ./ πx∧y(S).

Conversely assume f ∈ R ./ πy∧z(S). This means that f ∈ R andf [x ∧ y] ∈ πx∧y(S). The latter inclusion implies that there is a g ∈ S such

Page 33: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.4. VARIABLE ELIMINATION 33

that g[x∧ y] = f [x∧ y]. By property (4) of a tuple system, there is a h withd(h) = x ∨ y and such that h[y] = g and h[x] = f . Thus, h ∈ R ./ S andhence f = h[x] ∈ πx(R ./ S).

(7) Idempotency axiom (7) finally follows directly from the definition ofthe join. ut

We shall encounter later many instances of a tuple system. In fact,remarkably, an information algebra (Φ, D) is also a tuple system.

Lemma 3.4 Let (Φ, D) be an information algebra. Then (Φ, D) is a tuplesystem with domain d and projection ↓.

Proof. We have to show that the five axioms for tuples are satisfied:(1) follows from the labeling axiom.(2) is the projection axiom.(3) Using the axioms of an information algebra, assuming d(φ) = x, we

see that φ↓x = (φ ⊗ ex)↓x = φ ⊗ e↓xx = φ ⊗ ex = φ. This is axiom (3) fortuples.

(4) Assume d(φ) = x, d(ψ) = y and φ↓x∧y = ψ↓x∧y, then d(φ⊗ψ) = x∨y,and we claim that φ⊗ψ is the element whose existence is asserted in axiom(4) for tuples. Indeed,

(φ⊗ ψ)↓x = φ⊗ ψ↓x∧y = φ⊗ φ↓x∧y = φ,

(φ⊗ ψ)↓y = φ↓x∧y ⊗ ψ = ψ↓x∧y ⊗ ψ = ψ.

(5) Assume that d(φ) = x and x ≤ y. Then d(φ↑y) = y and (φ↑y)↓x = φ(lemma 3.1 (10)). So φ↑y is the element whose existence is asserted in axiom(5) ut

3.4 Variable Elimination

In a multivariate framework we may replace projection by an alternativeprimitive operation. Let V be the countable set of variables X1, X2, . . .and let now d(φ) ⊆ V denote the set of variables defining the domain of φ.For an element φ ∈ Φ and a variable X ∈ d(φ) we define

φ−X = φ↓d(φ)−X. (3.1)

This operation is called variable elimination. Let v = X1, X2, . . . denotea countable family of variables. If we replace projection by variable elimi-nation, then we obtain a system (Φ, V ) with the following operations:

1. Labeling: d : Φ→ V ; φ 7→ d(φ),

2. Combination: ⊗ : Φ× Φ→ Φ; (φ, ψ) 7→ φ⊗ ψ,

3. Projection: − : Φ× V → Φ; (φ,X) 7→ φ−X , for X ∈ d(φ).

Page 34: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

34 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

We shall prove below, that if (Φ, D) is an information algebra, then thefollowing system of axioms holds in (Φ, V ) and that this system is in factequivalent to the previous one, based on projection.

1. Semigroup: Φ is associative and commutative under combination.

2. Labeling: ∀φ, ψ ∈ Φ, d(φ) = x, d(ψ) = y imply

d(φ⊗ ψ) = d(φ) ∪ d(ψ),

and ∀φ ∈ Φ and ∀X ∈ d(φ), we have that

d(φ−X) = d(φ)− X.

3. Neutrality: ∀x ∈ D ∃ex ∈ Φx such that ∀φ ∈ Φx we have ex ⊗ φ = φand ∀X ∈ x, we have e−Xx = ex−X.

4. Nullity:∀x ∈ D ∃zx ∈ Φx such that ∀φ ∈ Φx we have zx ⊗ φ = zx and∀y ∈ D, x ≤ y we have zx ⊗ ey = zy.

5. Variable Elimination: ∀φ ∈ Φ and ∀X,Y ∈ d(φ), X 6= Y implies that

(φ−X)−Y = (φ−Y )−X .

6. Combination: ∀φ, ψ ∈ Φ, d(φ) = x, d(ψ) = y and Y ∈ y, Y 6∈ Ximplies

(φ⊗ ψ)−Y = φ⊗ ψ−Y .

7. Idempotency: ∀φ ∈ Φ and X ∈ d(φ) implies

φ⊗ φ−X = φ.

Theorem 3.2 If (Φ, D) is an information algebra over D = P(V ) andvariable elimination is defined by (3.1), then the system of axioms above forvariable elimination is satisfied.

Proof. Let (Φ, D) be an information algebra over D = P(V ). We onlyhave to prove axioms (2),(3) and (5) to (7) for variable elimination, the otherones being identical to those of labeled information algebra.

Axiom (2) follows from the labeling axiom since d(φ−X) = d(φ↓d(φ)−X) =d(φ) − X by the definition of variable elimination. Axiom (3) follows inthe same way from the neutrality axiom.

Axiom (5) follows from the definition of variable elimination, axiom (2)just proved and the projection axiom of information algebras,

(φ−X)−Y = (φ↓d(φ)−X)↓d(φ)−X,Y

= φ↓d(φ)−X,Y

= (φ↓d(φ)−Y )↓d(φ)−X,Y

= (φ−Y )−X .

Page 35: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.4. VARIABLE ELIMINATION 35

To derive axiom (6) note that x ⊆ (x ∪ y) − Y ⊆ x ∪ y. Take thenz = (x∪ y)− Y and apply lemma 3.3 (remember that the lattice P(V ) isdistributive, hence modular),

(φ⊗ ψ)−Y = (φ⊗ ψ)↓(x∪y)−Y

= φ⊗ ψ↓y∩((x∪y)−Y ).

Since Y ∈ d(ψ) = y, Y 6∈ d(φ) we obtain y ∩ ((x ∪ y) − Y ) = y − Y ,hence

(φ⊗ ψ)−Y = φ⊗ ψ↓y−Y = φ⊗ ψ−Y .

Axiom (7) finally follows from the definition of variable elimination andthe idempotency axiom of information algebra. ut

The axioms of variable elimination allow to define unambiguously theelimination of two or more variables X1, X2, . . . , Xn ∈ d(φ) by

φ−X1,X2,...,Xn = (· · · ((φ−X1)−X2) · · ·)−Xn , (3.2)

independently of the sequence of variable elimination actually used.Here are a few elementary results on variable elimination:

Lemma 3.5 1. ∀φ ∈ Φ, x ⊆ d(φ) implies

d(φ−x) = d(φ)− x.

2. ∀φ ∈ Φ, x, y ⊆ d(φ), x and y disjoint, implies

(φ−x)−y = φ−(x∪y).

3. ∀φ, ψ ∈ Φ, y ⊆ d(ψ), y ∩ d(φ) = ∅, implies

(φ⊗ ψ)−y = φ⊗ ψ−y.

4. ∀φ ∈ Φ, x ⊆ d(φ) implies

φ⊗ φ−x = φ.

Proof. (1) in order to prove this first item, we order the variables ofthe set x arbitrarily as X1, X2, . . . , Xn and apply definition (3.2) as well asaxiom (2) of the system above repeatedly

d(φ−x) = d((· · · ((φ−X1)−X2) · · ·)−Xn)= (· · · ((d(φ)− X1)− X2) · · ·)− Xn= d(φ)− x.

Page 36: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

36 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

(2) Follows directly from the definition of the elimination of subsets ofvariables (3.2).

(3) Follows from (3.2) and a repeated application of axiom (6) of thesystem above.

(4) By definition we have

φ⊗ φ−x = φ⊗ (. . . ((φ−X1)−X2) · · ·)−Xn .

If we apply idempotency axiom (7) above repeatedly, then we obtain

φ⊗ φ−x = φ⊗ φ−X1 ⊗ (φ−X1)−X2 ⊗ · · · ⊗ (. . . ((φ−X1)−X2) · · ·)−Xn .

A second, repeated application of idempotency axiom (7) from right to leftto the right hand side of the last equation then yields finally that the righthand side equals φ. ut

Since focusing is defined for arbitrary sets x of variables, it can not, ingeneral, be recovered from variable elimination, since this operation is onlydefined for finite sets. However, there is an important special case wherethis inversion is possible: If the domain d(φ) of all elements φ ∈ Φ is finite,then the information algebra (Φ, D) is called locally finite dimensional. Inthis case projection can now conversely be expressed as variable elimination.For x ⊆ d(φ) we have

φ↓x = φ−(d(φ)−x). (3.3)

The following theorem states now that the two ways of defining informationalgebras (Φ, D) and (Φ, V ) over variables are in fact equivalent for locallyfinite dimensional information algebras.

Theorem 3.3 If (Φ, V ) satisfies the system of axioms for variable elimina-tion given above, and if it is locally finite, then, if projection is defined by(3.3), the algebra (Φ, D), with D as the lattice of finite subsets of V , is aninformation algebra.

Proof. We have to verify axiom (2),(3) and (5) to (7) of an informationalgebra.

In order to prove axiom (2) apply lemma 3.5 (1) to the definition ofprojection (3.3),

d(φ↓x) = d(φ−(d(φ)−x)) = d(φ)− (d(φ)− x) = x.

Axiom (3) follows from the neutrality axiom (3) above for variable elim-ination and the definition of projection (3.3).

In order to prove axiom (5) assume d(φ) = z and x ⊆ y ⊆ z. Then, bythe definition of projection (3.3) and lemma 3.5 (2)

(φ↓y)↓x = (φ−(z−y))−((z−y)−x) = φ−(z−x) = φ↓x.

Page 37: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.4. VARIABLE ELIMINATION 37

Next assume that d(φ) = x and d(ψ) = y. Then, by the labeling axiomwe have d(φ ⊗ ψ) = x ∪ y. Thus, using the definition of projection andlemma 3.5 (3), we obtain, since (x ∪ y)− x = y − (x ∩ y)

(φ⊗ ψ)↓x = (φ⊗ ψ)−((x∪y)−x) = φ⊗ ψ−(y−(x∩y)) = φ⊗ ψ↓x∩y.

This is the combination axiom (6) for labeled information algebras.The idempotency axiom (7) for labeled information algebras finally fol-

lows from Lemma 3.5 (4). utIn certain cases variable elimination is the more natural operation than

projection. Let’s present two examples which illustrate this.

Example 3.6 Affine Spaces. Consider a countable set of variablesX1, X2, . . .,each with values in a field F . For a finite subset s ⊆ ω = 1, 2, . . . a map-ping x : s → F is called a s-vector over the field F , and the set of alls-vectors is denoted by Fs. If i ∈ s, the value x(i) ∈ F is called the i-thcomponent of the s-vector x. Clearly, for any s, Fs is a linear space, withaddition x+ y as usual defined by (x+ y)(i) = x(i) + y(i) and scalar multi-plication λ · x defined by λx(i) for any scalar λ ∈ F . The projection x↓t ofa s-vector x to a subset t ⊆ s is simply the restriction of x to t.

Consider now a system of linear equations over a set of variables s:∑j∈s

ai,jXj = ai,0, i = 1, . . . ,m.

The set of solutions φ = x ∈ Fs :∑

j∈s ai,jxj = ai,0, i = 1, . . . ,m de-termines an affine subspace of F . Of course the same affine space maybe determined by different systems of linear equations. The solution set φmay also be empty. Otherwise we may assume that m ≤ |s| and that them equations are linearly independent. Then the associated homogeneoussystem ∑

j∈sai,jXj = 0, i = 1, . . . ,m.

has |s| −m linearly independent solutions b1, . . . , b|s|−m. If b0 is a solutionof the original system, then

φ = x ∈ R : x = b0 + λ1 · b1 + · · ·λ|s|−m · b|s|−m.

is the affine solution space of the system of equations.For t ⊇ s, we may transform any system of linear equations over s into

a system of linear equations over t,∑j∈t

ai,jXj = ai,0, i = 1, . . . ,m, where ai,j = 0 for j ∈ t− s.

Page 38: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

38 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

The corresponding affine space in F↑t is denoted by φ↑t, if φ is the affinespace associated with the original system. Similarly, if S(φ) is a systemof linear equations over s, whose affine space is φ, then S(φ)↑t denotes thesame system, but over t.

Let now Φs be the set of affine spaces in Fs and

Φ =⋃s⊆ω

Φs.

In Φ we may define operations of combination and variable elimination,which satisfy the axioms of an information algebra.

1. Combination: Assume φ and ψ two affine spaces over s and t re-spectively, with associated systems of linear equations S(φ) and S(ψ).Then we define the combined affine space φ⊗ ψ by

φ⊗ ψ = φ↑s∪t ∩ ψ↑s∪t.

Note that this new affine space is determined by the system of linearequations

S(φ)↑s∪t ∪ S(ψ)↑s∪t.

In terms of systems of linear equations combination is essentially unionof system of equations.

2. Variable Elimination: Let φ be an affine space in Fs, defined by asystem S(φ) of linear equations over s. Select k ∈ S. We want todefine the operation of elimination of variable Xk from φ. There arethree cases to distinguish:

(a) Either Xk appears in no equation with coefficient ai,k 6= 0. Thenlet S(φ)−Xk be the system S(φ), but with the term 0 ·Xk elimi-nated in each equation. This is a system over s− k.

(b) Or Xk appears in exactly one equation with a coefficient ai,k 6= 0.Then let S(φ)−Xk be the system S(φ) with this equation elim-inated, and with, in the remaining equations, the term 0 · Xk

eliminated everywhere. Again S(φ)−Xk is a system over s− k.(c) Else, select one equation with ai,k 6= 0 and solve it for Xk,

Xk =ai,0ai,k−

∑h∈s−k

ai,hai,k

Xh.

Then, replace in all other equations Xk by the right hand sideabove, and regroup terms;∑h∈s−k

(aj,h −

ai,hai,k

Xh

)Xj = aj,0 −

ai,0ai,k

, j = 1, . . . ,m, j 6= i.

Page 39: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.4. VARIABLE ELIMINATION 39

The resulting system of equations S(φ)−Xk contains one equationless than S(φ) and is a system over s− k.

It is easy to verify that variable elimination corresponds to a projectionφ−Xk of affine affine space φ in Fs to the set of variables s− k. It is notdifficult to verify that Φ equipped with the two operations of combinationand variable elimination satisfies the axioms of an information algebra.

The affine space defined by a single linear equation over s is called ahyperplane in Fs. Note then that any affine space in Fs is the combinationof a finite number of hyperplanes in Fs-

This is a first example where the information elements φ in an algebra Φare represented by expressions S(φ) of some formal language, in this case thelanguage of linear equations over a field F . The association of elements φ totheir representation S(φ) is nondeterministic, because different systems maydefine the same element φ. However the basic operations of combination andvariable elimination can be expressed in terms of the representations S(φ).As we shall see later (part IV), this is a very general situation. The nextexample is of the same kind.

Example 3.7 Convex Polytopes and Polyhedra. Instead of linear equationsas in the previous example, we may look at linear inequalities. However wefix here the field F to be the field of reals R and consider thus linear spacesRs for finite subsets s ⊆ ω = 1, 2, . . .. We consider systems of linearinequalities over s, ∑

j∈sai,jXj ≤ ai,0, i = 1, . . . ,m.

The set of all s-vectors x satisfying such a system of linear inequalities,

φ = x ∈ Rs :∑j∈s

ai,jxj ≤ ai,0, i = 1, . . . ,m.

is called a convex polyhedron. Note that different systems of linear inequal-ities may define the same convex polyhedron.

As usual it may be convenient to use matrix notation. Let Am×s be thematrix with m rows and a column j for any j ∈ s. If Xs is the s-columnvector of variables Xj , j ∈ s and a0 the column vector with componentsai,0, i = 1, . . . ,m, , then the system of linear inequalities above may bewritten as Am×sXs ≤ a0. If s ⊆ t, then A↑tm×s is the m × t matrix, wherethe columns for j ∈ s correspond to those of Am×s, and the columns forj ∈ t−s are 0-columns. The system A↑tm×sX

t ≤ a0 is then a system of linearinequalities over t, the extension of the system Am×sX

s ≤ a0. It defines aconvex polyhedron over t, which clearly is nothing else than the cylindrical

Page 40: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

40 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

extension of the polyhedron over s in Rt. This cylindrical extension of aconvex polyhedra φ over s to t is denoted by φ↑t. Let S(φ) denote anarbitrary system of linear inequalities over s, which determines φ. As before,S(φ)↑t denotes the extension of S(φ) to t.

We admit also the empty system of linear inequalities over s, i.e. the sys-tem consisting of no linear inequality. The corresponding convey polyhedronis then simply the whole space Rs. Also, a system of linear inequalities maybe inconsistent, i.e. have no solution. The corresponding convex polyhedronis then the empty set (as a subset of Rs).

Let now Φs denote the set of all convex polyhedra over s, let D denotethe lattice of all finite subsets s ⊆ ω, and

Φ =⋃s∈D

Φs.

We are now going to define the operations of combination and variableelimination among elements of Φ and claim that with these operations Φforms an information algebra.

1. Combination: Assume φ and ψ convex polyhedra over s and t respec-tively, with systems S(φ) and S(ψ) of linear inequalities determiningthe two polyhedra. The we define the combined convex polyhedraφ⊗ ψ by

φ⊗ ψ = φ↑s∪t ∩ ψ↑s∪t.

Note that this combined polyhedron is determined by the system

S(φ)↑s∪t ∪ S(ψ)↑s∪t.

Combination is simply union of the respective systems of linear in-equalities.

2. Variable Elimination: Let φ be a convex polyhedron over s deter-mined by a system S(φ). If k ∈ s, let S(φ)+ denote the subsystem oflinear inequalities with a positive coefficient ai,k > 0 of the variableXk, whereas S(φ)− denote the subsystem of linear inequalities witha negative coefficient ai,k < 0 of the variable Xk. If i is the index ofan inequality in S(φ)+ and j an index of an inequality in S(φ)−, thenthese two inequalities may be rewritten as follows:

Xk ≤ ai,0ai,k−

∑h∈s−k

ai,hai,k

Xh,

Xk ≥ aj,0aj,k−

∑h∈s−k

aj,haj,k

Xh.

Page 41: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.5. THE TRANSPORT OPERATION 41

This leads then to the inequality, where the variable Xk is eliminatedai,0ai,k−

∑h∈s−k

ai,hai,k

Xh ≤aj,0aj,k−

∑h∈s−k

aj,haj,k

Xh.

If we eliminate the variable Xk in every pair of inequalities of S(φ)+

and S(φ)−, then we obtain a new system of linear inequalities, whichcontains no more the variable Xk. If either S(φ)+ or S(φ)− are empty,then elimination of Xk leads to the empty system. This new system oflinear inequalities corresponds to the projection φ−Xk of the originalconvex polyhedron φ to the set of variables s− k.

It can be readily verified that the system Φ of convex polyhedra satisfies theaxioms of an information algebra. As in the case of the previous example3.6, the elements φ of the algebra have a representation S(φ) in terms of aformal language and the operations of the algebra (combination and variableelimination) are reflected in these representations.

The convex polyhedron associated with a single linear inequality overs is called a halfspace of Rs. The discussion above makes it clear, thatany convex polyhedron is the intersection, i.e. the combination, of a finitenumber of halfspaces. Halfspaces are thus a kind of generating elements ofthe algebra of convex polyhedra.

A bounded polyhedron is called a polytope. It is evident that combinationof polytopes and variable elimination of a polytope yield again polytopes.Hence, polytopes form a subalgebra of the information algebra of convexpolyhedra.

3.5 The Transport Operation

When we vacuously extend an information φ from its domain x to a largerdomain y ≥ x by φ↑y = φ ⊗ ey, then we do not add any information, since(φ↑y)↓x = φ (see lemma 3.1 (10)). So φ and its vacuous extensions representthe same information. Pursuing this idea we define the following equivalencein an information algebra (Φ, D): For φ, ψ ∈ Φ, if d(φ) = x and d(ψ) = y,then

φ ≡ ψ (mod σ) if, and only if, φ↑x∨y = ψ↑x∨y. (3.4)

For further reference we define the following operation for φ ∈ Φ withd(φ) = x and any y ∈ D,

φ→y = (φ↑x∨y)↓y

This is called transport of an information φ from its domain x to anotherdomain y. By (9) of Lemma 3.1 we have also

φ→y = (φ↓x∧y)↑y.

Page 42: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

42 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

Note that, if y ≤ x, then transport is projection, and if x ≤ y, then transportis vacuous extension. Further φ ≡ ψ (mod σ) holds if, and only if,

φ→y = ψ, and ψ→x = φ.

Thus φ and ψ really represent the same information in this case, simplywith reference to different domains. This implies also that

φ↓x∧y = ψ↓x∧y. (3.5)

The following property is fundamental for the theory to be developed.

Lemma 3.6 The relation defined in (3.4) is a congruence relative to com-bination, projection and transport in the information algebra (Φ, D).

Proof. We verify first that the relation is an equivalence. Reflexivity andsymmetry are direct consequences from the definition. In order to provetransitivity assume d(φ) = x, d(ψ) = y, d(η) = z and φ ≡ ψ (mod σ),ψ ≡ η (mod σ). This means that φ↑x∨y = ψ↑x∨y and ψ↑y∨z = ψ↑y∨z. Bythe transitivity of vacuous extension, see lemma 3.1 (8), we obtain φ↑x∨y∨z =ψ↑x∨y∨z = η↑x∨y∨z. Point (10) of lemma 3.1 implies then

φ↑x∨z = (φ↑x∨y∨z)↓x∨z = (η↑x∨y∨z)↓x∨z = η↑x∨z.

Thus we have φ ≡ η (mod σ).Assume now φ1 ≡ ψ1 (mod σ) and φ2 ≡ ψ2 (mod σ) and let x1, x2, y1, y2

be the domains of φ1, φ2, ψ1, ψ2 respectively. Then, the assumed equiva-lences, together with lemma 3.1 (5), allow the following derivation,

(φ1 ⊗ φ2)↑(x1∨x2)∨(y1∨y2) = (φ1 ⊗ φ2)⊗ ey1∨y2

= (φ1 ⊗ ey1)⊗ (φ2 ⊗ ey2)= (ψ1 ⊗ ex1)⊗ (ψ2 ⊗ ex2)= (ψ1 ⊗ ψ2)⊗ ex1∨x2

= (ψ1 ⊗ ψ2)↑(x1∨x2)∨(y1∨y2 .

Thus we conclude that φ1 ⊗ φ2 ≡ ψ1 ⊗ ψ2 (mod σ).Further we claim that φ1 ≡ ψ1 (mod σ) implies φ↓z = ψ↓z if z ≤ d(φ) =

x, d(ψ) = y. Indeed, by the transitivity axiom and (3.5) above,

φ↓z = (φ↓x∧y)↓z = (ψ↓x∧y)↓z = ψ↓z.

So we have φ↓z ≡ ψ↓z (mod σ), whenever the two projections exist.Note that for any z ∈ D we have that φ→z = (φ ⊗ ez)↓z. By the

congruence relative to combination and projection we have then that φ ≡ ψ(mod σ) implies φ→z ≡ ψ→z (mod σ). In fact, according to the proofwe have even that φ→z = ψ→z. So the congruence holds also relative totransport. ut

The next lemma summarizes some elementary results concerning thetransport operation.

Page 43: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.5. THE TRANSPORT OPERATION 43

Lemma 3.7 1. d(φ) = x implies φ→x = x.

2. d(φ) = x, and x, y ≤ z imply φ→y = (φ↑z)↓y.

3. y ≤ z implies φ→y = (φ→z)→y.

4. ∀φ, and ∀x, y ∈ D, (φ→x)→y = (φ→x∧y)→y.

5. d(φ) = x implies (φ⊗ ψ)→x = φ⊗ ψ→x.

Proof. (1) follows since φ→x = φ↓x = φ as we saw above.(2) Use first transitivity of vacuous extension (lemma 3.1 (8)) and pro-

jection, together with the fact that x, y ≤ z implies y ≤ x∨ y ≤ z, and thenlemma 3.1 (10) to obtain

(φ↑z)↓y = (((φ↑x∨z)↑z)↓x∨y)↓y = (φ↑x∨y)↓y = φ→y.

(3) Assume d(φ) = x. Then, use first the definition of transport, thentransitivity of projection, followed by transitivity of vacuous extension (lemma3.1 (8)) and of projection, and then lemma 3.1 (10),

(φ→z)→y = ((φ↑x∨z)↓z)↓y = (φ↑x∨z)↓y

= (((φ↑x∨y)↑x∨z)↓x∨y)↓y = (φ↑x∨y)↓y = φ→y.

(4) To prove this identity, apply first (3) just proved, then lemma 3.1 (9)

(φ→x∧y)→y = ((φ→x)→x∧y)→y = ((φ→x)↓x∧y)↑y = (φ→x)→y.

(5) This follows from the combination axiom,

(φ⊗ ψ)→x = (φ⊗ ψ)↓x = φ⊗ ψ↓x∧y = φ⊗ ex ⊗ ψ↓x∧y

= phi⊗ (ψ↓x∧y)↑x = φ⊗ ψ→x.

utSince the relation σ is a congruence relative to combination and trans-

port, we may consider the congruence classes [φ]σ of the elements of Φ. Theelements of each such class represent the same information. Therefore theseclasses can be considered as a representation of information without refer-ence to a particular domain. Because σ is a congruence, we may define theoperations of combination and focusing unambiguously by

[φ]σ ⊗ [ψ]σ = [φ⊗ ψ]σ, [φ]⇒xσ = [φ→x]σ.

We may now consider the algebra of domain free information associated tothe information algebra (Φ, D). Denote the set of all congruences classes [φ]σby Φ/σ. Note that Φ/σ is a commutative semigroup under combination. Theclass [ex] is its neutral element and the class [zx] its null element. Furtherthe operations of combination and focusing have interesting properties asstated in the next theorem.

Page 44: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

44 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

Theorem 3.4 1. ∀x, y ∈ D, ([φ]⇒xσ )⇒y = [φ]⇒x∧yσ ,

2. ∀x ∈ D, ([φ]⇒xσ ⊗ [ψ]σ)⇒x = [φ]⇒xσ ⊗ [ψ]⇒xσ ,

3. ∀x ∈ D, [ex]⇒xσ = [ex]σ, and [zx]⇒xσ = [zx]σ,

4. ∀([φ]σ ∈ Φ/σ, ∃x ∈ D such that [φ]⇒xσ = [φ]σ.

Proof. (1) By lemma 3.7 (4) (φ→x)→y = (φ→x∧y)→y. But (φ→x∧y)→y =(φ→x∧y)↑y, hence (φ→x∧y)→y ≡ φ→x∧y (mod σ). Therefore, ([φ]⇒xσ )⇒y =[(φ→x)→y]σ = [φ→x∧y]σ = [φ]⇒x∧yσ .

(2) Since d(φ→x) = x, we obtain, using lemma 3.7 (5),

([φ]⇒xσ ⊗ [ψ]σ)⇒x = [(φ→x ⊗ ψ)→x]σ = [φ→x ⊗ ψ→x]σ= [φ]⇒xσ ⊗ [ψ]⇒xσ .

(3) Follows from e→xx = ex and z→xx = zx.(4) If d(φ) = x, then φ→x = φ, hence [φ]⇒xσ = [φ]σ. utLet’s illustrate the meaning of domain-free information by an example.

Example 3.8 Valuation and Interpretation in Propositional Logic. Thelanguage of logic can be used to describe information. As an example weconsider here propositional logic. The vocabulary of the language of proposi-tional logic consist of a countable number of propositional symbols, denotedas p1, p2, . . ., the logical constants ⊥ and >, the symbols ¬,∧ and auxiliarysymbols like parentheses. Propositional formulae or sentences are formed bythe following inductive rules:

1. Each propositional variable is a formula.

2. The constants ⊥ and > are formulae.

3. If A is a formula, then (¬A) is a formula

4. If A and B are formulae, then (A ∧B) is a formula.

The information expressed by a formula of propositional logic is obtainedthrough its semantics: A truth function is a mapping α which assigns eachpropositional variable pi a truth value αi ∈ f , t. If α is a truth function,then define the extension α of α to all propositional formulae inductively asfollows:

1. If pi is a propositional variable, then α(pi) = αi.

2. α(⊥) = f and α(>) = t.

3. If A is a propositional formula, then α(¬A) = f , if α(A) = t andα(¬A) = t, if α(A) = f .

Page 45: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.5. THE TRANSPORT OPERATION 45

4. If A and B are formulae, then α(A ∧ B) = t if α(A) = α(B) = t andα(A ∧B) = f otherwise.

A truth function α is called a model of a propositional formula A if α(A) = t;then we write α |= A. The set of all models of a formula A will be denotedby r(A) = α : α |= A.

Let var(A) denote the set of variables of a formula A. The restriction ofa truth function α to the set of variables var(A) is called an interpretationof A. Clearly, truth functions with the same interpretation of A yield thesame value α of A. So, if f , tω is the set of all truth functions, then apropositional formula A determines a subset r(A) of its model.

Now, consider the multivariate structure associated with the proposi-tional variables pi, that is, Di = f , t and, for any finite subset J of vari-ables DJ = f , tJ . If then S is any subset of f , tω, and J a finite subset,let

S↓J = x↓J : x ∈ S.

Further, let ΦJ be the family of sets S↓J for all S ⊆ f , tω and

Φ =⋃

J⊆ω, finiteΦJ .

Clearly Φ is an information algebra in the multivariate structure definedabove. If φ = S↓J ∈ ΦJ and ψ = S↓K ∈ ΦK , then clearly

φ→K = ψ, ψ→J = φ.

We remark now, that if var(A) ⊆ J,K for a propositional formula, thenφ = r(A)↓J and ψ = r(A)↓K are equivalent, φ ≡ ψ (mod σ), and indeed,they represent the same information expressed by the formula A. So aformula A is a domain-free expression of an information, which, however,by r(A)↓J may be focussed any time to sets of variables of interest.

In this way propositional logic gives rise to an associated informationalgebra. If projection is neglected, the algebra corresponds to the Linden-baum algebra of propositional logic. This is a Boolean algebra, we refer toChapter 6 for a discussion of information algebras in relation with Booleanalgebras.

Instead of starting with labeled information we may as well start withdomain-free information directly and derive labeled information from it.This will be discussed in the following Section 3.6. This gives us an alter-native way to look at information and its algebraic structure.

Page 46: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

46 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

3.6 Domain-Free Algebra

Guided by the discussion in the previous Section 3.5, we introduce in thissection an alternative algebra of information, which we call domain-free.We shall show how we can derive from it the old labeled variant. Thisdemonstrates then that both, domain-free and labeled information algebras,are two sides of the same coin, two different representations of the samething, i.e. information as related to questions.

So, let Ψ be a set of elements, called domain-free pieces of informa-tion. Let further, as before, D be a lattice of domains. Suppose that twooperations are defined,

1. Combination: ⊗ : Ψ×Ψ→ Ψ; (φ, ψ) 7→ φ⊗ ψ,

2. Focusing: ⇒: Ψ×D → D; (φ, x) 7→ φ⇒x.

As before, the elements of Ψ are thought to represent pieces of knowledgeor information, this time independent of any domain or question, i.e. in adomain-free way. The operation of combination is again introduced in orderto represent aggregation of different pieces of information into a new piece ofinformation, called the combined information. Focusing on the other hand isthought to represent the extraction of information relative to a given domainx, i.e. removing the part not pertaining to the domain x. As we shall see ina moment, focusing is closely related to variable elimination as projection isin the labeled version.

We require the system (Ψ, D) to satisfy the following system of axioms:

1. Semigroup: Ψ is associative and commutative under combination.

2. Support: ∀ψ ∈ Ψ, ∃x ∈ D such that ψ⇒x = ψ.

3. Neutrality: ∃e ∈ Ψ such that ∀ψ ∈ Ψ, ψ⊗e = e, and ∀x ∈ D, e⇒x = e.

4. Nullity: ∃z ∈ Ψ such that ∀ψ ∈ Ψ, ψ ⊗ z = z, and ∀x ∈ D, z⇒x = z.

5. Focusing: ∀ψ ∈ Ψ and ∀x, y ∈ D, (ψ⇒x)⇒y = ψ⇒x∧y.

6. Combination: ∀φ, ψ ∈ Ψ and ∀x ∈ D, (φ⇒x ⊗ ψ)⇒x = φ⇒x ⊗ ψ⇒x.

7. Idempotency: ∀ψ ∈ Ψ, and ∀x ∈ D, ψ ⊗ ψ⇒x = ψ.

These axioms require that Ψ is a commutative semigroup under combi-nation, which contains both a neutral element e and a null element z. Bothof these elements are unique. A domain x, such that ψ⇒x = ψ is called asupport of ψ. It means that the full content of a piece of information ψ per-tains to domain x. The support axiom requires that every element of Ψ hassuch a support. The focusing axiom means that extracting the information

Page 47: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.6. DOMAIN-FREE ALGEBRA 47

pertaining to a domain x and then a to domain y, is the same as extractingthe information pertaining to the infimum domain x∧ y, i.e. the “common”part of the two questions represented by x and y. The combination axiomstates that combination of a piece of information pertaining to a domainx with any other information and then extracting the part pertaining to xfrom the aggregated information is the same as first extracting the part ofthe second information pertaining to x and the combining it with the partof the first one pertaining to x. Idempotency finally says that aggregatinga piece of information with part of itself gives nothing new.

As with labeled information algebra, we may relax sometimes some ax-ioms. The existence of the neutral element need not be postulated. Wemay always adjoin a neutral element e by setting φ⊗ e = e⊗ φ = φ for allelements φ ∈ Φ and e⇒x = e for all x ∈ D. The support axiom could alsobe omitted in certain cases.

A system (Ψ, D) satisfying these axiom are called a domain-free infor-mation algebra. As we have seen the previous Section 3.5, and in particularin Theorem 3.4, a domain-free algebra is associated with any labeled infor-mation algebra. We shall show below that the converse is true too. Butbefore let’s give an example of a domain-free information algebra.

Example 3.9 System of Subsets A rather trivial example of a domain-free information algebra is given by the subsets of the universe of a latticeD of partitions which contains U as top element. Combination is simplyintersection. If P is a partition in D, and φ any subset of the universe D,then φ⇒P is defined as

φ⇒P =⋃P ∈ P : P ∩ φ 6= ∅.

Each element is supported by U . It is left as an exercise to verify the axioms.If the universe does not belong to the lattice D, then the subsets must belimited to those, which are unions of blocks of some partition of D.

In a domain-free information algebra, we call two elements φ, ψ ∈ Φcontradictory or incompatible, if

φ⊗ ψ = z.

Here are a few results on incompatible elements:

Lemma 3.8 1. ∀x ∈ D and φ ∈ Φ, φ⇒x = z implies φ = z.

2. ∀x ∈ D and φ, ψ ∈ Φ we have φ⊗ψ⇒x = z if, and only if, φ⇒x⊗ψ = z.

3. ∀x, y ∈ D and φ, ψ ∈ Φ we have φ⇒x ⊗ ψ⇒y = z if, and only if,φ⇒y ⊗ ψ⇒x = z.

Page 48: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

48 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

Proof. (1) By idempotency φ = φ⊗ φ⇒x = φ⊗ z = z.(2) Suppose φ⊗ψ⇒x = z. Then, by the combination and nullity axioms

z = (φ⊗ ψ⇒x)⇒x = φ⇒x ⊗ ψ⇒x = (φ⇒x ⊗ ψ)⇒x

Therefore, by (1) above φ⇒x ⊗ ψ = z. The converse follows by symmetry.(3) Suppose φ⇒x ⊗ ψ⇒y = z. Note that (φ⇒x)⇒y = φ⇒x∧y = (φ⇒y)⇒x

by the focusing axiom. Then, by the combination axiom and (2) above

z = (φ⇒x ⊗ ψ⇒y)⇒y = (φ⇒x)⇒y ⊗ ψ⇒y = (φ⇒y)⇒x ⊗ ψ⇒y

= φ⇒y ⊗ (ψ⇒y)⇒x = φ⇒y ⊗ (ψ⇒x)⇒y = (φ⇒y ⊗ ψ⇒x)⇒y.

Therefore, by (1) above we have φ⇒y ⊗ ψ⇒x = z. The converse follows bysymmetry. ut

Support is an important concept of information. The following lemmacollects a few properties of support.

Lemma 3.9 1. ∀ψ ∈ Ψ, and ∀x ∈ D, x is support of ψ⇒x.

2. ∀ψ ∈ Ψ, and ∀x, y ∈ D, x is support of ψ implies x is support of ψ⇒y.

3. ∀ψ ∈ Ψ, and ∀x, y ∈ D, x and y are supports of ψ imply x ∧ y issupport of ψ.

4. ∀ψ ∈ Ψ, and ∀x, y ∈ D, x is support of ψ implies x ∧ y is support ofψ⇒y.

5. ∀ψ ∈ Ψ, and ∀x, y ∈ D, x is support of ψ implies ψ⇒y = ψ⇒x∧y.

6. ∀ψ ∈ Ψ, and ∀x, y ∈ D, x is support of ψ, and x ≤ y imply y issupport of ψ.

7. ∀φ, ψ ∈ Ψ, and ∀x ∈ D, x is support of φ and ψ implies x is supportof φ⊗ ψ.

8. ∀φ, ψ ∈ Ψ, and ∀x ∈ D, x is support of φ, and y is support of ψ implyx ∨ y is support of φ⊗ ψ.

Proof. (1) The focusing axiom implies that (ψ⇒x)⇒x = ψ⇒x∧x = ψ⇒x.(2) Again by the focusing axiom we derive (ψ⇒y)⇒x = ψ⇒x∧y = (ψ⇒x)⇒y =

ψ⇒y since ψ⇒x = ψ.(3) By the focusing axiom again, ψ⇒x∧y = (ψ⇒x)⇒y = ψ⇒y = ψ.(4) Once more, by the focusing axiom, we conclude (ψ⇒y)⇒x∧y = ψ⇒x∧y,

since y ∧ (x ∧ y) = x ∧ y.(5) Here we use the focusing axiom to deduce that ψ⇒y = (ψ⇒x)⇒y =

ψ⇒x∧y.(6) We have ψ = ψ⇒x = ψ⇒x∧y since x = x ∧ y. But then the focusing

axiom gives us ψ⇒x∧y = (ψ⇒x)⇒y = ψ⇒y. This proves that ψ = ψ⇒y.

Page 49: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.6. DOMAIN-FREE ALGEBRA 49

(7) Here we use the combination axiom to obtain (φ ⊗ ψ)⇒x = (φ⇒x ⊗ψ)⇒x = φ⇒x ⊗ ψ⇒x = φ⊗ ψ.

(8) Follows from (6) and (7). utWe use now these results to construct a labeled information algebra from

a domain-free one: Let (Ψ, D) be a domain free algebra. For all x ∈ D define

Φx = (ψ, x) : ψ ∈ Ψ, ψ⇒x = ψ.

In words: Φx contains all pairs (ψ, x) of elements ψ from Ψ which have x asa support. Let further

Φ =⋃x∈D

Φx.

We define now in Φ the three operations of labeling, combination and pro-jection, where labeling is simply the extraction of the support from a pair(ψ, x) and combination and projection are defined from the operations ofcombination and focusing in the domain-free algebra (Ψ, D):

1. Labeling: ∀(ψ, x) ∈ Φ define

d(ψ, x) = x. (3.6)

2. Combination: ∀(φ, x), (ψ, y) ∈ Φ define

(φ, x)⊗ (ψ, y) = (φ⊗ ψ, x ∨ y). (3.7)

3. ∀(ψ, x) ∈ Φ and ∀y ∈ D such that y ≤ x, define

(ψ, x)↓y = (ψ⇒y, y). (3.8)

Note that combination and projection give indeed results within Φ (useLemma 3.9 to verify this). The next theorem states that (Φ, D) with theoperations as defined above is a labeled information algebra.

Theorem 3.5 (Φ, D) with labeling, combination and projection defined by(3.6), (3.7), and (3.8), respectively, is a labeled information algebra.

Proof. (1) The semigroup axiom follows from the associativity and thecommutativity of combination in Ψ and of join in the lattice D.

(2) The labeling axiom follows from the definitions of labeling, combina-tion and projection above.

(3) Note that all x ∈ D are supports of the neutral element e ∈ Ψ.Therefore (e, x) ∈ Φ for all domains x. This is certainly a neutral elementof combination in Φx. Further we have by definition of projection (e, x)↓y =(e⇒y, y) = (e, y).

Page 50: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

50 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

(4) Similarly, all x ∈ D are supports for the null element z ∈ Ψ, hence(z, x) ∈ Φ for all x. The elements (z, x) are null elements in Φx. Further(z, x)⊗ (e, y) = (z ⊗ e, y) = (z, y).

(5) Using the focusing axiom, we obtain by definition of projection (3.8),assuming that z is a support of φ ∈ Φ,

((φ, z)↓y)↓x = (φ⇒y, y)↓x = ((φ⇒y)⇒x, x) = (φ⇒x∧y, x) = (φ⇒x, x),

since x ≤ y implies x ∧ y = x. This is the projection axiom for labeledalgebras.

(6) Assume that x is a support of φ and y a support of ψ. Then thecombination axiom of the domain-free algebra allows us to deduce

((φ, x)⊗ (ψ, y))↓x = (φ⊗ ψ, x ∨ y)↓x = ((φ⊗ ψ)⇒x, x)= (φ⊗ ψ⇒x, x)

But, since y is a support of ψ, we have ψ⇒x = ψ⇒x∧y (see lemma 3.9 (5)).Thus we obtain further

((φ, x)⊗ (ψ, y))↓x = (φ⊗ ψ⇒x∧y, x) = (φ, x)⊗ (ψ⇒x∧y, x ∧ y)= (φ, x)⊗ (ψ, y)↓x∧y.

This proves the combination axiom of labeled algebras.(7) Let x be a support of φ and y ≤ x. Then, by the idempotency axiom

for domain-free algebras

(φ, x)⊗ (φ, x)↓y = (φ, x)⊗ (φ⇒y, y) = (φ⊗ φ⇒y, x)= (φ, x).

So the idempotency axiom for labeled algebras holds too. utIf we start with a labeled information algebra (Φ, D), pass to the cor-

responding domain-free algebra (Φ/σ,D) and then from there on to thecorresponding labeled algebra (Φ∗, D), then we would expect the latter tobe essentially the original algebra. Similarly, if we start with a domain-freealgebra (Ψ, D), derive the associated labeled algebra (Ψ∗, D) and constructfrom it the domain-free algebra (Ψ∗/σ,D) we expect the latter to be essen-tially equal to the first one.

Theorem 3.6 With the denotations introduced above (Φ, D) is isomorphicto (Φ∗, D) and (Ψ, D) to (Ψ∗/σ,D).

Proof. Define a mapping h from (Φ, D) to (Φ∗, D) as follows: for φ ∈ Φlet

h(φ) = ([φ]σ, d(φ)) ∈ Φ∗.

Page 51: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.6. DOMAIN-FREE ALGEBRA 51

The mapping h is one-to-one since ([φ]σ, d(φ)) = ([ψ]σ, d(ψ)) implies φ ≡ ψ(mod σ) and d(φ) = d(ψ). This together implies φ = ψ. It is a mappingonto Φ∗, since for an element ([ψ]σ, x) ∈ Φ, there must be a φ ∈ Φ with[φ]σ = [ψ]σ (namely φ = ψ→x) and d(φ) = x. Finally h is a homomorphism,since for φ, ψ ∈ Φ and x ∈ D, if z ≤ d(φ) = x and d(ψ) = y,

h(φ⊗ ψ) = ([φ⊗ ψ]σ, x ∨ y) = ([φ]σ, x)⊗ ([ψ]σ, y) = h(φ)⊗ h(ψ).h(φ↓z) = ([φ↓z]σ, z) = ([φ]⇒zσ , z) = ([φ]σ, x)↓z = (h(φ))↓z.

h(ex) = ([ex]σ, x), h(zx) = ([zx]σ, x)

and the elements on the right hand side of the equations of the last two linesare the neutral and null elements in the domain x of the algebra (Φ/σ,D).This proves the first part of the theorem.

The second part is proved similarly. utJust as in labeled information algebras related to a multivariate structure

we may also define the operation of variable elimination in domain-freealgebras. As before denote by V the countable set of variables X1, X2, . . .and let D be the associated lattice of domains. Consider a domain-freeinformation algebra (Φ, D). Let x be a support of φ ∈ Φ. For any variableX ∈ V we define then

φ−X = φ⇒x−X. (3.9)

We remark, that this definition is independent of the support of φ used. Infact, if y is another support of φ, then according to the focusing axiom wesee that

φ⇒x−X = (φ⇒y)⇒x−X = φ⇒(x∩y)−X

= (φ⇒x)⇒y−X = φ⇒y−X.

For this new operation we prove the following basic set of results:

Lemma 3.10 Let (Φ, D) be a domain-free information algebra with D thelattice of domains associated with the set of variables V . Then the followingholds

1. Commutativity of Variable Elimination. ∀φ ∈ Φ and X,Y ∈ V ,

(φ−X)−Y = (φ−Y )−X .

2. Combination. ∀φ, ψ ∈ Φ and X ∈ V ,

(φ−X ⊗ ψ)−X = φ−X ⊗ ψ−X .

3. Idempotency. ∀φ ∈ Φ and X ∈ V ,

φ⊗ φ−X = φ.

Page 52: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

52 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

4. Neutrality and Nullity. ∀X ∈ V ,

e−X = e, z−X = z, .

Proof. (1) Let x be a support of φ. Then x−X is a support of φ⇒x−X

(lemma 3.9 (1)). Thus, by the definition of the variable elimination (3.9)and the focusing axiom of domain-free information algebras, we obtain

(φ−X)−Y = (φ⇒x−X)⇒x−X−Y

= (φ⇒x−Y )⇒x−Y −X = (φ−Y )−X .

(2) Here we use the combination axiom for domain-free information al-gebras, together with the definition of variable elimination (3.9). Let x bea support of φ and y a support of ψ. Then x ∪ y is a support of φ, ψ andφ⊗ ψ (lemma 3.9 (6) and (8)). Hence

(φ−X ⊗ ψ)−X = (φ⇒(x∪y)−X ⊗ ψ)⇒(x∪y)−X

= φ⇒(x∪y)−X ⊗ ψ⇒(x∪y)−X

= φ−X ⊗ ψ−X .

(3) follows directly from the idempotency axiom for domain-free algebrasand the definition of variable elimination.

(4) follows since by the neutrality and nullity axioms of domain-freealgebras all domains x are supports of both e and z. ut

As usual we identify in a a domain-free information algebra (Φ, D) over amultivariate structure the lattice D with the corresponding lattice of subsetsof the set V of variables. We call then a domain-free information algebra(Φ, D) locally finite dimensional, if every element φ ∈ Φ has a support xof finite cardinality. Clearly, the domain-free algebra derived from a locallyfinite dimensional labeled information algebra is locally finite dimensional.If x and y are two finite support sets of φ, then x∩ y is also a finite supportset of φ (lemma 3.9 (3)). Hence, in a locally finite dimensional informationalgebra, any information element φ ∈ Φ has a unique finite least support∆(φ). This set is called dimension set of φ. In terms of variable elimination,we can define this set alternatively as

∆(φ) = X ∈ V : φ−X 6= φ.

The cardinality of the dimension set is called the dimension of the infor-mation element. The neutral and the null element have dimension zero.Clearly, if V has finite cardinality, then (Φ, D) is locally finite dimensional.But even, if V is not finite, there are locally finite dimensional informationalgebras.

Here are two simple properties of dimension sets:

Page 53: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.6. DOMAIN-FREE ALGEBRA 53

Lemma 3.11 1. ∀φ ∈ Φ and ∀x ∈ D,

∆(φ⇒x) ⊆ ∆(φ).

2. ∀φ, ψ ∈ Φ,

∆(φ⊗ ψ) ⊆ ∆(φ) ∪∆(ψ).

Proof. (1) ∆(φ) is a support for φ. By lemma 3.9 (5) it is also a supportof φ⇒x, thus ∆(φ⇒x) ⊆ ∆(φ).

(2) Since ∆(φ) and ∆(ψ) are supports of φ and ψ, by lemma 3.9 (8)∆(φ) ∪∆(ψ) is a support of φ⊗ ψ, thus ∆(φ⊗ ψ) ⊆ ∆(φ) ∪∆(ψ). ut

In a locally finite dimensional algebra, we may recover focusing fromvariable elimination, just as projection can be recovered in the correspondingcase of labeled information algebras. First we define for any finite set x =X1, X2, . . . , Xm of variables

φ−x = (· · · ((φ−X1)−X2) · · ·)−Xm .

According to the commutativity of variable elimination (lemma 3.10 (1) φ−x

is well defined and independent of the sequence of the variable elimination.Then define

φ⇒x = φ−∆(φ)−x.

This definition is valid for finite sets x. But it can be extended to arbitrarysets y. Note that, if y is not finite, then φ⇒y = (φ⇒∆(φ))⇒y = φ⇒∆(φ)∩y.

As in the case of labeled information algebras, we may here too recoverthe axioms of a domain-free information algebra. So, we may in a multivari-ate setting, as well work with variable elimination instead of focusing if thealgebras are locally finite dimensional. But we may also consider algebraswith variable elimination which are not locally finite dimensional, as thenext example shows.

Example 3.10 Cylindric Algebras. In order to develop an algebraic theoryof predicate logic so-called cylindric algebras were introduced (Henkin et al.,1971). The algebras have as a base a Boolean algebra Φ together with anoperation of variable elimination φ−X for all φ ∈ Φ and X element of someset. In relation to this operation the properties of commutativity of variableelimination and of combination, with combination replaced by meet, andnullity of lemma 3.10 above are required. In addition ∀φ ∈ Φ and X

φ ≤ φ−X

is added as an axiom. This last property implies the neutrality property oflemma 3.10 above: φ ≤ ψ means that φ∨ψ = ψ. Hence e−X = e∨ e−X = e.

Page 54: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

54 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

Since φ ≤ ψ means also that φ ∧ ψ = φ, the last property implies alsoidempotency (with respect to meet).

Cylindric algebras have an additional operation, called diagonalization.If this operation is retracted, the algebra is called a diagonal-free algebra.Thus, a diagonal-free algebra is a domain-free information algebra with meetas combination, although a particular one, since Φ is a Boolean algebra. Inthe cylindric algebra associated with predicate logic, variable eliminationcorresponds to existential quantification, see the following example.

Example 3.11 Information Algebras Related to Predicate Logic. As wehave seen in Example 3.8 propositional logic can be used to express infor-mation about the truth of certain proposition or combinations of them. Thisis reflected in the information algebra associated with propositional logic.Now, more generally, predicate logic can be used to express informationabout the values of certain variables. This is explained in this example. Asin propositional logic this leads to information algebras related to predicatelogic.

As a language, predicate logic has a vocabulary consisting of a count-able family of variables, denoted by X1, X2, . . ., constants ⊥, >, a finite orcountable family of non-logical constants denoted by P1, P2, . . . and sym-bols ∧, ∨,¬ and ∃. To each logical constant Pi a definite, finite rank ρi (anon-negative integer) is associated. From this vocabulary expressions areformed, based on the following inductive rules:

1. PiXi1 , . . . , Xiρiis an expression for any nonlogical constant Pi.

2. ⊥ and > are expressions.

3. if φ is an expression, then ¬φ and ∃Xi[φ] are expressions.

4. if φ and ψ are expressions, then so are φ ∧ ψ and φ ∨ ψ.

Expressions (1) and (2) are called atomic. The predicate language Λ isthe set of expressions obtained from atomic expressions by applying a finitenumber of times the rules (3) and (4) above. It is convenient to introduce anumber of further expressions as short hands for some often used formulas,

φ→ ψ = (¬φ) ∨ ψ,φ↔ ψ = (φ→ ψ) ∧ (ψ → φ).

In an expression ∃Xi[φ] the expression φ is called the scope of the ex-istential quantification of Xi and Xi is called quantified in ∃Xi[φ]. A vari-able is bounded in an expression φ, if it appears only in quantified expres-sions. Otherwise it is called free. For instance the variable X1 is free inthe expression P1X1 ∧∃X1[P2X1X2 ∨P3X2], but bounded in the expressionP1X2∧∃X1[P2X1X2∨P3X2]. An expression in which all occurring variables

Page 55: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.6. DOMAIN-FREE ALGEBRA 55

are bounded is called a sentence. Expressions of the predicate language Λare also called formulas (including sentences).

Formulae of predicate logic are interpreted using relational structures. Arelational structure is defined by a set U , the domain, and a family R1, R2, . . .of ρi relations in U , where to each non-logical constant Pi corresponds ex-actly a relation Ri with the same rank ρi

1. Such a relational structureis denoted by R =< U,R1, R2, . . . >. Further, a valuation is a mappingv : 1, 2 . . . → U . By a valuation variable Xi becomes value v(i) assigned.A valuation v can be extended to a truth valuation vR of formulas of Λrelative to a relational structure by the following inductive rules:

1. vR(PiXi1 , . . . , Xiρi) = t if values v(i1), . . . , v(iρi) satisfy relation Ri,

and vR(PiXi1 , . . . , Xiρi) = f otherwise.

2. vR(⊥) = f and vR(>) = t.

3. vR(¬φ) = f , if vR(φ) = t and vR(¬φ) = t, if vR(φ) = f .

4. v(∃Xi[φ]) = t is there exists a valuation u with u(j) = v(j) for j 6= isuch that vR(φ) = t and vR(∃Xi[φ]) = f otherwise.

5. vR(φ ∧ ψ) = t if vR(φ) = t and vR(ψ) = t, and vR(φ ∧ ψ) = fotherwise.

6. vR(φ∨ψ) = t if vR(φ) = t or vR(ψ) = t, and vR(φ∨ψ) = f otherwise.

For a sentence σ, the truth value vR(σ) does not depend on the valuationv used, but only on the relational structure R. If vR(σ) = t for somerelational structure R we say that σ holds in R and that the latter is amodel of σ. We write then R |= σ. If Σ is a set of sentences, then wesay Σ holds in R, if all σ ∈ Σ hold in R, i.e. R |= Σ, if R |= σ for allσ ∈ Σ. The relational structure R is then also called a model of Σ. If φ isa formula containing free variables Xi1 , . . . , Xin , then vR(φ) depends onlyon the values v(i1), . . . , v(in). It may be that vR(φ) = t for all valuations v.Then we say φ holds in R.

Let now Σ be a set of sentences and φ a formula of Λ. Then φ is calleda logical consequence of Σ, or φ is implied by Σ, if φ holds in all models ofΣ; then we write Σ |= φ. The set of all sentences,

C(Σ) = σ : σ is a sentence Σ |= σ,

which are logical consequences of Σ are called a theory, or the theory ax-iomatized by Σ. A theory Θ is closed, that is, it contains all sentences itimplies, C(Θ) = Θ. Given a relational structure R, the set

ΘR = σ : σ is a sentenceR |= σ1In many-sorted logic, different domains man be associated with each variable

Page 56: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

56 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

is always a theory, the (first order) theory of R. More generally, any set ofrelational structures induces in this way a theory, the set of all sentenceswhich hold in all structures of the set. The theory of the empty set ofstructures is the set of tautologies. Any theory Θ is the theory of some(possibly empty) set of structures, e.g. the set of all it models.

Now we are in a position to introduce domain-free information algebrasrelated to predicate logic. Let Σ be a set of sentences and φ and ψ formulasof Λ. Then

φ ≡Σ ψ if Σ |= φ↔ ψ

is an equivalence in Λ. Let [φ]Σ, [ψ]Σ, etc. denote the corresponding equiv-alence classes and FΣ the family of these equivalence classes. Then thefollowing operations are well-defined in FΣ

1. [φ]Σ ∧ [ψ]Σ = [φ ∧ ψ]Σ.

2. [φ]Σ ∨ [ψ]Σ = [φ ∨ ψ]Σ.

3. [φ]cΣ = [¬φ]Σ.

4. [φ]−XiΣ = [∃Xi[φ]]Σ.

It can easily be verified that FΣ is relative to the first three operations aBoolean algebra, and becomes by adding the fourth operation a cylindricalgebra, hence an information algebra. Of course, if Θ is the theory of Σ,then FΣ = FΘ. These algebras are also called algebras of (predicate logic)formulas.

The most interesting case arises, if we consider the algebra FΘR asso-ciated with the theory ΘR of a single relational structure R. Note that inthis case

φ ≡ΘR ψ if vR(φ) = vR(ψ).

We write in this case simply φ ≡R ψ, [φ]R for the corresponding equivalenceclasses and FR for the associated information algebra. Define the set

rR(φ) = v : vR(φ) = t

of valuations v under which φ is true. In other words, stating the formulaφ, we indicate the valuations in rR(φ) to be the ones which may containthe “real world”. This is the information conveyed by φ. The set rR(φ) is asubset of Uω. It is a cylindrical set over the free variables Xi1 , . . . , Xin of φ,that is, if a valuation v belongs to rR(φ), then any other valuation u withu(i1) = v(i1), . . . , u(in) = v(In) belongs to rR(φ) too. The free variablesform the base of the cylindrical set.

Page 57: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.6. DOMAIN-FREE ALGEBRA 57

The cylindrical sets rR(φ) form a cylindrical algebra with set-intersection,union and complementation as Boolean operations and cylindrification forvariable elimination, i.e. for S ⊆ Uω,

S−Xi = u ∈ Uω : u(j) = v(j) for j 6= i for some v ∈ S.

It is evident that the following relations hold

vR(φ ∧ ψ) = vR(φ) ∩ vR(ψ),vR(φ ∨ ψ) = vR(φ) ∪ vR(ψ),vR(¬φ) = vcR(φ),

vR(∃Xi[φ]) = v−XiR (φ).

The algebra of formulas and the corresponding cylindrical set algebra areessentially the same (isomorph). In particular, existential quantificationtranslates into cylindrification of information. Since all elements of the theset algebra have a finite base, the algebra is called locally finite. Any locallyfinite algebra is isomorph to an algebra of formulas. However, there arecylindrical set algebras, which are not locally finite, for example the algebraof all subsets of Uω. Such an algebra does not correspond to an algebra offormulas, at least not for a language like Λ, where any formula has finitelength. We refer to (Langel, 2009) for more on information algebras andpredicate logic.

Example 3.12 Quantifier Algebra. The previous example can be gener-alized by abstracting the operation of existential quantification. If Φ is aBoolean algebra, or, more generally, a semilattice, then an existential quan-tifier is a mapping ∃ : Φ→ Φ subject to the following conditions:

1. ∃⊥ = ⊥, if ⊥ is the least element in Φ.

2. φ ∧ ∃φ = φ,

3. ∃(φ ∧ ∃ψ) = ∃φ ∧ ∃ψ.

Note that any variable elimination defined by φ 7→ φ−X in a domain-freeinformation algebra is an existential quantifier.

More generally, let D be a lattice of subsets of some set I. Assume thatthere is an existential quantifier ∃(J) for every subset J ∈ D on the Booleanalgebra Φ, and that

1. ∃(∅)φ = φ,

2. If J,K ∈ D, then ∃(J ∪K)φ = ∃(J)(∃(K)φ.

Page 58: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

58 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

Then (Φ, D) is termed a quantifier algebra over D. If we take the meetoperation of the Boolean algebra Φ for combination and define φ⇒I−J =∃(J)φ, then (Φ, Dc) is a domain-free information algebra. Here Dc is thelattice of subsets I − J for J ∈ D.

This can also be seen more in the spirit of variable elimination. If, forall i ∈ I, ∃(i) is a an existential quantifier, and ∃(i)∃(j) = ∃(j)∃(i), if i 6= j,one can define ∃(J) = ∃(i1) · · · ∃(im) if J = 1, . . . ,m and ∃(∅)(φ) = φ.Then (Φ, D) is a quantifier algebra.

For instance, let Φ be the powerset of some cartesian product of sets Uifor i ∈ I and D a lattice of subsets of I. The mapping ∃(J) is defined by

∃(J)A = b :∏i∈I

Ui : ∃a ∈ A such that bi = ai,∀i 6∈ J.

It can be shown that this is an existential quantifier and (Φ, D) forms aquantifier algebra. It is clear, that the operation ∃(i) is similar to variableelimination. Note also that it is sufficient for Φ to be a semilattice, i.e. apartially ordered set, where meet is defined, in order to define existentialquantification and then a quantifier algebra (Φ, D).

Example 3.13 Constraints. The examples of linear equalities and linearinequalities are instances of constraint system. This can be captured bya more general and abstract formulation: Let C be a class of first orderformulas, expressing constraints and suppose that Θ is a first order theoryof these constraints. Two constraints are equivalent, φ ≡Θ ψ modulo thetheory Θ, if Θ |= φ ↔ ψ. This is of course just as in predicate logic ingeneral (Example 3.11 above). We require now further that C is closedunder conjunction and existential quantification and contains the constants⊥,> (as is the case for linear equations and inequalities for instance):

1. ⊥,> ∈ C,

2. If φ, ψ ∈ C, then there exists a η ∈ C, such that φ ∧ ψ ≡Θ η,

3. If φ ∈ C and Xi is a variable, then exists a η such that ∃Xi[φ] ≡Θ η.

As usual, the lattice D of domains are the finite subsets of variables in thecorresponding multivariate structure. The elements of the algebra FC arethe equivalence classes [φ]Θ. Within FC the following operations are defined

1. Combination: [φ]Θ ⊗ [ψ]Θ = [φ ∧ ψ]Θ.

2. Variable Elimination: [φ]−XiΘ = [∃Xi[φ]]Θ.

This is a domain-free information algebra, as can be easily verified. Incontrast to algebras of formula FΘ discussed in Example 3.11, this algebra

Page 59: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

3.6. DOMAIN-FREE ALGEBRA 59

is not Boolean in general. Examples are the algebras of linear equationsand inequalities. Further examples include Herbrand constraints, Booleanconstraints, and interval constraints.

Page 60: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

60 CHAPTER 3. AXIOMATICS OF INFORMATION ALGEBRA

Page 61: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 4

Order of Information

The structure of an information algebra permits to introduce a partial or-der between its elements. This order reflects information content. Withthis order an important issue of information is addressed, which has manyrepercussions. After defining the order and discussing some of its elemen-tary properties, ideals in information algebras are identified as describingconsistent systems of information and as forming themselves an informa-tion algebra, into which the original algebra is embedded. Then algebrascontaining most informative elements, called atoms, are studied. They areshown to be closely related to relational algebras in a generalized way. Nextthe issue of finite information elements and the approximation of more gen-eral elements by finite ones is discussed. This leads to compact informationalgebras, both in a domain-free and a labeled form. A more general formof approximation, represented by the “way below” relation between infor-mation elements is then introduced and the related concept of continuousinformation algebras is discussed. These last parts of the present chaptertestify that compact and continuous information algebras are very close tothe subject of domain theory, another theory treating the concept of infor-mation.

4.1 Partial Order of Information

A commutative idempotent semigroup like Φ in an information algebra(Φ, D) is known to be a semilattice. That is, there is a partial order be-tween the elements of Φ. This order expresses in fact a relation with respectto “information content” of the elements of Φ, i.e. a qualitative measure ofinformation. Due to the additional structure of information algebras, severaldifferent orders can be defined. The study of these orders is the goal of thisChapter.

To start with, consider a domain-free information algebra (Φ, D). Wesay that an information φ ∈ Φ is more informative than another information

61

Page 62: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

62 CHAPTER 4. ORDER OF INFORMATION

ψ ∈ Ψ, and write φ ≥ ψ, if the latter combined with the first one does notchange the first one, i.e. if ψ adds nothing to φ,

φ⊗ ψ = φ. (4.1)

This defines a partial order in Φ:

1. Reflexivity: φ ≤ φ follows from idempotency, φ⊗ φ = φ.

2. Antisymmetry: φ ≤ ψ and ψ ≤ φ implies φ = ψ, since φ = φ⊗ψ = ψ.

3. Transitivity: φ ≤ ψ and ψ ≤ η implies φ ≤ η. This follows sinceφ⊗ ψ = ψ and ψ ⊗ η = η implies φ⊗ η = φ⊗ ψ ⊗ η = ψ ⊗ η = η.

The following lemma contains a number of simple results concerning thisordering.

Lemma 4.1

1. e ≤ φ ≤ z for all φ ∈ Φ.

2. φ, ψ ≤ φ⊗ ψ.

3. φ⊗ ψ = supφ, ψ.

4. φ⇒x ≤ φ.

5. φ ≤ ψ implies φ⇒x ≤ ψ⇒x.

6. φ1 ≤ φ2 and ψ1 ≤ ψ2 imply φ1 ⊗ ψ1 ≤ φ2 ⊗ ψ2.

7. φ⇒x ⊗ ψ⇒x ≤ (φ⊗ ψ)⇒x.

8. x ≤ y implies φ⇒x ≤ φ⇒y.

Proof. (1) Follows since e⊗ φ = φ and φ⊗ z = z.(2) φ⊗ (φ⊗ ψ) = (φ⊗ φ)⊗ ψ = φ⊗ ψ shows that φ ≤ φ⊗ ψ and in the

same way with ψ at the place of φ we obtain ψ ≤ φ⊗ ψ.(3) According to (2) φ ⊗ ψ is an upper bound of φ and ψ. If η is

another upper bound of φ and ψ, that is, if φ⊗ η = η and ψ ⊗ η = η, then(φ⊗ ψ)⊗ η = φ⊗ (ψ ⊗ η) = φ⊗ η = η, thus φ⊗ ψ ≤ η and φ⊗ ψ is in factthe least upper bound of φ and ψ.

(4) This follows from idempotency φ⇒x ⊗ φ = φ.(5) Let φ ≤ ψ. Then we have (using idempotency) φ⇒x⊗ψ⇒x = (φ⇒x⊗

ψ)⇒x = (φ⇒x ⊗ φ⊗ ψ)⇒x = (φ⊗ ψ)⇒x = ψ⇒x.(6) We have (φ1 ⊗ ψ1)⊗ (φ2 ⊗ ψ2) = (φ1 ⊗ φ2)⊗ (ψ1 ⊗ ψ2) = φ2 ⊗ ψ2.(7) We have by (4) φ⇒x ≤ φ. Then, by (6), φ⇒x ⊗ ψ ≤ φ⊗ ψ and thus

φ⇒x ⊗ ψ⇒x = (φ⇒x ⊗ ψ)⇒x ≤ (φ⊗ ψ)⇒x by (5).

Page 63: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.1. PARTIAL ORDER OF INFORMATION 63

(8) x ≤ y implies x = x ∧ y and therefore φ⇒x ⊗ φ⇒y = φ⇒x∧y ⊗ φ⇒y =(φ⇒y)⇒x ⊗ φ⇒y = φ⇒y because of idempotency. ut

Property (3) of this lemma means that any finite set of elements from Φhas a least upper bound in Φ, that is, Φ is a semilattice. We write sometimesalso φ⊗ψ = φ∨ψ to emphasize the fact that the combination of two piecesof information is their join.

Order may be introduced in a similar way into a labeled informationalgebra (Φ, D). As before we define φ ≥ ψ if, and only if, (4.1) holds. Asbefore it can easily be verified that this defines a partial order in Φ. Thisorder is a more technical concept, it has less meaning as an expression ofinformation content. For example we have φ ≤ φ↑y (see point (7) of thelemma below), although φ and φ↑y represent the same information, albeitrelative to different domains.

In this case we have the following elementary results:

Lemma 4.2

1. φ ≤ ψ implies d(φ) ≤ d(ψ).

2. x ≤ y implies ex ≤ ey and zx ≤ zy.

3. x ≤ d(φ) implies ex ≤ φ and d(φ) ≤ x implies φ ≤ zx.

4. φ, ψ ≤ φ⊗ ψ.

5. φ⊗ ψ = supφ, ψ.

6. x ≤ d(φ) implies φ↓x ≤ φ.

7. d(φ) ≤ y implies φ ≤ φ↑y.

8. x ≤ y = d(φ) implies (φ↓x)↑y ≤ φ.

9. φ1 ≤ φ2 and ψ1 ≤ ψ2 imply φ1 ⊗ ψ1 ≤ φ2 ⊗ ψ2.

10. φ→x ⊗ ψ→x ≤ (φ⊗ ψ)→x.

11. φ ≤ ψ implies φ→x ≤ ψ→x.

12. If d(ψ) ≤ y, then φ ≤ ψ implies φ↑y ≤ ψ↑y.

13. x ≤ y implies φ→x ≤ φ→y.

14. If d(φ) ≤ x ≤ d(ψ), then φ ≤ ψ implies φ ≤ ψ↓x.

We leave the proof, which is similar to the one of Lemma 4.1, to thereader.

We may also compare, in a labeled information algebra (Φ, D), informa-tion content of elements φ of Φ relative to a fixed domain x ∈ D (i.e. a fixed

Page 64: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

64 CHAPTER 4. ORDER OF INFORMATION

question): For x ∈ D we write φ ≤x ψ, if φ→x ≤ ψ→x, where here the orderdefined above in labeled information algebras is used. This is a preorder inΦ (i.e. antisymmetry does not hold). If we restrict this order to the domainx, then on Φx it becomes a partial order coinciding with the order definedabove on Φ, restricted to Φx.

There is of course a strong relation between these three orders of infor-mation.

Theorem 4.1 Let (Φ, D) be a labeled information algebra and (Φ/σ,D) theassociated domain-free algebra. Then φ ≤ ψ implies [φ]σ ≤ [ψ]σ which inturn implies φ ≤x ψ for all x ∈ D. On the other hand, ∀x ∈ D, φ ≤x ψimplies [φ]⇒xσ ≤ [ψ]⇒xσ

Proof. From φ ⊗ ψ = ψ we deduce that [φ]σ ⊗ [ψ]σ = [φ ⊗ ψ]σ = [ψ]σ,hence [φ]σ ≤ [ψ]σ. This last inequality implies that [φ]⇒xσ ≤ [ψ]⇒xσ (lemma4.1 (5)). Thus we conclude that [φ→x]σ ≤ [ψ→x]σ, hence [φ→x]σ ⊗ [ψ→x]σ =[φ→x ⊗ ψ→x]σ = [ψ→x]σ, hence φ→x ⊗ ψ→x = ψ→x, therefore φ→x ≤ ψ→x.

Conversely, φ ≤x ψ means that φ→x ≤ ψ→x, hence [φ→x]σ = [φ]⇒xσ ≤[ψ→x]σ = [ψ]⇒xσ ut

According to Theorem 4.1 we could also define φ ≤x ψ equivalently by[φ]⇒xσ ≤ [ψ]⇒xσ through the domain-free version of the information algebra.Finally, we can also consider a domain-free algebra (Φ, D) and define φ ≤x ψby φ⇒x ≤ ψ⇒x.

To conclude we collect in the next lemma a few elementary properties ofthe preorder ≤x.

Lemma 4.3 ∀x, y ∈ D and ∀φ, ψ, φ1, ψ1, φ2, ψ2 ∈ Φ,

1. ey ≤x φ ≤x zy,

2. φ, ψ ≤x φ⊗ ψ,

3. φ→y ≤x φ,

4. [φ]σ ≤ [ψ]σ implies φ→y ≤x ψ→y,

5. [φ1]σ ≤ [ψ1]σ and [φ2]σ ≤ [ψ2]σ imply φ1 ⊗ φ2 ≤x ψ1 ⊗ ψ2,

6. φ→y ⊗ ψ→y ≤x (φ⊗ ψ)→y,

7. y ≤ z implies φ→y ≤x φ→z.

These properties can be easily proved using lemma 4.2 and theorem 4.1.We add an example to illustrate how the information order in a semiring

induced information algebra is induced by the semiring.

Page 65: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.2. IDEAL COMPLETION OF INFORMATION ALGEBRAS 65

Example 4.1 Order Induced by a Semiring. In example 3.4 we have in-troduced information algebras induced by idempotent semirings. If A is adistributive lattice a valuation ψ is a mapping from the set ΦJ of J-tuples(see example 3.2) into A. Then, by the definition of combination (see ex-ample 3.4) we have φ ≤ ψ if for all x ∈ Dd(φ)∪d(ψ),

φ⊗ ψ(x) = φ(x↓d(φ))× ψ(x↓d(ψ)) = ψ(x).(4.2)

This is the case if, and only if, d(φ) ≤ d(ψ) and for all x ∈ Dd(ψ) it holdsthat φ(x↓d(φ)) ≤ ψ(x), in the partial order of the lattice A. In particular, ifd(φ) = d(ψ), then φ ≤ ψ if, and only if, φ(x) ≤ ψ(x) for all x ∈ Dd(ψ).

4.2 Ideal Completion of Information Algebras

Information may be asserted, that is, a piece of information may be statedto hold, to be true. Suppose that a body of information represented byan element φ from a domain-free information algebra Φ is asserted to hold.Then, intuitively, all pieces of information ψ which are part of φ, that is forwhich ψ ≤ φ holds, should hold too. Moreover, if two pieces of information φand ψ are asserted, then their combination φ ⊗ ψ should also hold. So, anassertion of an information should be a set of elements from Φ, which satisfythese conditions. This leads to the following definition:

Definition 4.1 Ideals: A non-empty set I ⊆ Φ is called an ideal of thedomain-free information algebra (Φ, D), if the following holds:

1. φ ∈ I and ψ ∈ Φ, ψ ≤ φ imply ψ ∈ I.

2. φ ∈ I and ψ ∈ I imply φ⊗ ψ ∈ I.

Note that this concept is a notion related to a semi-lattice rather thanto an information algebra (Davey & Priestley, 1990a). However, it followsimmediately, that the neutral element e is contained in every ideal, and that,since φ⇒x ≤ φ, φ ∈ I implies also φ⇒x ∈ I for all domains x and every idealI. This relates the notion of ideals more specifically to information algebras.Φ itself is an ideal. Any ideal I different from Φ is called a proper ideal.The set of all elements ψ ∈ Φ which are less informative than φ, includingφ itself, forms also an ideal. It is called the principal ideal I(φ) of φ. Anideal, as stated above, represents a consistent and complete assertion ofan information. The principal ideal I(φ) represents the simple informationasserted by the element of information φ alone. The order-relation ψ ≤ φcan, in this respect, be considered as a relation stating that ψ is implied orentailed by φ.

Page 66: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

66 CHAPTER 4. ORDER OF INFORMATION

The intersection of an arbitrary number of ideals is clearly still an ideal.Thus we may define the ideal generated by a subset X of Φ by

I(X) =⋂I : I ideal in Φ, X ⊆ I. (4.3)

This represents the full body of information generated, if the pieces of in-formation φ in the set X are asserted. There is also an alternative represen-tation of I(X).

Lemma 4.4 If X ⊆ Φ, then

I(X) = φ ∈ Φ : φ ≤ φ1 ⊗ · · · ⊗ φm for somefinite set of elements φ1, . . . , φm ∈ X. (4.4)

Proof. If φ1, . . . , φm ∈ X, then φ1, . . . , φm ∈ I(X), hence φ1⊗· · ·⊗φm ∈I(X), hence φ ≤ φ1 ⊗ · · · ⊗ φm implies φ ∈ I(X). Conversely, the set onthe right-hand side in the lemma is clearly an ideal which contains X. Thusthis set contains I(X), hence it equals it. ut

As a corollary we obtain that if the set X is finite, then

I(X) = I(∨X), (4.5)

The ideal generated from X is the principal ideal of the element ∨X of Φ.It is well known in lattice theory, that a system of sets which is closed

under intersection, such as ideals, form a complete lattice (see for example(Davey & Priestley, 1990a)), where the supremum of two ideals I1 and I2 isdefined as

I1 ∨ I2 = I(I1 ∪ I2) =⋂I : I ideal in Φ, I1 ∪ I2 ⊆ I.

It is then obvious that this is the natural definition of the aggregation orthe combination of two assertions of information. So we equip the family ofideals of an information algebra Φ with the operation of combination

I1 ⊗ I2 = I1 ∨ I2. (4.6)

There is an alternative, equivalent definition of this operation between ideals.

Lemma 4.5

I1 ⊗ I2 = ψ ∈ Φ : ψ ≤ φ1 ⊗ φ2 for some φ1 ∈ I1, φ2 ∈ I2. (4.7)

Proof. The right-hand side of equation (4.7) is clearly an ideal and itcontains both I1 and I2. Hence it contains I1 ∨ I2. Assume then ψ anelement of the right-hand side of equation (4.7), that is, ψ ≤ φ1 ⊗ φ2 andφ1 ∈ I1, φ2 ∈ I2. But, if I is an ideal containing both I1 and I2, we have

Page 67: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.2. IDEAL COMPLETION OF INFORMATION ALGEBRAS 67

φ1 ∈ I and φ2 ∈ I, hence φ1⊗φ2 ∈ I and therefore ψ ∈ I. This tells us thatthe right-hand side of equation (4.7) is contained in I1 ∨ I2. ut

We may also equip the family of ideals of Φ with a focusing operation.If x ∈ D define

I⇒x = ψ ∈ Φ : ψ ≤ φ⇒x for some φ ∈ I. (4.8)

It is easy to verify that this is indeed an ideal. Of course, we have φ⇒x ∈ I⇒x,if φ ∈ I.

The family IΦ of ideals, equipped with the operations of combinationand focusing as defined above is nearly a domain-free information algebra.This statement is made more precise in the following theorem.

Theorem 4.2 The algebraic structure (IΦ, D), where the operations ⊗ and⇒ are defined by (4.6) or (4.7) and (4.8), satisfies all axioms of a domain-free information algebra, except the support axiom. If the lattice D has atop element, then (IΦ, D) is an information algebra.

Proof. (1) The semi-lattice axioms (idempotent semi-group) are satisfiedsince IΦ is a lattice, as remarked above.

(2) There is not necessarily a support x for an ideal I. If however Dhas a top element >, then this is a support for any ideal. In this case thesupport axiom is satisfied.

(3) The ideal I(e) = e is clearly the neutral element of combinationand I(e)⇒x = I(e), since e⇒x = e for all domains x.

(4) The ideal Φ is the null element and Φ⇒x = Φ since z⇒x = z andφ ≤ z for all φ ∈ Φ.

(5) Here is the verification of the focusing axiom:

(I⇒x)⇒y = ψ : ψ ≤ φ′⇒y, φ′ ∈ I⇒x= ψ : ψ ≤ (φ⇒x)⇒y = φ⇒x∩y, φ ∈ I= I⇒x∩y.

(3) To verify the combination axiom, we use equation (4.7):

(I⇒x1 ⊗ I2)⇒x = ψ : ψ ≤ φ⇒x, φ ∈ I⇒x1 ⊗ I2= ψ : ψ ≤ (φ′1 ⊗ φ2)⇒x, φ′1 ∈ I⇒x1 , φ2 ∈ I2= ψ : ψ ≤ (φ⇒x1 ⊗ φ2)⇒x = φ⇒x1 ⊗ φ⇒x2 , φ1 ∈ I1, φ2 ∈ I2= ψ : ψ ≤ φ′1 ⊗ φ′2, φ′1 ∈ I⇒x1 , φ′2 ∈ I⇒x2 = I⇒x1 ⊗ I⇒x2 .

(4) Idempotency can be verified as follows:

I ⊗ I⇒x = ψ : ψ ≤ φ⊗ φ′, φ ∈ I, φ′ ∈ I⇒x= ψ : ψ ≤ φ⊗ φ′′⇒x, φ, φ′′ ∈ I, .

Page 68: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

68 CHAPTER 4. ORDER OF INFORMATION

Since φ ⊗ φ′′⇒x ≤ φ ⊗ φ′′ ∈ I, it follows that ψ ∈ I ⊗ I⇒x implies ψ ∈ I,hence I ⊗ I⇒x ⊆ I. The converse inclusion is evident. ut

The structure (IΦ, D) with the operations ⊗ and ⇒ as defined above,is called the ideal completion of Φ. This terminology is justified by thefollowing theorem:

Theorem 4.3 The domain-free information algebra (Φ, D) can be embeddedinto the structure (IΦ, D).

Proof. We show that the mapping I : Φ → IΦ defined by I(φ) is anembedding. That is, we show that

1. I(φ⊗ ψ) = I(φ)⊗ I(ψ),

2. I(φ⇒x) = I⇒x(φ),

3. I(e) = e,

4. I(z) = Φ,

5. I(φ) = I(ψ) implies φ = ψ.

(1) is a direct consequence of equation (4.7) applied to the principalideals I(φ) and I(ψ).

(2) We have that ψ ∈ I⇒x(φ) if, and only if, ψ ≤ φ⇒x which holds if,and only if, ψ ∈ I(φ⇒x).

(3) holds since there are no elements different from e, for which φ ≤ eholds.

(4) holds since for all elements φ ∈ Φ we have φ ≤ z.(5) ψ ∈ I(φ) means that ψ ≤ φ and φ ∈ I(ψ) means that φ ≤ ψ, thus

φ = ψ. utWe shall reconsider the structure (IΦ, D) in Section 4.5.If the support axiom does not hold, we can not derive a labeled version

from the domain-free algebra (IΦ, D). However in many cases in the originalalgebra (Φ, D), D has a top element, or it can be adjoined. This is forexample the case for algebras based on a multivariate structure, or if D is apartition lattice.

Ideal completion is essentially an order-theoretic concept. Therefore,it can also be defined for a labeled information algebra (Φ, D). Here, weconsider ideals in Φx for any domain x. Thus, we change Definition4.1slightly:

Definition 4.2 Labeled Ideals: A non-empty set I ⊆ Φx is called anideal of the domain x, if the following holds:

1. φ ∈ I and ψ ∈ Φx, ψ ≤ φ imply ψ ∈ I.

Page 69: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.2. IDEAL COMPLETION OF INFORMATION ALGEBRAS 69

2. φ ∈ I and ψ ∈ I imply φ⊗ ψ ∈ I.

Again as before, an ideal is a collection of consistent and complete infor-mation, this time with respect to a specified domain or question. An idealI in Φx is labeled with the domain d(I) = x. Let IΦx be the set of ideals inΦx. An ideal in Φx is called proper if it is different from Φx. For φ ∈ Φx,the set I(φ) = ψ ∈ Φx : ψ ≤ φ is an ideal, called a principal ideal. Define

IΦ =⋃x∈D

IΦx .

Within IΦ we define the operations of combination between ideals and pro-jection of ideals according to the models of (4.7) and (4.8):

1. Combination of Ideals: Let I1 ∈ IΦx and I2 ∈ IΦ2 . Then we define

I1 ⊗ I2 = ψ ∈ Φx∨y : ψ ≤ φ1 ⊗ φ2 for some φ1 ∈ I1, φ2 ∈ I2. (4.9)

2. Projection of ideals: Let I ∈ IΦy and x ≤ y. Then we define

I↓x = ψ ∈ Φx : ψ ≤ φ↓x for some φ ∈ I. (4.10)

It is easy to verify that both I1 ⊗ I2 and I↓x are both indeed ideals. So thedefinitions above are well founded. Note also that φ↓x ∈ I↓x, if φ ∈ I.

As is to be expected, (IΦ, D) with labeling, combination and projectionas defined above is a labeled information algebra:

Theorem 4.4 The algebraic structure (IΦ, D) with labeling, combinationand projection as defined above is a labeled information algebra.

Proof. (1) Labeling follows directly from the definition of combinationand projection.

(2) Semigroup: Commutativity of combination follows directly from thedefinition. To verify associativity we invoke labeling and obtain for I1 ∈ IΦx ,I2 ∈ IΦy and I3 ∈ IΦz

I1 ⊗ (I2 ⊗ I3)= ψ ∈ Φx∨y∨z : ψ ≤ φ1 ⊗ η, for some φ1 ∈ I1, η ∈ I2 ⊗ I3= ψ ∈ Φx∨y∨z : ψ ≤ φ1 ⊗ η, for some φ1 ∈ I1,

η ≤ φ2 ⊗ φ3 for some φ2 ∈ I2, φ3 ∈ I3= ψ ∈ Φx∨y∨z : ψ ≤ φ1 ⊗ φ2 ⊗ φ3, for some φ1 ∈ I1, φ2 ∈ I2, φ3 ∈ I3.

The same result can also be derived for (I1 ⊗ I2)⊗ I3. This proves associa-tivity of combination.

Page 70: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

70 CHAPTER 4. ORDER OF INFORMATION

(3) Neutrality: The principal ideal ex is the neutral element of IΦx ,since for any I ∈ IΦx ,

ex ⊗ I = ψ ∈ Φx : ψ ≤ ex ⊗ φ = φ for some φ ∈ I = I,

and, for y ≤ x,

ex↓y = ψ ∈ Φy : ψ ≤ e↓yx = ey = ey.

(4) Nullity: The null element in IΦx is Φx. Certainly we have Φx⊗I = Φx

for any I ∈ IΦx , since I ⊆ Φx. Further, if x ≤ y,

Φx ⊗ ey = ψ ∈ Φy : ψ ≤ φ⊗ ey = φ↑y for some φ ∈ Φx = Φy,

since for all ψ ∈ Φy it holds that ψ ≤ zx ⊗ ey = zy.(5) Projection: Let x ≤ y ≤ z = d(I). Then,

(I↓y)↓x = ψ ∈ Φy : ψ ≤ η↓y for some η ∈ I↓x

= ψ ∈ Φx : ψ ≤ φ↓x for some φ ≤ η↓y for some η ∈ I= ψ ∈ Φx : ψ ≤ η↓x for some η ∈ I = I↓x.

(6) Combination: Let I1 ∈ IΦx and I2 ∈ IΦy . Then,

(I1 ⊗ I2)↓x = ψ ∈ Φx : ψ ≤ φ↓x for some φ ∈ I1 ⊗ I2= ψ ∈ Φx : ψ ≤ φ↓x for some

φ ≤ φ1 ⊗ φ2 for some φ1 ∈ I1, φ2 ∈ I2= ψ ∈ Φx : ψ ≤ (φ1 ⊗ φ2)↓x = φ1 ⊗ φ↓x∧y2

for some φ1 ∈ I1, φ2 ∈ I2= ψ ∈ Φx : ψ ≤ φ1 ⊗ η2 for some φ1 ∈ I1, η2 ∈ I↓x∧y2 = I1 ⊗ I↓x∧y2 .

(7) Idempotency: Let I ∈ IΦy and x ≤ y. Then

I ⊗ I↓x = ψ ∈ Φy : ψ ≤ φ⊗ η form some φ ∈ I, η ∈ I↓x= ψ ∈ Φy : ψ ≤ φ⊗ η↓x form some φ, η ∈ I⊆ I,

since we may select η = ey. On the other hand certainly I ⊆ I ⊗ I↓x, henceI ⊗ I↓x = I. ut

Just as with domain-free algebras, the labeled algebra (Φ, D) can beembedded into the labeled algebra (IΦ, D) by the mapping φ 7→ I(φ). Weformulate the corresponding theorem, its proof is similar to the proof ofTheorem 4.3 and is therefore omitted.

Page 71: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.3. ATOMIC ALGEBRAS AND RELATIONAL SYSTEMS 71

Theorem 4.5 The labeled information algebra (Φ, D) can be embedded intothe labeled information algebra (IΦ, D).

Again, this justifies to call (IΦ, D) the ideal completion of the algebra(Φ, D).

We leave it to the reader to examine the relation between ideal comple-tions of associated labeled and domain-free information algebra.

4.3 Atomic Algebras and Relational Systems

Many information algebras contain for any domain x most informative ele-ments, different from the null element. In relational algebra (example 3.3),for example, the one-tuple relations on a certain set of attributes are in thesense of the information ordering most informative elements. In subset sys-tems these are the one-element sets. Such an information is called atomic.This notion can be defined relative to any labeled information algebra:

Definition 4.3 Atoms: An element α ∈ Φ with d(α) = x is called anatom on x, if

1. α 6= zx.

2. For all φ ∈ Φ, d(φ) = x and α ≤ φ implies either α = φ or φ = zx.

This says that no information in a domain, except the null information,can be more informative than an atom. Clearly, the one-tuple relations areexactly the atoms in the relational algebra RT (see Section 3.3). By abuse oflanguage, we shall simply say that tuples are atoms in the relational algebras.On the other hand, an information algebra may well have no atoms.

Here are some elementary results about atoms:

Lemma 4.6 1. If α is an atom on x, φ ∈ Φ with d(φ) ≤ x, then α⊗φ =α or zx.

2. If α is an atom on x, and y ≤ x, then α↓y is an atom on y.

3. If α is an atom on x and φ ∈ Φ with d(φ) = x, then either φ ≤ α orα⊗ φ = zx.

4. If α, β are atoms on x, then either α = β or α⊗ β = zx.

Proof. (1) Let α⊗ φ = α⊗ φ↑x = η such that, multiplying both sides byφ↑x and η, we obtain α⊗ (φ↑x⊗ η) = φ↑x⊗ η. This means that α ≤ φ↑x⊗ η.Since α is an atom we have either

α = φ↑x ⊗ η or φ↑x ⊗ η = zx.

Page 72: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

72 CHAPTER 4. ORDER OF INFORMATION

In the first case we obtain that α⊗φ = α, in the second case we obtain thatα⊗ φ = zx.

(2) Since α 6= zx we have also that α↓y 6= zy (Lemma 3.2 (2)). Assumethat α↓y ≤ φ and d(φ) = y. We have

(α⊗ φ)↓y = α↓y ⊗ φ = φ.

From (1) of this lemma it follows that either

α⊗ φ = α or α⊗ φ = zx.

In the first case we obtain

α↓y = (α⊗ φ)↓y = φ

and in the second case

φ = (α⊗ φ)↓y = z↓yx = zy.

(see Lemma 3.2 (1)). This proves that α↓y is an atom on y.(3) is an immediate consequence of (1) of this lemma.(4) Since α is an atom it follows from (1) of this lemma that either

α⊗β = α or α⊗β = zx. In the first case we have that β ≤ α, hence α = β,since β is also an atom. ut

Denote by Atx(Φ) the set of all atoms on the domain x in the informationalgebra Φ and denote by At(Φ) the set of all atoms in Φ. Furthermore define,for φ ∈ Φ,

At(φ) = α ∈ At(Φ) : d(α) = d(φ), φ ≤ α. (4.11)

If α ∈ At(φ) we say also that α is an atom in φ or contained in φ. Clearly,in the relational algebra over a tuple system, a relation R over domain xcontains all the tuples f ∈ R. Strictly speaking this means that At(R) =f : f ∈ R.

In the example of the relational algebra RT , any element contains atleast one atom, except the empty relations Zx. More than that, any relationR which is not empty, is a lower bound of all the tuples it contains. Moreprecisely, R ≤ f for all f ∈ R. But if a relation S on the same domain isanother lower bound of At(R), then this means that S ≤ f, that is, f ∈ S,for all f ∈ R, hence R ⊆ S. So we have for any other lower bound S ≤ R.Hence, R is the largest lower bound, the infimum of At(R), R = ∧At(R).Moreover, any set of tuples R has an infimum in RT , namely

∧f : f ∈ R = R.

Any element of the algebra RT is determined by the atoms it contains, andany set of atoms defines an element of RT . These are strong propertieswhich show that the relational algebra RT really is essentially an algebra ofsets.

This motivates the following definitions:

Page 73: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.3. ATOMIC ALGEBRAS AND RELATIONAL SYSTEMS 73

Definition 4.4 Atomic Algebras.

1. A labeled information algebra Φ is called atomic, if for all φ ∈ Φ, if φis not a null element, the set of atoms At(φ) is not empty.

2. It is called atomic composed, if it is atomic and if for all φ ∈ Φ

φ = ∧At(φ). (4.12)

3. It is called atomic closed, if it is atomic composed and if for everysubset A ⊆ Atx(Φ) the infimum exists and belongs to Φ.

Example 4.2 Systems of Subsets. We have seen before several examplesof information algebras whose elements are subsets of some frames. Forinstance in Example 3.1 subsets of frames forming a system of compatibleframes constitute an information algebra. Clearly, the one-elements setson any frame are atoms in that frame. The algebra is atomic closed. Infact, this algebra is an example of a generalized relational algebra (see alsofile systems, Section 3.3). The same is the case of the algebra of subsetsin a multivariate structure (see Example 3.2). The information algebrasassociated with propositional logic (see Example 3.8) and predicate logic(see Example 3.11) are particular cases of subset algebras in multivariatestructures. All of them are atomic closed and in fact instances of a relationalalgebra.

Example 4.3 Affine Spaces. Any one-vector set in a linear space Fs (seeExample 3.6) is an affine space in that linear space and it is clearly an atomof the algebra of affine spaces. This algebra is not atomic closed, but atomiccomposed, since an affine space is the set of all vectors it contains. Thevectors of Fs are s-tuples, and the information algebra of affine spaces isembedded into the relational algebra of these tuples.

An interesting property of this atomic composed algebra is that a finitenumber of atoms of an affine space φ suffice to determine φ. In fact, anaffine space φ, defined by a system of m ≤ |s| linearly independent linearequations S(φ), is determined by n = |s| −m + 1 vectors v1, . . . , vn in thespace. If αi = vi are the associated atomic affine spaces in Fs, then

φ = α1 ∧ · · · ∧ αn. (4.13)

Hence, affine spaces are finitely generated by atoms they contain. Thus anaffine space, which in general is an infinite set, can either be represented bya finite system of linear equations or, alternatively, by a finite set of atoms,which span the space.

Example 4.4 Convex Polyhedra and Sets. Like affine spaces (previous Ex-ample 4.3), convex polyhedra, or more generally, convex sets, are composed

Page 74: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

74 CHAPTER 4. ORDER OF INFORMATION

of vectors in Rs, which, as one-vector sets, are atoms in the respective in-formation algebras of convex polyhedra or convex sets. Both algebras aretherefore atomic composed, but not atomic closed, since not every set ofvectors is convex.

It is well-known, that polytopes are the convex hull of its extremal points.This means that a polytope is determined by a finite number of certain atomsit contains (its extremal points). Thus they are, like affine spaces (Example4.3) finitely generated by some atoms. This does not hold for polyhedra ingeneral.

Her are a few simple results on sets of atoms.

Lemma 4.7 If (Φ, D) be an atomic, labeled information algebra. Then thefollowing results hold.

1. ∀φ, ψ ∈ Φx, φ ≤ ψ implies At(φ) ⊇ At(ψ).

2. ∀φ, ψ ∈ Φ and ∀x ∈ D, φ ≤ ψ implies At(φ→x) ⊃ At(ψ→x).

3. ∀φ, ψ ∈ Φx, At(φ⊗ ψ) = At(φ) ∩At(ψ)

Proof. (1) Let α ∈ At(ψ), i.e. φ ≤ ψ ≤ α, Hence α ∈ At(φ).(2) φ ≤ ψ implies φ→x ≤ ψ→x (Lemma 4.2 (11)). Then (2) follows from

(1) above.(3) Assume α ∈ At(φ⊗ ψ), i.e. α ≥ φ⊗ ψ ≥ φ, ψ. Then α ∈ At(φ) and

α ∈ At(ψ), hence α ∈ At(φ)∩At(ψ). Conversely, assume α ∈ At(φ)∩At(ψ),i.e. α ≥ φ, ψ, hence α ≥ φ⊗ ψ, therefore α ∈ At(φ⊗ ψ). ut

A relational information algebraRT on a tuple system T is atomic closedand the tuples (or the one-tuple relations) are its atoms. Not surprisingly,there are close relations between relational information algebras and atomicinformation algebra and between tuples and atoms. We begin by showingthat if an information algebra is atomic, then its atoms form a tuple system.

Lemma 4.8 If the labeled information algebra (Φ, D) is atomic, then(At(Φ), D) is a tuple system with domain d and projection ↓.

Proof. We have to prove the five axioms of a tuple system (see Section3.3). The first two axioms correspond simply to the marginalization andtransitivity axioms of the information algebra and the third one is a directconsequence of the labeling axiom.

In order to prove axiom (4) assume d(α) = x, d(β) = y and α↓x∧y =β↓x∧y for two atoms α and β. Suppose first that α⊗ β = zx∨y. Then,

α = (α⊗ β)↓x = z↓xx∨y = zx.

But α 6= zx and therefore α⊗ β 6= zx∨y.

Page 75: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.3. ATOMIC ALGEBRAS AND RELATIONAL SYSTEMS 75

Since Φ is atomic, the set At(α ⊗ β) is not empty. Let γ ∈ At(α ⊗ β).Then d(γ) = x ∨ y and α⊗ β ≤ γ. It follows, using Lemma 4.2 (11), that

α = (α⊗ β)↓x ≤ γ↓x.

But α is an atom, hence γ↓x = α or γ↓x = zx. The latter case is excluded,since γ 6= zx∨y (Lemma 3.2 (2)). So we have γ↓x = α. γ↓y = β is obtainedin the same way. So axiom (4) of a tuple system is satisfied.

In order to verify axiom (5) let α be an atom with d(α) = x ≤ y. Supposethat α⊗ ey = α↑y = zy. But then we obtain that α = (α⊗ ey)↓x = z↓xy = zx.But this is not possible since α is an atom. Therefore α↑y 6= zy.

Because Φ is atomic, the set At(α↑y) is not empty. Let β ∈ At(α↑y).Then d(β) = y and α↑y ≤ β. It follows that

α = (α↑y)↓x ≤ β↓x.

But α is an atom. Therefore we have either β↓x = α or β↓x = zx. The lattercase is excluded since β 6= zy (Lemma 3.2 (2)). This proves axiom (5) oftuple systems. ut

The tuple system (At(Φ), D, d, ↓) is a subsystem of the tuple system(Φ, D, d, ↓) (see Lemma 3.4). In other words, if Φ is atomic, the atoms inΦ form already themselves a tuple system. We call the relational algebraassociated with this tuple system RAt(Φ). The sets At(φ) belong to RAt(Φ)

for every element φ in Φ. So, At can be seen as a mapping from Φ intoRAt(Φ).

Theorem 4.6 If (Φ, D) is an atomic information algebra, then At : Φ →RAt(Φ) is a homomorphism.

Proof. We have to show the following:

1. At(φ⊗ ψ) = At(φ) ./ At(ψ).

2. At(φ↓x) = πx(At(φ)).

3. At(ex) = Ex.

(1) Let d(φ) = x and d(ψ) = y. Assume that α ∈ At(φ ⊗ ψ). Thismeans that φ ⊗ ψ ≤ α, hence φ ≤ α and therefore by the monotonicity ofmarginalization φ = φ↓x ≤ α↓x But α↓x is an atom on x (Lemma 4.6 (2))and thus α↓x ∈ At(φ). In the same way we find that α↓y ∈ At(ψ). Sinced(α) = x ∨ y this shows that α ∈ At(φ) ./ At(ψ).

Conversely, assume that α ∈ At(φ) ./ At(ψ). Then d(α) = x ∨ y,α↓x ∈ At(φ), and φ ≤ α↓x ≤ α and also ψ ≤ α↓y ≤ α. This impliesφ⊗ ψ ≤ α and hence α ∈ At(φ⊗ ψ). This proves (1).

Page 76: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

76 CHAPTER 4. ORDER OF INFORMATION

(2) Assume first that α ∈ At(φ↓x). This means that d(α) = x andφ↓x ≤ α. Let d(φ) = y. Then we have φ ≤ φ⊗ α. Suppose that φ⊗ α = zy.Then we obtain

α = φ↓x ⊗ α = (φ⊗ α)↓x = z↓xy = zx.

But this is not possible, since α is an atom. Thus φ ⊗ α 6= zy. Hence,because Φ is atomic, At(φ⊗ α) 6= ∅. Let β ∈ At(φ⊗ α) such that d(β) = yand φ ≤ φ⊗ α ≤ β. This shows that β ∈ At(φ). But we have also that

α = (φ⊗ α)↓x ≤ β↓x.

Since α and β↓x are both atoms, we must have α = β↓x. This shows thatα ∈ πx(At(φ)).

Assume α ∈ πx(At(φ)), i.e. there is a β ∈ At(φ) such that α = β↓x. Wehave φ ≤ β, hence φ↓x ≤ β↓x and thus α = β↓x ∈ At(φ↓x). This proves (2).

(3) follows directly from the definition of At. utThis theorem has two important results as corollaries:

Corollary 4.1

1. If Φ is atomic composed, then At is an embedding.

2. If Φ is atomic closed and if for X,Y ⊆ Atx(Φ), X 6= Y implies∧X 6= ∧Y , then At is an isomorphism.

Proof. (1) Assume that At(φ) = At(ψ). Then φ = ∧At(φ) = ∧At(ψ) =ψ.

(2) Let X ⊆ Atx(Φ) for some domain x. That is, X is a relation inRAt(Φ). Since Φ is atomic closed ∧X exists in Φ and ∧X = ∧At(∧X). Butthis implies that X = At(∧X), hence the mapping At is onto RAt(Φ). ut

The second part of the corollary 4.1 above says that an atomic closedinformation algebra is under some additional condition essentially identicalto a relational algebra over a tuple system. It is a system of subsets. Assuch it has much more structure than an information algebra. The firstpart of the corollary says that an atomic composed information algebra isembedded in such a subset system.

Atomicity is also closely related to partitions or compatible frames, asthe following theorem shows:

Theorem 4.7 If (Φ, D) is an atomic information algebra, then, for x ≤ y,Aty(Φ) is a refining of Atx(Φ) and the latter is a coarsening of the formerone.

Proof. Define τ : Atx(Φ)→ P(Aty(Φ)) for α ∈ Atx(Φ) by

τ(α) = β ∈ Aty(Φ) : β↓x = α.

Page 77: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.4. FINITE ELEMENTS AND APPROXIMATION 77

We claim that τ is a refinement (see Example 2.3 in Section 2.2).First we note that, since Φ is atomic, there is a β ∈ At(α↑y). But then

β↓x ≤ (α↑y)↓x = α, hence β ∈ τ(α). Thus, ∀α ∈ Atx(Φ), the set τ(α) is notempty.

Next let α′ 6= α′′ ∈ Atx(Φ). Assume β ∈ τ(α′) ∩ τ(α′′), i.e. β↓x = α′

and β↓x = α′′. But this implies α′ = α′′, contrary to the assumption. Soτ(α′) ∩ τ(α′′) = ∅.

Finally we have ⋃α∈Atx(Φ)

τ(α) ⊆ Aty(Φ).

Consider β ∈ Aty(Φ). Then β↓x ∈ Atx(Φ), hence β ∈ τ(β↓x). Thus we have⋃α∈Atx(Φ)

τ(α) = Aty(Φ).

This concludes the proof that τ is a refinement of Atx(Φ). utLet F = Atx(Φ) : x ∈ D. Then we may consider F as a family of

frames, with the order Atx(Φ) ≤ Aty(Φ) if x ≤ y. F is in fact a lattice withAtx(Φ) ∨ Aty(Φ) = Atx∨y(Φ) and Atx(Φ) ∧ Aty(Φ) = Atx∧y(Φ), which isisomorphic to the lattice D. Therefore F is a compatible family of frames(see Example 2.3). Hence RAt(Φ) can be seen as an information algebraof subsets of frames from F and (Φ, D) is homomorphic to this algebra.Clearly, if D has a top element >, then all Atx(Φ) are essentially partitionsof At>(Φ) and F is a partition lattice.

4.4 Finite Elements and Approximation

In this section we model another aspect of information, which has not beenconsidered so far in this text. Computers can treat only “finite” informa-tion. “Infinite” information can however often be approximated by “finite”elements. We shall try to model this aspect of finiteness in this section. Itmust be stressed that we will not capture every aspect of finiteness. Forexample, we shall not treat questions of computability and related issues,which are also linked to the notion of finiteness. This is postponed to partII. The structures considered here approach the subject of information al-gebras to domain theory. Many aspects we treat in this section, are treatedin domain theory, and much more in detail than here. Therefore we shallkeep the presentation short. However, the one crucial feature not addressedin domain theory, is focusing. Also domain theory places more emphasis onorder, approximation and convergence of information and less on combina-tion. So, although the subject is similar to domain theory, it is treated herewith a different emphasis and goal.

Page 78: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

78 CHAPTER 4. ORDER OF INFORMATION

We consider a domain-free information algebra (Φ, D). In the set Φof pieces of information we single out a subset Φf of elements which areconsidered to be finite. We shall see in a moment in what respect theseelements can be considered to be finite. In any way, combination of finiteinformation should again yield finite information. So Φf is assumed tobe closed under combination. The neutral element e is considered to befinite and belongs thus to Φf . May be one would expect that focusing offinite information yields also finite information. Although this is reasonableenough in many cases, we do not assume it for the purposes of this section.

Any information φ in Φ should be approximated with an arbitrary degreeof precision by a system of finite pieces of information. Each approximatingfinite information should be coarser than φ. The latter must then be thesupremum of the approximating elements. Inversely, the approximatingsystem of finite elements must, for any element of it, or more generally forany finite subset of it, contain a finer finite information. This means thatφ is approximated by more and more informative finite elements and thiscaptures the idea of convergence to the supremum. This concept correspondsto a directed set:

Definition 4.5 Directed Set. A nonempty subset X of Φ is called a di-rected set, if it contains with any finite subset φ1, φ2, . . . , φm ⊆ X, anelement φ ∈ X finer than all the elements of the subset,

φ1 ⊗ φ2 ⊗ · · · ⊗ φm ≤ φ.

So, X is directed, if it is non-empty and for any two pieces of informa-tion φ and ψ in X there exists a η ∈ X such that φ ≤ η and ψ ≤ η. Adirected set X of Φ is said to converge in Φ, if the supremum of X exists inΦ, that is ∨X ∈ Φ.

In order to model approximation of an information φ in Φ by finiteelements, we require first that all directed sets of finite pieces of informationconverge in Φ. This means that the respective suprema are elements of Φwhich are approximated by finite elements. But we want more: any elementof Φ must be approximated in this way. The finite elements must be densein Φ, that is any element of Φ is the supremum of a directed set of finiteelements. It seems natural to take the directed set of all finite elementssmaller than φ as the converging set,

Aφ = ψ ∈ Φf : ψ ≤ φ. (4.14)

and require that

φ = ∨Aφ. (4.15)

These two requirements do not yet say much about finiteness. One thingthey say, is that if φ is finite, then, according to equation (4.14) it belongs

Page 79: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.4. FINITE ELEMENTS AND APPROXIMATION 79

itself to the converging set of it. This is surely an important property offiniteness. But again we want more. We may possibly approximate anelement by a directed set X of finite elements which is smaller than the setof all finite elements smaller than ∨X. But then, if φ is a finite element,φ ≤ ∨X, if φ belongs not to X, there must at least exist an element ψ in Xfiner than φ. This is a so-called compactness property.

So we impose the following conditions on Φ and Φf .

1. Convergence: If X ⊆ Φf is a directed set, then the supremum ∨Xover X exists and belongs to Φ.

2. (Weak) Density: For φ ∈ Φ

φ = ∨ψ ∈ Φf : ψ ≤ φ. (4.16)

3. Compactness: If X ⊆ Φf is a directed set, and φ ∈ Φf such thatφ ≤ ∨X, then there exists a ψ ∈ X such that φ ≤ ψ.

Whereas these requirements clearly must be satisfied, we need sometimesa stronger requirement regarding density, namely, that an information φsupported by some domain x can be approximated by finite pieces of infor-mation supported by the same domain. This is expressed by the followingstrong version of density: For φ ∈ Φ and x ∈ D,

φ⇒x = ∨ψ ∈ Φf : ψ ≤ φ, ψ = ψ⇒x. (4.17)

Below we shall see examples of information algebras satisfying the re-quirement of density, but not the strong density. This shows that (4.16)does not imply (4.17).

Definition 4.6 Compact Domain-Free Information Algebra. A sys-tem (Φ,Φf , D), where (Φ, D) is a domain-free information algebra, Φf isclosed under combination and contains the empty information e, satisfy-ing the three conditions of convergence, density and compactness formulatedabove, is called a compact information algebra. If moreover the requirementof strong density is satisfied, the algebra is called strongly compact.

Here are a a few examples of compact information algebras:

Example 4.5 Finite Information Algebras: If the information algebra (Φ, D)is finite, i.e. if Φ has only finitely many elements, then it is (trivially) com-pact. In fact all elements are finite, Φf = Φ. Convergence and density,even strong density, hold clearly. If X is a (necessarily finite), directed set,X = φ1, . . . , φm, then φ1 ⊗ · · · ⊗ φm ∈ X, by the directedness of X. Thisimplies compactness. Of course this is not a particularly interesting exampleof a compact algebra.

Page 80: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

80 CHAPTER 4. ORDER OF INFORMATION

Example 4.6 Convex Sets: The convex polyhedra are the finite elementsof the information algebra of convex sets in Rn. In this example againstrong density holds: Clearly, the projection of convex polyhedra are stillconvex polyhedra and convex polyhedra on a given domain x can be usedto approximate convex sets on the same domain. in fact, convex polyhedraforma subalgebra of convex sets. We refer to Section 4.6 for a discussion ofsuch an algebra in its labeled version. The full proof is left to the reader.

Example 4.7 Cofinite Sets: Consider the domain-free information algebraof subsets of Rn. A cofinite set in Rn is the complement of a finite set. Cofi-nite sets are closed under intersection. Any subset of Rn is the intersectionof all cofinite sets it is contained in. This indicates that the cofinite sets ofRn are the finite elements of the domain-free information algebra of subsetsof Rn. But strong density does not hold, since projections of cofinite setsto subspaces of Rn are no more cofinite. So, this algebra is not stronglycompact.

We remark that the strong density equation (4.17) implies equation(4.16) because of compactness. So in a compact information algebra, anyelement is approximated by all finite element it dominates. However, theconverse, namely that equation (4.16) implies density, equation (4.17) is nottrue in general, as the example of cofinite sets above shows. Here is an evenmore explicit counter-example: Take for Φ the set 0, 1, 2, . . .∪ω. Definecombination as follows:

φ⊗ ψ =

max(φ, ψ), if φ 6= ω and ψ 6= ω,ω, otherwise.

For the lattice take D = 0, 1, where 0 < 1. The focusing operation isdefined as follows:

φ⇒1 = φ,

φ⇒0 =

0, if φ 6= ω,ω, otherwise.

As finite elements take the set 0, 1, 2, . . .. Then equation (4.15) is satisfied,but (4.17) is not.

However, in case Φf is closed also under focussing, that is, (Φf , D) isa subalgebra of (Φ, D), then weak density does imply strong density. Thiswill be stated and proved later in Lemma 4.9

The first main result is stated in the next theorem. It is less a theorem ofinformation algebra than one of lattice theory. We refer therefore to (Davey& Priestley, 1990a) for similar results and more details.

Theorem 4.8 Let (Φ,Φf , D) be a compact information algebra. Then:

Page 81: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.4. FINITE ELEMENTS AND APPROXIMATION 81

1. Φ is a complete lattice.

2. An information φ ∈ Φ belongs to Φf if, and only if, for all directedsubsets X ⊆ Φ, if φ ≤ ∨X, then there exists a ψ ∈ X such that φ ≤ ψ.

3. An information φ ∈ Φ belongs to Φf if, and only if, for all subsetsX ⊆ Φ, if φ ≤ ∨X, then there exists a finite subset Y ⊆ X such thatφ ≤ ∨Y .

Proof. (1) Let X be a subset of Φ. We show first that X has a greatestlower bound, an infimum. To obtain the greatest lower bound of X wedefine Y to be the set of all the finite lower bounds of X, i.e.

Y = ψ ∈ Φf : ψ ≤ φ for all φ ∈ X.

It is easy to see that Y is a directed subset of Φf and therefore, by theconvergence axiom, the supremum of Y exists in Φ. We claim that ∨Yis the infimum of X. Obviously, ∨Y is a lower bound of X, since everyelement of X is an upper bound of Y . Let ψ be another lower bound of X,and by the density axiom ψ = ∨Aψ. But Aψ is contained in Y and thusψ = ∨Aψ ≤ ∨Y . So ∨Y is the greatest lower bound of X.

To prove that each subset X of Φ has a supremum, we consider the setof its upper bounds. The infimum of this set exists, by what we have justproved. With standard methods of lattice theory one proves now that theinfimum of the upper bounds of a set X is its supremum (see (Davey &Priestley, 1990a)). So each set X has a supremum. Thus Φ is a completelattice.

(2) Assume that φ ∈ Φf , X ⊆ Φ is a directed set and φ ≤ ∨X. We define

Y = ψ ∈ Φf : there exists a χ ∈ X such that ψ ≤ χ.

Since X is directed, Y is directed too. We claim that ∨Y is an upper boundof X. Let χ be an element of X. Then Aχ is contained in Y and thereforeχ = ∨Aχ ≤ ∨Y . So we obtain φ ≤ ∨X ≤ ∨Y and, by the compactnessaxiom, it follows that there exists a ψ ∈ Y such that φ ≤ ψ. By thedefinition of the set Y there exists a χ ∈ X such that φ ≤ ψ ≤ χ.

For the converse direction assume that for all directed sets X ⊆ Φ, ifφ ≤ ∨X, then there exists a ψ ∈ X such that φ ≤ ψ. We take the directedset Aφ and, since φ = ∨Aφ, there exists a ψ ∈ Aφ such that φ ≤ ψ. Because,by the definition of Aφ, ψ ≤ φ and ψ ∈ Φf , we obtain that φ = ψ and thusφ belongs to Φf .

(3) The third assertion of the theorem follows from (2) by the followingobservation. Let X be an arbitrary, not necessarily directed, subset of Φ.Define

Z = ∨Y : Y ⊆ X,Y finite.

Page 82: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

82 CHAPTER 4. ORDER OF INFORMATION

Then we claim that Z is directed and ∨X = ∨Z. That Z is directed can beseen as follows: First, e belongs to Z, since e = ∨∅. If Y1 and Y2 are finitesubsets of X, then Y1 ∪Y2 is finite too, and ∨(Y1 ∪Y2) is an upper bound of∨Y1 and ∨Y2 in Z. It is obvious that ∨Z ≤ ∨X, since for each subset Y ofX,∨Y ≤ ∨X. But, X is contained in Z since φ = ∨φ ∈ Z for all φ ∈ X.Hence we obtain that ∨X ≤ ∨Z, thus ∨Z = ∨X.

Assume then that φ ∈ Φf and φ ≤ ∨X = ∨Z. By (2) there is a ψ ∈ Zsuch that φ ≤ ψ. But we have ψ = ∨Y for some finite subset Y of X.Conversely, assume φ ≤ ∨X = ∨Z and that Y is a finite subset of X suchthat φ ≤ ∨Y . Since ∨Y ∈ Z and Z is directed, it follows from (2) thatφ ∈ Φf . ut

This theorem says that the finite elements of a compact information al-gebra are characterized by the partial order in Φ alone. In lattice theory,elements of a complete lattice, which satisfy condition (2) of the theoremare called finite and those which satisfy condition (3) are called compact .Thus, the elements of Φf are both “finite” and “compact” in this sense. Fur-thermore, lattices in which each element is the supremum of the compactelements it dominates are called algebraic. Hence, a compact informationalgebra Φ is an algebraic lattice. Further Φ − z is an algebraic completepartial order such that, φ ∨ ψ ∈ Φ − z, if the φ ⊗ ψ 6= z. Such a struc-ture is called a domain or more precisley a Scott-Ershov domain. We referto (Davey & Priestley, 1990a) for a discussion of these subjects. So, com-pact information algebras are closely related to domains and share manyproperties with them.

At this point we prove a first important theorem about the interchange-ability of supremum and focusing in the case of strongly compact informationalgebras:

Theorem 4.9 Let (Φ,Φf , D) be a strongly compact domain-free informa-tion algebra. If X is a directed subset of Φ and x ∈ D, then

(∨X)⇒x =∨φ∈X

φ⇒x. (4.18)

Proof. First, note that φ ∈ X implies φ ≤ ∨X, hence φ⇒x ≤ (∨X)⇒x

and ∨φ∈X

φ⇒x ≤ (∨X)⇒x.

Conversely, by strong density,

(∨X)⇒x = ∨ψ ∈ Φf : ψ = ψ⇒x ≤ ∨X.

But, if ψ = ψ⇒x ∈ Φf such that ψ ≤ ∨X, then, since X is directed, thereis a φ ∈ X such that ψ ≤ φ (see Theorem 4.8 (2)). For this element φ we

Page 83: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.5. COMPACT IDEAL COMPLETION 83

have ψ = ψ⇒x ≤ φ⇒x. Hence

(∨X)⇒x ≤∨φ∈X

φ⇒x.

utThis allows us to show that weak density implies strong density in an

important case:

Lemma 4.9 If (Φf , D) is a subalgebra of (Φ, D), weak density implies strongdensity.

Proof. By Theorem 4.9, join and focussing may be exchanged in thiscase because the set ψ ∈ Φf : ψ ≤ φ is directed. Therefore, from (4.16) itfollows that

φ⇒x = (∨ψ ∈ Φf : ψ ≤ φ)⇒x

= ∨ψ⇒ : ψ ∈ Φf , ψ ≤ φ= ∨ψ ∈ Φf : ψ ≤ φ, ψ = ψ⇒x.

This proves the lemma. utSo, in case (Φf , D) is a subalgebra of (Φ, D), a compact information

algebra (Φ,Φf , D) is automatically strongly compact

4.5 Compact Ideal Completion

Compact information algebras can be obtained from any information algebra(Φ, D) by ideal completion (see Section 4.2). The elements of Φ are thenthe finite elements of the compact algebra.

Theorem 4.10 Let (Φ, D) be a domain-free information algebra and IΦ theideals of Φ with the operation of combination and focusing defined by (4.6)and (4.8). If D has a top element, and I : Φ→ IΦ is the embedding of Φ inIΦ (see Theorem 4.3), then (IΦ, I(Φ), D) is a strongly compact informationalgebra with finite elements I(Φ).

Proof. IΦ is an information algebra according to Theorem 4.2. It remainsto show that I(Φ) are its finite elements, satisfying convergence, density andcompactness axioms.

To simplify notation we identify Φ with I(Φ) and write simply φ insteadof I(φ). Note then, that φ ≤ I if, and only if, φ ∈ I. More generally, wehave I1 ≤ I2 if, and only if I1 ⊆ I2.

Convergence: Suppose X ⊆ Φ to be a directed subset. We claim thatI(X) (the ideal generated from X, see Section 4.2) equals ∨X in IΦ. Thatis,

I(X) =φ ∈ Φ : φ ≤ φ1 ⊗ · · · ⊗ φm

for some finite set of elements φ1, . . . , φm ∈ X.

.

Page 84: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

84 CHAPTER 4. ORDER OF INFORMATION

So, if φ ∈ X, then φ ∈ I(X), hence φ ≤ I(X). Thus I(X) is an upper boundof X.

Assume that I is another upper bound of X. Then X ⊆ I, hence I(X) ⊆I or I(X) ≤ I. Thus we have proved that ∨X = I(X) ∈ IΦ.

Density: We have

I⇒x = ψ ∈ Φ : ψ ≤ φ⇒x for some φ ∈ I.

We have to prove I⇒x = ∨X = I(X) for the setX = φ ∈ Φ : φ = φ⇒x ∈ I.Suppose ψ ∈ I(X) such that

ψ ≤ φ1 ⊗ · · · ⊗ φm= φ⇒x1 ⊗ · · · ⊗ φ⇒xm

= (φ1 ⊗ · · · ⊗ φm)⇒x

= φ⇒x = φ ∈ I.

So we have ψ ≤ φ⇒x for some φ ∈ I. Hence ψ ∈ I⇒x and we have shownthat I(X) ⊆ I⇒x.

Conversely, assume ψ ∈ I⇒x, that is ψ ≤ φ⇒x, φ ∈ I. Then φ⇒x ∈ I,since φ⇒x ≤ φ. We conclude that ψ ∈ I(X), since φ⇒x = (φ⇒x)⇒x ∈ I,hence φ⇒x ∈ X. So we have proved that I⇒x = I(X).

Compactness: Let X ⊆ Φ be a directed subset, and φ ≤ ∨X. DefineI = ∨X = I(X). Then φ ∈ I(X). But this means that

φ ≤ φ1 ⊗ · · · ⊗ φm, for φ1, . . . , φm ∈ X.

Since X is directed, there is some ψ ∈ X such that

φ1 ⊗ · · · ⊗ φm ≤ ψ.

Hence we have φ ≤ ψ ∈ X. utWe shall see in Part III that compact information algebras can be ob-

tained in other ways.Let now, conversely (Φ,Φf , D) be a compact information algebra. Does

it hold that Φ is the ideal completion of Φf? For this it is of course nec-essary that (Φf , D) is itself an information algebra, that is a subalgebra of(Φ, D). Or, in other words, it is necessary that Φf is closed not only undercombination, but also under focussing. This turns out also to be sufficient,and this is then the representation theorem for information algebras, a resultwhich parallels and extends a basic theorem of domain theory

Theorem 4.11 Representation Theorem for Domain-Free Information Al-gebras. Let (Φ,Φf , D) be a compact information algebra and (Φf , D) itselfan information algebra. Then (Φ, D) is isomorph to the ideal completion of(Φf , D).

Page 85: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.6. LABELED COMPACT INFORMATION ALGEBRA 85

Proof. Let I be an ideal in Φf . Then I is a directed set, and hence∨I ∈ Φ by convergence in the compact algebra (Φ,Φf , , D). Let φ = ∨I andψ ≤ φ, ψ ∈ Φf . By compactness there is then a η ∈ I with ψ ≤ η, andtherefore ψ ∈ I. This shows that I = Aφ. So, there is a one-to-one relationbetween ideals I in Φf and Aφ and hence φ for φ ∈ Φ. It is an isomorphismsince

Aφ⊗ψ = Aφ ⊗Aψ, Aφ⇒x = A⇒xφ .

utNote that even in a compact information algebra, IΦ 6= Φ, in general,

because there may exist ideals I different from Aφ, such that nevertheless∨I = φ holds.

We have seen in Section 4.2 that there is also a concept of ideal com-pletion in labeled information algebra. But so far we have no compactnessnotion for the labeled information algebras. This gap will be filled in thenext section. There it will become clear that for labeled ideal completion asimilar compactness result holds.

4.6 Labeled Compact Information Algebra

In a compact information algebra, the approximation of elements by finiteones takes place globally, i.e. domain-free. In practice however, finitenessand approximation is often considered per domain. In other words, a pieceof information φ with domain d(φ) is approximated by finite elements of thisdomain. The strong density condition (Section 4.4) reflects this idea par-tially. In this section a corresponding theory based on labeled informationalgebra is proposed. We start with a labeled information algebra (Φ, D) andassume that for every domain x ∈ D there exists a set of finite elements Φf,x,which is closed under combination and whose elements approximate everyelement in Φx. More precisely, we require for all x ∈ D, the convergence,density and compactness property of a compact algebra as introduced inSection 4.4:

1. Convergence: If X ⊆ Φf,x is a directed set, then the supremum ∨Xover X exists and belongs to Φx.

2. Density: For φ ∈ Φx,

φ = ∨ψ ∈ Φf,x : ψ ≤ φ.

3. Compactness: If X ⊆ Φf,x is a directed set, and φ ∈ Φf,x such thatφ ≤ ∨X, then there exists a ψ ∈ X such that φ ≤ ψ.

We extend now Definition 4.6 in the following way:

Page 86: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

86 CHAPTER 4. ORDER OF INFORMATION

Definition 4.7 Labeled Compact Information Algebra A system (Φ,Φf , D),where (Φ, D) is a labeled information algebra,

Φf =⋃x∈D

Φf,x

where the sets Φf,x ⊆ Φx are closed under combination, contain the neu-tral element ex, and satisfy the three conditions of convergence, density andcompactness formulated above for all x ∈ D, is called a labeled compactinformation algebra.

Here are two examples of labeled compact information algebras:

Example 4.8 Cofinite Sets: In Example 4.7 we have considered the com-pact domain-free information algebra of subsets of Rn with cofinite sets of Rn

as finite elements. Let now D be the lattice of finite subsets of ω = 1, 2, . . ..Consider, for any s ∈ D the s-tuples t : s → R. They form the domain Rs.And the subsets of Rs for all s∈ D form a labeled information algebra Φ.The cofinite sets in Rs are the finite elements forming Φf,s of the domainRs. Any subset of Rs is the intersection of all the cofinite sets of the domainit is contained in. Thus, (Φ,Φf , D) with Φf = ∪s∈DΦf,s is an example of acompact labeled information algebra.

Example 4.9 Convex Sets: The domain-free information algebra of convexsets in Rn has been shown to be compact with convex polyhedra as finiteelements in Example 4.6. This may be extended to a labeled algebra withconvex sets in domains Rs for any s ∈ D, where D is the lattice of finitesubsets of ω = 1, 2, . . . (see the previous example). The convex polyhedrain domain Rs are the finite elements of the domain. This is then a compactlabeled information algebra. Note that projections of convex polyhedraremain convex polyhedra, and, further, combinations of convex polyhedrayield again convex polyhedra. Thus, the finite elements form a subalgebra ofthe information algebra of convex sets. This is sufficient for the domain-freealgebra associated with the labeled one to be strongly compact, as we shallsee (Theorem 4.16). This in contrast to the cofinite sets of the previousExample 4.8, where cofinite sets do not form a subalgebra.

It is evident that Theorem 4.8 carries over with respect to every semi-lattice Φx.

Theorem 4.12 Labeled Compact Information Algebra. Let (Φ,Φf , D)be a labeled compact information algebra. Then, for all x ∈ D:

1. Φx is a complete lattice.

Page 87: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.6. LABELED COMPACT INFORMATION ALGEBRA 87

2. An information φ ∈ Φx belongs to Φf,x if, and only if, for all directedsubsets X ⊆ Φx, if φ ≤ ∨X, then there exists a ψ ∈ X such thatφ ≤ ψ.

3. An information φ ∈ Φx belongs to Φf,x if, and only if, for all subsetsX ⊆ Φx, if φ ≤ ∨X, then there exists a finite subsets Y ⊆ X such thatφ ≤ ∨Y .

The proof of Theorem 4.8 carries over to this theorem and is thereforenot repeated here. The whole algebra Φ is in general not a complete lattice.If, however, the lattice of D is complete, then so is Φ in a labeled compactinformation algebra:

Theorem 4.13 Let (Φ,Φf , D) be a labeled compact information algebra,where D is a complete lattice. Then Φ is a complete lattice.

Proof. Consider a subset X ⊆ Φ. Let y = ∨ψ∈Xd(ψ). Then we havefor every ψ ∈ X that ψ ≤ ψ↑y, hence ∨ψ∈Xψ↑y is an upper bound of X.On the other hand, if η is an upper bound of X, then ψ ≤ η for all ψ ∈ Ximplies y ≤ d(η). Therefore we conclude that ψ↑y ≤ η for all ψ ∈ X, hence∨ψ∈Xψ↑y ≤ η. Thus ∨ψ∈Xψ↑y is the least upper bound of X, i.e. ∨X existsin Φ and ∨X = ∨ψ∈Xψ↑y. But then the existence of the infimum for any setX ⊆ Φ follows by standard methods of lattice theory: Consider the set Yof all lower bound of X. There is at least one lower bound, namely ez withz = ∧ψ∈Xd(ψ). Now, ∨Y ∈ Φ exists by what we have just proved. Thissupremum ∨Y is a lower bound of X, and and if η is another lower boundof X, then η ∈ Y , hence η ≤ ∨Y . Therefore ∨Y is the infimum of X. Thus∧X = ∨Y exists for any subset X ⊆ Φ. Hence Φ is a complete lattice. ut

In the case of domain-free algebras, ideal completion leads to a compactalgebra. The same is true for ideal completion of a labeled informationalgebra. In particular, Theorem 4.10 carries over to labeled informationalgebras:

Theorem 4.14 Let (Φ, D) be a labeled information algebra and IΦx theideals of Φx, x ∈ D, with the operation of combination and projection definedby (4.9) and (4.10). Let

IΦ =⋃x∈D

IΦx

be the set of all ideals in Φ. Then (IΦ,Φ, D) is a labeled compact informationalgebra with finite elements Φ for x ∈ D.

The proof of this theorem is similar to the proof of Theorem 4.10 and isleft to the reader.

Page 88: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

88 CHAPTER 4. ORDER OF INFORMATION

We are now going to examine the relations between compact labeledand domain-free algebras. Under what conditions is the labeled version of acompact domain-free algebra itself compact, and, vice versa, is the domain-free version of a compact labeled algebra also compact? The next theoremshows that a strongly compact domain-free information algebra induces acompact labeled information algebra. The example of the compact algebraof cofinite sets (Example 4.7) on the other hand indicates that weak densityalone is not sufficient for this.

Theorem 4.15 If (Φ,Φf , D) is a domain-free strongly compact informationalgebra, then its associated labeled information algebra is compact too.

Proof. Let (Ψ, D) with Ψ = (φ, x) : φ ∈ Φ, φ = φ⇒x be the labeled in-formation algebra associated with the the information algebra (Φ, D). WithΨx we denote the subset of all elements (φ, x) of Ψ with domain x. We provefirst a lemma:

Lemma 4.10 For any set X ⊆ Ψx and∨

(ψ,x)∈X(ψ, x) ∈ Ψx∨(ψ,x)∈X

(ψ, x) = (∨

(ψ,x)∈X

ψ, x). (4.19)

Proof. To prove (4.19), define X ′ = ψ : (ψ, x) ∈ X, and φ = ∨X ′.But then, for ψ ∈ X, ψ ≤ φ, we have ψ = ψ⇒x ≤ φ⇒x ≤ φ. So φ⇒x is anupper bound of X, hence φ ≤ φ⇒x and therefore φ = φ⇒x. This shows thatx is a support of φ. Then it follows that (ψ, x) ≤ (φ, x) for all (ψ, x) ∈ X.Let (χ, x) be another upper bound of X. Then ψ ≤ χ for all ψ ∈ X ′, henceφ ≤ χ and therefore (φ, x) is the supremum of X. As we have seen, x is asupport of (φ, x), hence

∨(ψ,x)∈X(ψ, x) ∈ Ψx. This proves (4.19). ut

Now, to prove the theorem, we define

Ψf,x = (ψ, x) : ψ ∈ Φf , ψ = ψ⇒x.

This will be the set of finite elements on domain x. Clearly, Ψf,x is closedunder combination, since Φf is so, and the combination of two elementswith support x has support x too. Also (e, x) ∈ Ψf,x. We have to verify theconditions of convergence, density and compactness.

In order to show convergence let X ⊆ Ψf,x be a directed set. ThenX ′ = ψ : (ψ, x) ∈ X is directed to, and by convergence in the compactalgebra (Φ,Φf , D), ∨X ′ exists and belongs to Φ. Be Lemma 4.10

∨X = (∨η∈X′

η, x) ∈ Ψx.

This proves convergence

Page 89: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.6. LABELED COMPACT INFORMATION ALGEBRA 89

To verify density, we select an element (φ, x) ∈ Ψx. Then, again us-ing Lemma 4.10 and the strong density property in the compact algebra(Φ,Φf , D), we obtain

∨(ψ, x) : (ψ, x) ∈ Ψf,x, (ψ, x) ≤ (φ, x)= ∨(ψ, x) : ψ ∈ Φf , ψ = ψ⇒x ≤ φ= (∨ψ : ψ ∈ Φf , ψ = ψ⇒x ≤ φ, x)= (φ, x).

This proves density.For compactness consider a directed set X ⊆ Ψf,x and define X ′ ⊆ Φf

as above. Let (φ, x) ∈ Ψf,x with (φ, x) ≤ ∨X = (∨X ′, x), hence φ ≤ ∨X ′.By compactness in the algebra (Φ,Φf , D) there is a ψ ∈ X ′ with φ ≤ ψ.But then (φ, x) ≤ (ψ, x) ∈ X. This proves compactness.

We have thus shown that (Ψ,Ψf,x, D), the labeled information algebraassociated with (Φ,Φf , D) is indeed labeled compact. ut

As the example of labeled cofinite sets shows (Example 4.8), the domain-free version of a labeled compact information algebra is not necessarily itselfcompact. On the other hand, the example of the labeled algebra of convexsets (Example 4.9) indicates that the domain-free version may be compacttoo, if the finite elements (the convex polyhedra in the Example) form asubalgebra. This holds in general, as the following theorem proves.

Theorem 4.16 Let (Φ,Φf , D) be a labeled compact information algebra.Then, if (Φf , D) is a subalgebra of (Φ, D), and D has a top element >, theassociated domain-free information algebra (Φ/σ,D) is a strongly compactinformation algebra with Φf/σ as finite elements.

Proof. We remark first, that Φf/σ is closed under combination. Herewe use the assumption that Φf , as a subalgebra of Φ is closed under combi-nation. Hence, if [φ]σ, [φ]σ ∈ Φf/σ, that is φ, ψ ∈ Φf , therefore φ⊗ ψ ∈ Φf

and thus finally [φ]σ ⊗ [φ]σ = [φ ⊗ ψ]σ ∈ Φf/σ. It remains to show thatconvergence, density and compactness hold with respect to the elements ofΦf/σ).

Consider then a directed set X ⊆ Φf/σ. We need to show that ∨X existsin Φ/σ. To prove this, we use the following lemma, which is of independentinterest:

Lemma 4.11 Let (Φ, D) be a labeled information algebra. For all X ⊆ Φ,for which ∨X ∈ Φ exists, it holds that

[∨X]σ = ∨[X]σ,

where [X]σ = [η]σ : η ∈ X.

Page 90: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

90 CHAPTER 4. ORDER OF INFORMATION

Proof.Define φ = ∨X ∈ Φ, such that [φ]σ = [∨X]σ. If d(φ) = x, we claim that

∨X↑x = φ,

where X↑x = η↑x : η ∈ X|. In fact, η ≤ η↑x ≤ φ↑x = φ, hence φ = ∨X ≤∨X↑x ≤ φ↑x = φ. This proves the claim.

Now, η ∈ X implies η ≤ φ, hence [η]σ ≤ [φ]σ and therefore ∨[X]σ ≤ [φ]σ.Assume [χ]σ to be an upper bound of [X]σ. Then η↑x ≤ χ↑x, thus ∨X↑x ≤χ↑x. But then we have φ↑x = (∨X)↑x ≤ χ↑x. This proves that [φ]σ is thesupremum of [∨X]σ. ut

Note now that > is a support of every element in Φ/σ. In particular anyelement [φ]σ ∈ X has a representant φ with d(φ) = >. Let then

X ′ = φ ∈ Φf,> : [φ]σ ∈ X.

Since (Φ,Φf , D) is compact, ∨X ′ exists in Φ. By Lemma 4.11 we obtainthen ∨X = ∨[X ′]σ = [∨X ′]σ ∈ Φ/σ. This proves convergence.

Next consider an element [φ]σ ∈ Φ/σ and select a representative φ ∈ Φ.Then [φ]⇒xσ = [φ→x]σ. By the compactness of (Φ,Φf , D) we have

φ→x = ∨ψ ∈ Φf,x : ψ ≤ φ→x.

By Lemma 4.11 we obtain from this

[φ]⇒xσ = [∨ψ ∈ Φf,x : ψ ≤ φ→x]σ= [ψ]σ : ψ ∈ Φf,x : ψ ≤ φ→x. (4.20)

But we have ψ ∈ Φf,x : ψ ≤ φ→x if, and only if, [ψ]σ ∈ Φf/σ and [ψ]σ =[ψ]⇒xσ ≤ [φ]σ. This follows, since from [ψ]σ = [ψ]⇒xσ = [ψ→x]σ we may selecta representant ψ of [ψ]σ in Φf,x. Therefore, from (4.20) we obtain

[φ]⇒xσ = ∨[ψ]σ ∈ Φf,x : [ψ]σ = [ψ]⇒xσ ≤ [φ]σ.

This proves strong density.Finally consider a directed set X ⊆ Φf/σ and [φ]σ ∈ Φf/σ such that

[φ]σ ≤ ∨X. Select as above X ′ to be the set of representants of elements ofX with domain >. Then, if φ is a representant of [φ]σ with d(φ) = >, wehave that φ ≤ ∨X ′.

Now let [η]σ = ∨X, such that according to Lemma 4.11,

[η]σ = [∨ψ : d(ψ) = >, ψ ∈ X ′.

By the compactness of the algebra (Φ,Φf , D) there is a ψ ∈ X ′ such thatφ ≤ ψ. But then [φ]σ ≤ [ψ]σ and [ψ]σ ∈ X. This shows that compactnessholds also.

ut

Page 91: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.7. CONTINUOUS ALGEBRAS 91

The requirement that lattice D has a top element (or alternatively thatit is a complete lattice) seems to be necessary. Otherwise a directed setin Φf/σ has not necessarily a corresponding directed set of representativeswhose supremum exists in Φ.

Labeled compact algebras, although they do not necessarily induce as-sociated compact domain-free algebras, as we have seen in the example ofcofinite sets, have nevertheless interesting properties, as will be discussed inthe following Section 4.7.

4.7 Continuous Algebras

This section is based on a yet unpublished paper by Xuechong Guan 1 andYongmin Li 2 (Guan & Li, 2010). All main results in this section are fromthis paper.

The finiteness or compactness property of finite elements in a compactalgebra, as exhibited in Theorem 4.8 (2) and (3) can be be weakened, andreplaced by a more general approximation concept.

Definition 4.8 Way Below. Let Φ be a partially ordered set. For φ, ψ ∈ Φwe write ψ φ, and say ψ is way below φ if, for any directed set X ∈ Φ,from φ ≤ ∨X it follows that there is a η ∈ X such that ψ ≤ η.

Note that according to this definition, condition (2) of Theorem 4.8 canbe rephrased as φ ∈ Φf if, and only if, φ φ. Here are a few simpleconsequences of this definition

Lemma 4.12 The following hold for φ, ψ ∈ Φ:

1. ψ φ implies ψ ≤ φ.

2. ψ φ and φ ≤ η imply ψ η.

3. χ ≤ ψ and ψ φ imply χ φ.

Proof. (1) Since φ is a directed set with φ ≤ ∨φ, ψ φ impliesψ ≤ φ

(2) If X ⊆ Φ is a directed set and η ≤ ∨X, then φ ≤ ∨X, and hencefrom ψ φ it follows that there is an element χ ∈ X with ψ ≤ χ. Henceψ η.

1College of Mathematics and Information Science, Shaanxi Normal University, Xi’an710062 China and School of Mathematical Science, Xuzhou Normal University, Xuzhou,221116 China

2College of Mathematics and Information Science, and College of Computer Science,both Shaanxi Normal University, , Xi’an 710062 China

Page 92: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

92 CHAPTER 4. ORDER OF INFORMATION

(3) Again, if X ⊆ Φ is a directed set with φ ≤ ∨X, then ψ φ impliesthat there is a η ∈ X such that ψ ≤ η, hence χ ≤ η. This shows that χ φ.

utFrom now on, Φ will be part of an information algebra (Φ, D), either

labeled or domain-free. Then the following holds:

Lemma 4.13 Assume that (Φ, D) is a domain-free information algebra.Then the following holds:

1. ∀φ ∈ Φ, the set ψ : ψ φ is an ideal.

2. ψ φ if, and only if, for all X ⊆ Φ such that ∨X exists and φ ≤ ∨X,there exists a finite subset F ⊆ X such that ψ ≤ ∨F .

Proof. (1) Assume ψ φ and χ ≤ ψ. Then by (3) of Lemma 4.12 itfollows that χ φ. Further suppose ψ1 φ and ψ2 φ. Consider anydirected set X such that φ ≤ ∨X. Then there exist η1 ∈ X and η2 ∈ X suchthat ψ1 ≤ η1 and ψ2 ≤ η2. Since X is directed there exists also an elementη ∈ X such that η1, η2 ≤ η. Therefore we conclude that ψ1⊗ψ2 ≤ η1⊗η2 ≤ η,hence ψ1 ⊗ ψ2 φ. This proves that ψ : ψ φ is an ideal.

(2) Suppose ψ φ. Let X ⊆ Φ be a directed subset of Φ such that ∨Xexists in Φ and φ ≤ ∨X. Let Y be the set of all finite joins of elements ofY . Then we have X ⊆ Y . Also every element of Y is bounded above by∨X. Conversely, if η is an upper bound of Y , then it certainly is also anupper bound of X, hence ∨X ≤ η. So, ∨X is the least upper bound of Y ,or ∨X = ∨Y . Furthermore, Y is directed. So there is an element η ∈ Ysuch that ψ ≤ η. But η = ∨F for some finite subset F ⊆ X.

Conversely, let X be a directed subset such that ∨X exists in Φ andφ ≤ ∨X. Then there is a finite subset F ⊆ X such that ψ ≤ ∨F . Since Xis directed, there is an element η ∈ X such that ∨F ≤ η, hence ψ ≤ η. Soψ φ, completing the other side of the proof. ut

We define now what we mean by continuous information algebras.

Definition 4.9 Domain-Free Continuous Information Algebra. Let(Φ, D) be a domain-free information algebra. If there exists a set B ⊆ Φwhich is closed under combination, and such that it holds that

1. Convergence: If X ⊆ B is directed, then∨X exists in Φ.

2. (Weak) Density: For all φ ∈ Φ

φ = ∨ψ ∈ B : ψ φ, (4.21)

then (Φ, D) is called continuous.If instead of (2) it holds that, for all φ ∈ Φ and x ∈ D,

φ⇒ = ∨ψ ∈ B : ψ = ψ⇒x φ = φ, (4.22)

then (Φ, D) is called strongly continuous.

Page 93: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.7. CONTINUOUS ALGEBRAS 93

In the strong version of density, approximation of elements on Φ by elementsway-down is required to be possible on domains which support φ. This issimilar to the case of compact information algebra, where strong densityrequires that elements may be approximated by finite elements of the samedomain.

A similar definition can also be given for labeled information algebra.There it is convenient and natural to consider the way-below relation perdomain x ∈ D. Thus we consider for any x ∈ D the partially order set Φx

of elements with domain x and define in this set the way-below relationx.This leads then to the following definition:

Definition 4.10 Labeled Continuous Information Algebra. Let (Φ, D)be a labeled information algebra. If there exists a set Bx ⊆ Φx for all do-mains x ∈ D which is closed under combination, and such that it holds thatfor all x ∈ D

1. Convergence: If X ⊆ Bx is directed, then∨X exists in Φx.

2. Density: For all φ ∈ Φx

φ = ∨ψ ∈ Bx : ψ x φ, (4.23)

then (Φ, D) is called continuous.

Note that it is not excluded that the basis B of a continuous informationalgebra equals Φ. Here is an example which illustrates this.

Example 4.10 (Guan & Li, 2010). Let Φ = [0, 1] and D = 0, 1. Theoperations of combination and focusing are define as follows:

∀φ, ψ ∈ Φ, φ⊗ ψ = maxφ, ψ,∀φ ∈ Φ, φ⇒1 = φ,

φ⇒0 =φ if φ ∈ [0, 1

2 ],12 if φ ∈ (1

2 , 1].(4.24)

The reader may verify that this defines indeed a domain-free informationalgebra. We have φ ψ if, and only if, φ < ψ or φ = ψ = 0. As a basis wemay take here B = Φ. Then, we claim that

φ⇒x = ∨ψ ∈ Bx : ψ = ψ⇒x x φ. (4.25)

In fact, it is obvious that this identity holds for x = 1. To verify it for x = 0assume first φ ∈ (1

2 , 1]. In this case

∨ψ ∈ Bx : ψ = ψ⇒0 φ = ∨ψ ∈ [0,12

];ψ < φ =12

= φ⇒0.

Page 94: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

94 CHAPTER 4. ORDER OF INFORMATION

Otherwisen, if φ ∈ [0, 12 ], we have

∨ψ ∈ Bx : ψ = ψ⇒0 φ = ∨ψ ∈ [0,12

];ψ < φ = φ = φ⇒0.

Convergence is evident. So (Φ, D) is indeed a strongly continuous informa-tion algebra. It is not compact, because the only finite element, defined byφ φ, is 0.

So, as the example shows, a continuous information algebra, in general,is not compact. But, inversely, a compact information algebra is alwayscontinuous.

Theorem 4.17 A (strongly) continuous domain-free information algebra(Φ, D) is (strongly) compact if, and only if, the set φ ∈ Φ : φ φ is abasis for (φ,D).

Proof. Let first (Φ, D) be continuous with basis φ ∈ Φ : φ φ.With this set as finite elements, (Φ, D) becomes in fact compact: Conver-gence and (strong) density are inherited from the continuity of the algebra.Compactness follows from the definition of the way-below relation φ φ.

Conversely, the relation φ φ defines compactness of elements φ. So(Φ,Φf , D) is compact with Φf = φ ∈ Φ : φ φ, if additionally conver-gence and (strong) density hold. Note that φ ∈ Φf and ψ ≤ φ impliesψ φ (Lemma 4.12 (3)). But then φ ∈ Φ : φ φ is in fact a basis and(Φ, D) is (strongly) continuous: ut

So, any (strongly) compact domain-free information algebra is (strongly)continuous with the finite elements as basis. The converse holds only if,φ ∈ Φ : φ φ is a basis of the continuous algebra. This clarifies therelation between compactness and continuity.

In a compact information algebra, a further property of the way-downrelation can be proved:

Theorem 4.18 If (Φ,Φf , D) is a compact, domain-free information alge-bra, then χ φ implies the existence of an element ψ ∈ Φf such thatχ ≤ ψ ≤ φ.

Proof. Consider the set Aφ = ψ ∈ Φf : ψ ≤ φ which is directed andfor which φ ≤ ∨Aφ. Then χ φ implies that there is a ψ ∈ X such thatχ ≤ ψ, hence χ ≤ ψ ≤ φ. ut

A set of elements S having the property that χ φ implies the existenceof an element ψ ∈ S such that χ ≤ ψ ≤ φ is called separating. Thus, theset of finite elements Φf in a domain-free compact information algebra areseparating.

When we go over to labeled compact information algebras (Φ,Φf,x, D),then Theorem 4.17 applies to the algebras (Φx,Φf,x, D) for all x ∈ D, if the

Page 95: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.7. CONTINUOUS ALGEBRAS 95

relation x is defined relative to Φx. In particular, Φf,x is separating inΦx. Further the labeled information algebra (Φ, D) is continuous. So, thesituation is like with domain-free algebras.

Just as a compact information algebra is a complete lattice (Theorem4.8), continuous information algebras are also closely related to completelattices. This is the issue we examine next.

Theorem 4.19 Let (Φ, D) be a domain-free information algebra. The thefollowing is equivalent:

1. (Φ, D) is continuous.

2. Φ is a complete lattice and for all φ ∈ D,

φ = ∨ψ ∈ Φ : ψ φ. (4.26)

Or, alternatively the following is also equivalent:

1. (Φ, D) is strongly continuous.

2. Φ is a complete lattice and for all φ ∈ D and x ∈ D,

φ⇒x = ∨ψ ∈ Φ : ψ = ψ⇒x φ. (4.27)

Proof. We prove first the case of a continuous information algebra.(1) ⇒ (2): Suppose that B is a base for (Φ, D. Consider a set X ⊆ Φ.

We claim that∧X exists. Define Y = ψ ∈ B : ψ ≤ φ for all φ ∈ X.

Since B is closed under combination, Y is a directed set. Hence∨Y exists

and is a lower bound for X. Assume that ψ ∈ Φ is another lower boundof X and define Aψ = η ∈ B : η ψ. Then Aψ ⊆ Y , since η ψimplies η ≤ ψ (Lemma 4.12 (1)). Therefore, ψ =

∨Aψ ≤

∨Y and

∨Y is

the infimum of X,∨Y =

∧X. By a standard result of lattice theory, Φ

is then a complete lattice. Identity (4.26) is simply the density property ofthe continuous algebra (Φ, D).

(2) ⇒ (1): Take B = Φ as a base. Convergence follows then fromcompleteness of the lattice Φ and density is assured by (4.26).

The second part, concerning strongly continuous information algebras isproved in the same way. ut

At this place we are in a position to state and prove a similar theoremfor compact information algebras. The following theorem enforces Theorem4.8:

Theorem 4.20 Let (Φ, D) be a domain-free information algebra. The thefollowing is equivalent:

1. (Φ,Φf , D) is compact.

Page 96: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

96 CHAPTER 4. ORDER OF INFORMATION

2. Φ is a complete lattice and for all φ ∈ D,

φ = ∨ψ ∈ Φf : ψ φ, (4.28)

where Φf = φ ∈ φ : φ φ.

Or, alternatively the following is also equivalent:

1. (Φ,Φf , D) is strongly compact.

2. Φ is a complete lattice and for all φ ∈ D and x ∈ D,

φ⇒x = ∨ψ ∈ Φf : ψ = ψ⇒x φ, (4.29)

where Φf is defined as above.

Proof. (1)⇒ (2): Compact algebras are also continuous. Therefore thisimplication follows from Theorem 4.19.

(2) ⇒ (1): Convergende follows again from completeness of the latticeΦ. If the finite elements Φf are defined as φ ∈ Φ : φ φ, then densityor strong density follow from (4.28) or (4.29) respectively. Compactness isa consequence of the way-below relation φ φ. ut

The last two theorems state in other words, that a domain-free infor-mation algebra (Φ, D) is a continuous or compact information algebra if,and only if, Φ is a continuous or algebraic lattice! For labeled informationalgebras, analogous theorems apply for the domains Φx for all x ∈ D. Weleave it to the reader to write out the corresponding statements.

Next we examine the domain-free algebra (Φ/σ) associated with thelabeled compact information algebra (Φ,Φf,x, D). In general this algebra isnot compact, as the example of cofinite sets indicates. But it turns out thatit is at least continuous, as the following discussion shows.

As usual, [φ]σ denotes the equivalence class of elements of Φ representingthe same (domain-free) information, that is the elements of Φ/σ.

Using Lemma 4.11, the following lemma can be proved.

Lemma 4.14 If [ψ]σ [φ]σ in Φ/σ and d(ψ) = d(φ) = x, then ψ x φ.Conversely, if D has a top element >, then ψ > φ implies [ψ]σ [φ]σ.

Proof. Assume X ⊆ Φx directed and φ ≤ ∨X. Consider then [φ]σ ≤∨[X]σ with [X]σ defined as in Lemma 4.11. The set [X]σ is directed, there-fore [ψ]σ [φ]σ implies that there is a η ∈ X such that [ψ]σ ≤ [η]σ, henceψ ≤ η. This proves that ψ x φ.

For the second part, assume ψ > φ and consider a directed set X inΦ/σ. We may take as representants of the elements [η]σ of X elements inΦ>. Let then X ′ = η ∈ Φ> : [η]σ ∈ X. X ′ is then still directed. Now, if[φ]σ ≤

∨X and φ is again a representant of [φ]σ in Φ>, then also φ ≤

∨X ′.

Since ψ > φ, there is an element η ∈ X ′ such that ψ ≤ η. But then[η]σ ∈ X and [ψ]σ ≤ [η]σ. This shows that [ψ]σ [φ]σ. ut

This permits to prove the main theorem.

Page 97: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

4.7. CONTINUOUS ALGEBRAS 97

Theorem 4.21 If (Φ, D) is a continuous labeled information algebra, andlattice D has a top element >, then (Φ/σ,D) is strongly continuous.

Conversely, if the continuous domain-free information algebra (Φ, D) isstrongly continuous, then the associated labeled information algebra (Ψ, D)is continuous.

Proof. We assume first that (Φ, D) is a continuous labeled informationalgebra, and lattice D has a top element >. We are going to show that Φ/σis complete. Consider any subset X ⊆ Φ/σ. Since D has a top element >,we may for all elements [φ]σ of X take a representant φ ∈ Φ>. Then letX ′ = φ ∈ Φ> : [φ]σ ∈ X. Since(Φ, D) is a continuous, Φ> is a completelattice (by the labeled version of Theorem 4.19), hence

∨X ′ exists. But by

Lemma 4.11 we have [∨X ′]σ =

∨X, hence by standard results of lattice

theory Φ/σ is indeed a complete lattice.Next we show that strong density (4.11) holds in Φ/σ. Fix a domain

x ∈ D and consider [φ]⇒xσ . We may take a representant φ′ in Φx for thisequivalence class. Also in the set [ψ]σ ∈ Φ/σ : [ψ]σ = [ψ]⇒xσ [φ]σ wemay take representants ψ ∈ Φx. Further, we have that [ψ]σ = [ψ]⇒xσ [φ]if, and only if, [ψ]σ = [ψ]⇒xσ [φ]σ. Therefore we obtain form the continuityof Φx, and by Lemma 4.14

φ′ = ∨ψ ∈ Φx : ψ x φ′.

Applying Lemma 4.11, and again Lemma 4.14, this in turn gives us

[φ]⇒xσ = [∨ψ ∈ Φx : ψ x φ′.]σ

= ∨[ψ]σ ∈ Φ/σ : [ψ]σ = [ψ]⇒xσ [φ]⇒xσ .

This is the required strong continuity in the complete lattice Φ/σ. By The-orem 4.19 (Φ/σ,D) is then strongly continuous.

Assume next, that (Φ, D) is a domain-free labeled algebra. By definitionwe have Ψ = (φ, x) : φ ∈ Φ, φ = φ⇒x. Let B be a base in the s-continuousalgebra (Φ, D) and define Bx = (φ, x) : φ ∈ B,φ = φ⇒x. We claim thatBx is a base in Ψx. In fact, if (φ1, x) and (φ2, x) belong to Bx, then, sinceφ1 ⊗ φ2 ∈ B and φ1 ⊗ φ2 = (φ1 ⊗ φ2)⇒x, it follows that (φ1, x) ⊗ (φ2, x) =(φ1 ⊗ φ2, x) belongs to Bx too. Further, let X ⊆ Bx be directed. ThenX ′ = φ : (φ, x) ∈ X is directed too, and

∨X ′ exists in the base B. But

by Lemma 4.10,∨X = (

∨X ′, x), hence we conclude that

∨X ∈ Ψx. Again

by Lemma 4.10 and also Lemma 4.14,∨(ψ, x) ∈ Bx : (ψ, x)x (φ, x)

=∨(ψ, x) : ψ ∈ B,ψ = ψ⇒x, (ψ, x)x (φ, x)

≥∨(ψ, x) : ψ ∈ B,ψ = ψ⇒x φ = φ⇒x

= (∨ψ : ψ ∈ B,ψ = ψ⇒x φ = φ⇒x, x)

= (φ, x).

Page 98: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

98 CHAPTER 4. ORDER OF INFORMATION

Since (φ, x) ≥∨(ψ, x) ∈ Bx : (ψ, x)x (φ, x), we conclude that equality

must hold and this shoes that Ψx is a continuous lattice with base Bx.ut

The case of the domain-free algebra associated with a compact labeledinformation algebra allows also to determine a separative set in the domain-free algebra:

Theorem 4.22 If (Φ,Φf , D) is a compact labeled information algebra, andD has a top element, then [χ]σ [φ]σ implies the existence of an elementψ ∈ Φf,x for some x ∈ D such that [χ]σ ≤ [ψ]σ ≤ [φ]σ.

Proof. Let x be a support of [χ]σ and [φ]σ. Then we may assumed(χ) = d(φ) = x and by Lemma 4.14 it follows that χ x φ. From this,by Theorem 4.17 (3) we conclude that there is an element ψ ∈ Φf,x withχ ≤ ψ ≤ φ. But this implies [χ]σ ≤ [ψ]σ ≤ [φ]σ. ut

This shows that we may consider the elements [ψ]σ with a representantψ ∈ Φf,x for some x ∈ D in some sense as a generalization of finite elementsin a compact domain-free information algebra.

Example 4.11 Cofinite Sets. In this example the equivalence classes ofcofinite sets on some domain x form a separative set in the domain-freeversion of the set algebra. Let’s call such classes cofinite too. Note that inthis example χ ≤ ψ, d(χ) = d(ψ) and ψ cofinite implies that χ is cofinitetoo. Theorem 4.22 says then that [ψ]σ [φ]σ implies that [ψ]σ is cofinite,i.e. all elements way below another one are cofinite.

This completes the picture of continuous and compact information alge-bras and the relations between their labeled and domain-free versions. Thesituation is summarized in the following table

Labeled Domain-freealgebra algebra

compact ← Th. 4.15/4.16 → s-compact↑ ↑

Bx = Φf,x/Th.4.17 B = Φf/Th. 4.17↓ ↓

continuous ← Th. 4.21 → s-continuous

Page 99: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 5

Mappings

5.1 Order-Preserving Mappings

Let now (Φ, DΦ) and (Ψ, DΨ) be two information algebras. To simplifynotation we denote combination and focusing in the two algebras Φ andΨ by the same symbols, and the same holds for the order, meet and joinsin the two lattices DΦ and DΨ. It will always be clear from the context,were the operations take place. We study order-preserving or monotonemappings between these two algebras. A mapping o : Φ → Ψ is calledorder-preserving, if

φ ≤ ψ implies o(φ) ≤ o(ψ). (5.1)

Such a mapping can be thought of as a processing which takes an elementof Φ as input and produces an element of Ψ as output. (5.1) says then thatless informative inputs produce less informative outputs.

As an immediate consequence we see that, since φ, ψ ≤ φ ⊗ ψ, we havealso o(φ)⊗o(ψ) ≤ o(φ⊗ψ). If equality holds here, we have a semi-group ho-momorphism. Inversely, a semi-group homomorphism is an order-preservingmapping.

A few simple examples of monotone mappings from (Φ, DΦ) into itselfare as follows:

1. Identity mapping: o(φ) = φ.

2. Constant mapping: o(φ) = ψ.

3. Focusing: o(φ) = φ⇒x

4. Conditioning: o(φ) = ψ ⊗ φ for a fixed ψ ∈ Φ.

In the first and third example we have that o(e) = e and if φ 6= z theno(φ) 6= z. Such mappings are called strict. These properties do not hold

99

Page 100: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

100 CHAPTER 5. MAPPINGS

in the second and fourth example, hence these mappings are not strict, butthey are semi-group homomorphisms.

Furthermore, compositions of monotone mappings are monotone. Thatis, if o1 is an order-preserving mapping from (Φ1, DΦ1) into (Φ2, DΦ2) and o2

is an order-preserving mappings from (Φ2, DΦ2) into (Φ3, DΦ3), then o1 o2

is a monotone mapping from (Φ1, DΦ1) into (Φ3, DΦ3).We denote the set of all monotone mappings from (Φ, DΦ) into (Ψ, DΨ)

by [Φ→ Ψ]. We are going to define in this set an operation of combination.Thus, if o1, o2 ∈ [Φ→ Ψ], then o1 ⊗ o2 is defined by

o1 ⊗ o2(φ) = o1(φ)⊗ o2(φ). (5.2)

If we look at o1 and o2 as two processing procedures, which apply to the sameinput, then they can be combined into one procedure, simply by combiningthe outputs. If φ ≤ ψ, then o1 ⊗ o2(φ) = o1(φ) ⊗ o2(φ) ≤ o1(ψ) ⊗ o2(ψ) =o1 ⊗ o2(ψ), hence o1 ⊗ o2 ∈ [Φ → Ψ]. Clearly, this operation is associative,commutative and the constant mapping defined by o(φ) = e for all φ ∈ Φ isthe neutral element of this operation. Combinations of strict mappings arestrict.

We are next going to define focusing of monotone mappings. Note thatthe cartesian product of two lattices DΦ ×DΨ becomes also a lattice, if wedefine (x, y) ≤ (x′, y′) if, and only if, x ≤ x′ and y ≤ y′. We then have(x, y) ∧ (x′, y′) = (x ∧ x′, y ∧ y′). So we define, with respect to this productlattice, o⇒(x,y) by

o⇒(x,y)(φ) = (o(φ⇒x))⇒y. (5.3)

We remark that φ ≤ ψ implies φ⇒x ≤ ψ⇒x. Hence we conclude thato⇒(x,y)(φ) = (o(φ⇒x))⇒y ≤ (o(ψ⇒x))⇒y = o⇒(x,y)(ψ). This shows thato⇒(x,y) ∈ [Φ→ Ψ]. Note that we write also o⇒y(φ⇒x) for (o(φ⇒x))⇒y.

It turns out that [Φ→ Ψ] together withDΦ×DΨ becomes an informationalgebra with combination and focusing defined as above.

Theorem 5.1 Suppose that DΦ and DΨ have both a top element. Then([Φ → Ψ], DΦ × DΨ), where the operations ⊗ and ⇒ are defined by (5.2)and (5.3), is a domain-free information algebra.

Proof. (1) The semigroup properties hold as noted above.(2) The pair (>,>) is the top-element of the lattice DΦ × DΨ. This

element is a support of every mapping o ∈ [Φ→ Ψ] since > is a support forevery element in Φ and Ψ. Thus the support axiom holds.

(3) The neutral element is the mapping e(φ) = e. Since e⇒(x,y)(φ) =(e(φ⇒x))⇒y = e⇒y = e, the neutrality axiom is satisfied.

(4) The null element is the mapping z(φ) = z and as before we obtainz⇒(x,y)(φ) = z. Thus the nullity axiom holds.

Page 101: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

5.2. CONTINUOUS MAPPINGS 101

(5) In order to verify focusing assume x, x′ ∈ DΦ and y, y′ ∈ DΨ. Then,using the transitivity in the two algebras (Φ, DΦ) and (Ψ, DΨ), we obtain

(o⇒(x,y))⇒(x′,y′)(φ) = (o⇒(x,y)(φ⇒x′))⇒y

′= ((o((φ⇒x

′)⇒x))⇒y)⇒y

= (o(φ⇒x∧x′))⇒y∧y

′= o⇒(x∧x′,y∧y′)(φ)

= o⇒(x,y)∧(x′,y′)(φ).

(6) The combination axiom is verified as follows:

(o⇒(x,y)1 ⊗ o2)⇒(x,y)(φ) = ((o⇒(x,y)

1 ⊗ o2)(φ⇒x))⇒y

= (o⇒(x,y)1 (φ⇒x)⊗ o2(φ⇒x))⇒y

= ((o1(φ⇒x))⇒y ⊗ o2(φ⇒x))⇒y

= (o1(φ⇒x))⇒y ⊗ (o2(φ⇒x))⇒y

= o⇒(x,y)1 (φ)⊗ o⇒(x,y)

2 (φ)

= o⇒(x,y)1 ⊗ o⇒(x,y)

2 (φ).

Here we have used the combination axiom in the algebra (Ψ, DΨ).(7) We have

o(φ) ≤ o(φ)⊗ o⇒(x,y)(φ)= o(φ)⊗ (o(φ⇒x))⇒y.

But φ⇒x ≤ φ implies that o(φ⇒x) ≤ o(φ). Thus we have by the idempotencyaxiom in Ψ

o(φ)⊗ o⇒(x,y) ≤ o(φ)⊗ (o(φ))⇒y = o(φ).

Hence we obtain that

o(φ) ≤ o(φ)⊗ o⇒(x,y)(φ) ≤ o(φ).

This proves the idempotency axiom. utNot surprisingly, mappings which describe some form of information pro-

cessing represent a form of information and indeed they form also an infor-mation algebra.

5.2 Continuous Mappings

We turn now to continuos and also compact, domain-free information alge-bras (Φ, DΦ) and (Ψ, DΨ). In this context we consider continuous mappings,that is mappings, which maintain limits.

Page 102: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

102 CHAPTER 5. MAPPINGS

Definition 5.1 Continuous Mappings. A mapping f : Φ → Ψ froma continuous domain-free information algebra (Φ, D) into another continu-ous domain-free information algebra (Ψ, E) is called continuous, if for anydirected set X ⊆ Φ, it holds that

f(∨X) =

∨φ∈X

f(φ). (5.4)

We note that a continuous mapping f is order preserving.

Lemma 5.1 A continuos mapping f from a continuous domain-free infor-mation algebra (Φ, D) into another continuous domain-free information al-gebra (Ψ, E) preserves order, that is, f ∈ [Φ→ Ψ].

Proof. Assume φ1 ≤ φ2 ∈ Φ and let B be the basis in Φ. Then, by thecontinuity of f ,

f(φ1) = f(∨ψ ∈ B,ψ φ1)= ∨f(ψ) : ψ ∈ B,ψ φ1≤ ∨f(ψ) : ψ ∈ B,ψ φ2= f(∨ψ ∈ B,ψ φ2)= f(φ2).

utThe following equivalent definitions of continuity are well known in lat-

tice theory.

Lemma 5.2 Let f : Φ → Ψ be an order preserving mapping from a con-tinuous, domain-free information algebra (Φ, D) with basis BΦ into anothercontinuous, domain-free information algebra (Ψ, E) with basis BΨ. Then,the following conditions are equivalent:

1. f is continuous.

2. f(φ) = ∨f(ψ) : ψ ∈ BΦ, ψ φ.

3. Af(φ) ⊆ ↓f(Aφ), where Af(φ) = ψ ∈ BΨ : ψ f(φ) and Aφ = ψ ∈BΦ : ψ φ

Proof. (1) ⇒ (2): The set ψ : ψ ∈ BΦ, ψ φ is directed, andφ = ∨ψ : ψ ∈ BΦ, ψ φ. By the continuity of f ,

f(φ) = ∨f(ψ) : ψ ∈ BΦ, ψ φ.

(2) ⇒ (3): Let ψ′ ∈ Af(φ), that is, ψ′ ∈ BΨ and ψ′ f(φ). Then wehave, by (2),

ψ′ ≤ f(φ) = ∨f(ψ) : ψ ∈ BΦ, ψ φ.

Page 103: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

5.2. CONTINUOUS MAPPINGS 103

The set f(ψ) : ψ ∈ BΦ, ψ φ is directed. By the definition of the way-below relation there is an element ψ ∈ BΦ, ψ φ, such that ψ′ ≤ f(ψ).But this means that ψ′ ∈ ↓f(Aφ).

(3) ⇒ (1): Let X ∈ Φ be a directed set and define φ =∨X. Then φ

belongs to Φ, since the continuity of (Φ, D) implies that Φ is a completelattice (Theorem 4.19). Let ψ′ ∈ Af(φ). Then, by (3), there is a ψ ∈ Aφsuch that ψ′ ≤ f(ψ). We have ψ ∈ BΦ and ψ φ, hence ψ ≤ φ =

∨X.

There is therefore a η ∈ X such that ψ ≤ η. Thus we finally conclude thatψ′ ≤ f(ψ) ≤ f(η) ≤

∨f(X), since f is order-preserving. Then

∨f(X) is

an upper bound of Af(φ), hence

f(φ) =∨Af(φ) ≤

∨f(X).

But, on the other hand, from ψ ∈ X, it follows that ψ ≤ φ, hence f(ψ) ≤f(φ), since f is order-preserving, and therefore∨

f(X) ≤ f(φ),

such that finally f(φ) = f(∨X) =

∨f(X) which means that f is continu-

ous. utTheorem 4.9 stated that in a strongly compact information algebra fo-

cussing is continuous. The next theorem enforces this statement for contin-uous algebras

Theorem 5.2 If (Φ, D) is a continuous, domain-free information algebra,then it is strongly continuous if, and only if, for any directed subset X ⊆ Φand for all x ∈ D

(∨X)⇒x =

∨φ∈X

φ⇒x. (5.5)

Proof. Assume first that (Φ, D) is strongly continuous. From φ ≥ φ⇒x

it follows that ∨X ≥

∨φ∈X

φ⇒x.

Therefore it holds also that

(∨X)⇒x ≥ (

∨φ∈X

φ⇒x)⇒x.

We claim that x is a support for the right hand side of the inequality. Infact, let η =

∨φ∈X φ

⇒x. Then η ≥ φ⇒x implies η⇒x ≥ (φ⇒x)⇒x = φ⇒x.From this we see that η⇒x ≥

∨φ∈X φ

⇒x = η, hence η⇒x = η. So, we have

(∨X)⇒x ≥

∨φ∈X

φ⇒x.

Page 104: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

104 CHAPTER 5. MAPPINGS

From strong density we have

(∨X)⇒x =

∨ψ ∈ B : ψ = ψ⇒x

∨X.

Since X is directed, ψ ∨X implies that there is a φ ∈ X such that ψ ≤ φ.

But then ψ = ψ⇒x ≤ φ⇒x. This in turn shows that

(∨X)⇒x ≤

∨φ∈X

φ⇒x,

hence (∨X)⇒x =

∨φ∈X φ

⇒x, and identity (5.5) holds.Conversely, assume now that identity (5.5) holds and that algebra (Φ, D)

is continuous. In this case, according to Theorem 4.19,

φ = ∨ψ ∈ Φ : ψ φ.

Then, by (5.5)

φ⇒x = ∨ψ⇒x : ψ ∈ Φ : ψ φ.

Since (ψ⇒x)⇒x = ψ⇒x and ψ⇒x ≤ ψ φ implies ψ⇒x φ we concludethat

φ⇒x = ∨ψ′ : ψ′ = ψ′⇒x φ.

Furthermore, Φ is a complete lattice since (Φ, D) is continuous. But thenby Theorem 4.19 is strongly continuous. ut

Let [Φ → Ψ]c denote the set of continuous mappings from (Φ, D) into(Ψ, E). Between continuous mappings the operations of combination arefocusing are defined as above for monotone mappings. We show next, thatthese operations again yield continuous mappings. This means then that([Φ → Ψ]c, D × E) is a subalgebra of the algebra ([Φ → Ψ], D × E) ofmonotone mappings.

Theorem 5.3 If (Φ, D) and (Ψ, E) are strongly continuous, then, if f, g ∈[Φ→ Ψ]c, x ∈ D, y ∈ E then also f⊗g ∈ [Φ→ Ψ]c, and f⇒(x,y) ∈ [Φ→ Ψ]c.

Proof. First, we consider combination. Let X ⊆ Φ be a directed subset.Then using continuity of f and g and associativity of join, we obtain

(f ⊗ g)(∨X) = f(

∨X)⊗ g(

∨X)

=

∨φ∈X

f(φ)

⊗∨φ∈X

g(φ)

=

∨φ∈X

(f(φ)⊗ g(φ))

=∨φ∈X

(f ⊗ g)(φ).

Page 105: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

5.2. CONTINUOUS MAPPINGS 105

This shows that the mapping f ⊗ g is continuous, f ⊗ g ∈ [Φ→ Ψ]c.Next we examine focusing. Let X as before be a directed subset of Φ.

Let x ∈ D and y ∈ E. If X is directed, then so is f(φ)⇒x : φ ∈ X.Therefore, again using continuity of f , and also Theorem 5.2

f⇒(x,y)(∨X) =

(f(∨X)⇒(x)

)⇒(y)

=

∨φ∈X

f(φ⇒x)

⇒(y)

=∨φ∈X

(f(φ⇒x))⇒(y)

=∨φ∈X

f⇒(x,y)(φ).

Thus, the mapping f⇒(x,y) is continuous, f⇒(x,y) ∈ [Φ→ Ψ]c.ut

Since, by this last theorem, [Φ → Ψ]c is closed under the operationsof combination and focusing, if (Φ, D) and (Ψ, E) are strongly continuous,and the neutral mapping is continuous, [Φ → Ψ]c must be an informationalgebra, namely a subalgebra of the information algebra [Φ → Ψ] of order-preserving mappings. We are now going to show that it is a continuousinformation algebra.

Theorem 5.4 If (Φ, D) and (Ψ, E) are strongly continuous, domain-freeinformation algebras, then ([Φ → Ψ]c, D × E) is a continuous domain-freeinformation algebra.

Proof. We show that [Φ → Ψ]c is a continuous lattice in the orderinduced by the combination operation. Then, since ([Φ → Ψ]c, D × E) is adomain-free information algebra, as noted above, the theorem follows as aconsequence of Theorem 4.19.

More precisely, according to (Gierz, 2003) 1 [φ → ψ]c is a continuouslattice under point-wise order, i.e. f ≤p g if f(φ) ≤ g(φ) for all φ ∈ Φ. Butthis is also the order induced by combination. In fact, f ≤comb g if f⊗g = gor f(φ) ⊗ g(φ) = g(φ) for all φ ∈ Φ. But this means exactly f(φ) ≤ g(φ)for all φ ∈ Φ. So the two orders coincide and we may drop the index andsimply write f ≤ g.

Thus [Φ→ Ψ]c is a complete lattice and it holds that for all f ∈ [φ→ ψ]cthat

f = ∨g ∈ [φ→ ψ]c : g f.1see Exercise I.-2.22

Page 106: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

106 CHAPTER 5. MAPPINGS

Therefore, from Theorem 4.19 it follows that ([Φ→ Ψ]c, D×E) is a contin-uous, domain-free information algebra. ut

This result leaves two questions open: Is ([Φ → Ψ]c, D × E) stronglycontinuous or not and how can the way-below relation be characterized in([Φ → Ψ]c in terms of the relations in (Φ, D) and (Ψ, E). In the case ofstrongly compact information algebras these questions are answered in thenext section.

5.3 Continuous Mappings Between Compact Al-gebras

In this section we assume that (Φ,Φf , D) and (Ψ,Ψf , E) are two stronglycompact, domain-free information algebras. According to Theorem 4.17both algebras are then strongly continuous. Therefore, ([Φ → Ψ]c, D × E)is a continuous information algebra, according to Theorem 5.4. In fact, inthis section we show that it is a strongly compact information algebra.

The first question is, what are the finite elements of this algebra? LetY be a finite subset of finite elements from Φf . A mapping f : Y → Ψf iscalled a simple function. Let S be the set of all simple functions. For s ∈ Sand φ ∈ Φ we define Y (s) to be the domain of s and

s(φ) = ∨s(ψ) : ψ ∈ Y (s), ψ ≤ φ.

We have s(φ) = e, if the set on the right hand side of this definition aboveis empty.

Lemma 5.3 Let s ∈ S. Then

1. s : Φ→ Ψf .

2. s is continuous.

Proof. (1) Let X = s(ψ) : ψ ∈ Y (s), ψ ≤ φ. Then X ⊆ Ψf is a finiteset. Hence s(φ) = ∨X ∈ Ψf .

(2) Assume φ1 ≤ φ2. Then we have that

s(ψ) : ψ ∈ Y (s), ψ ≤ φ1 ⊆ s(ψ) : ψ ∈ Y (s), ψ ≤ φ2.

Hence s(φ1) ≤ s(φ2) which shows that s is an order-preserving mapping.Let now X ⊆ Φ be a directed set. Then s(φ) ≤ s(∨X) if φ ∈ X,

hence ∨φ∈X s(φ) ≤ s(∨X). We claim that the inverse inequality s(∨X) ≤∨φ∈X s(φ) holds also.

Let ψ ∈ Y (s), ψ ≤ ∨X. Note that ψ ∈ Φf . Since X is directed, there isa φ ∈ X such that ψ ≤ φ. Thus we have that s(ψ) ≤ s(ψ) ≤ s(φ), so

s(∨X) = ∨s(ψ) : ψ ∈ Y (s), ψ ≤ ∨X ≤∨φ∈X

s(φ).

Page 107: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

5.3. CONTINUOUS MAPPINGS BETWEEN COMPACT ALGEBRAS107

This shows that ∨φ∈X s(φ) = s(∨X), which, by definition means that s iscontinuous. ut

Let S = s : s ∈ S. We have shown that S ⊆ [Φ→ Ψ]c. We claim thatS are the finite elements of [Φ→ Ψ]c.

We show first that S is closed under combination. Let s1 and s2 be twosimple mappings and Y1 = Y (s1), Y2 = Y (s2). Define s : Y1 ∪ Y2 → Ψf

by ψ 7→ s1(ψ) ∨ s2(ψ). Clearly, s is a simple mapping. We claim thats = s1 ⊗ s2. In fact,

s(φ) = ∨s1(ψ) ∨ s2(ψ) : ψ ∈ Y1 ∪ Y2, ψ ≤ φ= ∨(∨s1(ψ1) : ψ1 ∈ Y1, ψ1 ≤ ψ)∨ (∨s2(ψ2) : ψ2 ∈ Y2, ψ2 ≤ ψ) : ψ ∈ Y1 ∪ Y2, ψ ≤ φ

= (∨s1(ψ1) : ψ1 ∈ Y1, ψ1 ≤ ψ ∈ Y1 ∪ Y2, ψ ≤ φ)∨ (∨s2(ψ2) : ψ2 ∈ Y2, ψ2 ≤ ψ ∈ Y1 ∪ Y2, ψ ≤ φ)

= (∨s1(ψ1) : ψ1 ∈ Y1, ψ1 ≤ φ) ∨ (∨s2(ψ2) : ψ2 ∈ Y2, ψ2 ≤ φ)= s1(φ) ∨ s2(φ). (5.6)

The vacuous mapping e(φ) = e belongs to S since e = f where Y (f) =e, f(e) = e.

Next we verify the axioms of convergence, density and compactness forS relative to [Φ→ Ψ]c.

Convergence follows from the fact that for all X ⊆ [Φ → Ψ]c we have∨X ∈ [Φ → Ψ]c. In fact, let f = ∨X, that is, f(φ) = ∨g∈Xg(φ). If Y ⊆ Φis directed, then, by the continuity of g and the associative law for thesupremum in the lattice Ψ,

f(∨Y ) =∨g∈X

g(∨Y ) =∨g∈X

∨φ∈Y

g(φ)

=

∨φ∈Y

∨g∈X

g(φ)

=∨φ∈Y

f(φ).

So we conclude that f ∈ [Φ→ Ψ]c.Density: We have to show that

f⇒(x,y) = ∨s : s ∈ S, s = s⇒(x,y) ≤ f.

By definition and the continuity of f ,

f⇒(x,y)(φ) = (f(φ⇒x))⇒y

= (f(∨α : α ∈ Φf , α = α⇒x ≤ φ)⇒y

= (∨f(α) : α ∈ Φf , α = α⇒x ≤ φ)⇒y .

Page 108: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

108 CHAPTER 5. MAPPINGS

This set is directed, hence, by Theorem 4.9,

f⇒(x,y)(φ) = ∨(f(α))⇒y : α ∈ Φf , α = α⇒x ≤ φ.

We are going to show that for α ∈ Φf , α = α⇒x ≤ φ we have

(f(α))⇒y = ∨s(α) : s ∈ S, s = s⇒(x,y) ≤ f. (5.7)

This implies then

f⇒(x,y)(φ) = ∨∨s(α) : s ∈ S, s = s⇒(x,y) ≤ f : α ∈ Φf , α⇒x ≤ φ

= ∨∨s(α) : α ∈ Φf , α⇒x ≤ φ : s ∈ S, s = s⇒(x,y) ≤ f

≤ ∨s(φ) : s ∈ S, s = s⇒(x,y) ≤ f.

This shows that

f⇒(x,y) ≤ ∨s : s ∈ S, s = s⇒(x,y) ≤ f.

Since the inverse inequality clearly holds, we are done.To prove (5.7) note that by the density in Ψ we have

(f(α))⇒y = ∨β ∈ Ψf : β = β⇒y ≤ f(α).

Consider β ∈ Ψf such that β = β⇒y ≤ f(α). Let s : α → Ψf defined bys(α) = β. Then s is a simple mapping in S. And,

s(φ) =β, if α ≤ φ,e, otherwise.

So we have s(φ) ≤ f(φ) for all φ, that is, s ≤ f .It remains to show that s⇒(x,y) = s. Consider γ ∈ Φ. We have to show

that (s(γ⇒x))⇒y = s(γ). If α = α⇒x, then α = α⇒x ≤ γ holds if, and onlyif, α ≤ γ⇒x. We have two cases:

case 1: α ≤ γ. Then

(s(γ⇒x))⇒y = (s(α))⇒y = β⇒y = β = s(γ),

since α ≤ γ⇒x implies s(γ⇒x) = s(α).case 2: α 6≤ γ. In this case

(s(γ⇒x))⇒y = e = s(γ).

So we have indeed s⇒(x,y) = s. This proves that for α = α⇒x ≤ φ,

(f(α))⇒y ≤ ∨s(α) : s ∈ S, s = s⇒(x,y) ≤ f.

The inverse inequality is obvious. This proves (5.7).Compactness: Let X ⊆ [Φ → Ψ]c be a directed set and s ≤ ∨X. If

ψ ∈ Y (s), then s(ψ) ∈ Ψf and s(ψ) = s(ψ) ≤ (∨X)(ψ) = ∨f∈Xf(ψ). Theset of f(ψ) for f ∈ X is directed in Φ. Therefore, there is a f ∈ X such thats(ψ) ≤ f(ψ). Since Y (s) is finite and X directed, there is a g ∈ X such thats(ψ) ≤ g(ψ) for all ψ ∈ Y . Hence, s ≤ g ∈ X.

Thus, we have proved the following theorem:

Page 109: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

5.3. CONTINUOUS MAPPINGS BETWEEN COMPACT ALGEBRAS109

Theorem 5.5 If (Φ,Φf , D) and (Ψ,Ψf , E) are two strongly compact, domain-free information algebras, then ([Φ → Ψ]c, S,D × E) is a strongly compactinformation algebra with S as finite elements.

Page 110: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

110 CHAPTER 5. MAPPINGS

Page 111: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 6

Boolean Information Algebra

6.1 Domain-Free Boolean Information Algebra

We have seen that in a domain-free information algebra (Φ, D) the set Φ is asemilattice. And we have already seen, that it can even be a Boolean algebra(see e.g. example 3.10). This case is of some interest and will therefore bestudied in this Chapter.

Definition 6.1 Boolean Information Algebras: A domain-free infor-mation algebra (Φ, D) is called Boolean, if

1. Φ is a distributive lattice. That is, besides the supremum (combina-tion) of two elements φ and ψ, the infimum φ ∧ ψ exists too, and thedistributive laws hold:

φ ∧ (ψ ∨ η) = (φ ∧ ψ) ∨ (φ ∧ η),φ ∨ (ψ ∧ η) = (φ ∨ ψ) ∧ (φ ∨ η).

2. For all φ ∈ Φ exists an element φc, called the complement of φ suchthat

φ ∧ φc = e, φ ∨ φc = z.

This means essentially, that Φ is a Boolean lattice, see (Davey & Priestley,1990a).

The following results are well known for Boolean lattices (Davey &Priestley, 1990a):

1. φ ≤ ψ if, and only if, φ ∧ ψ = φ.

2. Every φ has a unique complement.

3. (φc)c = φ.

111

Page 112: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

112 CHAPTER 6. BOOLEAN INFORMATION ALGEBRA

4. ec = z and zc = e.

5. φ ≤ ψ if, and only if, φc ≥ ψc.

6. φ ∨ ψc = z if, and only if, φ ≥ ψ.

7. De Morgan laws:

(φ ∧ ψ)c = φc ∨ ψc,(φ ∨ ψ)c = φc ∧ ψc.

Here follows first two results on the interplay of meet and focusing inBoolean information algebras:

Lemma 6.1

1. ∀φ ∈ Φ and ∀x ∈ D, φ ∧ φ⇒x = φ⇒x.

2. If ∧i∈Iφi exists, then so does ∧i∈Iφ⇒xi and

∧i∈Iφ⇒xi = (∧i∈Iφi)⇒x.

In particular, ∀φ, ψ ∈ Φ and x ∈ D,

(φ ∧ ψ)⇒x = φ⇒x ∧ ψ⇒x.

Proof. (1) Since φ ≥ φ⇒x, φ⇒x is a lower bound of φ and φ⇒x. Butφ⇒x ≥ φ ∧ φ⇒x hence φ ∧ φ⇒x = φ⇒x.

(2) Let ψ = ∧i∈Iφi. Then ψ⇒x⊗(ψ⇒x)c = z. Lemma 3.8 (2) implies thenthat ψ⊗(ψ⇒x)c ⇒x = z. From φi ≥ ψ it follows then that φi⊗(ψ⇒x)c ⇒x = zfor all i ∈ I, hence (Lemma 3.8 (2)) φ⇒xi ⊗ (ψ⇒x) = z. This means thatφ⇒xi ≥ ψ⇒x. So, ψ⇒x is a lower bound of the φ⇒xi .

Let t be another lower bound of φ⇒xi . Then φ⇒xi ⊗tc = z, hence (Lemma3.8 (2)) φi ⊗ tc ⇒x = z. This implies also ψ ⊗ tc ⇒x = z, thus (Lemma 3.8(2)) ψ⇒x ⊗ tc = z. But this implies that ψ⇒x ≥ t. Therefore ψ⇒x is thegreatest lower bound of the φ⇒xi for all i ∈ I. ut

Next we collect some results concerning complementation. We define

φ− ψ = φ⊗ ψc,φ∆ψ = (φ− ψ) ∧ (ψ − φ).

The first operation is called difference, the second one symmetric difference.Note however that usually these two operation are defined dually in Booleanalgebras, with join and meet exchanged, which corresponds also to a dualorder, ≤ changed into ≥ (for a duality theory see below).

Then we have the following results concerning complementation andthese new operations:

Page 113: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

6.1. DOMAIN-FREE BOOLEAN INFORMATION ALGEBRA 113

Lemma 6.2 Let (Φ, D) be a Boolean information algebra. Then, ∀φ, ψ ∈ Φand x, y ∈ D:

1. (φ⇒x)c ⇒x = (φ⇒x)c,

2. φ = φ⇒x if, and only if, φc = φc ⇒x,

3. φ⇒x ∧ φc ⇒x = e.

4. (φ− ψ⇒x)⇒x = φ⇒x − ψ⇒x,

5. (φ− ψ)⇒x ≤ φ⇒x − ψ⇒x,

6. (φ∆ψ)⇒x ≤ φ⇒x∆ψ⇒x,

7. (φ⇒x)c ⇒y ⊗ (φc ⇒y)c ⇒x = z.

Proof. (1) From φ⇒x ⊗ (φ⇒x)c = z we deduce (φ⇒x)⇒x ⊗ (φ⇒x)c =z, hence, by lemma 3.8 (2) φ⇒x ⊗ (φ⇒x)c ⇒x = z. This implies that(φ⇒x)c ⇒x ≥ (φ⇒x)c. The inverse inequality holds always, so we have equal-ity.

(2) Assume φ = φ⇒x. Then (1) just proved implies that φc = (φ⇒x)c =(φ⇒x)c ⇒x = φc ⇒x. The other direction follows by symmetry.

(3) Since φ⇒x ≤ φ and φc ⇒x ≤ φc we have φ⇒x ∧ φc ⇒x ≤ φ ∧ φc = e.Because e is the bottom element we must have equality.

(4) Here we use again (1) above, together with the combination axiom,and obtain

(φ− ψ⇒x)⇒x = (φ⊗ (ψ⇒x)c)⇒x = (φ⊗ (ψ⇒x)c ⇒x)⇒x

= φ⇒x ⊗ (ψ⇒x)c ⇒x = φ⇒x ⊗ (ψ⇒x)c

= φ⇒x − ψ⇒x.

(5) Note that

φ = φ⊗ e = φ⊗ (ψ ∧ ψc) = (φ⊗ ψ) ∧ (φ⊗ ψc).

We have then by lemma 6.1 (2)

φ⇒x = ((φ⊗ ψ) ∧ (φ⊗ ψc))⇒x = (φ⊗ ψ)⇒x ∧ (φ⊗ ψc)⇒x.

This implies φ⇒x ≥ ψ⇒x ∧ (φ− ψ)⇒x. From this we deduce that

φ⇒x ⊗ (psi⇒x)c = φ⇒x − ψ⇒x ≥ z ∧ (φ− ψ)⇒x = (φ− ψ)⇒x.

(6) Using (4) just proved, together with lemma 6.1 (2) we obtain

φ⇒x∆ψ⇒x = (φ⇒x − ψ⇒x) ∧ (ψ⇒x − φ⇒x)≥ (φ− ψ)⇒x ∧ (ψ − φ)⇒x = ((φ− ψ) ∧ (ψ − φ))⇒x

= (φ∆ψ)⇒x.

Page 114: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

114 CHAPTER 6. BOOLEAN INFORMATION ALGEBRA

(7) Using (3) above with De Morgan’s law we obtain (φ⇒x)c⊗(φc ⇒y)c =z. From (1) above it follows then (φ⇒x)c ⇒x⊗(φc ⇒y)c ⇒y = z. Now, lemma3.8 (3) gives (φ⇒x)c ⇒y ⊗ (φc ⇒y)c ⇒x = z ut

The classical example of a Boolean information algebra is the algebra ofsubsets of a cartesian product. We give several examples, which are variantsof this basic system.

Example 6.1 Multivariate Subset System. The subset system introducedin Example 3.9 is a Boolean information algebra. Complements of subsets ofan universe are the usual set-theoretic complements relative to the universe.The infimum of two sets is their union. This is contrary to the usual viewof lattices of subsets systems: There corresponding to the order of set inclu-sion, meet corresponds to intersection and join to union. We however useinformation order, which is dula to set inclusion, i.e. φ ≤ ψ means φ ⊇ ψ interms of set inclusion. This explains the difference.

Example 6.2 Propositional Logic. Classical logic systems as propositionallogic, Example 3.8, and predicate logic, Example 3.11 and Example 6.3below. In Example 3.8 we mentioned that a formula of propositional logicis a domain-free expression of information. In fact, it is somehow morenatural, to derive directly a domain-free algebra associated with propositionlogic: We refer to Example 3.8 for the definition of propositional languageand the interpretation of formulae by truth functions, which are elementsof Dω with D = t, f. The set of models of a propositional formula A isdenoted by r(A) which is a cylindric subset of Dω with a finite dimensionalbase. This means that there is a finite set J of variables such that α ∈ r(A)implies β|α↓J ∈ r(A) for any β ∈ dω, if β|α↓J denotes the truth functionβ, where β(j) is replace by α(i) for j ∈ J . Clearly the cylindric sets withfinite bases form a Boolean information algebra, with respect to the latticeof finite subsets of ω as domains. This is the domain-free algebra associatedwith the propositional logic.

Clearly the following relations hold between language and algebra:

r(A ∧B) = r(A) ∩ r(B),r(A ∨B) = r(A) ∪ r(B),r(¬A) = (r(A))c.

This reflects the Boolean structure of the doamin-free informaiton algebraassociated with propositional logic. Focusing, or rather variable elimination,is expressed by existential quantification (see also Example 6.4 below)

r(∃XA) = r(A)−X .

The partial order in the information algebra of models correspond to logicalconsequence: We have A |= B, if r(A) ⊆ r(B), or in term on information

Page 115: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

6.1. DOMAIN-FREE BOOLEAN INFORMATION ALGEBRA 115

order r(A) ≥ r(B). Of course, different formula A and B may have the sameset of models, express the same information. This is the case if r(A) = r(B),or, in the usual logical notation A |= B and B |= A.

Example 6.3 Many-Sorted Structures. In Example 3.11 it has been shownhow predicate logic is related to a domain-free information algebras, in factan algebra of formulae and one of structures. This is very similar to propo-sitional logic, Example 6.2 above. For applications many sorted predicatelogic is more natural, although this is not really an extension of usual predi-cate logic. Nevertheless workingin the many-sorted framework is somewhatsimpler than reducing it to usual predicate logic.

Example 6.4 Quantor Algebra. In Section 3.6 we introduced the opera-tion of variable elimination for the case of a domain-free information algebra(Φ, D) related to a finite multivariate structure, i.e. where D is the latticeof domains associated with a set of variables V . Let now Φ be a Booleanalgebra, such that (Φ, D) is a domain-free Boolean information algebra rel-ative to a multivariate structure. Then, for any variable X ∈ V we definethe mapping ∃X : Φ→ Φ by

∃Xφ = φ−X .

This mapping has the following properties:

Lemma 6.3 The following properties hold ∀X ∈ V for the mapping ∃X:

1. ∃Xz = z,

2. ∀φ ∈ Φ, ∃Xφ ≤ φ,

3. ∀φ, ψ ∈ Φ, ∃X(φ ∨ ∃Xψ) = ∃Xφ ∨ ∃Xψ.

Proof. (1) and (3) correspond to properties (4) and (2) of Lemma 3.10.Property (2) above follows from property (3) of Lemma 3.10. ut

A mapping from a boolean algebra Φ into itself satisfying properties (1)to (3) of the lemma above is called an existential quantifier 1 If x ⊆ V is anyfinite set of variables, then the same properties hold also for the mapping∃x defined by ∃xφ = φ−x:

Lemma 6.4 The following properties hold ∀x ⊆ V , x finite, for the map-ping ∃x:

1. ∃xz = z,1In the literature however, in general the dual order is considered, such that (1) reads

∃Xe = e, (2) φ ≤ ∃Xφ and (3) ∃X(φ∧∃ψ) = ∃Xφ∧∃ψ. But this is no essential difference.

Page 116: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

116 CHAPTER 6. BOOLEAN INFORMATION ALGEBRA

2. ∀φ ∈ Φ, ∃xφ ≤ φ,

3. ∀φ, ψ ∈ Φ, ∃x(φ ∨ ∃xψ) = ∃xφ ∨ ∃xψ.

Proof. These properties follow immediately from Lemma 6.3 above, togetherwith property (1) of Lemma Lemma 3.10. ut

Thus, ∃x is an existential quantifier too.A Boolean algebra Φ, together with a set V such that, ∀x ⊆ V , ∃x is an

existential quantifier, such that

1. ∃∅φ = φ,

2. ∃(x1 ∪ x2)φ = ∃x1(∃x2φ),

is called a quantifier algebra over V .Thus, a Boolean information algebra (Φ, D) relative to a multivariate

structure D based on a finite set of variables V is a quantifier algebra overV .

Inversely, every quantifier algebra Φ over a set V is equivalent to aBoolean information algebra. In fact, for J ⊆ V define

φ⇒J = ∃(V \ J)φ.

We are going to verify the axioms of a domain-free information algebra.(1) The semigroup axiom holds clearly, since it holds for the join in theBoolean algebra Φ. (2) The set V is a support of every φ ∈ Φ, since φ⇒V =∃(V \ V )φ = ∃∅φ = φ.

(3) The bottom element e of a Boolean algebra is its neutral elementwith respect to join. Further, e⇒J = ∃(V \ J)φ ≤ e since ∃(V \ J) is anexistential quantifier. This shows that e⇒J = e. (4) The top element z is anull element relative to the operation of a join. Further z⇒J = ∃(V \J)z = zby the properties of an existential quantifier.

(5) We have by property 2 above

(φ⇒J)⇒K = ∃(V \K)(∃(V \ J)φ) = ∃((V \K) ∪ (V \ J))φ= ∃(V \ (J ∩K))φ = φ⇒J∩K .

For (6) we see that by property 3 of an existential quantifier

(φ⇒J ⊗ ψ)⇒J = ∃(V \ J)(∃(V \ J)φ⊗ ψ)= ∃(V \ J)φ⊗ ∃(V \ J)ψ = φ⇒J ⊗ ψ⇒J .

(7) finally follows from the property φ⇒J = ∃(V \J)φ ≤ φ of the existen-tial quantifier. We remark that the equivalence between quantifier algebrasand information algebras is not confined to Boolean information algebras,but holds in general for a semilattice Φ. Quantifier algebras are introduced

Page 117: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

6.1. DOMAIN-FREE BOOLEAN INFORMATION ALGEBRA 117

almost exclusively with respect to Boolean algebras. The reason is of coursethat then the dual forall-quantifier can be defined: ∀Xφ = (∃Xφc)c. Fur-thermore, quantifier algebras are reducts of algebras with more structuresuch as Halmos algebras or cylindrical algebras (Halmos, 1962; Henkin et al.,1971), which are introduced in order to algebrize first order logic, whichagain explains their underlying Boolean structure. First order logic willbe examined with respect to information and information algebra in theexamples below.

Example 6.5 A Special Algebra. A particular algebra, which is closelyrelated to predicate logic (and first order logic more generally is a general-ization of example 6.1 to infinite dimension. Consider a set U and an infinitesequence v = v1, v2, . . . of values in U . We denote by Uω the set of allthese sequences and M = P(Uω) the power set of Uω. Now, M is clearly aBoolean algebra. It is also a quantor algebra, hence a Boolean informationalgebra, with variable elimination (or existential quantification) defined forany i = 1, 2, . . . as

A−Xi = v : ∃v′ ∈ A such that vj = v′j∀j 6= i

This example can be generalized to more general ordinals than ω and ex-tended to cylindric algebras or Halmos algebras (Henkin et al., 1971; ?) (seeExample 3.10).

In Boolean lattices there is a duality theory, which can be carried over toBoolean information algebras. If (Φ, D) is a Boolean information algebra,then define the following dual operations of combination ⊗d and focusing⇒d:

φ⊗d ψ = (φc ⊗ ψc)c,φ⇒dx = ((φc)⇒x)c.

We note that, by de Morgan’s law,

φ⊗d ψ = φ ∧ ψ.

This defines a new Boolean information algebra, which is called the dual ofthe original one.

Theorem 6.1 (Φ, D) with the operations ⊗d and⇒d is a Boolean informa-tion algebra. The mapping φ 7→ φc is an isomorphism between dual Booleaninformation algebras.

Proof. We note that the mapping φ 7→ φc is one-to-one, since every φhas a unique complement φc. We have the following relations:

Page 118: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

118 CHAPTER 6. BOOLEAN INFORMATION ALGEBRA

1. φc 7→ (φc)c = φ,

2. φ ∨ ψ 7→ (φ ∨ ψ)c = φc ∧ ψc,

3. φ ∧ ψ 7→ (φ ∧ ψ)c = φc ∨ ψc,

4. φ⇒x 7→ (φ⇒x)c = (φc)⇒dx.

This shows that the mapping is isomorph relative to the operations of com-plementation, supremum and infimum and focusing. This proves that (Φ, D)is a Boolean information algebra with respect to operations ⊗d and⇒d. ut

In the case of subset systems, dual combination corresponds to union.Focusing consists of projecting the complement and taking the complementof the result. This illustrates that dual focussing is not very attractive.More on Boolean information algebras, and more specifically on the relatedcylindric algebras, can be found in (Henkin et al., 1971).

6.2 Labeled Boolean Information Algebra

It is also possible to define a labeled version of a Boolean information alge-bra. Whereas domain-free Boolean information algebras are well present inthe literature as reducts of Halmos, polyadic or cylindric algebras, labeledalgebras seem not to have found much attention so far. We shall see belowthat relational algebra in database theory may be considered as a labeledBoolean information algebra. But again this is not the point of view adaptedin relational algebra. We shall first define the labeled version of a Booleaninformation algebra and derive a few elementary results associated with it.We then show that its derived domain-free version is in fact, as to be ex-pected, a Boolean information algebra. In turn, the labeled algebra derivedfrom the domain-free version turns out to be a labeled Boolean informationalgebra.

A labeled Boolean information algebra is an information algebra, wherethe restriction of the algebra to each domain is a Boolean algebra and a cer-tain distributive law is valid also between different domains. The followingdefinition specifies this more precisely.

Definition 6.2 Labeled Boolean Information Algebra. A labeled informa-tion algebra (Φ, D) is called a labeled Boolean information algebra, if

1. ∀x ∈ D, the algebra Φx is Boolean.

2. ∀x, y ∈ D and φ, ψ ∈ Φx it holds that

(φ ∧ ψ)⊗ ey = ((φ⊗ ey) ∧ (ψ ⊗ ey)).

Page 119: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

6.2. LABELED BOOLEAN INFORMATION ALGEBRA 119

Note that we denote the meet in Φx as usual with the operator symbol∧. We remark that the meet between elements with different domains isnot defined so far! As in an ordinary information algebra, combination isthe same as supremum or join and will be denoted, depending on the caseby the combination symbol ⊗ or the join symbol ∨. It is defined betweenarbitrary elements.

We prove first a useful lemma connecting complementation with vacuousextension.

Lemma 6.5 Let (Φ, D) be a labeled Boolean information algebra. Then,∀x, y ∈ D, x ≤ y and φ ∈ Φx,

(φc)↑y = (φ↑y)c.

Proof. From φ ∨ φc = φ⊗ φc = zx we conclude that

(φ⊗ φc)↑y = φ↑y ⊗ (φc)↑y = zy,

hence φ↑y ∨ (φc)↑y = zy. On the other hand φ ∧ φc = ex implies, usingproperty (2) in the definition above

ey = (φ ∧ φc)⊗ ey = ((φ⊗ ey) ∧ (φc ⊗ ey) = φ↑y ∧ (φc)↑y.

This shows then that (φc)↑y = (φ↑y)c utWe may now extend the meet operation to arbitrary elements of Φ by

the following definition

φ ∧ ψ = (φc ∨ ψc)c. (6.1)

The meet operation is certainly an extension of the local meet operationwithin the algebras Φx. The following lemma collects some properties of themeet operation.

Lemma 6.6 Let (Φ, D) be a labeled Boolean information algebra, with meetdefined by (6.1). Then the following properties hold:

1. If d(φ) = x and d(ψ) = y, then φ ∧ ψ = φ↑x∨y ∧ ψ↑x∨y.

2. d(φ ∧ ψ) = d(φ) ∨ d(ψ).

3. ∀φ, ψ ∈ Φ and z ∈ D we have (φ ∧ ψ)→z = φ→z ∧ ψ→z.

4. ∀φ, ψ, η ∈ Φ we have (φ∧ψ)∨ η = (φ∨ η)∧ (ψ ∨ η) and (φ∨ψ)∧ η =(φ ∧ η) ∨ (ψ ∧ η) ( distributive laws).

5. ∀φ, ψ ∈ Φ we have φ ∧ ψ = (φc ∨ ψc)c and φ ∨ ψ = (φc ∧ ψc)c ( DeMorgan laws).

Page 120: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

120 CHAPTER 6. BOOLEAN INFORMATION ALGEBRA

6. ∀φ ∈ Φ with d(φ) = x, and ∀y ∈ D we have φ ∧ zy = φ↑x∨y andφ ∧ ey = ex∨y.

Proof. (1) We use Lemma 6.5 and the definition of the meet operation(6.1), as well as (3) of Lemma 3.1,

φ ∧ ψ = (φc ∨ ψc)c = ((φc)↑x∨y ∨ (ψc)↑x∨y)c

= ((φ↑x∨y)c ∨ (ψ↑x∨y)c)c = φ↑x∨y ∧ ψ↑x∨y.

(2) follows from d(φ∧ψ) = d((φc ∨ψc)c) = d(φc ∨ψc) = d(φc)∨ d(ψc) =d(φ) ∨ d(ψ).

(3) Let d(φ) = x and d(ψ) = y. Then by definition of the transportoperation, and (1) proved above

(φ ∧ ψ)→z = (φ↑x∨y ∧ ψ↑x∨y)→z

= ((φ↑x∨y ∧ ψ↑x∨y)↑x∨y∨z)↓z

= (φ↑x∨y∨z ∧ ψ↑x∨y∨z)↓z.

The last identity follows from property (2) of a labeled Boolean informationalgebra. Let η = φ↑x∨y∨z ∧ψ↑x∨y∨z. Since this is a meet within the Booleanalgebra Φx∨y∨z, η is the infimum of φ↑x∨y∨z and ψ↑x∨y∨z. In particular,η ≤ φ↑x∨y∨z and η ≤ ψ↑x∨y∨z. This implies that η↓z ≤ φ→z and η↓z ≤ ψ→z,hence η↓z ≤ φ→z ∧ ψ→z.

Let ν be another lower bound of φ→z and ψ→z. Then we conclude thatν↑x∨y∨z ≤ (φ→z)↑x∨y∨z ≤ φ↑x∨y∨z and, in the same way, ν↑x∨y∨z ≤ ψ↑x∨y∨z.This implies that ν↑x∨y∨z ≤ η. But then (ν↑x∨y∨z)↓z = ν ≤ η↓z. Thisshows that η↓z is the infimum of φ→z and ψ→z. Since this are both elementsof the same Boolean algebra Φz we conclude that η↓z = φ→z ∧ ψ→z. Asη↓z = (φ ∧ ψ)→z, point (3) of the lemma is proved.

(4) Let d(φ) = x, d(ψ) = y and d(η) = z. Then, using (1) provedabove and the distributive law within the Boolean algebra Φx∨y∨z, we obtain(writing ⊗ instead of ∨),

(φ ∧ ψ)⊗ η = (φ↑x∨y ∧ ψ↑x∨y)⊗ η= (φ↑x∨y ∧ ψ↑x∨y)↑x∨y∨z ⊗ η↑x∨y∨z

= (φ↑x∨y∨z ∧ ψ↑x∨y∨z)⊗ η↑x∨y∨z

= (φ↑x∨y∨z ⊗ η↑x∨y∨z) ∧ (ψ↑x∨y∨z ⊗ η↑x∨y∨z)= (φ⊗ η)↑x∨y∨z ∧ (ψ ⊗ η)↑x∨y∨z

= (φ⊗ η) ∧ (ψ ⊗ η).

The dual distributive law is proved similarly.(5) Let d(φ) = x and d(ψ) = y. The first de Morgan law is the definition

of the meet operation. The second law follows from De Morgan’s law within

Page 121: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

6.2. LABELED BOOLEAN INFORMATION ALGEBRA 121

the Boolean algebra Φx∨y, Lemma 6.5 and (1) of this lemma proved above:

φ⊗ ψ = φ↑x∨y ⊗ ψ↑x∨y = ((φ↑x∨y)c ∧ (ψ↑x∨y)c)c

= ((φc)↑x∨y ∧ (ψc)↑x∨y)c = (φc ∧ ψc)c.

(6) By (1) of this lemma φ∧ zy = φ↑x∨y ∧ zx∨y = φ↑x∨y, since zx∨y is thetop element in the Boolean algebra Φx∨y. The second equation is obtainedsimilarly, noting that ex∨y is the least element in Φx∨y. ut

Remark that φ ∧ ψ is not the infimum of φ and ψ, if the two elementsdo not belong to the same domain. In fact, we have, assuming d(φ) = x andd(ψ) = y,

φ⊗ (φ ∧ ψ) = φ↑x∨y ⊗ (φ↑x∨y ∧ ψ↑x∨y)= φ↑x∨y ∧ (φ↑x∨y ⊗ ψ↑x∨y)= φ↑x∨y,

since the distributive law holds and the meet within Φx∨y is the infimum.But in general, φ↑x∨y is not equal to φ. It holds only that φ ≤ φ↑x∨y.Hence φ is not even an upper bound of φ ∧ ψ. This means that Φ is not aBoolean algebra! Another deviation from a Boolean algebra is that, withinΦ, complement is not defined relative to an absolute zero and one element,but only relative to those elements within Φ. In fact, if D does not have abottom and top element, there exist no zero and one elements in Φ.

Example 6.6 Relational Algebra. Relational algebra is a model of an infor-mation algebra, as we have seen (see example 3.3 in Section 3.2). This is so,if we consider join and projection. But relational algebra contains more op-erations like union, intersection, difference, divide, product and restriction.Here we show that this defines essentially a labeled Boolean informationalgebra.

Note first that intersection is defined only between relations with thesame schema (or domain in our language). But then it corresponds to join.The Cartesian product is defined between relations with disjoint schema ordomains. Again then, it corresponds to a join.

Really new operations are union and complementation. As the intersec-tion, union is defined only between relation with the same domain. And itis the classical set theoretic union

R ∪ S = f : f ∈ R or f ∈ S,

and d(R∪S) = d(R) = d(S). For a relation R with domain x complementa-tion is the usual set theoretic complementation with respect to the neutralrelation in domain x,

Rc = f ∈ Ex : f 6∈ R,

Page 122: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

122 CHAPTER 6. BOOLEAN INFORMATION ALGEBRA

Therefore, we see that d(Rc) = d(R). Complementation is usually not con-sidered as an operation of relational algebra. But it serves to define otherrelational operations, such as difference, which is also defined only betweenrelations on identical domains,

R \ S = R ./ Sc = f : f ∈ R, f 6∈ S.

Again, d(R\S) = d(R) = d(S). The operation divide is defined for relationsR, S and T with d(R) ∩ d(S) = ∅ and d(T ) = d(R) ∪ d(S). Let d(R) =x, d(S) = y, x∩y = ∅ and d(T ) = x∪y Then R divided by S by T is definedas

R÷ S : T = f : d((f) = x,∀g ∈ S, (f, g) ∈ T = R \ πx((R ./ S) \ T ).

Union can also be defined using complementation and join,

R ∪ S = (Rc ./ Sc)c. (6.2)

Clearly, for any schema or domain, the relations form a Boolean algebra.It remains to verify property (2) of the definition of a labeled Boolean

information algebra above. But this means only that the union of the cylin-dric extensions of two relations on the same domain equals the cylindricextension of the union of the two relations, which clearly holds.

Now, union can be generalized by (6.2) to an operation between arbitraryrelations, not necessarily on the same domain. Then we have d(R ∪ S) =d(R) ∪ d(S) by the labeling operations for the join and complementation.In this way relational algebra is a reduct of a labeled Boolean informationalgebra.

We have however so far neglected the relational operation of restrictionor selection. For a set of attributes x, σx is a relation between the values ofthe attributes in x, i.e. σx ⊆ Ex. If the domain of a relation R covers x,then the selection σx applied to R returns all tuples of R which satisfy therelation. This is simply the join

σx ./ R.

So selection can be reduced to join too. This completes the relational alge-bra. Its basic operations are projection, join and (generalized union).

Relational algebra is essential to study various important problems ofdatabases: data retrieval or query processing, database updates, definingintegrity constraints, deriving database views or snapshots, defining stabilityrequirements and defining security constraints. Relational expressions serveas a high-level representation of the user’s intent (Date, 1975a). Most ofthese problems are general problems of information processing. Thus theabstraction of information algebra serves to unify and generalize analysisand solutions to these problems.

Page 123: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

6.2. LABELED BOOLEAN INFORMATION ALGEBRA 123

In practical computations, it is important, that relations remain finite(i.e. contain a finite number of tuples). If the domains of attributes areinfinite, then complementation does not maintain finiteness, but all otheroperations do. This is even true for selection, which itself may define aninfinite relation. That is fundamentally the reason why derived operations,based on the complementation, but which maintain finiteness of relations,are introduced.

Example 6.7 File Systems. In Section 3.3 we have introduced tuple sys-tems T over a lattice D, and an abstract relational information algebra(RT , D) over the tuple system T (see Theorem 3.1). A relation R over a do-main x ∈ D are subsets of tuples with domain x, and join within x reducesto intersection, relations over any domain x ∈ D form a Boolean algebrawith meet equal to union. Certainly we have also, for two relations R andS in over domain x and any y ∈ D,

(R ∪ S) ./ Ey = ((R ./ Ey) ∪ (S ./ Ey)).

So, the second condition in Definition 6.2 is satisfied too and the algebra(RT , D) is a labeled Boolean information algebra.

As a consequence, if (Φ, D) is an atomic labeled information algebra,then (RAt(Φ, D) is a labeled Boolean information algebra (see Section 4.3).By Corollary 4.1, if the algebra (Φ, D) is atomic composed, then it is em-bedded (as an information algebra) into a Boolean information algebra, oreven isomorph to one (if it is atomic closed).

From a labeled Boolean information we may derive by the usual proce-dure the corresponding domain-free algebra. It is to be expected that it willbe a Boolean information algebra. The domain-free algebra is the quotient(Φ/σ,D) defined by the congruence φ ≡ ψ (mod σ), if for d(φ) = x andd(ψ) = y, we have φ→y = ψ and ψ→x = φ (see Section 3.5). In the case ofa labeled Boolean information algebra, this equivalence is also a congruencerelative to meet. In fact, assume φ1 ≡ ψ1 (mod σ) and φ2 ≡ ψ2 (mod σ)and let x1, x2 and y1, y2 be the domains of φ1, φ2 and ψ1, ψ2 respectively.Then

(φ1 ∧ φ2)↑(x1∨x2)∨(y1∨y2) = φ↑(x1∨x2)∨(y1∨y2)1 ∧ φ↑(x1∨x2)∨(y1∨y2)

2

= φ↑x1∨y11 ∧ φ↑x2∨y2

2

= ψ↑x1∨y11 ∧ ψ↑x2∨y2

2

= ψ↑(x1∨x2)∨(y1∨y2)1 ∧ ψ↑(x1∨x2)∨(y1∨y2)

2

= (ψ1 ∧ ψ2)↑(x1∨x2)∨(y1∨y2).

This shows that φ1 ∧ φ2 ≡ ψ1 ∧ ψ2 (mod σ). We may therefore define[φ]σ ∧ [ψ]σ = [φ ∧ ψ]σ.

Page 124: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

124 CHAPTER 6. BOOLEAN INFORMATION ALGEBRA

In the same way, φ ≡ ψ (mod σ) implies φc ≡ ψc (mod σ). In fact,let d(φ) = x, d(ψ) = y. Then d(φc = x, d(ψc) = y and (φc)↑x∨y = (φ↑x∨y)c =(ψ↑x∨y)c = (ψc)↑x∨y. Thus we can define ([φ]σ)c = |φc]σ.

It remains to show that these operations in Φ/σ define in fact a Booleanalgebra. We have clearly commutativity and associativity of meet (for jointhis inherited from the information algebra), The distributive laws hold too,as may easily be demonstrated from the distributive laws of the labeledBoolean information algebra, for instance

([φ]σ ∧ [ψ]σ)⊗ [η]σ = [(φ ∧ ψ)⊗ η]σ= [(φ⊗ η) ∧ (ψ ⊗ η)]σ= ([φ]σ ⊗ [η]σ) ∧ ([ψ]σ ⊗ [η)]σ.

The elements [ex]σ and [zy]σ are the null and one elements of the algebra.In fact, [φ]σ ⊗ [ex]σ = [φ]σ (as already in the information algebra) and[φ]σ ∧ [zx]σ = [φ ∧ zy]σ = [φ]σ (see lemma 6.6 (6)). Finally, we have

[φ]σ ∨ ([φ]σ)c = [φ ∨ φc]σ = [zx]σ.

and

[φ]σ ∧ ([φ]σ)c = [φ ∧ φc]σ = [ex]σ.

So, ([φ]σ)c is the complement of [φ]σ. These are all the properties neededfor a Boolean algebra. Hence (Φ/σ,D) is indeed a domain-free Booleaninformation algebra.

We may finally derive a labeled version from a domain-free Booleaninformation algebra (Φ, D). Indeed, let x be a support of φ and y a supportof ψ. Then (φ ∧ ψ)⇒x∨y = φ⇒x∨y ∧ ψ⇒x∨y = φ ∧ ψ. And we have seenthat x is also a support of φc (Lemma 6.2 (2)). Hence in Φ∗ = (φ, x) : φ ∈Φ, φ⇒x = φ we may define

(φ, x) ∨ (ψ, y) = (φ ∨ ψ, x ∨ y)(φ, x) ∧ (ψ, y) = (φ ∧ ψ, x ∨ y)

(φ, x)c = (φc, x).

Then Φ∗ is clearly a labeled Boolean information algebra.In fact, if we start with a labeled Boolean information algebra (Φ, D),

then ((Φ/σ)∗, D) is isomorph to the labeled Boolean information algebra(Φ, D) (see Theorem 3.6). The other way round the same holds: If (Φ, D) isa domain-free Boolean information algebra, then (Φ∗/σ,D) is an isomorphdomain-free Boolean algebra.

Page 125: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

6.3. REPRESENTATION BY SET ALGEBRAS 125

6.3 Representation by Set Algebras

The case of finite Boolean information algebras is both important from apractical view, and particularly simple. In fact, it is well known that afinite Boolean algebra is isomorph to the subset algebra of a finite set. Inthis section we examine what this means for a finite Boolean informationalgebra. We shall show these algebras are essentially identical to subsetalgebras in a system of finite compatible frames (see Example 3.1). So,in some sense, there is not much room for the variety of finite Booleaninformation algebras.

A labeled information algebra (Φ, D) is called finite, if, for all x ∈ D,the set Φx of elements with domain x is finite. If the information algebra(Φ, D) is in addition Boolean, then Φx is a finite Boolean algebra. We con-sider therefore first a number of well-known results regarding finite Booleanalgebras (Davey & Priestley, 1990b). The first result is, that such a Booleaninformation algebra is atomic composed (see Section 4.3).

Theorem 6.2 If (Φ, D) is a finite, labeled Boolean information algebra,then ∀φ ∈ Φ,

φ = ∧At(φ).

Proof. Let x = d(φ). Clearly, there is an atom α ∈ Atx(Φ) such thatφ ≤ α since the Φx is finite. Certainly φ is then a lower bound for the setAt(φ). Let ψ be any lower bound of At(φ). We claim that ψ ≤ φ. If thiswere not true, then φ ⊗ ψc < zx. Choose an atom β ∈ At(φ ⊗ ψc). Sinceφ ≤ φ⊗ ψc ≤ β we have also β ∈ At(φ). So ψ ≤ β, but ψc ≤ β too. Henceψ ⊗ ψc = z ≤ β which is a contradiction. So ψ ≤ φ and φ is the greatestlower bound of At(φ). ut

Since Φx is a Boolean algebra, it follows that for any subset of atomsS ⊆ Atx(Φ) it holds that ∧S ∈ Φx. So any finite Boolean information alge-bra (Φ, D) is atomic closed. From theorem 4.7 in Section 4.3, we concludethen that a finite Boolean information algebra is isomorph to the relationalalgebra RAt(Φ with the the lattice F = Atx(Φ) : x ∈ D as domains.This is also equivalent to a subset algebra in a family of compatible frames,see Example 3.1 in Section 3.1. This is a representation theorem for finiteBoolean information algebras, which is an extension of the correspondingrepresentation theorem for finite Boolean algebras as the power set algebraof its atoms.

Next we try to extend this result to a general, not necessarily finite,Boolean algebra. The way to do this, is using the ideal completion of theBoolean information algebra (see Section 4.2). So, let (Φ, D) be a labeledBoolean information algebra. Ideal completion has been defined in Section4.2 both for domain-free and labeled information algebras. Here we con-sider a labeled Boolean information algebra (Φ, D) and consider its ideal

Page 126: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

126 CHAPTER 6. BOOLEAN INFORMATION ALGEBRA

completion (IΦ, D) (see Section 4.2). Well known results from the theory ofBoolean algebra permit to conclude that the information algebra (IΦ, D) isatomic. The atoms of IΦ are maximal ideals, a concept to be defined in thefollowing definition:

Definition 6.3 Maximal Ideal. A proper ideal I ∈ IΦx is called maximalin Φx, if J ∈ IΦx and I ⊆ J implies J = I or J = Φx.

Maximal ideals are proper ideals which are not properly contained inanother proper ideal. Now, I ⊆ J is equivalent to I ≤ J in terms of thepartial order in the information algebra IΦ. This shows that maximal idealscorrespond to atoms in the algebra (IΦ, D). In Boolean algebras maximalideals have nice properties. Here is a first result (Davey & Priestley, 1990b):

Theorem 6.3 If (Φ, D) is a labeled Boolean information algebra, then thefollowing are equivalent:

1. I is a maximal ideal in Φx.

2. I is a proper ideal, and ∀φ, ψ ∈ Φx, φ ∧ ψ ∈ I implies φ ∈ I or ψ ∈ I.

3. ∀φ ∈ Φx, it is the case that φ ∈ I if, and only if, φc 6∈ I.

Proof. This is a theorem of the theory of Boolean algebra (Davey &Priestley, 1990b). We prove first (1)⇒ (2): Let I be a maximal ideal in Φx,and φ ∧ ψ ∈ I for some φ, ψ ∈ Φx. Assume φ 6∈ I. Define Iφ = η : η ≤φ ∨ χ for some χ ∈ I. This is an ideal, φ ∈ Iφ and it contains I. Since I ismaximal and I 6= Iφ it be that Iφ = Φx. In particular, we have zx ∈ Iφ. So,zx = φ ∨ χ for some χ ∈ I. Then

I 3 (φ ∧ χ) ∨ χ = (φ ∨ χ) ∧ (ψ ∨ χ) = ψ ∨ χ.

Since ψ ≤ ψ ∨ χ we conclude that ψ ∈ I as required.To prove (2) ⇒ (3) assume I a proper ideal, and note that, for φ ∈ Φx,

φ ∧ φc = ex ∈ I. By (2) then either φ ∈ I or φc ∈ I. If both belong to I,then I 3 φ ∨ φc = zx and I is not proper, contrary to the assumption. So(3) holds.

Finally we prove (3) ⇒ (1): Let J be an ideal properly containing I.Select a φ ∈ J \ I. Then φc ∈ I ⊆ J , so zx = φ ∨ φc ∈ J , hence J = Φx.Therefore, I is maximal. ut

Further, we are now in a position to show that a labeled Boolean infor-mation algebra (Φ, D) is embedded in a set algebra

Theorem 6.4 If the labeled information algebra (Φ, D) is Boolean, then itis embedded in the relational algebra (RAt(Φ, D).

Page 127: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

6.3. REPRESENTATION BY SET ALGEBRAS 127

Proof. By Theorem 4.3 the mapping φ 7→ I(φ) is a information algebrahomomorphism between (Φ, D) and its ideal completion (IΦ, D). Further,according to (1) of Theorem 6.3 above, the information algebra (IΦ, D) isatomic. Therefore, by Theorem 4.6, the mapping I 7→ At(I) is a homomor-phism between (IΦ, D) and (RAt(Φ, D). Therefore the mapping defined byφ 7→ At(I(φ)) is an information algebra homomorphism too. Now, by (2)of Theorem 6.3, if φ 6= ψ, then At(I(φ)) 6= At(I(ψ)). The mapping is thusone-to-one, hence an embedding. ut

We remark that the information-algebra homomorphism considered inthe proof above is in fact a Boolean algebra homomorphism for every Booleanalgebra Φx. This last theorem shows that a labeled Boolean algebra is es-sentially equivalent to a certain abstract relational algebra.

For domain-free Boolean information algebras, similar results can be ob-tained. Also Stone’s representation theorem could be extended along theselines to Boolean information algebras. These issues will not be developedhere.

Page 128: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

128 CHAPTER 6. BOOLEAN INFORMATION ALGEBRA

Page 129: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 7

Independence of Information

7.1 Conditional Independence of Domains

The concepts of independence relative to information is an important struc-tural issue in the algebraic theory of information. Roughly speaking, in-dependence is thought to capture certain irrelevance relations. For exam-ple an information concerning a certain question may be entirely irrelevantwith regard to another question. Or, given a certain information relativeto a certain question, new information relative to a second question maybe irrelevant relative to a third question, i.e. may not change the infor-mation concerning this third question. Such considerations prove not onlyimportant from a structural point of view, but are also fundamental for for-mulating requirements for a measure of information. Furthermore, it hasalso important consequences for computational purposes, i.e. focusing in-formation on specified questions.

In this Chapter we define different notions of independence and studytheir properties. We start in this first section by defining conditional inde-pendence of domains or questions. This is a general and basic independencerelation. In the following Section 7.2 we use this concept to define a notionof conditional independence of information which is related to a certain fac-torization of a piece or a class of information. This restricts the structure ofthe information to be expected in certain circumstances. The issue of factor-ization will be deepened in Section 7.3 in the context of relational algebra,or more generally, atomic algebras. It will be shown that factorization isclosely related to the concept of normal forms and some forms of depen-dency defined on the level of tuples, as for example discussed in relationaldatabase theory.

Let then D be lattice of domains. Then we define conditional indepen-dence of two domains x, y ∈ D given a third domain z ∈ D as follows:

Definition 7.1 Conditional Independence of Domains. Domains x, y ∈ D

129

Page 130: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

130 CHAPTER 7. INDEPENDENCE OF INFORMATION

are called conditionally independent, given a domain z ∈ D, if

(x ∨ z) ∧ (y ∨ z) = z.

We then write x⊥y|z.

Clearly, x⊥y|z implies x∧y ≤ z. Note that, if the lattice D is distributive,then (x∨ z)∧ (y∨ z) = (x∧ y)∨ z. Hence, in this case, x⊥y|z if, and only if,x∧ y ≤ z. If the lattice D is at least modular, then the modular law implies(x ∨ z) ∧ (y ∨ z) = (x ∧ (y ∨ z)) ∨ z. Thus x⊥y|z holds in modular latticesif, and only if, x ∧ (y ∨ z) ≤ z.

Let now D be the domain lattice of an information algebra (Φ, D). Con-ditional independence relation between domains translate into certain irrel-evance statements. Here is a first result of this kind.

Theorem 7.1 If x⊥y|z, then, ∀φ, ψ ∈ Φ, d(φ) = x and d(ψ) = y imply

φ→y = (φ→z)→y, ψ→x = (ψ→z)→x. (7.1)

Proof. Using Lemma 3.7 (3) and (4), the definition of transport, and(x ∨ z) ∧ (y ∨ z) = z, we see that

φ→y = (φ→y∨z)↓y = ((φ↓(x∨z)∧(y∨z))↑y∨z)↓y

= ((φ↓z)↑y∨z)↓y = (φ→z)→y.

The second part of (7.1) is proved similarly. utThis theorem says that for an information relative to a domain x, only

the part concerning z is relevant for the domain y, if x and y are conditionallyindependent, given z.

Conditional independence is also related to factorization of information.A result illustrating this, is given in the next theorem.

Theorem 7.2 Let φ, ψ ∈ Φ with d(φ) = x ∨ z and d(ψ) = y ∨ z. Thenx⊥y|z implies

(φ⊗ ψ)↓z = φ↓z ⊗ ψ↓z.

Proof. Assume x⊥y|z, that is, (x ∨ z) ∧ (y ∨ z) = z. Then, by thecombinaiton axiom,

(φ⊗ ψ)↓z = ((φ⊗ ψ)↓x∨z)↓z = (φ⊗ ψ↓z)↓z = φ↓z ⊗ ψ↓z.

utAgain, only the part concerning z is relevant, when two pieces of infor-

mation relating to conditionally independent domains given z are projectedto z. Such results are of utmost importance for computation, as will bediscussed in part III. Further results of this type will be given below and inthe following sections.

The following theorem collects a few basic properties of the ternary re-lation x⊥y|z:

Page 131: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

7.1. CONDITIONAL INDEPENDENCE OF DOMAINS 131

Theorem 7.3 The conditional independence relation x⊥y|z in a lattice Dsatisfies the following properties

1. (P1) x⊥y|x,

2. (P2) x⊥y|z implies y⊥x|z,

3. (P3) x⊥y|z and w ≤ y imply x⊥w|z.

If the lattice D is modular, then following further properties hold:

1. (P4) x⊥y|z and w ≤ y imply x⊥y|(z ∨ w),

2. (P5) x⊥y|z and x⊥w|(y ∨ z) imply x⊥(y ∨ w)|z.

3. (P6) If z ≤ y and w ≤ y, then x⊥y|z and x⊥y|w imply x⊥y|(z ∧ w).

Proof. (P1) We have (x ∨ x) ∧ (y ∨ x) = x, hence x⊥y|x.(P2) Symmetry is direct consequence of the definition.(P3) Since w ≤ y we obtain z ≤ (x∨ z)∧ (w ∨ z) ≤ (x∨ z)∧ (y ∨ z) = z,

hence (x ∨ z) ∧ (w ∨ z) = z or x⊥w|z.(P4) Now we assume D to be modular. As seen above, x⊥y|z holds

if, and only if, x ∧ (y ∨ z) ≤ z. But then, since w ≤ y we have thatx ∧ (y ∨ z ∨ w) = x ∧ (y ∨ z) ≤ z ≤ z ∨ w. This shows that x⊥y|(z ∨ w).

(P6) Assume x⊥y|z and x⊥y|w and z, w ≤ y. Then x∧(y∨z) = x∧y ≤ zand x ∧ (y ∨ w) = x ∧ y ≤ w. Thus, x ∧ (y ∨ (w ∧ z)) = x ∧ y ≤ w ∧ z and(P6) follows. ut

A ternary relation x⊥y|z on a lattice is called a separoid, if proper-ties (P1) to (P5) are satisfied, and a strong separoid, if furthermore (P6)holds (Dawid, 2001). Many different forms of independence occurring indomains like probability theory, statistics, logic, relational databases andelsewhere induce relations which are separoids. In particular, when the lat-tice of domains in an information algebra is distributive, like, for instance,in multivariate systems, then our conditional independence relation betweendomains forms a strong separoid. This is, because a distributive lattice isalso modular. If the lattice is even Boolean, then separoids are equivalentto semi-graphoids, (Dawid, 2001; ?). Conversely, (Dawid, 2001) shows thatif the relation x⊥y|z in a lattice is a separoid, then the lattice is modular.Now, the lattice part(S) of the partitions of a set S is not modular. Thus,in general, we have less structure than a separoid in our conditional inde-pendence relation between domains. Interestingly, this is sufficient for ourpurposes, for example for local computation schemes, see part III. The ben-efit of the further separoid properties is, in this context, merely to simplifyconcepts and proofs sometimes.

We generalize now the notion of conditional independence from pairs ofdomains to arbitrary sets of domains:

Page 132: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

132 CHAPTER 7. INDEPENDENCE OF INFORMATION

Definition 7.2 Conditional Independence of Families of Domains. Let xi :i ∈ I be a family of domains from a domain lattice D, where I is a finiteset of cardinality at least 2. If z ∈ D, we call the family xi : i ∈ I tobe conditionally independent given z, if for all disjoint, non empty subsetsJ,K ⊆ I, J ∩K = ∅ we have that∨

j∈Jxj⊥

∨k∈K

xk|z. (7.2)

We then write

⊥xi : i ∈ I|z.

We recall that by definition 7.1 condition (7.2) means that

(∨j∈J

xj ∨ z) ∧ (∨k∈K

xk ∨ z) = z. (7.3)

In fact, it is sufficient that condition (7.2) holds for disjoint sets J and Kwhich cover I, J∪K = I. This is so because of property (P3) of Theorem 7.3.Further, clearly, Definition 7.2 generalizes Definition 7.1, since in the case ofbinary sets I the former reduces to the latter definition. By convention, forall x, z ∈ D, we have x⊥z, since there are no nonempty pairs of subsetsof a one-element set. Similarly, we admit also that ∅|z holds for all z ∈ D.

If the lattice D is distributive, then

(∨j∈J

xj ∨ z) ∧ (∨k∈K

xk ∨ z) =∨

j∈J,k∈K(xj ∧ xk) ∨ z = z.

Therefore, xj ∧ xk ≤ z for all j, k ∈ I, j 6= k, i.e. xj⊥xk|z, holds if, andonly if, ⊥xi : i ∈ I|z. In distributive lattices, a family of domains isconditionally independent exactly, if all pairs of elements of the family areso.

We collect a few properties of conditional independence of families ofdomains in the next theorem.

Theorem 7.4 Assume ⊥x1, . . . , xn|z. Then the following properties hold:

1. If σ is a permutation of 1, . . . , n, then ⊥xσ(1), . . . , xσ(n)|z.

2. If J ⊆ 1, . . . , n, and |J | ≥ 2, then ⊥xj : j ∈ J|z.

3. If y1 ≤ x1 then ⊥y1, x2, . . . , xn|z.

4. ⊥x1 ∨ x2, x3, . . . , xn|z.

5. ⊥(x1 ∨ z), . . . , (xn ∨ z)|z.

Page 133: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

7.1. CONDITIONAL INDEPENDENCE OF DOMAINS 133

Proof. All four properties follow immediately from (7.3). utWe may also generalize Theorem 7.2.

Theorem 7.5 If ⊥x1, . . . , xn|z and

φ = ψ1 ⊗ · · · ⊗ ψn, d(ψi) = xi ∨ z for i = 1, . . . n,

then

φ↓z = ψ↓z1 ⊗ · · · ⊗ ψ↓zn ,

Proof. Define for J ⊆ 1, . . . , n,

φJ =⊗j∈J

ψJ .

Then we have φ = ψ1 ⊗ φI\1 and, by Theorem 7.4 (4)

x1⊥(x2 ∨ · · · ∨ xn)|z.

Thus by Theorem 7.2 we have

φ↓z = ψ↓z1 ⊗ φ↓zI\1,

The theorem follows then by recursion on the second factor in this formula.ut

We illustrate these notions with two of our basic examples of lattices ofdomains.

Example 7.1 Multivariate Systems. In this example we may identify thelattice D of domains with a lattice of subsets of variables (see Example 2.1).This lattice is distributive. Thus, if I, J, k are subsets of variables, we haveI⊥J |K if, and only if I ∩J ⊆ K. The multivariate systems is the case mostoften considered when discussing conditional independence, see for example(Studeny, 1993).

Example 7.2 Partitions. Partition lattices are the most general lattices.Hence it is interesting to see how conditional independence presents itself inpartition structures (see Example 2.2). The following theorem shows howconditional independence of partition can be expressed in terms of the blocksof partitions.

Theorem 7.6 Let P, P1 and P2 be partitions of an universe S. ThenP1⊥P2|P if, and only if, for all P ∈ P, P1 ∈ P1 and P2 ∈ P2, P1 ∩ P 6= ∅and P2 ∩ P 6= ∅ imply P1 ∩ P2 ∩ P 6= ∅.

Page 134: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

134 CHAPTER 7. INDEPENDENCE OF INFORMATION

Proof. We have to prove that (P1 ∨P)∧ (P2 ∨P) = P. It is sufficient toshow that (P1 ∨P)∧ (P2 ∨P) ≤ P, since the other inequality always holds.This means that any block of the left hand side of the inequality is containedin a block of the right hand side. This in turn holds if two elements a, b arein a block of (P1 ∨ P) ∧ (P2 ∨ P) ≤ P, then they are also in a block of P.

We use Theorem 2.1. So, if a ∈ P1 ∪ P and b ∈ P2 ∪ P for some blocksP ∈ P, P1 ∈ P1 and P2 ∈ P2, then a and b are in the same block of the meet(P1∨P)∧(P2∨P) ≤ P, if, and only if there is an element c ∈ P1∩P2∩P . So,if both P1∩P and P2∩P are not empty, then P1∩P2∩P is not empty either.Hence, this condition is necessary and sufficient for (P1∨P)∧ (P2∨P) ≤ P.This proves the theorem. ut

Note that blocks of a partition correspond to atoms of the domain. The-orem 7.6 is an example of a result, which relates conditional independenceto atoms. More results of this kind are presented in Section 7.3. The condi-tions of Theorem 7.6 have an interpretation in terms of irrelevance: Giventhe information that the unknown element of S belongs to the block P of thepartition P, the further information that the element belongs to a certainblock P1 of partition P1 restricts in no way the possible blocks of partitionP2 and vice versa. In other words, given an information on partition P, aninformation on partition P1 gives no information on P2 and vice versa, ifP1⊥P2|P.

We remark that Theorem 7.6 and the ideas related to it can be extendedto families of compatible frames (see Example 2.3).

Suppose now that the lattice D has a bottom element ⊥ and thatΦ⊥ = e⊥, z⊥. Important examples like multivariate systems or the latticepart(S) satisfy these conditions. If then x⊥y|⊥, i.e. if

x ∧ y = ⊥,

we call x and y independent and write x⊥y. We have then for all φ ∈ Φthat φ 6= zd(φ) implies φ↓⊥ = e⊥. Thus, if x⊥y, be Theorem 7.1 we have, ifd(φ) = x and d(ψ) = y,

φ→y = ey, ψ→x = ex,

and, if η = φ⊗ ψ,

η↓x = φ, η↓y = ψ,

Hence

η = η↓x ⊗ η↓y.

Thus, if x⊥y, then an information on x contains no information on y andvice versa.

Page 135: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

7.1. CONDITIONAL INDEPENDENCE OF DOMAINS 135

This can be extended to families of domains. We say that x1, . . . , xnare independent, and write ⊥x1, . . . , xn, if ⊥x1, . . . , xn|⊥. It followsfrom (7.3), if z = ⊥, that this implies

xj ∧ xk = ⊥ (7.4)

for all 1 ≤ j < k ≤ n, or that xj⊥kk. If the lattice D is distributive, then⊥x1, . . . , xn is equivalent to the pairwise indepence condition (7.4).

Theorem 7.4 carries immediately over to independence:

Corollary 7.1 Assume ⊥x1, . . . , xn. Then the following properties hold:

1. If σ is a permutation of 1, . . . , n, then ⊥xσ(1), . . . , xσ(n).

2. If J ⊆ 1, . . . , n, and |J | ≥ 2, then ⊥xj : J ∈ J.

3. ⊥x1 ∨ x2, x3, . . . , xn.

4. ⊥(x1 ∨ z), . . . , (xn ∨ z).

Let’s again consider our basic examples of lattices of domains.

Example 7.3 Multivariate Systems. In multivariate Sysstems, we haveI⊥J if, and only if, I ∩ J = ∅ (compare Exampleexpl:CondIndepMultiVar).

Example 7.4 Partitions. For partition structures it follows from Theorem7.6 (see Example 7.2) that P1⊥P2 if, and only if, for all blocks P1 ∈ P1 andP2 ∈ P2 it holds that P1 ∪ P2 6= ∅. If P1⊥P2, then an information on onepartition gives no information on the other one.

We turn now to the study of certain families of conditional independencerelations, which are in particular import for computational purposes (seepart IIII). Consider a tree T = (V,E) with set of nodes V and set of edgesE ⊆ V 2, where V 2 is the family of two-element subsets of V . Let λ : V → Dbe a labeling of nodes of T with domains from the lattice D. The domainλ(v) is called the label of node v. The pair (T, λ) is called a labeled tree. Byne(v) we denote the neighbors of node v, i.e. ne(v) = w ∈ V : v, w ∈.When we eliminate a node v from the tree (together with the edges incidentto v), then a family of subtrees Tv,w = (Vv,w, Ev,w) : w ∈ ne(v) of T arecreated. Further, for any subset U ⊆ V of nodes define

λ(U) = ∨v∈Uλ(v).

This leads then to the following definition:

Definition 7.3 Markov Tree. A labeled tree (T, λ) with T = (V,E) iscalled a Markov Tree, if ∀v ∈ V ,

⊥λ(Vv,w) : w ∈ ne(v)|λ(v). (7.5)

Page 136: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

136 CHAPTER 7. INDEPENDENCE OF INFORMATION

The term Markov Tree has been introduced in (Shafer et al., 1987) withrespect to partition lattices and for the purpose of efficient information pro-cessing. We refer also to (Kohlas & Monney, 1995) for a discussion of thisconcept in the context of families of compatible frames. We prove two funda-mental theorems, whose proofs are adapted from (Kohlas & Monney, 1995).

The first theorem tells us that there is conditional independence betweenthe domains of node v and the domains covered by a neighboring subtreeTv,w, given, the doamin of the associated neighbor w.

Theorem 7.7 If (T, λ) is a Markov tree, then, ∀w ∈ ne(v),

λ(Vv,w)⊥λ(v)|λ(w). (7.6)

Proof. For w ∈ ne(v), the Markov property (7.5) reads

⊥λ(Vw,u) : u ∈ ne(w)|λ(w).

Note further that

λ(Vv,w) =∨

u∈ne(w)−v

λ(Vw,u) ∨ λ(w).

Thus, using Theorem 7.4 (4) we derive∨u∈ne(w)−v

λ(Vw,u)⊥λ(Vw,v)|λ(w).

Further, by Theorem 7.4 (5) we obtain that

λ(Vv,w)⊥λ(Vw,v)|λ(w).

Finally, since λ(v) ≤ λ(Vw,v), we conclude (7.6) from Theorem 7.4 , whichproves the theorem. ut

The fact, expressed in the next theorem, that any subtree of a Markovtree remains a Markov tree, is important for purposes of induction andrecursion, as will be seen later.

Theorem 7.8 If (T, λ) is a Markov tree, then every subtree is also aMarkov tree.

Proof. Let T ′ = (V ′, E′) be a subtree of T and λ′ the restriction of λto V ′. We prove that (T ′, λ′) is a markov tree, if (T, λ) is a Markov tree.Consider a node v ∈ V ′ and let ne′(v) be the set of its neighbors in T ′. Alsoconsider the subtrees T ′v,w for w ∈ ne′(v) relative to the tree T ′ and let V ′v,wbe the node sets of these subtree. Note that ne′(v) ⊆ ne(v) and V ′v,w ⊆ Vv,w,

Page 137: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

7.2. FACTORIZATION OF INFORMATION 137

such that λ′(V ′v,w) ≤ λ(Vv,w) for all w ∈ ne′(v). Therefore, by (2) and (3) ofTheorem 7.4 we conclude that for all v ∈ V ′

⊥λ′(V ′v,w) : w ∈ ne′(v)|λ′(v).

This proves that (T ′, λ′) is a Markov tree. utThis Markov tree structure of conditional independence becomes some-

what simpler in the case of multivariate structures.

Example 7.5 Multivariate Structures: Join Trees. as usual in multivariatestructure we may work with the lattice of subsets of variables. The labelingfunction of a tree maps then nodes into sets. The label λ(v) of a node isthus a set of variables associated with this node, and λ(Vv,w for a neighborw ∈ ne(v) is the set of all variables contained in the label of one of the nodesof Vv,w. Then, a tree T = (V,E) with a labeling function λ is a Markov tree,if, for all nodes v ∈ V and neigbors w, u of v,

λ(Vv,w) ∩ λ(Vv,u) ⊆ λ(v). (7.7)

In the case of multivariate structures Markov trees are usually called jointrees, sometimes also junction trees. They can also be characterized in adifferent way:

Theorem 7.9 A couple (T, λ) of a tree T = (V,E) with a labeling functionin a lattice of subsets is a join tree if, and only if, ∀v, w ∈ V , X ∈ λ(v) andX ∈ λ(w) implies X ∈ λ(u) for all nodes u on the path from v to w (this iscalled the running intersection property).

Proof. Assume (T, λ) to be a join tree. Consider any node u on the pathfrom v to w and let Tu,r and Tu,s be the subtrees containg nodes v and wrespectively. If X ∈ λ(v) and X ∈ λ(w), then X ∈ λ(Vu,r and X ∈ λ(Vu,s).By (7.7) it follows that X ∈ λ(u). Thus the running intersection propertyis satisfied.

Conversley, assume that the runningintersection property is satisfied andfix a node v togeht with two neighboring nodes w and u. If X ∈ Tv,w andx ∈ Tv,u, then, by the running intersection property X ∈ λ(v). Thus (7.7)holds and (T, λ) is a join tree. ut

Join trees are conditional independence structures which are used inmany fields, including probabilistic networks (Lauritzen & Spiegelhalter,1988) and relational databases (Maier, 1983).

7.2 Factorization of Information

In this section we will relate conditional independence structure to certainfactorizations of information. This is motivated by fields like probability

Page 138: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

138 CHAPTER 7. INDEPENDENCE OF INFORMATION

theory and relational databases, where conditional independence is definedin terms of decompositions of information into certain factors. Also con-ditional independence and corresponding factorizations of information is,in the case of multivariate systems, related to certain graphical structures.The latter were a strong motivation, in domains like probabilistic expertsystems, to develop the theory of graphical representation of independencestructures.

The main point is that conditional independence of information is aninformation about information. It is meta-information. It tells us, thatthe information to be considered satisfies certain independence constraints,which are given by the semantics of the problem at hand. In probability net-works, for instance, one often starts qualitatively with specifying that certainvariables are conditional independent, given some other variables. This im-plies then that the only numerical probability distributions to be consideredsatisfy these contraints, which can be expressed by a certain factorizationof the distribution (see for example (Shafer, 1996; Cowell et al., 1999)). Asimilar situation arises in relational databases. This will be discussed in thefollowing Section 7.3. We are going to capture this idea by considering con-ditionally independent domains and then constraining information relativeto these domains to respect this conditional independence.

Let thus (Φ, D) be an information algebra, and x⊥y|z in D. Then wedefine

Φx⊥y|z = φ ∈ Φ : φ→x∨y∨z = ψ1 ⊗ ψ2, d(ψ1) = x ∨ z, d(ψ2) = y ∨ z.

This delimits all pieces of information, which respect the conditional inde-pendence of the domains in a sense which generalizes the familiar notionof conditional independence in probabilty theory (Cowell et al., 1999). Thegeneralization is twofold: First, we do not limit ourselves to lattices of sub-sets of variables as usual in probaiblity theory, but consider general latticesD of domains, second we do not limit ourselves to a particular formalismlike probability theory, but try to work out the general, generic propertiesfor information algebras. The latter point however means that idempotencyis added, a property not present in probability theory. For a general theorywithout this additional property, i.e. a theory of conditional independy invaluation algebras, see (Kohlas, 2003).

Theorem 7.10 If x⊥y|z, then ∀φ ∈ Φx⊥y|z

φ→x∨z = ψ1 ⊗ ψ↓z2 ,

φ→y∨z = ψ↓z1 ⊗ ψ2,

φ→z = ψ↓z1 ⊗ ψ↓z2 . (7.8)

Page 139: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

7.2. FACTORIZATION OF INFORMATION 139

Proof. Here we use the combination axiom and (x ∨ z) ∧ (y ∨ z) = z toobtain

φ→x∨z = (φ→x∨y∨z)↓x∨z = (ψ1 ⊗ ψ2)↓x∨z

= ψ1 ⊗ ψ↓(x∨z)∧(y∨z)2 = ψ1 ⊗ ψ↓z2 . (7.9)

The second identity of the theorem is obtained in the same way. Then, usingthese results and once more the combination axiom

φ→z = (φ→x∨z)↓z = (ψ1 ⊗ ψ↓z2 )↓z

= ψ1↓z ⊗ ψ↓z2 .

utIf x⊥y|z, then, for the combination of an information relative to x with

another one relative to y, only the parts relative to z are relevant. Thisresult is particularly important for computational purposes. If an infor-mation is given as a combination over conditionally independent domains,then in order to focus it to the conditioning domain, it is not necessaryto first compute the combination. It is sufficient to focus both factors andthen combine them. This will be exploited in part IIII for so-called localcomputation schemes.

We have the following fundamental theorem about factorizations:

Theorem 7.11 If x⊥y|z, then ∀φ ∈ Φx⊥y|z

φ→x∨y∨z = φ→x∨z ⊗ φ→y∨z. (7.10)

Proof. By idempotency we have

φ→x∨y∨z = ψ1 ⊗ ψ↓z2 ⊗ ψ↓z1 ⊗ ψ2.

Then (7.10) follows from (7.11). utIn relational database theory similar properties are called lossless decom-

positions (Maier, 1983; Date, 1975b)The results above can be generalized to families of conditional indepen-

dent domains. Assume ⊥x1, . . . , xn|z and define

Φ⊥x1,...,xn|z

= φ ∈ Φ : φ→x∨...∨xn∨∨z = ψ1 ⊗ · · ·ψn, d(ψi) = xi ∨ z, i = 1, . . . , n.

Then the following irrelevance result holds:

Theorem 7.12 If ⊥x1, . . . , xn|z, then ∀φ ∈ Φ⊥x1,...,xn|z

φ→z = ψ↓z1 ⊗ · · · ⊗ ψ↓zn . (7.11)

Page 140: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

140 CHAPTER 7. INDEPENDENCE OF INFORMATION

Proof. The conditional independence relation ⊥x1, . . . , xn|z impliesalso the relations x1⊥x2 ∨ . . . ∨ xn|z and ⊥x2, . . . , xn|z. Let

ψn1 = ψ2 ⊗ · · · ⊗ ψn. (7.12)

Note that d(ψn1 ) = x2 ∨ . . . ∨ xn ∨ z. Then, by Theorem 7.10 we obtain

φ↓z = ψ↓z1 ⊗ ψn↓z1 .

The identity (7.11) follows from this by recursion. utThe result on lossless decomposition, Theorem 7.11, generalizes also:

Theorem 7.13 If ⊥x1, . . . , xn|z, then ∀φ ∈ Φ⊥x1,...,xn|z

φ→x1∨...xn∨z = φ→x1∨z1 ⊗ · · · ⊗ φ→xn∨zn . (7.13)

Proof. Let ψn1 be defined as in (7.12), then, since ⊥x1, . . . , xn|z impliesx1⊥x2 ∨ . . .∨ xn|z, hence (x1 ∨ z)∧ (x2 ∨ . . .∨ xn ∨ z) = z, it follows formthe combination axiom that

φ→x1∨z = ψ1 ⊗ ψn↓z1 .

But since ⊥x1, . . . , xn|z implies also ⊥x2, . . . , xn|z, we conclude by The-orem 7.12 that

ψn↓z1 = ψ↓z2 ⊗ · · · ⊗ ψ↓z2 .

We obtain similar results for i = 2, . . . , n,

φ→xi∨z = ψi ⊗ ψn↓zi ,

where ψni is the combination of all ψi from 1 to n, except ψi. Therefore,using idempotency, we obtain

φ→x1∨...xn∨z = (ψ1 ⊗ ψn↓z1 )⊗ · · · ⊗ (ψn ⊗ ψn↓zn ) = φ→x1∨z ⊗ · · · ⊗ φ→xn∨z.

The concludes the proof. utWe may of course apply these results to Markov trees. For instance, if

M = (T, λ) is a Markov tree we define

ΦM = φ ∈ Φ : φ→∨v∈V λ(v) =⊗v∈V

ψv, d(ψ(v)) = λ(v).

Then the following irrelevance theorem holds:

Page 141: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

7.3. ATOMIC ALGEBRAS AND NORMAL FORMS 141

Theorem 7.14 Let M = (T, λ) be a Markove tree, with T = (V,E). Ifφ ∈ ΦM , i.e.

φ→∨v∈V λ(v) =⊗v∈V

ψv, with d(ψ(v)) = λ(v),

then, ∀v ∈ V ,

φ→λ(v) = ψv ⊗⊗

w∈ne(v)

φ→λ(v) = ψv ⊗⊗

w∈ne(v)

(φ↓λ(w)v,w )→λ(v), (7.14)

where

φv,w =⊗

u∈λ(Vv,w)

ψu,

Proof. In a Markov tree (T, λ) we have by definition ⊥λ(Vv,w : w ∈ne(v)|λ(v). Note then that, by idempotency

φ→∨v∈V λ(v) = ψv ⊗⊗

w∈ne(v)

φv,w =⊗

w∈ne(v)

(ψv ⊗ φv,w)

This shows that φ ∈ Φ⊥λ(Vv,w:w∈ne(v)|λ(v). It follows therefore from Theo-rem 7.12 that

φ→λ(v) =⊗

w∈ne(v)

(ψv ⊗ φv,w)↓λ(v).

Applying the combination axiom and once more idempotency, we obtainfrom this

φ→λ(v) =⊗

w∈ne(v)

(ψv ⊗ φ

↓λ(v)∧λ(Vv,w)v,w

)= ψv ⊗

⊗w∈ne(v)

φ→λ(v).

This proves the first part of (7.14). Further, by Theorem 7.7 we have alsoλ(Vv,w)⊥λ(v)|λ(w) This implies then (see Theorem 7.1)

φ→λ(v)v,w = (φ→λ(w)

v,w )→λ(v).

This proves the second part of (7.14). utThis irrelevance theorem proves to be the base of local computation, see

part III.

7.3 Atomic Algebras and Normal Forms

Page 142: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

142 CHAPTER 7. INDEPENDENCE OF INFORMATION

Page 143: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 8

Measure of Information

8.1 Information Measure

In the literature the term “information theory” is far almost exclusively usedas a theory for measurement of information content. This is clearly the casefor Shannon’s theory of information (Shannon, 1948), where the concept ofentropy is used to measure information content. It is also the case for algo-rithmic information theory (Chaitin, 2001; Kolmogorov, 1965; Kolmogorov,1968; Li & Vitanyi, 1993) where the length of a shortest program is for com-puting an object is considered as measure of the information content of theobject. In statistics still other measures are proposed (Kullback, 1959). Anewer theory going beyond these classical measures is proposed in (Kahre,2002) which allows for many different measures as long as they satisfy thelaw of diminishing information.

Essentially we shall base our considerations on Shannon’s concept of in-formation, where information is measured by the amount it reduces uncer-tainty. Shannon’s theory is a statistical one, where uncertainty is measuredby entropy. At this point however, no probabilistic elements are present inour theory of information algebra. Therefore, the measure of uncertainty weshall use is Hartley’s measure rather than entropy.

We expect our quatitative measure of information to respect the partialorder of information introduced in Chapter 4.

8.2 Dual Information in Boolean Information Al-gebras

143

Page 144: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

144 CHAPTER 8. MEASURE OF INFORMATION

Page 145: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 9

Topology of informationAlgebras

9.1 Observable Properties and Topology

It is noted in (Smyth, 1992) that topology ties up with the idea of observableproperties. In more concrete terms, in (Smyth, 1992) a device which outputsbinary symbols is considered. An observer inspects the output sequence asit proceeds. He wants to detect certain properties of the output sequence,but his decision whether a property is present or not, must depend on thefinite sequence output so far. Obviously there are properties of the outputsequence which can can never be decided on the base of finite sequencesonly: For example the property the there s a finite number of occurencessymbol ′′0“ only. On the other hand, a property like that there are at leasthundert occurences of the symbol ′′0“ can be detected in a finite sequenceby the observer. Note however that the absence of this property cannot bedetected in finite sequences. We do not require that both the presence andabsence of a property be observable.

Technically, the condition of finite observability of a property A in abinary output sequence may be stated as follows: We consider infinite se-quences σ ∈ 0, 1∞ of binary symbols. An observable property A is thena subset of sequences in 0, 1∞ such that if σ ∈ A, then there is some fi-nite initial segment σ↓n of n symbols such that any infinite extension of σ↓n

belongs to A. This says that if property A holds on some output sequence,then some finite initial segment of the sequence is sufficient to detect theproperty A.

From this consideration two important closure properties of observableproperties are derived in(Smyth, 1992):

1. Finite Conjunction: Suppose that A1, . . . , An are observable proper-ties, and A1∩ . . .∩An holds on a sequence σ. Then σ ∈ A1∩ . . .∩An isclearly secured in a finite initial subsequence of σ, since each property

145

Page 146: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

146 CHAPTER 9. TOPOLOGY OF INFORMATION ALGEBRAS

A1, . . . , An is so. Hence, if A1, . . . , An are observable properties, thenσ ∈ A1 ∩ . . . ∩An is observable too.

2. Arbitrary Disjunction: Suppose A is a collection of observable prop-erties.If σ ∈ ∪A, then σ ∈ A for some A ∈ A. Since A is observable,σ ∈ A is detected in some finite initial sequence. But then σ ∈ ∪A issecured too. Thus A is observable too.

But this are exactly the properties of open sets in a topology. In other words,observable properties as defined above form the open sets of a topology inthe set σ ∈ 0, 1∞ of infinite binary sequences.

Somehow (finitely) observable properties remind of (finite) information.In fact, topology enrich domain theory very much, as is well known. Do-main theory is about information, although concentrating on approximationissues. Information algebras are about information in more respect, addinginformation aggregation and information extraction, but otherwise closelyrelated to domains. So it will be worthwhile to examine topology in relationto information algebra, and this is what will be done in this chapter.

To start with, we put the example above into a more formal perspectiveoffered by considering strings as elements of an information algebra.

Example 9.1 An Algebra of Strings.

9.2 Alexandrov Topology of Information Algebras

9.3 Scott Topology of Compact Information Alge-bras

9.4 Topology of Labeled Algebras

9.5 Representation Theory of Information Alge-bras

Page 147: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 10

Algebras of Subalgebras

10.1 Subalgebras of Domain-free Algebras

10.2 Subalgebras of Labeled Algebras

10.3 Compact Algebra of Subalgebras

10.4 Subalgebras of Compact Algebras

147

Page 148: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

148 CHAPTER 10. ALGEBRAS OF SUBALGEBRAS

Page 149: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Part II

Local Computation

149

Page 150: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof
Page 151: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 11

Local Computation onMarkov Trees

11.1 Problem and Complexity Considerations

Information comes usually in pieces. It must be combined and then focusedonto the domains of interest. This situation can be formulated within an ab-stract information algebra (?). Let therefore (Φ, D) be an iformation algebrawith information elements φ ∈ Φ) and a lattice of domains D. Suppose npieces of information φi, i01, . . . , n given. The combined, total informationis then

φ = φ1 ⊗ · · · ⊗ φn.

Suppose s ∈ D is a domain of interst, representing a question of interest. Ifwe want to extract information relative domain s, then we focus φ to s, i.e.we compute

φ→s.

If we view this problem from a computational point of view, then thestraight forward approach is to compute sequentially the partial combina-tions

φ1 ⊗ · · · ⊗ φi

for i = 2, . . . , n and then, at the end, the focus φ→s. These partial combi-nations yield informations on increasing domains by the labeling axiom:

d(φ1) ≤ d(φ1) ∨ d(φ2) ≤ · · · ≤ d(φ1) ∨ d(φ2) ∨ · · · ∨ d(φi) ≤ · · · .

At the end we arrive at the domain d(φ1)∨d(φ2)∨ · · ·∨d(φn). Now we mayassume that both the storage required as well as the time of elementary op-erations like combination and projection growths with increasing domains

151

Page 152: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

152 CHAPTER 11. LOCAL COMPUTATION ON MARKOV TREES

(see examples below). The growth may even be so, that storage or com-putation on big domains become infeasible. This indicates that the naive,straight forward solution of the projection problem posed above becomesinefficient or even infeasible. It is also conceivable that in practical prob-lems the prieces of information φi are stored on distributed processors. Thedirect method just described may then also generate costly or infeasiblecommunication costs between processors.

To clarify the situation let us consider some instances of informationalgebra which have been introduced in (?)

Example 11.1 Subsets and Indicator Functions

Example 11.2 Propositional Logic

Example 11.3 Relational Databases

Example 11.4 Partitions

Example 11.5 Linear Systems and Linear Manifolds

Example 11.6 Convex Polyhedra

In a simplyfing view we may, at least in some cases, assume that latticeD decomposes into two disjoint subsets Df and Di, where for x ∈ Df andy ∈ Di we never have y ≤ x. Intuitively in Df we have the smaller domainson which storing information and computing is feasible (whence index f),whereas Di contains the larger domains where storing and computation be-comes infeasible (whence index i). In a multivariate system we may, forexample, admit that for domains with up to n variables storage and compu-tation is still feasible, whereas for domains with more than n variables thisbecomes infeasible.

We need then computations with intermediate results which never leeavethe feasible part Df of the lattice. We shall see that, if the domains d(ψi aswell as the query domain s of a projection proble as defined above, belongall to Df , then in many cases we can arrange computations so as to satisfythese feasibility requirements. This is true even if the domain of the totalinformation

d(φ1) ∨ d(φ2) ∨ · · · ∨ d(φn) ≤ · · · .

Page 153: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

11.2. MARKOV TREES 153

is in not feasible,, i.e. belongs to Di. This is achieved by so-called localcomputation techniques. They make use of conditional independence rela-tions and the corresponding factorization theorem (see Theorem 17 in (?)).These conditional independence relations are in turn associated with cer-tain tree structures, called Markov trees or, in more particular cases, alsojoin trees. In the present Chapter we define the concept of Markov treesand develop the local computation techniques based on this concept. Inthe following Chapter ?? the particular case of multivariate systems is con-sidered. There are specific methods and results for this case. Next localcomputation in Boolean information algebras is studied. In this case thebasic projection problem may be generalized. Also certain information alge-bras like atomic ones can be embedded into Boolean information algebras,and special representations (normal forms) of information are then available.Local computation can then be executed using these normal forms.

11.2 Markov Trees

In this section we introduce the concept of a Markov tree. This is a graph-ical structure, representing an organization of domains, which proves fun-damental for local computation techniques. In the multivariate setting itcorresponds to the concepts of join, junction trees or hypertrees, which arewell known in different fields like probabilistic networks, relational databasesand constraint solving. This particular case will be treated in the followingChapter 12.

Let T = (V,E) be a tree with set of nodes V and edges E ⊆ V 2, whereV 2 is the family of all two-element subsets of V . We recall a certain numberof notions related to trees: A path between two nodes v, u ∈ V is a sequenceof distinct nodes v1, v2, . . . , vm from V , such that v1 = v, vm = u andvi, vi+1 ∈ E. In a tree there is always exactly one path between any pairof nodes. A node u is a neighbor of another node v ∈ V , if u, v ∈ E. Ina tree, every node has at least one neighbor. The set of all neighbors of anode v is denoted by ne(v). If a node v has only one neighbor, it is calleda leaf. A tree always has leafs. A subtree of a tree T = (V,E) is a treeT ′ = (V ′, E′) with V ′ ⊆ V and E′ ⊆ E. If a node v ∈ V is eliminated froma tree (together with all edges it is contained in), then the tree decomposesinto one or several (sub-) trees, one for every neighbor u ∈ ne(v). We denotethese subtrees be Tvu = (Vvu, Ev.u). Pathes in T between nodes in differentsubtrees Tvu pass through node v.

Let now D be any lattice (of domains). If T0(V,E) is a tree, we introducea mapping λ : V → D, which assigns a domain λ(v) to every node v ∈ V .This is called a labeling function. If U is a subset of nodes, U ⊆ V , then we

Page 154: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

154 CHAPTER 11. LOCAL COMPUTATION ON MARKOV TREES

define

λ(V ) =∨u∈U

λ(u).

Now we are in a position to define the concept of a Markov tree relative tosuch a lattice as follows:

Definition 11.1 Markov Tree. Let T = (V,E) be a tree and λ : V → D alabeling function, which assigns a domain from D to every node in V . ThenTM = (V,E, λ) is called a Markov tree, if

⊥λ(Vvu) : u ∈ ne(v)|λ(v), ∀v ∈ V. (11.1)

We recall from (?) the definition of conditional independence. So, if ne(v)contains two or more nodes, and U1 and U2 are disjoint subsets of ne(v),then

λ(∨u∈U1Vvu)⊥λ(∨u∈U2Vvu)|λ(v).

In particular, if w1 ∈ Vvu1 and w2 ∈ Vvu2 are nodes in two different subtrees,then

λ(w1)⊥λ(w2)|λ(v). (11.2)

We remark also that in the case of a single neighbor (i.e. for leafs), condition(11.1) is satisfied. We remind also that always ⊥∅|z holds.

If D is the lattice of subsets of a reference set S, then a Markov treewith reference to this lattice is also called a join tree. This is the case inmultivariate systems, as discussed later in Chapter 12. In this case (11.2)becomes

λ(w1) ∩ λ(w2) ⊆ λ(v). (11.3)

And this condition is now equivalent to (11.1). Therefore, a labeled tree isa join tree, if for any pair of nodes w1 and w2 condition (11.3) holds for anynode on the path between w1 and w2. The condition (11.3) is also called therunning intersection property. We shall see later that join trees are in manyrespects easier to handle than Markov trees. But they are also less general.

We formulate and prove two basic theorems related to Markov trees.

Theorem 11.1 Any subtree of Markov tree is also a Markov tree.

Proof. Let TM = (V,E, λ) be a Markov tree and T ′ = (V ′, E′, λ′) asubtree of TM such that V ′ ⊆ V , E′ ⊆ E and λ′ is the restriction of λ to V ′.Consider a node v ∈ V ′ and define ne′(v) and V ′vu relative to the subtreeT ′M . Then we have ne′(v) ⊆ ne(v) and V ′vu ⊆ Vvu for all u ∈ ne′(v). This

Page 155: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

11.2. MARKOV TREES 155

implies also λ′(V ′vu) ≤ λ(Vvu) for all u ∈ ne′(v). But then the independencerelation (11.1) implies that (Theorem 16 (2) in (?))

⊥λ(Vvu) : u ∈ ne′(v)|λ(v), ∀v ∈ V ′.

This in turn implies

⊥λ′(V ′vu) : u ∈ ne′(v)|λ′(v),∀v ∈ V ′

since λ′(v) = λ(v) and λ′(V ′vu) = λ(V ′vu) ≤ λ(Vvu). This shows that T ′M is aMarkov tree. ut

Another theorem derives for Markov trees a new independence relation,which is important for local computation (see next Section 11.3

Theorem 11.2 Let TM = (V,E, λ) be a Markov trre. Then ∀v ∈ V and∀u ∈ ne(v), it holds that

λ(Vvu)⊥λ(v)|λ(u).

Proof. By the Markov property we have

⊥λ(Vuw : w ∈ ne(u)|λ(u)

and

λ(Vvu) =∨

w∈ne(u)\v

λ(Vuw) ∨ λ(u). (11.4)

Using Theorem 16 (4) in Chapter 5 of (?)) we deduce∨w∈ne(u)\v

λ(Vuw)⊥λ(Vuv)|λ(u).

By Theorem 16 (5) in Chapter 5 of (?)) we can then conclude, using (11.4),that

λ(Vvu)⊥λ(Vuv)|λ(u).

Finally, since λ(v) ≤ λ(Vuv) we infer with the aid of Theorem 16 (3) inChapter 5 of (?)) that

λ(Vvu)⊥λ(v)|λ(u)

and this concludes the proof. utGiven a Markov tree TM = (V,E, λ), we consider now information φ ∈ Φ,

which factors according to the nodes of a Markove tree. Thus we define

ΦM = φ ∈ Φ : φ→λ(V ) =⊗v∈V

ψv, d(ψv = λv.

Page 156: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

156 CHAPTER 11. LOCAL COMPUTATION ON MARKOV TREES

We shall examine in the next Section 11.3 the problem of computing φ→v

for one or all nodes v ∈ V and for φ ∈ ΦM . Note that for all nodes v ∈ V

φ→λv = (φ→λ(V ))↓λ(v) = (⊗v∈V

ψv)↓λ(v).

So, it is actually the problem of projection a certain factorization accordingto a Markov tree to one or all nodes of the tree. We call this a projectionproblem. This problem has a so-called local computation solution, whichorganizes the computatuion in such a way that never information on do-mains larger than the domains λ(v) have to be stored and never operations,i.e. combinations and projections on domains larger than λ(v) have to beexecuted, see Section 11.3. So, if all λ(v) belong to the feasible domainsDf , the problem is efficiently solvable, although the domain of φ itself or ofλ(V ) may be in Di.

For φ ∈ ΦM the following basic factorization theorem holds:

Theorem 11.3 Markov Tree Factorization Theorem. Let TM = (V,E, λ)be a Markov tree. Then, ∀φ ∈ ΦM and ∀v ∈ V ,

φ→λ(v) = ψv ⊗

⊗u∈ne(v)

(φ↓λ(u)vu )→λ(v)

, (11.5)

where

φvu =⊗w∈Vvu

ψw.

Proof. Note first, that because of idempotency we have

φ = ψv ⊗

⊗u∈ne(v)

( ⊗w∈Vvu

ψw

)=

⊗u∈ne(v)

(ψv ⊗

⊗w∈Vvu

ψw

)This shows that φ ∈ Φ⊥λ(Vvu):u∈ne(v)|λ(v), since d(ψv ⊗

⊗w∈Vvu ψw) =

λ(v) ∨ λ(Vvu). Therefore, Theorem 17 in (?) implies that

φ→λ(v) =⊗

u∈ne(v)

(ψv ⊗

⊗w∈Vvu

ψw

)↓λ(v)

=⊗

u∈ne(v)

ψv ⊗( ⊗w∈Vvu

ψw

)↓λ(v)∧λ(Vvu)

Page 157: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

11.2. MARKOV TREES 157

= ψv ⊗⊗

u∈ne(v)

φ↓λ(v)∧λ(Vvu)uv

= ψv ⊗⊗

u∈ne(v)

φ→λ(v)uv .

Next, we remark that by Theorem 11.2 λ(Vvu)⊥λ(v)|λ(u). Therefore Theo-rem 14 in (?) shows that φ→λ(v)

uv = (φ→λ(u))→λ(v). So we obtain finally

φ→λ(v) = ψv ⊗

⊗u∈ne(v)

(φ↓λ(u)vu )→λ(v)

, (11.6)

and this concludes the proof. utThis theorem provides the base for local computation procedures. in

fact, according to (11.6), once the φ↓λ(u)vu is computed for all neighbors u of

node v, the remaining combinations and Projections take place within thedomain λ(v). But, since the subtrees Tvu = (Vvu, Evu, λ) are still markovtrees, for the computation of any of the φ↓λ(u)

vu we have a similar factorizationtheorem, such that, by recursion, all combinations and projections finallycan be constrained to domains λ(w) of the nodes w of the Markov tree. Thiscomputational scheme will be worked out in detail in Section 11.3.

The membership of φ to a ΦM for some Markov tree TM may appear tobe very restrictive. It is however much less so that it appears at first sight.In application we may assume that a total information φ is given by pieces,say φ1, . . . , φn with domains d(φi = xi, such that

φ = φ1 ⊗ · · · ⊗ φn.

On the other hand we may be interested in focusing the total information φto a certain number of domains s1, . . . , sm. We may assume without loss ofgenerality that

si ≤ z =n∨j=1

xj , for i = 1, . . . ,m.

Otherwise we could replace si by s′i = si ∧ z, and focus φ to s′i. Theinformation φ→si equals then simply the vacuous extension (φ↓s

′i)↑si .

A Markov tree TM is called a covering Markov tree for the domainsxj , j = 1, . . . , n and si, i = 1, . . . ,m, if for all j ∈ J = j = 1, . . . , n thereis a v ∈ V such that xj ≤ λ(v) and for all i ∈ I = 1, . . . ,m there is a vsuch that si ≤ λv. Note that there exists always a trivial covering Markovtree, namels the tree consisting of a single node v with v equal the join ofall xj and all si. There exists now an assignment mapping

a : J ∪ I → V

Page 158: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

158 CHAPTER 11. LOCAL COMPUTATION ON MARKOV TREES

such that xj ≤ λ(a(j)) for all j ∈ J and xi ≤ λ(a(i)) for all i ∈ I. Now wedefine for all v ∈ V

si ≤ z =n∨j=1

xj , for i = 1, . . . ,m.

Note then, that

d(ψ(v)) = λ(v), ∀v ∈ V

and furthermore

φ→λ(V ) =⊗v∈V

ψv.

Thus, we have φ ∈ ΦM .This means that we can apply local computation techniques with re-

spect to the covering Markov tree TM . The projection of φ to a domain siis then simply given by the projection of φ→λ(a(i)) to si. This is again alocal operation. However, with the trivial covering Markov tree, nothing isgained. Everything depends on finding a good covering Markov tree, withnode domains at least all in Df and, more generally, being as small as pos-sible. Techniques for finding such covering Markov trees will be examinedin Chapter 12 for the particular case of multivariate systems.

11.3 Local Computation

We have seen in the previous Section, that the Markov Tree FactorizationTheorem 11.3 provides a recursive procedure for computing φ→λ(v) for somenode v od a Markov tree TM , if φ ∈ ΦM . In this scheme every combinationor projection operation takes place on a domain λ(w) of the Markov tree.In the section we are going to work out this procedure in detail.

In a tree T = (V,E) with m = |V | nodes it is always possible to numberthe nodes from i = 1 to m, such that a selected node v obtains number mand from any node u the numbers increase on the path to v. We identifyfrom now on the nodes by their number i and write accordingly λ(i) for thedomain associated with it. The set of neighbor nodes of node i with numberj < i is denoted by pa(i) (pa for parents). If i is a leaf, then pa(i) = ∅. Ifi < m, then node i has a unique neighbor with number j > i. It is denotedby ch(i) (ch for child). The root node m has no child. We denote further asbefore by Ti = (Vi, Ei) the subtrees of T , rooted in nodes i. Define

φi =⊗k∈Vi

ψk

Page 159: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

11.3. LOCAL COMPUTATION 159

Note that the tree Ti is a Markov tree. Hence, the Markov Tree FacorizationTheorem 11.3 tells us that for φ ∈ ΦM

φ↓λ(i)i = ψi ⊗

⊗j∈pa(i)

(φ↓λ(j)j )→λ(i)

. (11.7)

We may now describe the computation represented by this result in thefollowing way: We associate with any node j a memory and consider astepwise algorithm for i = 1, . . . ,m − 1. The memory content at node jbefore step i is denoted by ψ(i)

j . At the beginning, for i = 1, we set ψ1j = ψj .

At step i, node i sends a message

µi→ch(i) = ψ(i)↓λ(i)∧λ(ch(i)i

to its child ch(i). The receiving node ch(i) updates its memory content to

ψ(i+1)ch(i) = ψ

(i)chi ⊗ µi→ch(i).

For all other nodes j 6= ch(i) we put

ψ(i+1)j = ψ

(i)j .

We claim that at the end, after step m− 1, node m contains φ→λ(m).

Theorem 11.4 Let φ ∈ ΦM and define ψ(i)j as above. Then

ψ(m)m = φ→λ(m). (11.8)

Proof. We remark that ψ(i)i = φ

↓λ(i)ch(i),i. Indeed, this holds clearly for i = 1.

Assume it holds for j < j. Then by (11.7) and the stepwise procedure aboveit holds for i too and the proposition is proved by induction. Then (11.8)follows for i = m, since φm = φ. ut

This gives us then a stepwise algorithm to compute φ→λ(m). At all stepsin this procedure we combine or project elements on node domains λ(i). Atno step do we need to go beyond these domains. Therefore, this procedureis called local computation. If, for all nodes i of the Markov tree we haveλ(i) ∈ Di, local computation provides for a feasible procedure. In particular,we call this local algorithm to compute φ→λ(m) the collect algorithm, sinceit collects all information in order to compute its projection to λ(m).

We may extend this algorithm to compute φ→λ(i) for all nodes i of theMarkov tree. In fact, if we want to compute φ→λ(i) for some node i 6= m,we may redirect the the edges of the Markov tree to node i and repeat thecollect algorithm with these new orientations of the edges. Clearly, only themessages on the edges from node m to node i have to be newly computed.

Page 160: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

160 CHAPTER 11. LOCAL COMPUTATION ON MARKOV TREES

Only thes edges have changes direction. So, the messages along all otheredges do not change. We may therefore assign the two messages µi→j andµj→i to an edge i, j of the markov tree. The messages themselves areaccording to the collect algorithm defined as

µi→j =

ψi ⊗ ⊗k∈ne(i),k 6=j

µk→i

λ(i)∧λ(j)

.

As a corollary of the correctness of the collect algorithm, Theorem 11.4, itfollows that

φλ(i) = ψi ⊗⊗

j∈ne(i)

µj→i.

We claim now that

µi→j ≤ φλ(i). (11.9)

This holds since from the collect algorithm we have µi→j = ψ(i)λ(i)∧λ(j)i ≤

φλ(i)∧λ(j) ≤ φλ(i). But then we obtain (using the combination axiom) that

φ↓λ(i)∧λ(j) =

ψi ⊗ ⊗k∈ne(i)

µk→i

λ(i)∧λ(j)

=

ψi ⊗ ⊗k∈ne(i),k 6=j

µj→i

λ(i)∧λ(j)

⊗ µj→i

= µi→j ⊗ µj→i.

Assume then, that in node i we have stored φ↓λ(i). If we send the messageφ↓λ(i)∧λ(j) to node j and combine there this message with the informationstored from the collect algorithm, then we obtain

ψ(j)j ⊗ φ

↓λ(i)∧λ(j)

= ψ(j)j ⊗ µi→j ⊗ µj→i

=

ψi ⊗ ⊗k∈ne(i)

µk→i

⊗ µj→i= φ↓λ(j) ⊗ µj→i= φ↓λ(j)

according to (11.9).This result shows that, once the collect algorithm to some root node

m terminated, we may just send messages φ↓λ(i)∧λ(j) from node i to all its

Page 161: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

11.3. LOCAL COMPUTATION 161

parents j ∈ pa(i) and combine it with the stored information ψ(j)j in these

nodes. This will give us all projections φ↓λ(j) in all nodes of the Markovtree. This second local algorithm is called distribute algorithm. Hence wecan compute all these projections to all nodes of the Markov tree by justpassing through each node (except the root node m) twice and sendingmessages to all it neighbors in due order.

The following lemma shows now that at each time during collect, as wellas during the distribute algorithm the information φ factors into the storagesat the nodes of the Markov tree.

Lemma 11.1 Let M be a Markov tree and φ ∈ ΦM . Then, ∀i = 1, . . .m

φ =m⊗j=1

ψ(i)j (11.10)

=i⊗

j=1

ψ(j)j ⊗

m⊗j=i+1

φ↓λ(j). (11.11)

Proof. We prove the first part by induction over the steps of the collectalgorithm. At the beginning of the collect phase we have ψ(1)

j = ψj and(11.10) holds since φ ∈ ΦM . Assume then that it holds for some i < m,

φ =m⊗j=1

ψ(i)j =

m⊗j=1,j 6=ch(i)

ψ(i+1)j ⊗ ψ(i)

ch(i)

=m⊗

j=1,j 6=ch(i)

ψ(i+1)j ⊗ ψ(i)

ch(i) ⊗ µi→ch(i)

since µi→ch(i) = ψ(i)↓λ(i)∧λ(ch(i))i ≤ φ↓λ(i) ≤ φ. But then, as ψ(i)

ch(i) = ψ(i)ch(i) ⊗

µi→ch(i), this shows that

φ =m⊗j=1

ψ(i+1)j

In particular, at the end of the collect algorithm, we have

φ =m−1⊗j=1

ψ(j)j ⊗ φ

↓λ(m)

because φ(i)j = φ

(j)j for i ≥ j, i.e. the storage of node j does not change after

step j in the collect algorithm.Therefore, (11.11) holds at the beginning of the distribut algorithm, for

i = m − 1. We prove (11.11) by induction over the steps of the distribute

Page 162: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

162 CHAPTER 11. LOCAL COMPUTATION ON MARKOV TREES

algorithm. Assume (11.11) holds for some i. Then

φ =j⊗j=1

ψ(i)j ⊗

m⊗j=i+1

φ↓λ(j).

Select k such that k = ch(i), hence i < k. Then the storage of k containsφ↓λ(k), node k sends message φ↓λ(k)∧λ(i) to node i, whose storage ψ(i)

i changesto φ↓λ(i) according to the distribute algorithm. Furthermore we note thatψ

(i)i ≤ φ↓λ(i) ≤ φ. Therefore,

φ =i⊗

j=1

ψ(j)j ⊗ ψ

(i)i ⊗ φ

↓λ(i)m⊗

j=i+1

φ↓λ(j)

=i−1⊗j=1

ψ(j)j ⊗

m⊗j=i

φ↓λ(j).

This proves (11.11). utAs a corollary of (11.11) we note that (at the end of the distribute

algorithm), the information φ ∈ ΦM factors into its proejctions to the nodesof the Markov tree.

Corollary 11.1 Let M be a Markov tree and φ ∈ ΦM . Then

φ =m⊗j=1

φ↓λ(j). (11.12)

This is an important result. We remind in particular that in relationaldatabase theory normalization often requires that the base decomposes ac-cording to (11.12) (see Part I, Chapter 5). Clearly, applying the collect ordistribute algorithm to a factorization (11.12) does not change the storagecontents of the Markov tree, i.e. has no effect. As we shall see in the nextSection, this factorization allows for simple updating algorithms. Queryprocessing in relational databases can be reduced to updating. This is oneof the facets of this theorem.

11.4 Updating and Retracting

Once the projections of an information φ ∈ ΦM with respect to a Markovtree have been computed, a new piece of information may be added, andthe projections again be computed. In this section we show that, if the newinformation is covered by a node of the Markov tree, then we do not needto recompute the projections from scratch. In fact, the collect phase needsnot to be redone. Only a new distribute computation has to be added.

Page 163: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

11.4. UPDATING AND RETRACTING 163

According to Corollary 11.1 at the end of a collect/distribute computa-tion for a φ ∈ ΦM , the informationφ factors into its projections to the nodesof the Markov tree M . Assume that an additional piece of information ψmust be added, where the domain of ψ is covered by a node of the markovtree M . We take the corresponding covering node as the root of a directedtree. Without loss of generality we may assume the node m covers ψ, i.e.d(ψ) ≤ λ(m). Then we have the new information

φ′ = φ⊗ ψ = φ⊗ (ψ ⊗ eλ(m)).

Since d(ψ ⊗ eλ(m)) = λ(m) we obtain

φ′↓λ(m) = (φ⊗ (ψ ⊗ eλ(m)))

↓λ(m) = φ↓λ(m) ⊗ (ψ ⊗ eλ(m)) = φ↓λ(m) ⊗ ψ.

So, the updating of the projection in the root node m is done simply bycombing the new information with the old projection to the node domainλ(m).

We note that the messages µi→ch(i) in the direction to the root nodedo not change with this updating operation on the root node. This showsalready that the collect phase does not have to be recomputed. We show no,that we only need to recompute the distribute phase, starting with φ

′↓λ(m)

in the root node. Indeed, we have by Corollary 11.1 that

φ′ = φ⊗ ψ =m−1⊗i=1

φ↓λ(i) ⊗ φ′↓λ(m).

Clearly φ′ ∈ ΦM . So we may apply the collect/distribute algorithm in orderto compute the projections of φ′ with respect to the nodes of the Markovtree M . However, as noted above, in the collect phase nothing changes. Sowe may directly start with the distribute algorithm with the new content ofthe root node m.

Page 164: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

164 CHAPTER 11. LOCAL COMPUTATION ON MARKOV TREES

Page 165: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 12

Multivariate Systems

12.1 Local Computation on Covering Join Trees

12.2 Fusion Algorithm

12.3 Updating

165

Page 166: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

166 CHAPTER 12. MULTIVARIATE SYSTEMS

Page 167: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 13

Boolean Information Algebra

167

Page 168: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

168 CHAPTER 13. BOOLEAN INFORMATION ALGEBRA

Page 169: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 14

Finite Information andApproximation

169

Page 170: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

170 CHAPTER 14. FINITE INFORMATION AND APPROXIMATION

Page 171: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 15

Effective Algebras

171

Page 172: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

172 CHAPTER 15. EFFECTIVE ALGEBRAS

Page 173: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 16

173

Page 174: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

174 CHAPTER 16.

Page 175: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 17

175

Page 176: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

176 CHAPTER 17.

Page 177: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Part III

Information and Language

177

Page 178: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof
Page 179: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 18

Introduction

179

Page 180: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

180 CHAPTER 18. INTRODUCTION

Page 181: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 19

Contexts

19.1 Sentences and Models

19.2 Lattice of Contexts

19.3 Information Algebra of Contexts

181

Page 182: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

182 CHAPTER 19. CONTEXTS

Page 183: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 20

Propositional Information

183

Page 184: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

184 CHAPTER 20. PROPOSITIONAL INFORMATION

Page 185: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 21

Expressing Information inPredicate Logic

185

Page 186: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

186CHAPTER 21. EXPRESSING INFORMATION IN PREDICATE LOGIC

Page 187: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 22

Linear Systems

187

Page 188: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

188 CHAPTER 22. LINEAR SYSTEMS

Page 189: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 23

Generating Atoms

189

Page 190: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

190 CHAPTER 23. GENERATING ATOMS

Page 191: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 24

191

Page 192: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

192 CHAPTER 24.

Page 193: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Part IV

Uncertain Information

193

Page 194: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof
Page 195: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 25

Functional Models and Hints

25.1 Inference with Functional Models

In this section we examine, in guise of an introductory example, a partic-ular way how uncertain information could arise. This should serve as amotivation and illustration for the general, abstract model presented in thesubsequent sections. The example is drawn from statistics. Based on a func-tional model, which describes how data is generated in some experiment orprocess, an assumption-based inference approach is presented. This repre-sents the logical part of the inference. On top of this inference a probabilityanalysis allows to measure the likelihood of the possible deductions. Theinference can be subsumed in an object called a hint, which represents theinformation drawn from the experiment.

Functional models describe the process by which a data x is generatedfrom a parameter θ and some random element ω. The set of possible values,i.e. the domain, of the data x is denoted by X, whereas the domain ofthe parameter θ is denoted by Θ and the domain of the random elementω is denoted by Ω. Unless otherwise stated, we assume that the sets X,Θ and Ω are finite. This is in order to simplify the mathematics, but isnot a necessary assumption. The data generation process is specified by afunction

f : Θ× Ω→ X (25.1)

that relates the data with the perturbation and the parameter. In otherwords, if θ ∈ Θ is the correct value of the parameter and the random elementω ∈ Ω occurs, then the data x is uniquely determined by the function faccording to the equation

x = f(θ, ω). (25.2)

We assume that the function f is known and we also assume that the prob-ability distribution of the random element ω is known. This probabilitydistribution is denoted by p(ω) and the corresponding probability measureon Ω is denoted by P . Note that the probability measure P does not depend

195

Page 196: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

196 CHAPTER 25. FUNCTIONAL MODELS AND HINTS

on θ. The function f together with the probability measure P constitute afunctional model for a statistical experiment E .

In a functional model, if we assume a parameter θ, then, from the prob-abilities p(ω), we can compute the probabilities for the data x, namely

pθ(x) =∑

ω:x=f(θ,ω)

p(ω).

This shows that a functional model induces a parametric family of probabil-ity distributions on the sample space X, an object which is usually assumeda priori in modelling statistical experiments. These probability measures arethe statistical specifications associated with functional model. We emphasizethat different functional models may induce the same statistical specifica-tions pθ(x). This means that functional models contain more informationthan the family of distributions pθ(x).

Let’s illustrate the idea of functional models by two examples.

Example 25.1 Sensor. Consider a sensor which supervises a dangerousevent e. If this event occurs, it sends an alarm signal a. However, there is thepossibility that the sensor fails, a possibility denote by ω. To complete thenotation let’s denote the non-occurence of event e by ¬e, the non-occurenceof the alarm signal by ¬a and finally the case of an intact sensor by ¬ω.Now, we may assume that if event e occurs and the sensor is intact, thenthe alarm sounds. On the other hand, if event e did not occur and thesensor is intact, then no alarm occurs. Finally, if the sensor fails, no alarmis possible. If we consider the set X = a,¬a the observation space, the setΘ = e,¬e the parameter space and Ω = ω,¬ω the disturbance space,then these assumptions can be captured in the following functional model:

f(e,¬ω) = a, f(¬e,¬ω) = ¬a, f(e, ω) = f(¬e, ω) = ¬a.

If we assume a probability p for the failure of the sensor, p(ω) = p, then thefunctional model is complete.

Example 25.2 Poisson Process. Suppose that traffic is observed at a givenpoint over time. An average number of λ ·∆t cars are passing during a timeinterval ∆t, where λ is unknown. This corresponds to a Poisson processwhere the time interval between two subsequent cars is exponentially dis-tributed with parameter λ. In order to get information about the unknownparameter λ, we may fix a time interval of some given length, say ∆t, andthen observe the number of cars passing in this time interval. Here we havealready two ingredients of a functional model describing this process, namelythe parameter and the observation. The random elements ω are responsiblefor the random time intervals between passing cars. In fact, if ω is a ran-dom variable distributed according to the exponential distribution with unit

Page 197: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

25.2. ASSUMPTION-BASED INFERENCE 197

parameter, given by the density function

e−t, (25.3)

then µ = ω/λ is a random variable with exponential distribution with pa-rameter λ, which means a mean time of 1/λ between two subsequent cars.So, the count of cars passing during the time interval ∆t is given by thelargest value i such that

i∑k=1

µi ≤ ∆t. (25.4)

Here the µi are stochastically independent random variables, each with ex-ponential distribution with parameter λ. Thus, we may define the followingfunctional model for the counting x of passing cars in a Poisson process:

x = i, if1λ

i∑k=1

ωk ≤ ∆t <1λ

i+1∑k=1

ωk, for i = 0, 1, . . . . (25.5)

Here the ωk are assumed to be stochastically independent and exponentiallydistributed with parameter 1. This model explains how the counts in atime interval of length ∆t are generated. In fact, in this model, the randomelement ω is given by an infinite sequence of independent random variablesω1, ω2, . . . with all the same exponential distribution.

25.2 Assumption-Based Inference

Consider an experiment E described by a functional model x = f(θ, ω) withgiven probabilities p(ω) for the random elements. Suppose that the outcomeof the experiment is observed to be x. Given this data x and the experimentE , what can be inferred about the value of the unknown parameter θ ? Toanswer this question, we use assumption-based reasoning. The basic idea ofassumption-based reasoning is to assume that a random element ω generatedthe data and then determine the consequences on the unknown parameterwhich can be drawn from this assumption. The consequences about θ arethen evaluated according to the probabilities p(ω) of the assumptions ω inΩ.

Some random elements ω in Ω may become impossible after x has beenobserved. In fact, if, for an ω ∈ Ω, there is no θ ∈ Θ such that x =f(θ, ω), then this ω is clearly impossible: it cannot have generated the actualobservation x. Therefore, the observation x defines an event in Ω, namelythe event

vx = ω ∈ Ω : there is a θ ∈ Θ such that x = f(θ, ω). (25.6)

Page 198: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

198 CHAPTER 25. FUNCTIONAL MODELS AND HINTS

Since it is known that vx has occurred, we condition the probability measureP on the event vx, which leads to the revised probabilities

p′(ω) =p(ω)P (vx)

for all ω ∈ vx. These probabilities define a probability measure P ′ on vxgiven by the equation

P ′(A) =∑ω∈A

p′(ω)

for all subsets A ⊆ vx. It is still unknown which random element ω in vxgenerated the observation x, but p′(ω) is the probability that this elementis ω. Nevertheless, let us assume for the time being that ω ∈ vx caused theobservation x. Then, according to the function f relating the parameter andthe random element with the data, the possible values for the parameter θcan be logically restricted to the set

Tx(ω) = θ ∈ Θ : x = f(θ, ω. (25.7)

Note that in general Tx(ω) may contain several elements, but it could also bea one-element subset of Θ. It is also possible that Tx(ω) = Θ, in which casethe observation x does not carry any information about θ, assuming that ωcaused the observations x. Therefore, in general, even if the chance elementgenerating the observation x were known, it would still not be possible toidentify the value of the parameter unambiguously. This analysis shows thatan observation x in a functional model generates the structure

Hx = (vx, P ′, Tx,Θ), (25.8)

which we call a hint.This approach to inference can be illustrated by the two examples intro-

duced at the end of Section 25.1.

Example 25.3 Sensor. Assume that in the sensor described in Example25.1, an alarm a sounds. Then we may use the functional model specifiedthere to judge the likelihood that event e occured. First it is clear, that thesensor must be intact, since otherwise there would be no alarm. Hence wehave va = ¬ω. But then, Ta(¬ω) = e, hence event e has occured forsure. So this is a very reliable inference.

If,on the other hand no alarm sounds, how sure can we be that thereis no event e present. We remark the the observation ¬a is compatibleboth with ω (sensor failed) as well as with ¬ω (sensor intact). Hence wehave v¬a = Ω = ω,¬ω. If we assume that the system is intact, then weconclude the there is no event e, i.e. T¬a(¬ω) = ¬e. If we assume onthe other hand, that the sensor failed, then both possibilities of e and ¬e

Page 199: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

25.3. HINTS 199

remain possible, T¬a(ω = e,¬e. So in this case we arrive at no unique andclear conclusion. We remark however, that it may be safe to assume thatthe sensor is usually functioning correctly,such that we have some guaranteethat event e did not occur. This informal consideration will be made moreprecise below.

Example 25.4 Inference for a Poisson Process. We refer to the functionmodel 25.5 defined in example 25.2 for the counting of passing cars. Al-though in this case neither the parameter space nor the the space of distur-bances ω is finite, the inference can nevertheless be carried out in the waydescribed in this section. If we observe a given value x, then, according to25.5, we must have

x∑k=1

ωk ≤ ∆t · λ <x+1∑k=1

ωk.

First, we note that observation x excludes no sequence ω1, ω2, . . ., henceno conditioning of the underlying probabilities is needed. If we assume asequence ω = ω1, ω2, . . . we conclude thus that the unknown parameter mustbe in the interval

Tx(ω) = [1

∆t

x∑k=1

ωk,1

∆t

x+1∑k=1

ωk).

Thus the elements of the hint Hx associated with the count of x cars passingduring the interval ∆t are fixed: vx is still the whole set of all sequencesω = ω1, ω2, . . ., no one i excluded by the observation x. Thus the originalprobability distribution does not change. The multivalued mapping Tx isdefined by the intervals obtained above.

25.3 Hints

The hint defined in (25.8) an instance of a more general notion of a hint.In general, if Θ denotes the set of possible answers to a question of interest,then a hint on Θ is a quadruple of the form

H = (Ω, P,Γ,Θ),

where Ω is a set of assumptions, P is a probability measure on Ω reflectingthe probability of the different assumptions and Γ is a mapping between theassumptions and the power set of Θ,

Γ : Ω −→ 2Θ.

For a fixed assumption ω ∈ Ω, the subset Γ(ω) is the smallest subset ofΘ that is known for sure to contain the correct answer to the question of

Page 200: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

200 CHAPTER 25. FUNCTIONAL MODELS AND HINTS

interest. In other words, if the assumption ω is correct, then the answerin certainly within Γ(ω). The theory of hints is presented in detail in thereference (Kohlas & Monney, 1995). Intuitively, a hint represents a pieceof information regarding the unknown value of the parameter in Θ. Thisinformation can then be used to evaluate the validity of certain hypothesesregarding Θ. A hypothesis is a subset H of Θ and it is correct if, and onlyif, it contains the answer to the question of interest.

The most important concept for the evaluation of the hypothesis H isthe degree of support of H induced by the hint H, which is defined by

sp(H) = P (ω ∈ Ω : Γ(ω) ⊆ H). (25.9)

If the assumption ω is the true one and Γ(ω) ⊆ H, then H is necessarilytrue because the correct answer is in Γ(ω) by definition of the mapping Γ.In other words, under the assumption ω the hypothesis H can be derived orproved. The degree of support of H is the probability of the assumptionsthat are capable of proving H. Such assumptions are called arguments forthe validity of H and sp(H) measures the strength of these arguments. Thearguments represent the logical evaluation of H, whereas sp(H) representsthe quantitative evaluation of H.

Similarly, the degree of plausibility of H, which is defined as

pl(H) = P (ω ∈ Ω : Γ(ω) ∩H 6= ∅), (25.10)

measures the level of compatibility between the hypothesis H and the as-sumptions. Obviously, degrees of support and plausibility for hypothesesdefine functions

sp : 2Θ −→ [0, 1], pl : 2Θ −→ [0, 1]

which are called the support and plausibility functions associated with thehint. It can easily be shown that

pl(H) = 1− sp(Hc)

and sp is a belief function as defined by Shafer (Shafer, 1976).Let’s illustrate these concepts by our examples.

Example 25.5 Sensor. We continue the analysis of the sensor examplestarted in Example 25.3. We noted that if an alarm sounds, then e eventoccurred for sure. This translates into sp(e) = 1, since Ta(¬ω) = e andp(¬ω) = 1.

More interesting is the case of no alarm. In this case sp(¬e) = 1−p, sinceonly the assumption that the sensor is intact supports the hypothesis thatevent e did not occur. There is no assumption supporting the hypothesisthat event e occured. However this hypothesis is not entirely excluded, in

Page 201: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

25.3. HINTS 201

fact it has some plausibility, namely pl(e) = 1−sp(¬e) = p. In practicethe reliabilty of a sensor will be high, which means that the probability offailure p is small. So there is much support for the hypothesis that the evente did not occur, and there is only a small degree of plausibility that it didoccur.

Example 25.6 Inference for a Poisson Process. Here we go on with ananalysis of the hint developed in Example 25.4 for the Poisson process. Theconsideration there allow to determine the degree of support for intervals as

spx(u ≤ λ ≤ v) = P

(u ≤

x∑k=1

ωk <

x+1∑k=1

ωk ≤ v

).

It is convenient to introduce the distribution function

Q(u, v) = P

(x∑k=1

ωk ≤ u ≤ v ≤x+1∑k=1

ωk

)

=1x!uxe−v for 0 ≤ u ≤ v.

This distribution function has a density function

qx(u, v) =∂2

∂u∂vQx(u, v) = Qx−1(u, v, ) for x ≥ 1.

For x = 0 we have Q0(u, v) = e−v with a density q0(u, v) = e−v. Further wedefine

Fx(u) = P

(x∑k=1

ωk ≤ u

),

Gx(v) = P

(v ≤

x∑k=1

ωk

).

Both these function have densities too, namely

fx(u) =∂

∂uQx(u, v)|u=v = qx(u, u),

gx(v) =∂

∂vQx(u, v)|u=v = qx+1(u, u).

This permits to compute spx(u ≤ λ ≤ v), since

spx(u ≤ λ ≤ v) = 1− (Fx(u) +Gx+1(v)−Qx(u, v)) .

Page 202: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

202 CHAPTER 25. FUNCTIONAL MODELS AND HINTS

Similarly, we compute the degree of plausibility for the unknown parameteras

plx(u ≤ λ ≤ v) = 1− (P

(v <

x∑k=1

ωk

)+ P

(x+1∑k=1

ωk < u

))

= 1− ((1− Fx(v)) + (1−Gx+1(u)))= Fx(v) +Gx+1(u)− 1.

In particular, we may determine the plausibility of singletons,

plx(u ≤ λ ≤ u) = P

(x∑k=1

ωk ≤ u ≤x+1∑k=1

ωk

)=

1x!uxe−u

This could be used to determine the most plausible estimate of the unknownparameter λ, i.e. for maximum likelihood estimation.

25.4 Combination and Focussing of Hints

When there are several hints relative to the same domain Θ, the questionis, how should we combine these hints in order to obtain a single hint re-flecting the aggregated information contained in both hints? For the sakeof simplicity, we only consider the case where two hints are combined—thegeneralization to more than two hints is straightforward. So let

H1 = (Ω1, P1,Γ1,Θ) H2 = (Ω2, P2,Γ2,Θ) (25.11)

be two hints referring to Θ. If the assumptions of the the two hints areindependent, then the prior probability of a pair of assumptions (ω1, ω2) inΩ1×Ω2 is P1(ω) ·P2(ω)2. If the intersection Γ1(ω1)∩Γ2(ω2) is empty, thenthis pair is called contradictory because, by construction of the mappingsΓ1 and Γ2, it is impossible that both ω1 and ω2 are simultaneously correct.Therefore, if

C = (ω1, ω2) ∈ Ω1 × Ω2 : Γ1(ω1) ∩ Γ2(ω2) = ∅

denotes the set of contradictory pairs of assumptions, then the correct pairmust be in the set

Ω = (Ω1 × Ω2) \ C. (25.12)

Since the correct pair of assumptions is in Ω, we must condition the productprobability measure P1P2 on the set Ω. Specifically, if we define

K =∑P1(ω1)P2(ω2) : (ω1, ω2) ∈ C,

Page 203: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

25.4. COMBINATION AND FOCUSSING OF HINTS 203

then the conditioning operation results in a new probability space (Ω, P )where

P (ω1, ω2) =P1(ω1)P2(ω2)

1−K(25.13)

for all (ω1, ω2) ∈ Ω. Here we assume that K 6= 1. Furthermore, if (ω1, ω2) isthe correct pair of assumption, then the correct value in Θ must necessarilybe in the set

Γ(ω1, ω2) = Γ1(ω1) ∩ Γ2(ω2). (25.14)

This is the smallest subset of Θ that is know for sure to contain the correctanswer if both ω1 and ω2 are assumed to be correct. Therefore, we definethe combination of the two hints H1 and H2 as the hint

H1 ⊕H2 = (Ω, P,Γ,Θ),

where Ω is defined in (25.12), P is defined in (25.13) and Γ is defined in(25.14). Clearly, this method for combining hints corresponds to Demp-ster’s rule for combining belief functions (Shafer, 1976). If K = 1, then thecombination H1 ⊕H2 isnot defined.

There is another notion regarding hints that we need to introduce, namelythe operation of marginalization of a hint. Consider a hint on a domain thatis the Cartesian product of two domains, namely a hint of the form

H = (Ω, P,Γ,Θ1 ×Θ2).

In order to evaluate hypotheses regarding the domain Θ1, we define theprojection of H to Θ1 as the hint

H↓Θ1 = (Ω, P,Γ1,Θ1)

where

Γ1(ω) = Γ↓Θ1(ω) = θ1 ∈ Θ1 : ∃ θ2 ∈ Θ2 such that (θ1, θ2) ∈ Γ(ω).

The set Γ1(ω) is the projection of Γ(ω) to the domain Θ1. The definitionof the mapping Γ1 is justified by the fact that the projection Γ1(ω) is thesmallest subset of Θ1 that is known for sure to contain the correct valuein Θ1 when we know for sure that the correct joint value is in Γ(ω). Themarginalized hint H↓Θ1 can then be used as usual to evaluate hypotheses Hthat are subsets of Θ1.

Next we present some notions related to partitions that are used togeneralize the operations of combination and of proejection introduced aboveand that will be useful in the development of the algebraic theory of hints.Consider a general domain Θ representing the possible answers to a questionof interest. A partition P of Θ is a family of non-empty subsets of Θ whichare mutually disjoint and cover Θ, namely⋃

A∈PA = Θ, A ∩A′ = ∅ for all A,A′ ∈ P satisfying A 6= A′.

Page 204: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

204 CHAPTER 25. FUNCTIONAL MODELS AND HINTS

Now consider a hintH = (Ω, P,Γ,Θ)

on Θ and a partition P of Θ. This partition represents a coarser granularityfor the possible answers to the question of interest and we want to express theinformation contained in the hint with respect to this coarser granularity.By construction, under the assumption ω ∈ Ω, it is known for sure thatthe correct answer to the question is within the subset Γ(ω). Then everyelement A of the partition P that intersects with Γ(ω) might contain theanswer. Therefore, the set

Γ→P(ω) = ∪ A ∈ P : A ∩ Γ(ω) 6= ∅

is the smallest union of elements of P that necessarily contain the correctanswer to the question. This shows that the hint H is represented by thehint

H→P = (Ω, P,Γ→P ,Θ)

when the coarser domain P is used instead of the original domain Θ. Thisoperation of coarsening the information contained in the hint H is calledfocussing. It generalizes the projection of hints.

We say that a hint H is carried by a partition P if

H→P = H.

Since(H→P)→P = H→P ,

it follows that the coarsened hint H→P is carried by the partition P.It is possible to define a partial order in the set D consisting of all

partitions of Θ. Indeed, if P and P ′ are two partitions in D, then we saythat P ≤ P ′ if, for all A ∈ P, there is a A′ ∈ P ′ such that A ⊆ A′. In thiscase we say that P is finer than P ′, or, alternatively, that P ′ is coarser thanP. This terminology is justified by the fact that, for every A′ ∈ P ′, we have

A′ = ∪ A ∈ P : A ⊆ A′.

The partition consisting of all one-element subsets of Θ is the bottom ele-ment of this partial order. We identify Θ with this bottom partition, whichallows us to consider that Θ is in D. It can be shown that the pair (D,≤)is actually a lattice with meet and join given by

P ∧ P ′ = A ∩A′ 6= ∅ : A ∈ P, A′ ∈ P ′

P ∨ P ′ = ∧ Q ∈ D : Q ≥ P,Q ≥ P ′.

Now consider three partitions P1, P2 and P in D. Then we say that P1

and P2 are conditionally independent given P, if, for all P1 ∈ P1, P2 ∈ P2

Page 205: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

25.5. ALGEBRA OF HINTS 205

and P ∈ P, the condition Pi ∩ P 6= ∅ for i = 1, 2 implies P1 ∩ P2 ∩ P 6= ∅.In this case we write

(P1,P2) a P.Let us now consider a hint H = (Ω, P,Γ,Θ) and define the partition W

of Ω that is obtained by grouping the elements ω in Ω that map to the sameset Γ(ω). If we define Ω′ = W and, for W ∈ W, if we define Γ′(W ) = Γ(ω)where ω is any element of W , and if we define

P ′(W ) =∑ω∈W

P (ω),

then the hintHc = (Ω′, P ′,Γ′,Θ)

is called the canonical version ofH. Clearly, bothH and its canonical versionHc have the same support and plausibility functions. A hint satisfyingΓ(ω) 6= Γ(ω′) whenever ω 6= ω′ is its own canonical version and such a hintis called canonical.

Let V denote the set of all canonical hints on Θ. In order to make Vclosed under the operations of focussing and combination, we slightly modifythese operations by taking the canonical version of the resulting hint at theend. In other words, we define

H1 ⊕H2 = (H1 ⊕H2)c

H→P = (H→P)c.

Furthermore, given two hintsH1 = (Ω1, P1,Γ1,Θ) andH2 = (Ω2, P2,Γ2,Θ),it can happen that

(ω1, ω2) ∈ Ω1 × Ω2 : Γ1(ω1) ∩ Γ2(ω2) 6= ∅ = ∅.

In this case the two hints cannot be combined because they are completelycontradictory. However, we nevertheless formally define the combined hintas the zero hint Z, which is not further specified. In addition, for every hintH and every partition P, we define

H⊕Z = Z Z→P = Z

and we consider that the set V also contains the zero hint Z.

25.5 Algebra of Hints

The set V endowed with the operation of combination has some interestingalgebraic properties, some of which are related to the focussing operation.These properties are presented in the following theorem.Theorem 1. The set V with the operations of combination and focussinghas the following properties

Page 206: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

206 CHAPTER 25. FUNCTIONAL MODELS AND HINTS

1. Semigroup. The operation of combination is associative and commu-tative. There is a neutral hint E such that H⊕E = H for every H ∈ V.There is a null element Z such that H⊕Z = Z for every H ∈ V.

2. Transitivity. If H is carried by a partition P1 and (P1,P2) a P, then

H→P2 = (H→P)→P2 .

3. Combination. If H1 is carried by a partition P1 and H2 is carried bya partition P2 and (P1,P2) a P, then

(H1 ⊕H2)→P = H→P1 ⊕H→P2 .

4. Neutrality. For all P ∈ D we have

E→P = E .

5. Nullity. For all P ∈ D we have

H→P = Z ⇐⇒ H = Z.

6. Focussing. For all H ∈ V we have

H→Θ = H.

7. Idempotency. For all H ∈ V, if P1 ≤ P2, then

H→P1 ⊕H→P2 = H→P1 .

Proof. Assertions (1) to (3) are proved in (Kohlas & Monney, 1995). Asser-tions (4) to (7) are immediate consequences of the definition of focussing.ut

Let us now consider the particular case where Θ is a Cartesian product,namely

Θ = Θ1 × . . .×Θm (25.15)

Then we define the setM = 1, . . . ,m

and, for every subset J ⊆ N , we define

ΘJ =∏j∈J

Θj .

Note that we clearly have Θ = ΘN and each ΘJ corresponds to a partitionof Θ, namely the partition

PJ = θJ ×ΘN−J : θJ ∈ ΘJ.

Page 207: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

25.5. ALGEBRA OF HINTS 207

It is shown in (Kohlas & Monney, 1995) that PJ ≤ PK if and only if K ⊆ Jand

(PI ,PJ) a PK ⇐⇒ I ∩ J ⊆ K.

If a partition PJ is simply denoted by J , then we obtain the followingcorollary of theorem 1.

Corollary 1. The set V consisting of all canonical hints on the domain Θgiven in (25.15) has the following properties

1. Transitivity. Let H be a hint in V that is carried by I. If I ∩ J ⊆ K,then

H→J = (H→K)→J .

2. Combination. Let H1 be a hint in V that is carried by I and let H2

be a hint in V that is carried by J . If I ∩ J ⊆ K, then

(H1 ⊕H2)→K = H→K1 ⊕H→K2 .

Note that the operation of projection defined in Section 25.4 is a specialcase of the focussing operation. The following theorem presents algebraicproperties of hints with respect to the partitions PJ , which are again simplydenoted by J .

Theorem 2. The set V consisting of all canonical hints on the domain Θgiven in (25.15) has the following properties

1. Semigroup. The operation of combination is associative and commu-tative. There is a neutral hint E such that H⊕E = H for every H ∈ V.There is a null element Z such that H⊕Z = Z for every H ∈ V.

2. Transitivity. For all J,K ⊆M and all H ∈ V, we have

(H→J)→K = H→J∩K .

3. Combination. For all K ⊆M and all H1,H2 ∈ V, we have

(H→K1 ⊕H2)→K = H→K1 ⊕H→K2 .

4. Neutrality. For all J ⊆M , we have

E→J = E .

5. Nullity. For all J ⊆M , we have

H→J = Z ⇐⇒ H = Z

Page 208: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

208 CHAPTER 25. FUNCTIONAL MODELS AND HINTS

6. Focussing. For all H ∈ V, we have

H→M = H.

7. Idempotency. For all H ∈ V, if J ⊆ K, then

H→J ⊕H→K = H→K .

Proof. Assertion (1) is the same as in theorem 1 and assertions (4) to (6)are simple reformulations of assertions (4) to (7) in Theorem 1.

In order to prove assertion (2), we first show that, that for J ⊆ K, wealways have

H→J = (H→K)→J . (25.16)

Any hint H is carried by M according to property (6) of the present theoremand we have M ∩ J ⊆ K. Therefore, (25.16) follows from property (1) ofcorollary 1. Further, note that (H→J)→J = H→J by property (1) of corollary1, which means that H→J is carried by J . Then, by assertion (1) of corollary1, we have

(H→J)→K = ((H→J)→J∩K)→K .

Using (25.16) and assertion (1) of corollary 1, we finally obtain

(H→J)→K = (H→J∩K)→K = H→J∩K .

In order to prove assertion (3) we use again the fact that H→K1 is carriedby K and J∩K ⊆ K. Therefore, assuming that H2 is carried by J , assertion(2) of corollary 1 implies that

(H→K1 ⊕H2)→K = (H→K1 )→K ⊕H→K2 = H→K1 ⊕H→K2 .

ut

This algebraic structure with two operations, namely combination andfocussing, that satisfies the properties of theorem 2 is very similar to aninformation algebra (see Part I). We afre going to define and study suchstructures of uncertain information as exemplified by hints more generallyin relation to information algebra in this part III.

Page 209: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 26

Simple Random Variables

26.1 Domain-Free Variables

We consider a domain-free information algebra (Φ, D) (see Part I). Uncertaininformation can be represented by random variables taking values in such aninformation algebra. For example, probabilistic assumption-based reasoningas studied as discussed in the previous Chapter 25 lead to random sets inthe information algebra of subsets of a set of parameters. Therefore, wedevelop in this section the elements of a theory of such random variables.

Let (Ω,A, P ) be a probability space. It is well known that a mappingfrom such a space to any algebraic structure inherits this structure (Kappos,1969). This will be exploited here.

Let B = B1, . . . , Bn be any finite partition of Ω such that all Bi belongto A. A mapping ∆ : Ω→ Φ such that

∆(ω) = φi ∀ω ∈ Bi, i = 1, . . . , n,

is called a simple random variable in (Φ, D). Among simple random variablesin (Φ, D), we can easily define the operation of combination and focussing:

1. Combination: Let ∆1 and ∆2 be simple random variables in (Φ, D).Then let ∆1 ⊗∆2 be defined by

(∆1 ⊗∆2)(ω) = ∆1(ω)⊗∆2(ω).

2. Focussing: Let ∆ be a simple random variable in (Φ, D) and x ∈ D,then let ∆⇒x be defined by

(∆⇒x)(ω) = (∆(ω))⇒x.

Clearly, both ∆1 ⊗ ∆2 and ∆⇒x are simple random variables in (Φ, D).Let Rs denote the set of simple random variables in (Φ, D). With theseoperations of combination and focussing, (Rs, D) is an information algebra.

209

Page 210: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

210 CHAPTER 26. SIMPLE RANDOM VARIABLES

The axioms are simply inherited from (Φ, D) and easily verified. The neutralelement of this algebra is the random variable E(ω) = e for all ω ∈ Ω,the vacuous random variable. Further, since for any φ ∈ Φ, the mappingDφ(ω) = φ for all ω ∈ Ω is a simple random variable, the algebra (Φ, D) isembedded in the algebra (Rs, D).

Since the simple random variables form themselves an information alge-bra, containing the original one, it follows that simple random variables arepieces of information in this sense. Note that there is partial order in theinformation algebra of simple random variables, as in any information alge-bra. As we shall argue later (Section ??) this order reflects the informationcontent of simple random variables. In fact it holds that ∆1 ≤ ∆2 in Rs if,and only if, for all ω ∈ Ω it holds that ∆1(ω) ≤ ∆2(ω).

There are two important special classes of simple random variables: Iffor a random variables ∆ defined relative to a partition B = B1, . . . , Bn itholds that φi 6= φj for i 6= j, the variable is called canonical. It is a simplematter to transform any random variable ∆ into an associated canonical one:Take the union of all elements Bi ∈ B with the same values φj which yieldsa new partition B′ of Ω. Define ∆′(ω) = ∆(ω). Then ∆′ is the canonicalversion of ∆ and we write ∆′ = ∆→. We may consider the set of canonicalrandom variables, Rs,c, and define between elements of this set combinationand focussing as follows:

∆1 ⊗c ∆2 = (∆1 ⊗∆2)→,∆⇒cx = (∆⇒x)→.

Then (Rs,c, D) is still an information algebra under these modified opera-tions. We remark also that (∆1 ⊗∆2)→ = (∆→1 ⊗∆→2 )→ holds.

Secondly, if ∆(ω) 6= z for all ω ∈ Ω, then ∆ is called normalized. Wecan associate a normalized random variables ∆↓ with any simple randomvariable ∆ provided ∆(ω) 6= z with a positive probability. In fact, letΩ↓ = ω ∈ Ω : ∆(ω) 6= z. This is a measurable set with probabilityP (Ω↓) = 1 − Pω ∈ Ω : ∆(ω) = z > 0. We consider then the newprobability space (Ω,A, P ′), P ′ is the probability measure on A defined by

P ′(A) =P (A ∩ Ω↓)P (Ω↓)

, (26.1)

if A ∩ Ω↓ 6= ∅ and P ′(A) = 0, otherwise. On this new probability spacedefine ∆↓(ω) = ∆(ω). Clearly, it holds that (∆→)↓ = (∆↓)→.

The idea behind normalization becomes clear, when we consider combi-nation of random variables: Each of two (normalized) random variables ∆1

and ∆2 represents some (uncertain) information with the following interpre-tation: One of the ω ∈ Ω must be the correct, but unknown assumption ofthe information. However, if ω happens to be the correct assumption, then

Page 211: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

26.2. LABELED RANDOM VARIABLES 211

under the first random variable information ∆1(ω) can be asserted, and un-der the second variable information ∆2(ω). Thus, together, still under theassumption ω, information ∆1(ω) ⊗∆2(ω) can be asserted. However, it ispossible that ∆1(ω) ⊗∆2(ω) = z, even if both ∆1 and ∆2 are normalized.But the null information z represents a contradiction, thus in view of the in-formation given by the variables ∆1 and ∆2 the assumption ω can not hold,it can (and must) be excluded. This amounts to normalize the random vari-able ∆1(ω)⊗∆2, by excluding all ω ∈ Ω for which the combination resultsin a contradiction, and then to condition (i.e. normalize) the probability onnon-contradictory assumptions.

Two partitions B1 and B2 of Ω are called independent, if B1,i ∩B2,j 6= ∅for all blocks B1,i∩B2,j ∈ B1 and B2,j ∈ B2. If furthermore P (B1,i∩B2,j) =P (B1,i) ·P (B2,j) for all those pairs of blocks, then the two partitions B1 andB2 are called stochastically independent. In addition, if ∆1 and ∆2 are twosimple random variables defined on these two partitions respectively, thenthese random variables are called stochastically independent too. Note thatif ∆1 and ∆2 are stochastically independent, then their canoncical versions∆→1 and ∆→2 are stochastically independent too.

26.2 Labeled Random Variables

To the domain-free information algebra (Rs, D) we can, in the usual way,associate an equivalent labeled version (see Part I). Note that for a simplerandom variable ∆ any possible value φi has some support xi. Then x = ∨ixiis a support for all possible values of ∆. Hence we see that ∆⇒x = ∆. Thusany simple random variable has a support.

We define R∗s to be the set of all pairs (∆, x), where ∆ ∈ Rs is a simplerandom variable and x a support of ∆. Then (R∗s, D) is a labeled informationalgebra where x is the label d(∆, x) of (∆, x), combination is defined by(∆1, x)⊗ (∆2, y) = (∆1 ⊗∆2, x ∨ y) and projection for y ≤ x by (∆, x)↓y =(∆⇒y, y).

If (Φ, D) is a domainfree information algebra, and (Φ∗, D) the associatedlabeled algebra, i.e. the algebra of pairs (φ, x) with x a support of φ, then thelabeled simple random variables (∆, x) can also be considered as mappings∆∗ : Ω→ Φ∗x, where Φ∗x is the set of all (φ, x) ∈ Φ∗. The mapping is definedby ∆∗(ω) = (∆(ω), x). This is well defined, since x being a support of ∆means that x is a support of ∆(ω) for all ω ∈ Ω.

This observation indicates that we may define labeled random variablesalso directly: Let (Φ, D) now be a labeled information algebra and Φx themembers of Φ with label d(φ) = x. A simple random variable with label xis then a mapping ∆ : Ω→ Φx, such that there is a finite decomposition ofB = B1, . . . , Bn such that all Bi belong to A and

∆(ω) = φi ∈ Φx ∀ω ∈ Bi, i = 1, . . . , n.

Page 212: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

212 CHAPTER 26. SIMPLE RANDOM VARIABLES

The label of a such a random variable is d(∆) = x. As before, it is eas-ily verified that such —labeled random variables form themselves a labeledinformation algebra. In fact define the operations of combination and pro-jection in the natural way as follows:

1. Combination: Let ∆1 and ∆2 be simple random variables in (Φ, D).Then let ∆1 ⊗∆2 be defined by

(∆1 ⊗∆2)(ω) = ∆1(ω)⊗∆2(ω).

2. Projection: Let ∆ be a simple random variable in (Φ, D) and x ∈D,such that x ≤ d(∆), then let ∆↓x be defined by

(∆↓x)(ω) = (∆(ω))↓x.

In the same way as the domainfree version of the labeled algebra (Φ, D) isobtained, the domain free version of the algebra of simple random variablescan be obtained. We leave the details to the reader, see also Section 27.1.Important is only that we may concerning random variables switch as con-veniently between the two equivalent versions of the domainfree or labeledversion as with simple information algebras. Some issues are more conve-niently discussed in the domainfree version, for others the labeled version isbetter adapted.

26.3 Degrees of Support and Plausibility

This section is devoted to the study of the probability distribution of simplerandom variables. The starting point is the following question: Given arandom variable ∆ on a information algebra and an element φ ∈ Φ, underwhat assumptions can the information φ be asserted? And how likely is it,that these assumptions hold?

If ω ∈ Ω is an assumption such that ∆(ω) ≥ φ, then ∆(ω) implies φ.In this case we may say that ω is an assumption supporting φ, given ∆.Therefore we define for every φ ∈ Φ the set

qs∆(φ) = ω ∈ Ω : φ ≤ ∆(ω)

of assumptions supporting φ. However, if ∆(ω) = z, then ω is supportingevery φ ∈ Φ, since φ ≤ z. The null element z represents the contradiction,which implies everything. In a consistent theory, contradictions must beexcluded. Thus, assuming that our information is consistent, we concludethat assumptions such that ∆(ω) = z are not really possible assumptionsand must be excluded. We refer once more to Chapter 25 for an illustrationand justification of this consideration. Let

qs∆(z) = ω ∈ Ω : ∆(ω) = z.

Page 213: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

26.3. DEGREES OF SUPPORT AND PLAUSIBILITY 213

We assume that qs∆(z) is not equal to Ω; otherwise ∆ is called the nullvariable representing fully contradictory “information”. In other words, weassume that proper information is never fully contradictory. If we eliminatethe contradictory assumptions from qs(φ), we obtain the support set

s∆(φ) = ω ∈ Ω : φ ≤ ∆(ω) 6= z = qs∆(φ)− qs∆(z).

of φ, which is the set of assumptions properly supporting φ and the mappings∆ : Φ → 2Ω is called the allocation of support induced by ∆. The setqs(φ) is called quasi-support set to underline that it contains contradictoryassumptions. This set has little interest from a semantic point of view, butit is important for technical and especially for computational reasons. Theseconcepts capture the essence of probabilistic assumption-based reasoning asexemplified in the context of functional models and hints in the previousChapter 25.

Here are the basic properties of allocations of support:

Theorem 26.1 If ∆ is a simple random variable on an information algebra(Φ, D9, then the following holds for the associated allocation of support s∆:

1. qs∆(e) = Ω, s(z) = ∅.

2. If ∆ is normalized, then qs∆(z) = ∅.

3. For any pair φ, ψ ∈ Φ,

qs∆(φ⊗ ψ) = qs∆(φ) ∩ qs∆(ψ),s∆(φ⊗ ψ) = s∆(φ) ∩ s∆(ψ).

Proof. (1) and (2) follow immediately from the definition of the allocationof support. (3) follows since φ ⊗ ψ ≤ ∆(ω) if, and only if φ ≤ ∆(ω) andψ ≤ ∆(ω). ut

Knowing assumptions supporting a hypothesis φ is already interestingand important. It is the part logic can provide. On top of this, it is importantto know how likely it is that a supporting assumption hold. This is thepart probability adds. If we know or may assume that the information isconsistent, then we should condition the original probability measure P inΩ on the event qsc∆(z). This leads then to the probability space (qsc∆(z),A∩qsc∆(z), P ′), where P ′(A) = P (A)/P (qsc∆(z)). The likelihood of supportingassumptions for φ ∈ Φ can then be measured by

sp∆(φ) = P ′(s∆(φ)).

The value sp∆(φ) is called degree of support of φ associated with the randomvariable ∆. The function sp : Φ→ [0, 1] is called the support function of ∆.It corresponds to the distribution function of ordinary random variables.

Page 214: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

214 CHAPTER 26. SIMPLE RANDOM VARIABLES

It is for technical reasons convenient to define also the degree of quasi-support

qsp∆(φ) = P (s∆(φ)).

Then, the degree of support can also be expressed in terms of degrees ofquasi-support

sp∆(φ) =qsp∆(φ)− qsp(z)

1− qsp∆(z).

This is the form which is usually used in applications (Kohlas et al., 1999).In another consideration, we can also ask for assumptions ω ∈ Ω, under

which ∆ shows φ to be possible, i.e. not excluded, although not necessarilysupported. If ∆(ω) is such that combined with φ it leads to a contradiction,i.e. if ∆(φ) ⊗ φ = z, then under ω the information φ is excluded by aconsistency consideration as above. So we define the set

p∆(φ) = ω ∈ Ω : ∆(φ)⊗ φ 6= z.

This is the set of assumptions under which φ is not excluded. Therefore wecall it the possibility set of φ. We can define the degree of possibility, alsosometimes called degree of plausibility (e.g. in (Shafer, 1976)).

pl∆(φ) = P ′(p∆(φ)).

If ω ∈ pc∆(φ), then, under this assumption, φ is impossible, is excluded. Sothe set pc∆(φ) contains arguments against φ and

do∆(φ) = P ′(pc∆(φ)) = 1− pl∆(φ).

can be called the degree of doubt into φ. Note that s∆(φ) ⊆ p∆(φ) sinceφ ≤ ∆(ω) 6= z implies φ ⊗ ∆(ω) = ∆(ω) 6= z. Hence, we see that for allφ ∈ Φ we have that sp∆(φ) ≤ pl∆(φ).

26.4 Basic Probability Assignments

Consider the canonical version ∆→ of a random variable ∆ witha corre-sponding partition B = B1, . . . , Bn of Ω and ∆(ω) = φi for ω ∈ Bi. Thenwe define the probabilities

m∆(φi) = P (Bi), for i = 1, . . . , n.

The m∆(φi) are called basic probability assignments (bpa). This term comesfrom Dempster-Shafer theory of evidence (DS theory) (Shafer, 1976), whichcorresponds to our theory of random random variables, when the information

Page 215: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

26.4. BASIC PROBABILITY ASSIGNMENTS 215

algebra of subsets of a finite set is considered. It follows then immediatelythat

qsp∆(φ) =∑i:φ≤φi

m∆(φi).

So, it is sufficient to know the bpa of a random variable to determine itssupport function. In many applications only with the bpa is used and itis not cared about the underlying random variables. However, implicitly itthen generally is assumed that the underlying variables are stochasticallyindependent, although this is often not rigorously and explicitly stated.

We may also consider normalized bpas, corresponding to the normalizedvariable ∆↓,

m∆↓(φi) =m∆(φi)

1−m∆(z)

for φi 6= z. In terms of this normalized bpa, the degrees of support become

sp∆(φ) =∑i:φ≤φi

m∆↓(φi).

Similarly, the degree of plausibility may also be obtained from the normal-ized bpa,

pl∆(φ) =∑

φ⊗ψ 6=zm∆↓(ψ).

The bpa, as the degrees of quasi-support, are not semantically important,but convenient concepts especially for computational purposes.

Assume then that the two random variables ∆1 and ∆2 are stochasticallyindependent. Then ∆→1 and ∆→2 are also stochastically independent. Thefollowing theorem shows how the bpa of the combined random variable ∆1⊗∆2 can be computed form the bpas of the individual variables.

Theorem 26.2 Let m∆1(φ1,i) for i = 1, . . . ,m and m∆2(φ2,j) for j =1, . . . , n be the bpas of the two stochastically independent random variables∆1 and ∆2. Then the bpa of ∆1 ⊗∆2 is given by

m∆1⊗∆2(φ) =∑

i,j:φ1,i⊗φ2,j=φ

m∆1(φ1,i) ·m∆2(φ2,j). (26.2)

Proof. The canonical version of the combined random variables (∆1 ⊗∆2)→ = (∆→1 ⊗ ∆→2 )→ has as possible values the combination φ1,i ⊗ φ2,j .Each such combination has assigned the probability m∆1(φ1,i) ·m∆2(φ2,j) ofthe underlying intersection B1,i ∩ B2,j of the blocks of the two orthogonalpartitions associated with ∆→1 and ∆→1 . So, when the canonical version of

Page 216: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

216 CHAPTER 26. SIMPLE RANDOM VARIABLES

∆→1 ⊗∆→2 is taken, these probabilities sum up over all pairs i, j such thatφ1,i ⊗ φ2,j take the same value φ. ut

Of course, in (26.2), only a finite number of φ have a non-empty sum onthe right. The method to combine the bpas proposed in Theorem 26.2 iscalled the (non-normalized) Dempster rule, since it has first been proposedin the context of multivalued mappings in (Dempster, 1967). More precisely,Dempster proposed a normalized version of this rule, where the combinedvariable is normalized, (∆1 ⊗∆2)↓. Then the rule is

m(∆1⊗∆2)↓(φ) =

∑i,j:φ1,i⊗φ2,j=φ

m∆1(φ1,i) ·m∆2(φ2,j)

1−∑

i,j:φ1,i⊗φ2,j=zm∆1(φ1,i) ·m∆2(φ2,j)

. (26.3)

Here, the combinations which result in a contradiction are excluded and theprobability is renormalized.

A particular case is the combination of a random variable with a degen-erate variable, i.e. an element of Φ. Let ∆ be a random variable on Φ withbpa m∆(φi) for i = 1, . . . ,m, and φ ∈ Φ. To φ corresponds a degeneraterandom variable with bpa m(φ) = 1. Then we get by (26.2)

m∆⊗φ(ψ) =∑

i:φi⊗φ=ψ

m∆(φi). (26.4)

This Dempster’s (non-normalized) rule of conditioning. In the normalizedcase the rule becomes

m∆⊗φ(ψ) =

∑i:φi⊗φ=ψm∆(φi)

1−∑

i:φi⊗φ=zm∆(φi). (26.5)

We can also compute the bpa of a focussed random variable from thebpa of the original variable. This is shown in the next theorem.

Theorem 26.3 Let m∆(φi) for i = 1, . . . ,m, be the bpa of a random vari-able ∆. Then the bpa of ∆⇒x is given by

m∆⇒x(φ) =∑

i:φ⇒xi =φ

m∆(φi). (26.6)

Proof. This follows, since the partition of (∆⇒x)→ contains the unionsof the blocks Bi of the partition of ∆→, for which φ⇒xi = φ. ut

Again, in (26.3), only a finite number of φ have a non-empty sum onthe right. So, we see that the operations with simple random variables arereflected by their associated bpas. In fact later, in Chapter 31, we shallconsider an algebra of bpas, similar to an information algebra, based onthese operations.

Page 217: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 27

Random Information

27.1 Random Mappings

When we want to go beyond simple random mappings, we may considerany mapping h : Ω → Φ from a probability space (Ω,A, P ) into a domain-free information algebra (Φ, D). Let’s call such mappings random mappingsin (Φ, D). As before, in the case of simple random mappings, we maydefine between random mappings in (Φ, D) the operations of combinationand focussing:

1. Combination: Let Γ1 and Γ2 be two random mappings in (Φ, D), thenΓ1 ⊗ Γ2 is the random mapping in (Φ, D) defined by

(Γ1 ⊗ Γ2)(ω) = Γ1(ω)⊗ Γ2(ω). (27.1)

2. Focussing: Let Γ be a random mapping in (Φ, D) and x ∈ D, thenΓ⇒x is the random mapping in (Φ, D) defined by

Γ⇒x(ω) = (Γ(ω))⇒x. (27.2)

For a fixed probability space (Ω,A, P ), let RΦ denote the set of all randommappings in (Φ, D). With the two operations defined above, (RΦ, D) inher-its all axioms of a domain-free information algebra from (Φ, D), with theexception of the support axiom. A domain x ∈ D is a support for a randommapping Γ in (Φ, D), if Γ = Γ⇒x, i.e. Γ(ω) = Γ⇒x(ω) for all ω ∈ Ω (or atleast for almost all ω ∈ Ω, i.e. for all ω except a set of probability zero).Now, every upper bound of supports of h(ω) would be a support of h. But,except if D has a top element or is a complete lattice, such an upper bounddoes not necessarily exist. Therefore, we need some additional condition toguarantee that (RΦ, D) is an information algebra, such that D has a top el-ement. Or we may restrict ourselves to random mappings having a support.This set is closed under combination and focussing and forms therefore aninformation algebra.

217

Page 218: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

218 CHAPTER 27. RANDOM INFORMATION

We may also define a labeled version of random mappings in a labeledinformation algebra (Φ, D) as mappings Γ : Ω → Φx for some x ∈ D. Thelabel of such a random mapping Γ is d(Γ) = x. Between such labeled randommappings, defined on a fixed probability space (Ω,A, P ), combination andprojection can be defined as above,

1. Combination: Let Γ1 and Γ2 be two random mappings in (Φ, D), thenΓ1 ⊗ Γ2 is the random mapping in (Φ, D) defined by

(Γ1 ⊗ Γ2)(ω) = Γ1(ω)⊗ Γ2(ω). (27.3)

2. Projection: Let Γ be a random mapping in (Φ, D) and x ∈ D, x ≤d(Γ), then Γ↓x is the random mapping in (Φ, D) defined by

Γ↓x(ω) = (Γ(ω))↓x. (27.4)

The system (RΦ, D) of labeled random mappings in the labeled informationalgebra (Φ, D) inherits all axioms from the latter and is thus itself a labeledinformation algebra. We may derive from the labeled information algebraof random mappings in the usual way the associated domain-free algebra.Its elements are equivalence classes [Γ] of random mappings. If Γ1,Γ2 ∈ [Γ],then, if x = d(Γ1) and y = d(Γ2) we have Γ→y1 = Γ2 and Γ→x2 = Γ1. But thismeans that for all ω ∈ Ω we have Γ→y1 (ω) = Γ2(ω) and Γ→x2 (ω) = Γ1(ω),hence [Γ1(ω)] = [Γ2(ω)]. Thus, we may consider [Γ] as a mapping from Ωinto Φ/σ defined by [Γ](ω) = [Γ(ω)]. Therefore, the derived domain-freeinformation algebra is an algebra of random mappings in (Φ/σ,D). Thisalgebra will be denoted by (RΦ/σ,D).

Conversely, from an information algebra of random mappings in a domain-free information algebra (Ψ, D), where all mappings have a support, we mayobtain the corresponding labeled version. The elements of the labeled al-gebra are pairs (Γ, x), where Γ is a random mapping in Φ and x a supportof Γ. To such a pair we may associate a mapping Γx : Ω → Φx defined byΓx(ω) = (Γ(ω), x), where (Φ, D) is the labeled information algebra derivedfrom (Ψ, D). Evidently combination in the labeled information informationalgebra can be translated into a similar operation between mappings: Infact, let (Γ1, x) and (Γ2, y) be two labeled random mappings which combineinto (Γ1 ⊗ Γ2, x ∨ y). Then the associated mapping Γx∨y : Ω → Φx∨y isdefined by

Γx∨y(ω) = ((Γ1 ⊗ Γ2)(ω), x ∨ y)= (Γ1(ω), x)⊗ (Γ2(ω), y)= Γ1,x(ω)⊗ Γ2,y(ω). (27.5)

Similarly, the focussed random mapping Γx = (Γ⇒x, x), where Γ has supporty, and x ≤ y, generates the mapping

Γx(ω) = (Γ⇒x(ω), x) = (Γ(ω), y)↓y (27.6)

Page 219: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

27.2. SUPPORT AND POSSIBILITY 219

This amounts to saying that the labeled algebra derived from an infor-mation algebra of random mappings in the domain-free algebra (Ψ, D) isin fact equivalent (isomorph) to an algebra of labeled random mappings inthe labeled algebra (Φ, D) derived from (Ψ, D). So, finally, as with ordinaryinformation algebra we may with algebras of random mappings too switchfreely between labeled and domain-free versions, assuming the support ax-iom in the latter case.

A random mapping Γ : Ω → Φ can be seen as uncertain information inthe following way: The information represented by Γ can be interpreted indifferent ways. If an element ω ∈ Ω is assumed to hold, then informationΓ(ω) ∈ Φ is asserted. The uncertainty arises from the fact, that it notknown with certainty which element ω ∈ Ω holds. This view of a randommapping corresponds to the concept of a hint as explained in Chapter 25.Now, if information Γ(ω) is asserted, than all implied pieces of informationψ ≤ Γ(ω) are asserted too. This leads to an important generalization of arandom mapping. In part I of these Lectures it has been argued that idealsI in Φ represent consistent assertions of information. Thus, it is natural toextend the concept of a random mapping from maps into Φ more generallyto maps into the ideal completion IΦ of Φ. Thus, we call, more generally arandom mapping a mapping Γ : Ω → IΦ. Since (IΦ, D) is an informationalgebra, the set of random mappings into IΦ form an information algebra(RIφ , D) too.

Clearly (RΦ, d) is a subalgebra of (RIφ , D), or rather, it is embeddedinto (RIφ , d) by the mapping Γ ∈ RΦ 7→ I(Γ) ∈ RIφ , where I(Γ) is theprincipal ideal of all random mappings Γ′ ∈ RΦ such that Γ′ ≤ Γ. For arandom mapping Γ ∈ (RIφ , D) we have also that for all ω ∈ Ω, in the idealcompletion IΦ of Φ,

Γ(ω) =∨φ : φ ∈ Φ, φ ≤ Γ(ω)

Now, let ∆ ∈ RΦ such that ∆ ≤ Γ, i.e. ∆(ω) ≤ Γ(ω) for all ω ∈ Ω. Then

Γ =∨∆ : ∆ ∈ RΦ,∆ ≤ Γ.

This shows that (RIφ , D) is the ideal completion of the algebra (RΦ, d).

27.2 Support and Possibility

As in the case of simple random variables (Section 26.3) we may define theallocation of support sΓ of a random mapping by

sΓ(φ) = ω ∈ Ω : φ ≤ Γ(ω). (27.7)

We do not any more distinguish between the semantic categories of supportand quasi-support as in Section 26.3 and speak simply of support, even so(27.7) is strictly speaking a quasi-support.

Page 220: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

220 CHAPTER 27. RANDOM INFORMATION

This support as defined in (27.7) has the same properties as the supportof simple random variables, in particular, Theorem 26.1 holds still (see alsoSection 29.1). Again, as before, with simple random variables, we maydefine the degree of support induced by a random mapping Γ of a piece ofinformation φ by

spΓ(φ) = P (sΓ(φ)). (27.8)

Now however, this probability is only defined if sΓ(φ) ∈ A. For this tohold we have no guarantee whatsoever. The only element which we knowfor sure to measurable is sΓ(e) = Ω. A simple way out of this problemwould be to limit random mappings to mappings Γ for which sΓ(φ) ∈ Afor all φ ∈ Φ or even for all elements of the ideal completion IΦ. Thereis however the question whether such random mappings do exist for allinformation algebras. Even if they do, there is no reason why we shouldrestrict ourselves exactly to those mappings. Therefore we prefer other, morerational approaches to overcome the difficulty of an only partial definitionof degrees of support. Here we propose a first solution. Later we presentsome alternatives.

In the paper (Shafer, 1979) the use of probability algebras instead ofprobability spaces was proposed as a natural framework for studying belieffunctions. Since degrees of support are like belief functions (see Chapter29) we adapt this idea here. First, we introduce the probability algebraassociated with a probability space. Let J be the σ-ideal of P -null sets inthe σ-algebra A of the probability space. Two sets A′, A′′ ∈ A are equivalentmodulo J , if A′ − A′′ ∈ J and A′′ − A′ ∈ J . This means that the two setshave the same probability measure P (A′) = P (A′′). This equivalence is acongruence in the Boolean algebra A. Hence the quotient algebra B = A/Jis a Boolean σ-algebra too. If [A] denotes the equivalence class of A, then,for any countable family of sets Ai, i ∈ I,

[A]c = [Ac],∨i∈I

[Ai] =

[⋃i∈I

Ai

],

∧i∈I

[Ai] =

[⋂i∈I

Ai

]. (27.9)

So [A] defines a Boolean homomorphism from A onto B, called projection.We denote [Ω] by > and ∅ by ⊥. These are of course the top and bot-tom elements of B. Now, as is well known, B has some further importantproperties (see (Halmos, 1963)): it satisfies the countable chain condition,which means that any family of disjoint elements of B is countable. Further,any Booelan algebra B satisfying the countable chain condition is complete.

Page 221: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

27.2. SUPPORT AND POSSIBILITY 221

That is any subset E ⊆ B has a supremum ∨E and an infimum ∧E inB. Furthermore the countable chain condition implies also that there is al-ways a countable subset D of E with the same supremum and infimum, i.e.∨D = ∨E and ∧D = ∧E. We refer to (Halmos, 1963) for these results.Finally, by µ([A]) = P (A) a normalized, positive measure µ is defined on B.Positive means here that µ(b) = 0 implies b = ⊥. A pair (B, µ) of a Booleanσ-algebra B, satisfying the countable chain condition, and a normalized,positive measure µ on it, is called a probability algebra (see also (Kappos,1969).

We use now this construction of a probability algebra out of the prob-ability space to extend the definition of the degrees of support sΓ beyondelements φ for which sΓ(φ) are measurable. Even, if sΓ(φ) is not measurable,any A ∈ A, such that A ⊆ sΓ(φ), represents an argument, which supportsφ. To exploit this remark, define for every set H ∈ P(Ω)

ρ0(H) = ∨[A] : A ⊆ H,A ∈ A. (27.10)

This mapping has interesting properties as the following theorem shows.

Theorem 27.1 The application ρ0 : P(Ω) → A/J as defined in (27.10)has the following properties:

ρ0(Ω) = >,ρ0(∅) = ⊥,

ρ0

(⋂i∈I

Hi

)=

∧i∈I

ρ0(Hi). (27.11)

if Hi, i ∈ I is a countable family of subsets of Ω.

Proof. Clearly, ρ0(Ω) = [Ω] = > ∈ A/J . Similarly, ρ0(∅) = [Ω] = ⊥ ∈A/J .

In order to prove the remaining identity, let Hi, i ∈ I be a countablefamily of subsets of Ω. For every index i, there is a countable family ofsets H ′j ∈ A such that H ′j ⊆ Hi and ∨[H ′j ] = [∪H ′j ] = ρ0(Hi) since Bsatisfies the countable chain condition. Take Ai = ∪H ′j . Then Ai ⊆ Hi,Ai ∈ A and P (Ai) = µ(ρ0(Hi)). Define A = ∩i∈IAi ∈ A. It follows thatA ⊆ ∩i∈IHi and, because the projection is a σ-homomorphism, we obtain[A] = ∧i∈I [Ai] = ∧i∈Iρ0(Hi).

We are going to show now that [A] = ρ0(∩i∈IHi) which proves then thetheorem. For this it is sufficient to show that P (A) = µ(ρ0(∩i∈IHi)). Thisis so, since P (A) = µ([A]) and A ⊆ ∩Hi, hence [A] ≤ ρ0(∩Hi). Therefore, ifµ([A]) = µ(ρ0(∩Hi)) we must well have [A] = ρ0(∩Hi), since µ is positive.

Now, clearly P (A) ≤ µ(ρ0(∩Hi)). As above, we conclude that there isa A′ ∈ A, A′ ⊆ ∩Hi such that P (A′) = µ(ρ0(∩Hi)). A′ ∪ (A − A′) ⊆ ∩Hi

Page 222: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

222 CHAPTER 27. RANDOM INFORMATION

implies that P (A′ ∪ (A − A′)) = P (A′), hence P (A − A′) = 0. DefineA′i = Ai ∪ (A−A′) ⊆ Hi. Then Ai −A′i = ∅ and therefore,

µ(ρ0(Hi)) = P (Ai) ≤ P (A′i) = P (Ai) + P (A′i −Ai)≤ µ(ρ0(Hi)). (27.12)

This implies that P (A′i − Ai) = 0, therefore we have [Ai] = [A′i]. Fromthis we obtain that ∩A′i = ∩(Ai ∪ (A′ − A)) = (A′ − A) ∪ (∩Ai) = (A′ −A) ∪ A = A ∪ A′ = A′ ∪ (A − A′). But ∩A′i and ∩Ai are equivalent, since[∩A′i] = ∧[A′i] = ∧[Ai] = [∩Ai]. This implies finally that P (A) = P (∩Ai) =P (∩A′i) = P (A′) + P (A − A′) = P (A′) = µ(ρ0(∩Hi)). This is what wewanted to prove. ut

Take now B = A/J and consider the probability algebra (B, µ). Thenwe can compose the allocation of support s from Φ into the power set P(Ω)with the mapping ρ0 from P(Ω) into B to a mapping ρ = ρ0 s : Φ → B.Now we see that

ρ(e) = ρ0(s(e)) = ρ0(Ω) = >,

ρ(φ⊗ ψ) = ρ0(s(φ⊗ ψ)) = ρ0(s(φ) ∩ s(ψ))

= ρ0(s(φ)) ∧ ρ0(s(ψ)) = ρ(φ) ∧ ρ(ψ). (27.13)

A mapping ρ satisfying these properties is called a normalized allocation ofprobability on the information algebra Φ. In fact, it allocates an element ofthe probability algebra B to any element of algebra Φ. In this way, a randommapping leads always to an allocation of probability, once a probabilitymeasure on the assumptions is introduced.

In particular, we may now define, with ρΓ = ρ0 sΓ,

spΓ(φ) = µ(ρΓ(φ)). (27.14)

This extends the support function (27.8) to all elements φ of Φ.This extension has an interesting interpretation. In order to see this,

we add an important property of probability algebras: clearly µ(∧bi) ≤infi µ(bi) and µ(∨bi) ≥ supi µ(bi) holds for any family of elements bi. Butthere are important cases where equality hold. A subset D of B is calleddownward (upward) directed, if for every pair b′, b′′ ∈ D there is an elementb ∈ D such that b ≤ b′ ∧ b′′(b ≥ b′ ∨ b′′).

Lemma 27.1 If D is a downward (upward) directed subset of B, then

µ(∧i∈Dbi) = infi∈D

µ(bi), (µ(∨i∈Dbi) = supi∈D

µ(bi)) (27.15)

Proof. There is a countable subfamily of elements bi which have thesame meet. If ci, i = 1, 2, . . . are these elements, then define c′1 = c1 and

Page 223: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

27.2. SUPPORT AND POSSIBILITY 223

select elements c′i in the downward directed set such that c′2 ≤ c′1 ∧ c2,c′3 ≤ c′2 ∧ c3, . . .. Then c′1 ≥ c′2 ≥ c′3 ≥ . . . and this sequence has still thesame infimum. However, by the continuity of probability we have now

µ(∧bi) = µ(∧c′i) = limi→∞

µ(c′i) ≥ infiµ(bi). (27.16)

But this implies µ(∧bi) = infi µ(bi). The case of upwards directed sets istreated in the same way. ut

Note now that [H ′] : H ′ ⊆ H,H ′ ∈ A is an upward directed family.Therefore, according to Lemma 27.1 we have

spΓ(φ) = µ(ρ(φ)) = µ(∨[A] : A ∈ A, A ⊆ sΓ(φ))= supµ([A]) : A ∈ A, A ⊆ sΓ(φ)= supP (A) : A ∈ A, A ⊆ sΓ(φ)= P∗(sΓ(φ)),

where P∗ is the inner probability associated with P . This shows, that thedegree of support of a piece of information φ is the inner probability of theallocated support sΓ(φ). Note that definitions (27.14) and (27.8) coincide,if sΓ(φ) ∈ A. Support functions and inner probability measures are thusclosely related. The result above is very appealing: any measurable set A,which is contained in sh(φ) supports φ. So we expect P (A) ≤ spΓ(φ). Inthe absence of further information, it is reasonable to take spΓ(φ) to be theleast upper bound of the probabilities of A supporting φ.

A similar consideration can be made with respect to the possibility setsassociated with elements of Φ with respect to a random mapping Γ. As inSection 26.3 we define the possibility set of φ as

pΓ(φ) = ω ∈ Ω : Γ(ω)⊗ φ 6= z.

This set contains all assumptions ω which do not lead to a contradiction withφ under the mapping Γ. Thus, the probability of this set, if it is defined,measures the degree of possibility or the degree of plausibility of φ,

plΓ(φ) = P (pΓ(φ)). (27.17)

As in the case of the degree of support, there is no guarantee that pΓ(φ) ismeasurable. But we can solve this problem in a way similar to the case ofthe degree of support. Under any measurable event A ⊇ pΓ(φ) the elementφ remains possible. Therefore we define for every set H ∈ P

ξ0(H) = ∧[A] : A ⊇ H,A ∈ H. (27.18)

Note that A ⊇ H if, and only if Ac ⊆ Hc. This implies that ξ0(H) =(ρ0(Hc))c. From this in turn we conclude that the following corollary toTheorem 27.1 holds:

Page 224: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

224 CHAPTER 27. RANDOM INFORMATION

Corollary 27.1 The application ξ0 : P(Ω) → A/J as defined in (27.18)has the following properties:

ξ0(Ω) = >,ξ0(∅) = ⊥,

ξ0

(⋃i∈I

Hi

)=

∨i∈I

ξ0(Hi). (27.19)

if Hi, i ∈ I is a countable family of subsets of Ω.

As before we can now compose pΓ with ξ0 to obtain a mapping ξΓ =ξ0 pΓ : Φ → B = A/J . We may then define for any φ ∈ Φ a degree ofplausibility by

plΓ(φ) = µ(ξΓ(φ)). (27.20)

Using Lemma 27.1 we obtain also

plΓ(φ) = infP (A) : A ∈ A, A ⊇ pΓ(φ) = P ∗(pΓ(φ)).

Here P ∗ is the outer probability of the set pΓ(φ). Thus, if pΓ(φ) is mea-surable, then P ∗(pΓ(φ)) = P (pΓ(φ)), which shows that (27.20) defines infact an extension of the plausibility defined by (27.17). We remark that theplausibility takes its full flavour only in Boolean information algebras (seeChapter 29).

27.3 Random Variables

Here we propose an alternative approach to define random variables in an in-formation algebra. We start with the information algebra (Rs, D) of simplerandom variables with values in a domain free information algebra (Φ, D)and defined on a sample space (Ω,A, P ), and consider the ideal completionof this algebra, rather than of the algebra (RΦ, D) considered in Section27.1. Let R denote the ideal completion of Rs. Then (R,Rs, D) is a com-pact information algebra (see part I) with simple random variables as finiteelements. We call the elements of R generalized random variables. A gener-alized random variable is thus an ideal of simple random variables. For anyΓ ∈ R we may within the algebra (R,Rs, D) write Γ = ∨∆ ∈ Rs : ∆ ≤ Γ.

To any Γ ∈ R we may associate a random mapping Γ : Ω→ IΦ from theunderlying sample space into the ideal completion of (Φ, D) by defining

Γ(ω) = ∨∆(ω) : ∆ ∈ Γ.

This random mapping is defined by a sort of pointwise limit within IΦ. Wedenote the random mapping Γ deliberately with the same symbol as thegeneralized random variable Γ. The reason is that the two concept canessentially by identified as the following lemmata show:

Page 225: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

27.3. RANDOM VARIABLES 225

Lemma 27.2 1. ∀ Γ1,Γ2 ∈ R,

(Γ1 ⊗ Γ2)(ω) = Γ1(ω)⊗ Γ2(ω).

2. ∀ Γ ∈ R and ∀x ∈ D,

(Γ⇒x)(ω) = (Γ(ω))⇒x.

Proof. (1) By definition we have

(Γ1 ⊗ Γ2)(ω) = ∨∆(ω) : ∆ ∈ Γ1 ⊗ Γ2= φ ∈ Φ : ∃ ∆ ∈ Γ1 ⊗ Γ2 such that φ ≤ ∆(ω).

Assume then φ ∈ (Γ1⊗Γ2)(ω), hence φ ≤ ∆(ω) for some ∆ ∈ Γ1⊗Γ2. Then,by the definition of combination of ideals, ∆ ≤ ∆1⊗∆2 for some ∆1 ∈ Γ1 andsome ∆2 ∈ Γ2. This implies φ ≤ ∆(ω) ≤ (∆1 ⊗∆2)(ω) = ∆1(ω) ⊗∆2(ω).Thus φ ∈ Γ1(ω)⊗ Γ2(ω).

Conversely, again by the definition of combination of ideals

Γ1(ω)⊗ Γ2(ω) = φ ∈ Φ : φ ≤ ψ1 ⊗ ψ2, ψ1 ∈ Γ1(ω), ψ2 ∈ Γ2(ω).

Consider a φ ∈ Γ1(ω)⊗Γ2(ω) and assume φ ≤ ψ1⊗ψ2 for some ψ1 ∈ Γ1(ω)and ψ2 ∈ Γ2(ω). Since

Γ1(ω) = ∨∆(ω) : ∆ ∈ Γ1.

it follows from the compactness that there is a ∆1 ∈ Γ1 such that ψ1 ≤ ∆1(ω)and similarly it follows that there is a ∆2 ∈ Γ2 such that ψ2 ≤ ∆2(ω). Henceφ ≤ ∆1(ω)⊗∆2(ω) = (∆1⊗∆2)(ω) and ∆1⊗∆2 ∈ Γ1⊗Γ2 such that finallyφ ∈ (Γ1 ⊗ Γ2)(ω), and therefore φ ∈ (Γ1 ⊗ Γ2)(ω).

(2) Assume φ ∈ (Γ⇒x)(ω). Note that

(Γ⇒x)(ω) = ∨∆(ω) : ∆ ∈ Γ⇒x,

and

Γ⇒x = ∨∆ : ∆ ≤ ∆′⇒x for some ∆′ ∈ Γ,

Therefore, there is a ∆ ∈ Γ⇒x such that φ ≤ ∆(ω). But then there isfurther a ∆′ ∈ Γ such that ∆ ≤ ∆

′⇒x, hence φ ≤ (∆′⇒x)(ω) = (∆

′(ω))⇒x.

Now, from ∆′ ∈ Γ we conclude that ∆′(ω) ∈ Γ(ω). This implies then finallyφ ∈ (Γ(ω))⇒x.

Conversely, let φ ∈ (Γ(ω))⇒x, i.e. φ ≤ ψ⇒x for some ψ ∈ Γ(ω), i.e.ψ ≤ ∨∆(ω) : ∆ ∈ Γ. By compactness it follows that ψ ≤ ∆(ω) for some∆ ∈ Γ. Hence we conclude that φ ≤ (∆(ω))⇒x = (∆⇒x)(ω). This showsthat φ ∈ (Γ⇒x)(ω). ut

According to this lemma we have a homomorphism between the algebraof generalized random variables and random mappings. In fact it is anembedding, since Γ1(ω) = Γ2(ω) for all ω ∈ Ω implies Γ1 = Γ2. as thelemma can be strengthened in the following way: follows from the nextlemma, which strengthens Lemma 27.2.

Page 226: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

226 CHAPTER 27. RANDOM INFORMATION

Lemma 27.3 If X ⊆ R is a directed set, then

(∨Γ∈XΓ)(ω) = ∨Γ∈XΓ(ω).

Proof. If Γ ∈ X, then Γ ≤ ∨Γ∈XΓ, hence Γ(ω) ≤ (∨Γ∈XΓ)(ω) andtherefore

∨Γ∈XΓ(ω) ≤ (∨Γ∈XΓ)(ω).

Conversely, assume φ ∈ (∨Γ∈XΓ)(ω). Since

(∨Γ∈XΓ)(ω) = ∨∆(ω) : ∆ ≤ ∨Γ∈XΓ

we have φ ≤ ∆(ω) for some simple random variable ∆ ≤ ∨Γ∈XΓ. Now, sinceX is a directed set, by compactness (more precisely, Theorem 15 (2) in PartI), there is a Γ ∈ X such that ∆ ≤ Γ, that is ∆(ω) ≤ Γ(ω). It follows thenthat φ ≤ ∨Γ∈XΓ(ω), which in turn implies

(∨Γ∈XΓ)(ω) ≤ ∨Γ∈XΓ(ω).

This concludes the proof of the lemma. utAs before, in the previous Section 27.1, there is no guarantee that the

support sΓ(φ) is measurable for every φ ∈ Φ. Of course we can extendthe support function to all of Φ by the inner probability. Here however wepropose alternatively to limit the class of random variables to the reasonableclass of mappings which are limits in some sense of countable sequences ofsimple random variables. This can be accomplished in the framework of thealgebra of generalized variables.

In order to do this we need to introduce a new concept. Let (Φ, D) bea domain-free information algebra and (IΦ, I(Φ), D) its ideal completion. Asubset S ⊆ IΦ is called σ-closed, if it is closed under countable combinationsor joins, i.e. if I1, I2, . . . is a countable subset of S, then ∨∞i=1Ii belongsto S too. The intersection of any family of σ-closed sets is also σ-closed.Further the set IΦ itself is σ-closed. Therefore, for any subset X ⊆ IΦ wemay define the σ-closure σ(X) as the intersection of σ-closed sets containingX.

We are particularly interested in σ(Φ), the σ-closure of Φ in IΦ. Notethat here, as in the sequel, we identify Φ with I(Φ) for simplicity of notation.Also we shall write φ rather than I(φ), even if we operate within IΦ. Theσ-closure of Φ can be characterized as follows:

Theorem 27.2 If (Φ, D) is an information algebra, then

σ(Φ) = φ ∈ IΦ : φ = ∨∞i=1ψi, ψi ∈ Φ,∀i = 1, 2, . . .. (27.21)

Page 227: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

27.3. RANDOM VARIABLES 227

Proof. Clearly the set on the right of equation (27.21) contains Φ and iscontained in σ(Φ). We claim that this set is itself σ-closed. In fact, considera countable set φj of elements of this set, such that

φj = ∨∞i=1ψj,i

with ψj,i ∈ Φ. Define the set I = (j, i) : j = 1, 2 . . . ; i = 1, 2 . . . and thesets Ij = (j, i); i = 1, 2, . . . for j = 1, 2, . . ., and Ki = (h, j) : 1 ≤ h, j ≤ ifor i = 1, 2, . . .. Then we have

I = ∪∞j=1Ij = ∪∞i=1Ki.

By the laws of associativity in the complete lattice IΦ we obtain then

∨∞j=1φj = ∨∞j=1(∨(j,i)∈Ijψj,i)= ∨(j,i)∈Iψj,i

= ∨∞i=1(∨(h,j)∈Kiψh,j).

But ∨(h,j)∈Kiψh,j ∈ Φ for i = 1, 2, . . .. Hence ∨∞j=1φj belongs itself to theset on the right hand side of (27.21). This means that this set is indeedσ-closed. Since the set contains Φ, it contains also σ(Φ), hence it equalsσ(Φ). ut

Consider now a monotone sequence φ1 ≤ φ2 ≤ . . . of elements of Φ. Inthe following theorem we show that for such sequences join commutes withfocussing:

Theorem 27.3 For a monotone sequence φ1 ≤ φ2 ≤ . . . of elements of Φ,and for any x ∈ D,

(∨∞i=1φi)⇒x = ∨∞i=1φ

⇒xi . (27.22)

Proof. By the density property of a compact information algebra (seePart I), we have that

(∨∞i=1φi)⇒x = ∨ψ⇒x : ψ ∈ Φ, ψ ≤ ∨∞i=1φi.

Certainly, the left hand side of equation (27.22) is an upper bound of theright hand side. Consider a ψ ∈ Φ such that ψ ≤ ∨∞i=1φi. Note thatthe monotone sequence φ1, φ2, . . . is a directed set in Φ. Therefore, by thecompactness property of a compact information algebra (see part I), thereis a φi in the sequence such that ψ ≤ φi. This implies that ψ⇒x ≤ φ⇒xi .From this we conclude that

∨ψ⇒x : ψ ∈ Φ, ψ ≤ ∨∞i=1φi ≤ ∨∞i=1φ⇒xi .

This proves the equality in (27.22). utTheorem 27.3 shows in particular that σ(Φ) is closed under focussing.

In fact, if φi is any sequence of elements of Φ, then we may define ψi =

Page 228: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

228 CHAPTER 27. RANDOM INFORMATION

∨ik=1φk ∈ Φ, such that ψk for k = 1, 2, . . . is a monotone sequence andφ = ∨∞i=1φi = ∨∞i=1ψi. So, for φ ∈ σ(Φ) and any x ∈ D by Theorem 27.3

φ⇒x = ∨∞i=1ψ⇒xi , (27.23)

ψ⇒xi ∈ Φ and hence φ⇒x ∈ σ(Φ) by Theorem 27.2. As a σ-closed set σ(Φ)is closed under combination. Therefore (σ(Φ), D) is itself an informationalgebra. Since it is closed under combination (i.e. join) of countable setsand satisfies condition (27.22) we call it the σ-information algebra inducedby (Φ, D). We refer however to Example ?? below to clear this concept.

We now take the σ-closure of Rs in the compact algebra (R,Rs, D)of generalized random variables. The elements of σ(Rs) are called properrandom variables or simply random variables. According to Theorem 27.2random variables are defined as

Γ = ∨∞i=1∆i, with ∆i ∈ Rs, ∀I = 1, 2, . . . .

Then (σ(Rs), D) is a σ-information algebra, containing Rs, i.e. the simplerandom variables. To random variables we can, just as with generalizedrandom variables, associate random mappings, defined by

Γ(ω) = ∨∞i=1∆i(ω), with ∆i ∈ Rs,∀I = 1, 2, . . . .

Random variables are thus within the ideal completion IΦ of Φ pointwiselimits of sequences of simple random variables. Note that Γ(ω) ∈ σ(Φ) byTheorem 27.2. According to Lemma 27.2 combination and focusing can alsobe defined pointwise within the algebra (IΦ, D), or more precisely, withinthe σ-information algebra (σ(Φ), D).

For σ-random variables Lemma 27.2 can be extended in the followingway:

Lemma 27.4 If Γi ∈ σ(Rs), for i = 1, 2, . . ., then

(∨∞i=1Γi)(ω) = ∨∞i=1Γi(ω). (27.24)

Proof. Define Σi = ∨ij=1Γi. Then Σi ∈ σ(Rs) is a monotone sequence ofrandom variables and ∨∞i=1Σi = ∨∞i=1Γi. The same holds for Σi(ω) ∈ σ(Φ)and ∨∞i=1Σi(ω) = ∨∞i=1Γi(ω). Since a monotone sequence is a directed set,applying Lemma 27.3, we obtain

(∨∞i=1Γi)(ω) = (∨∞i=1Σi)(ω) = ∨∞i=1Σi(ω) = ∨∞i=1Γi(ω).

Thus (27.24) holds indeed. utIn the next section we examine a variant of the theory above of random

variables in algebras which are already themselves closed under countablecombination.

Page 229: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

27.4. RANDOM VARIABLES IN σ-ALGEBRAS 229

27.4 Random Variables in σ-Algebras

In the previous Section 27.3 we have seen, that the ideal completion of aninformation algebra contains a σ-algebra. There are some information alge-bras, like for example Borel algebras, which are themselves already closedunder countable combination. It should be noted however that (Φ, D) is em-bedded into the ideal completion (IΦ, D) only by a homomorphism φ 7→ I(φ),respecting finite combination only. Thus, if φ1, φ2, . . . is a countable set ofelements of Φ and φ = ∨∞i=1φ, then I(φ), the principal ideal of φ in Φ is notequal to ∨∞i=1I(φ). This is so, because the principal ideal I(φ) is a σ-ideal,i.e. an ideal closed under countable combination of its elements. The idealcompletion on the other hand contains ordinary ideals, closed only underfinite combinations in general. Therefore, by the definition of combinationof ideals in IΦ,

I(φ1, φ2, . . .) =⋂I : I ideal, φ1, φ2, . . . ⊆ I.

Certainly φ1, φ2, . . . ⊆ I(φ), but there may be ordinary ideals I such thatφ1, φ2, . . . ⊂ I ⊂ I(φ), hence we can only conclude that I(φ1, φ2, . . .) ⊆I(φ). So, even if Φ is closed under countable combinations, it is not truethat σ(Φ) = Φ. Nevertheless, a theory of random variables may be definedin such algebras along similar lines as in Section 27.3 above.

A domain-free information algebra (Φ, D) is called a σ-information alge-bra, if

1. Countable Combination: Φ is closed under countable combination.

2. Continuity of Focussing: For every montone sequence φ1 ≤ φ2 ≤ . . . ∈Φ, and for any x ∈ D, it holds that

(∞⊗i=1

φi)⇒x =∞⊗i=1

φ⇒xi .

In this section, information algebras (Φ, D) are always assumed to be σ-algebras. Certainly, the countable combination equals also the countablesupremum,

∞⊗i=1

φi =∞∨i=1

φi.

Simple random variables ∆ on a probability space (Ω,A, P ) with valuesin a domain-free information algebra (Φ, D) are defined as in Section 26.1and they form still an information algebra (Rs, D). Since (Φ.D) is a σ-algebra, we may define a random mapping Γ : Ω→ Φ by

Γ(ω) =∞∨i=1

∆i(ω).

Page 230: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

230 CHAPTER 27. RANDOM INFORMATION

We call such random mappings Γ random variables in the σ-algebra (Φ, D).Let now Rσ be the family of random variables in the σ-algebra (Φ, D). Weare going to define the operations of combination and focusing in Rσ asusual by pointwise combination and focussing:

1. Combination: (Γ1 ⊗ Γ2)(ω) = Γ1(ω)⊗ Γ2(ω),

2. Focussing: (Γ⇒x)(ω) = (Γ(ω))⇒x.

We have to verify that the resulting random mappings still belong to Rσ.So, let

Γ1 =∞∨i=1

∆1,i, Γ2 =∞∨i=1

∆2,i.

Then we obtain, using associativity of the supremum

(Γ1 ⊗ Γ2)(ω) = Γ1(ω)⊗ Γ2)(ω) = (∞∨i=1

∆1,i) ∨ (∞∨i=1

∆2,i)

=∞∨i=1

(∆1,i(ω) ∨∆2,i(ω)) =∞∨i=1

(∆1,i ∨∆2,i)(ω).

Since ∆1,i ∨∆2,i ∈ Rs this proves that Γ1 ⊗ Γ2 ∈ Rσ. Similarly, let

Γ(ω) =∞∨i=1

∆i(ω).

Define then

Γn(ω) =n∨i=1

∆i(ω),

such that Γn(ω) is an increassing sequence in Φ. Then, by the continuity offocussing,

(Γ⇒x)(ω) = (Γ(ω))⇒x = (∞∨i=1

Γi)⇒x

=∞∨i=1

(Γi(ω))⇒x =∞∨i=1

(Γi(ω))⇒x.

Again, ∆i ∈ Rs implies Γi ∈ Rs for all i = 1, 2, . . .. Thus this shows thatΓ⇒x ∈ Rσ.

We expect (Rσ, D) to form an information algebra, even a σ-algebra.This is indeed essentially true.

Page 231: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

27.4. RANDOM VARIABLES IN σ-ALGEBRAS 231

Theorem 27.4 The system (Rσ, D) of random variables, with combina-tion and focussing defined as above, satisfies all axioms of a domain-freeσ-information algebra, except the support axiom.

Proof. We show that (Rσ) is σ-closed: Let

Γj(ω) =∞∨i=1

∆j,i(ω), for j = 1, 2, . . . ,

and define the random mapping Γ by

Γ(ω) =∞∨j=1

Γj(ω) =∞∨j=1

( ∞∨i=1

∆j,i

).

Consider the sets Ij = (j, i) : i = 1, 2, . . . and

I = (j, i) : j, i = 1, 2, . . . =∞⋃j=1

Ij .

Then we have

Γ(ω) =∞∨

(j,i)∈I

∆j,i.

Let further Ki = (h, j) : h, j ≤ i and define

Γi(ω) =i∨

h=1

i∨j=1

∆h,j(ω).

Note that Γi are simple random variables. Then we have

I =∞⋃i=1

Ki

and therefore, by the general law of associativity of supremum,

Γi(ω) =∞∨i=1

∨(h,j)∈Ki

∆h,j(ω)

=∞∨i=1

Γi(ω).

Thus the random mapping Γ is indeed a random variable and Rσ is closedunder countable combination.

The axioms of the information algebra (Rσ, D) of are as usual all in-herited from the underlying information algebra (Φ, D), except the supportaxiom.

Page 232: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

232 CHAPTER 27. RANDOM INFORMATION

It remains to verify the continuity of focusing. Note that in the infor-mation algebra (Rσ, D) we have Γ1 ≤ Γ2 if, and only if, Γ1(ω) ≤ Γ2(ω).So, assume Γ1 ≤ Γ2 ≤ . . . be a monotone sequence of random variables inRσ and x ∈ D. Then continuity of focusing in (Rσ, D) follows from thisproperty in (Φ, D) as follows:

((∞∨i=1

Γi)⇒x)(ω) = (∞∨i=1

Γi(ω))⇒x =∞∨i=1

(Γi(ω))⇒x

= (∞∨i=1

Γ⇒xi )(ω).

This ends the proof. utThe support axiom is not guaranteed, because the sequence of simple

random variables defining the random variable may have supports withoutan upper bound in D. Certainly, (Rs, D) is a subalgebra of (Rσ, D). Withinthe algebra (Rσ, D), each element of Rσ is the supremum of the simplerandom variables, which it dominates.

Lemma 27.5 Let Γ ∈ Rσ, defined by

Γ(ω) =∞∨i=1

∆i(ω).

Then, in the information algebra (Rσ, D)

Γ =∞∨i=1

∆i =∨∆ : ∆ ∈ Rs,∆ ≤ Γ. (27.25)

Proof. The first equality in (27.25) follows directly from the definitionof Γ. Trivially, Γ is an upper bound of the set ∆ : ∆ ≤ Γ. If Γ′ is anotherupper bound of this set, then it is also an upper bound of the ∆i, henceΓ ≤ Γ′. Therefore. Γ is the least upper bound of the set ∆ : ∆ ≤ Γ. ut

Example 27.1 Random Variables in Compact-Algebras. The theory of ran-dom variables developed above may be presented in a similar way in theframework of compact information algebras. Here is a sketch of the ap-proach: Let (Φ,Φf , D) be a compact information algebra. Define simplerandom variables ∆ with finite elements from Φf as values. They form aninformation algebra (Rs;D) with combination and focusing defined point-wise. Order between simple random variables is also defined pointwise, i.e.∆1 ≤ ∆2 holds if, and only if ∆1(ω) ≤ ∆2(ω) for all sample points ω of theunderlying probability space.

If X is a directed set of simple random variables, then the random map-ping Γ = ∨X : Ω→ Φ is also defined pointwise:

Γ(ω) = ∨∆∈X∆(ω).

Page 233: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

27.4. RANDOM VARIABLES IN σ-ALGEBRAS 233

In particular, if ∆1,∆2, . . . is a countable family of simple random variables,define

Γi = ∨ij=1∆j .

Then the mappings Γ = ∨∞i=1∆i define a σ-information algebra of randommappings.

This method can, in an obvious way, also be applied for labeled compactinformation algebras.

Page 234: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

234 CHAPTER 27. RANDOM INFORMATION

Page 235: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 28

Allocation of Probability

28.1 Algebra of Allocation of Probabilities

In Section 27.1 we have introduced allocations of probability (a.o.p.) as ameans to extend the support function of a random mapping beyond themeasurable elements φ, i.e. the elements for which sΓ(φ) ∈ A. Theseallocations of probability play an important role in the theory of randominformation. Therefore, we start here with a study of this concept, firstindependent of its relation to random mappings and random variables.

Random variables or hints and probabilistic argumentation systems pro-vide means to model explicitly the mechanisms which generate uncertaininformation. Alternatively, allocations of probability may serve to directlyassign beliefs to pieces of information. This is more in the spirit of a subjec-tive description of belief, advocated especially by G. Shafer (Shafer, 1973;Shafer, 1976). In this view, allocations of probability are taken as the prim-itive elements, rather than random variables or hints. This is the point ofview developed in this section (see also (Kohlas, 1995; Kohlas, 1997)).

Consider a domain-free information algebra (Φ, D) and a probabilityalgebra (B, µ). Allocations of probability are mappingρ : Φ→ B such that

(A1) ρ(e) = >,

(A2) ρ(φ⊗ ψ) = ρ(φ) ∧ ρ(ψ).

If furthermore ρ(z) = ⊥ holds, then the allocation is called normalized .Note, that if φ ≤ ψ, that is, φ⊗ψ = ψ, then ρ(φ⊗ψ) = ρ(φ)∧ρ(ψ) = ρ(ψ),hence ρ(ψ) ≤ ρ(φ). A particular a.o.p. is defined by ν(φ) = ⊥, unlessφ = e, in which case ν(e) = >. This is called the vacuous allocation. It isassociated with the vacuous information associated wit the random variableΓ(ω) = e for all ω ∈ Ω.

We may think of an allocation of probability as the description of a bodyof belief relative to pieces information in an information algebra (Φ, D) ob-tained from a source of information. Two (or more) distinct sources of

235

Page 236: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

236 CHAPTER 28. ALLOCATION OF PROBABILITY

information will lead to the definition of two (or more) corresponding allo-cations of probability. Thus, in a general setting let AΦ be the set of allallocations of probability on Φ in (B, µ). Select two allocations ρi, i = 1, 2,from AΦ. How can they be combined in order to synthesize the two bodiesof information they represent into a single body?

The basic idea is as follows: consider a piece of information φ in Φ. If nowφ1 and φ2 are two other pieces of information in Φ, such that φ ≤ φ1 ⊗ φ2,then any belief allocated to φ1 and to φ2 by the two allocations ρ1 and ρ2

respectively, is a belief allocated to φ by the two allocations simultaneously.That is, the total belief ρ(φ) to be allocated to φ by the two allocations ρ1

and ρ2 together must be at least the combined, common support allocatedto φ1 and φ2 by each of the two allocations respectively,

ρ(φ) ≥ ρ1(φ1) ∧ ρ2(φ2). (28.1)

In the absence of other information, it seems then reasonable to define thecombined belief in φ, as obtained from the two sources of information, asthe least upper bound of all these implied beliefs,

ρ(φ) = ∨ρ1(φ1) ∧ ρ2(φ2) : φ ≤ φ1 ⊗ φ2. (28.2)

This defines indeed a new allocation of probability.

Theorem 28.1 ρ as defined by (28.2) is an allocation of probability.

Proof. First, we have

ρ(e) = ∨ρ1(φ1) ∧ ρ2(φ2) : e ≤ φ1 ⊗ φ2= ρ1(e) ∧ ρ2(e) = >.

So (A1) is satisfied.Next, let ψ1, ψ2 ∈ Φ. By definition we have

ρ(ψ1 ⊗ ψ2) = ∨ρ1(φ1) ∧ ρ2(φ2) : ψ1 ⊗ ψ2 ≤ φ1 ⊗ φ2.

Now, ψ1 ≤ ψ1 ⊗ ψ2 implies that

∨ρ1(φ1) ∧ ρ2(φ2) : ψ1 ⊗ ψ2 ≤ φ1 ⊗ φ2≤ ∨ρ1(φ1) ∧ ρ2(φ2) : ψ1 ≤ φ1 ⊗ φ2

and similarly for ψ2. Thus, we have ρ(ψ1 ⊗ ψ2) ≤ ρ(ψ1), ρ(ψ2), that isρ(ψ1 ⊗ ψ2) ≤ ρ(ψ1) ∧ ρ(ψ2).

On the other hand,

(φ1, φ2) : ψ1 ⊗ ψ2 ≤ φ1 ⊗ φ2⊇ (φ1, φ2) : φ1 = φ′1 ⊗ φ′′1, φ2 = φ′2 ⊗ φ′′2, ψ1 ≤ φ′1 ⊗ φ′2, ψ2 ≤ φ′′1 ⊗ φ′′2.

Page 237: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

28.1. ALGEBRA OF ALLOCATION OF PROBABILITIES 237

By the distributive law for complete Boolean algebras we obtain then

ρ(ψ1 ⊗ ψ2)≥ ∨ρ1(φ′1 ⊗ φ′′1) ∧ ρ2(φ′2 ⊗ φ′′2) : ψ1 ≤ φ′1 ⊗ φ′2, ψ2 ≤ φ′′1 ⊗ φ′′2= ∨ρ1(φ′1) ∧ ρ1(φ′′1) ∧ ρ2(φ′2) ∧ ρ2(φ′′2) : ψ1 ≤ φ′1 ⊗ φ′2, ψ2 ≤ φ′′1 ⊗ φ′′2=

(∨ρ1(φ′1) ∧ ρ2(φ′2) : ψ1 ≤ φ′1 ⊗ φ′2

)∧(

∨ρ1(φ′′1) ∧ ρ2(φ′′2) : ψ2 ≤ φ′′1 ⊗ φ′′2)

= ρ(ψ1) ∧ ρ(ψ2). (28.3)

This implies finally that ρ(ψ1 ⊗ ψ2) = ρ(ψ1) ∧ ρ(ψ2). Thus (A2) holds tooand ρ is indeed an allocation of probability. ut

In this way, in the set of allocations of probability AΦ a binary com-bination operation is defined. We denote this operation by ⊗. Thus, ρ asdefined by (28.2) is written as ρ = ρ1 ⊗ ρ2. The following theorem gives usthe elementary properties of this operation.

Theorem 28.2 The combination operation, as defined by (28.2), is com-mutative, associative, idempotent and the vacuous allocation is the neutralelement of this operation.

Proof. The commutativity of (28.2) is evident. For the associativitynote that for a φ ∈ Φ we have, due to the associativity and distributivity ofcomplete Boolean algebras,

((ρ1 ⊗ ρ2)⊗ ρ3)(φ)= ∨ρ1 ⊗ ρ2(φ12) ∧ ρ3(φ3) : φ ≤ φ12 ⊗ φ3= ∨∨ρ1(φ1) ∧ ρ2(φ2) : φ12 ≤ φ1 ⊗ φ2 ∧ ρ3(φ3) : φ ≤ φ12 ⊗ φ3= ∨ρ1(φ1) ∧ ρ2(φ2) ∧ ρ3(φ3) : φ ≤ φ1 ⊗ φ2 ⊗ φ3.

For (ρ1 ⊗ (ρ2 ⊗ ρ3))(φ) we obtain exactly the same result in the same way.This proves associativity.

To show idempotency consider

ρ⊗ ρ(φ) = ∨ρ(φ1) ∧ ρ(φ2) : φ ≤ φ1 ⊗ φ2= ∨ρ(φ1 ⊗ φ2) : φ ≤ φ1 ⊗ φ2 = ρ(φ)

since the last supremum is attained for φ1 = φ2 = φ.Finally let ν denote the vacuous allocation. Then, for any allocation ρ

and any φ ∈ Φ we have, noting that ν(ψ) = ⊥, unless ψ = e, in which caseν(e) = >,

ρ⊗ ν(φ) = ∨ρ(φ1) ∧ ν(φ2) : φ ≤ φ1 ⊗ φ2 = ρ(φ).

This shows that ν is the neutral element for combination. utThis theorem shows that AΦ is a commutative and idempotent semi-

group, that is, a semilattice. Indeed, a partial order between allocations can

Page 238: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

238 CHAPTER 28. ALLOCATION OF PROBABILITY

be introduced as usual by defining ρ1 ≤ ρ2 if ρ1⊗ ρ2 = ρ2. This means thatfor all φ ∈ Φ,

ρ1 ⊗ ρ2(φ) = ∨ρ1(φ1) ∧ ρ2(φ2) : φ ≤ φ1 ⊗ φ2 = ρ2(φ).

We have therefore always ρ1(φ1)∧ ρ2(φ2) ≤ ρ2(φ) if φ ≤ φ1 ⊗ φ2. Take nowφ1 = φ and φ2 = e, such that φ ≤ φ ⊗ e = φ, to obtain ρ1(φ) ∧ ρ2(e) =ρ1(φ) ≤ ρ2(φ). Thus we have ρ1 ≤ ρ2 if, and only if, ρ1(φ) ≤ ρ2(φ) for allφ ∈ Φ.

As an example consider allocations ρ of the form

ρφ(ψ) => if ψ ≤ φ,⊥ otherwise,

(28.4)

for some φ ∈ Φ. Such allocations are called deterministic. They express thefact that full belief is allocated to the information φ. Now, for φ1, φ2 ∈ Φwe have

ρφ1 ⊗ ρφ2(ψ) = ∨ρφ1(ψ1) ∧ ρφ2(ψ2) : ψ ≤ ψ1 ⊗ ψ2

=> if ψ ≤ φ1 ⊗ φ2,⊥ otherwise

= ρφ1⊗φ2(ψ). (28.5)

So, the combination of allocations produces the expected result. Note alsothat ρe = ν. This shows that the mapping φ 7→ ρφ is a ⊗-homomorphism.

Next we turn to the operation of focusing of an allocation of probability.Let ρ be an allocation of probability on an information algebra (Φ, D) andx a domain in D. Just as it is possible to focus an information φ from Φ tothe domain x, it should also be possible to focus the belief represented bythe allocation ρ to the information supported by the domain x. This meansto extract the information related to x from ρ. Thus, for a φ ∈ Φ considerthe beliefs allocated to pieces of information ψ which are supported by xand which entail φ, i.e. φ ≤ ψ = ψ⇒x. The part of the belief allocated to φand relating to the domain x, ρ⇒x(φ) must then be at least ρ(ψ),

ρ⇒x(φ) ≥ ρ(ψ) for any ψ = ψ⇒x ≥ φ. (28.6)

In the absence of other information, it seems again reasonable to defineρ⇒x(φ) to be the least upper bound of all these implied supports,

ρ⇒x(φ) = ∨ρ(ψ) : ψ = ψ⇒x ≥ φ. (28.7)

This defines indeed an allocation of probability.

Theorem 28.3 ρ⇒x(φ) as defined by (28.7) is an allocation of probability.

Page 239: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

28.1. ALGEBRA OF ALLOCATION OF PROBABILITIES 239

Proof. We have by definition

ρ⇒x(e) = ∨ρ(ψ) : ψ = ψ⇒x ≥ e = ρ(e) = >,

since e = e⇒x. Thus (A1) is verified.Again by definition,

ρ⇒x(φ1 ⊗ φ2) = ∨ρ(ψ) : ψ = ψ⇒x ≥ φ1 ⊗ φ2.

From φ1, φ2 ≤ φ1⊗φ2 it follows that ρ⇒x(φ1⊗φ2) ≤ ρ⇒x(φ1), ρ⇒x(φ2) andthus ρ⇒x(φ1 ⊗ φ2) ≤ ρ⇒x(φ1) ∧ ρ⇒x(φ2).

On the other hand,

ψ : ψ = ψ⇒x ≥ φ1 ⊗ φ2⊇ ψ = ψ1 ⊗ ψ2 : ψ1 = ψ⇒x1 ≥ φ1, ψ2 = ψ⇒x2 ≥ φ2.

From this we obtain, using the distributive law for complete Boolean alge-bras,

ρ⇒x(φ1 ⊗ φ2)≥ ∨ρ(ψ1 ⊗ ψ2) : ψ1 = ψ⇒x1 ≥ φ1, ψ2 = ψ⇒x2 ≥ φ2= ∨ρ(ψ1) ∧ ρ(ψ2) : ψ1 = ψ⇒x1 ≥ φ1, ψ2 = ψ⇒x2 ≥ φ2= (∨ρ(ψ1) : ψ1 = ψ⇒x1 ≥ φ1) ∧ (∨ρ(ψ2) : ψ2 = ψ⇒x2 ≥ φ2)= ρ(ψ1) ∧ ρ(ψ2).

This proves property (A2) for an allocation of support. utThe following remarks are easily verified:

ρ⇒x(φ) ≤ ρ(φ) for all φ ∈ Φ,φ = φ⇒x implies ρ⇒x(φ) = ρ(φ). (28.8)

Thus we see that ρ⇒x ≤ ρ. Furthermore, if ρ⇒x = ρ, then the allocationof probability ρ is said to be supported by the domain x. The followingtheorem shows that allocations of probability, furnished with the operationsof combination and focusing as defined so far, satisfy the axioms of domain-free information algebras.

Theorem 28.4 1. Transitivity. For ρ ∈ AΦ and x, y ∈ D,

(ρ⇒x)⇒y = ρ⇒x∧y. (28.9)

2. Combination. For ρ1, ρ2 ∈ AΦ and x ∈ D

(ρ⇒x1 ⊗ ρ2)⇒x = ρ⇒x1 ⊗ ρ⇒x2 . (28.10)

Page 240: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

240 CHAPTER 28. ALLOCATION OF PROBABILITY

3. Idempotency. For ρ ∈ AΦ and x ∈ D

ρ⊗ ρ⇒x = ρ. (28.11)

Proof. (1) By definition and the associative law for complete Booleanalgebras

(ρ⇒x)⇒y(φ) = ∨ρ⇒x(ψ) : ψ = ψ⇒y ≥ φ= ∨∨ρ(ψ′) : ψ′ = ψ′⇒x ≥ ψ : ψ = ψ⇒y ≥ φ= ∨ρ(ψ′) : ψ′ = ψ′⇒x ≥ ψ = ψ⇒y ≥ φ.

Also by definition, we have

ρ⇒x∧y(φ) = ∨ρ(ψ) : ψ = ψ⇒x∧y ≥ φ= ∨ρ(ψ) : ψ = (ψ⇒x)⇒y ≥ φ.

If ψ is supported by x ∧ y, then it is supported both by x and by y. Takehere now ψ′ = ψ. Then it follows that ψ′ = ψ′⇒x ≥ ψ = ψ⇒y ≥ φ. Thisimplies that

(ρ⇒x)⇒y(φ) ≥ ρ⇒x∧y(φ).

On the other hand, ψ′ = ψ′⇒x ≥ ψ = ψ⇒y ≥ φ implies that ψ′⇒y =(ψ′⇒x)⇒y ≥ ψ⇒y ≥ φ. Furthermore, ψ′⇒y ≤ ψ′ implies ρ(ψ′) ≤ ρ(ψ′⇒y).Thus we obtain

(ρ⇒x)⇒y(φ) ≤ ∨ρ(ψ′⇒y) : ψ′⇒y = (ψ′⇒y)⇒x ≥ φ.

Now put in this formula ψ′⇒y = ψ, such that ψ⇒y = ψ. Then we obtain

(ρ⇒x)⇒y(φ) ≤ ∨ρ(ψ) : ψ = ψ⇒y = (ψ⇒y)⇒x ≥ φ= ∨ρ(ψ) : ψ = ψ⇒y = ψ⇒x∧y ≥ φ= ρ⇒x∧y(φ).

This proves transitivity of focusing of allocations of probability.(2) Again, by definition and the distributive and associative laws of com-

plete Boolean algebras, for any φ ∈ Φ

ρ⇒x1 ⊗ ρ⇒x2 (φ) = ∨ρ⇒x1 (φ1) ∧ ρ⇒x2 (φ2) : φ ≤ φ1 ⊗ φ2= ∨(∨ρ1(ψ1) : ψ1 = ψ⇒x1 ≥ φ1)∧ (∨ρ2(ψ2) : ψ2 = ψ⇒x2 ≥ φ2) : φ ≤ φ1 ⊗ φ2

= ∨ρ1(ψ1) ∧ ρ2(ψ2) : ψ1 = ψ⇒x1 ≥ φ1, ψ2 = ψ⇒x2 ≥ φ2, φ ≤ φ1 ⊗ φ2= ∨ρ1(ψ1) ∧ ρ2(ψ2) : ψ1 = ψ⇒x1 , ψ2 = ψ⇒x2 , φ ≤ ψ1 ⊗ ψ2.

Also by definition,

ρ⇒x1 ⊗ ρ2(φ) = ∨ρ⇒x1 (φ1) ∧ ρ2(φ2) : φ ≤ φ1 ⊗ φ2

Page 241: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

28.1. ALGEBRA OF ALLOCATION OF PROBABILITIES 241

and therefore, again by definition, and the associative and distributive lawsof complete Boolean algebras

(ρ⇒x1 ⊗ ρ2)⇒x(φ)= ∨(∨ρ⇒x1 (φ1) ∧ ρ2(φ2) : ψ ≤ φ1 ⊗ φ2) : φ ≤ ψ = ψ⇒x= ∨ρ⇒x1 (φ1) ∧ ρ2(φ2) : φ ≤ ψ = ψ⇒x ≤ φ1 ⊗ φ2= ∨(∨ρ1(ψ1) : φ1 ≤ ψ1 = ψ⇒x1 ) ∧ ρ2(φ2) : φ ≤ ψ = ψ⇒x ≤ φ1 ⊗ φ2= ∨ρ1(ψ1) ∧ ρ2(φ2) : φ1 ≤ ψ1 = ψ⇒x1 , φ ≤ ψ = ψ⇒x ≤ φ1 ⊗ φ2= ∨ρ1(ψ1) ∧ ρ2(ψ2) : ψ1 = ψ⇒x1 , φ ≤ ψ = ψ⇒x ≤ ψ1 ⊗ ψ2.

Now, whenever ψ1 = ψ⇒x1 , ψ2 = ψ⇒x2 , φ ≤ ψ1 ⊗ψ2, take ψ = ψ1 ⊗ψ2. Thenψ⇒x = ψ and φ ≤ ψ = ψ⇒x ≤ ψ1 ⊗ ψ2, ψ1 = ψ⇒x1 . This shows that

ρ⇒x1 ⊗ ρ⇒x2 (φ) ≤ (ρ⇒x1 ⊗ ρ2)⇒x(φ).

On the other hand, whenever ψ1 = ψ⇒x1 , φ ≤ ψ = ψ⇒x ≤ ψ1 ⊗ ψ2, thenφ ≤ ψ = ψ⇒x ≤ (ψ1⊗ψ2)⇒x = ψ1⊗ψ⇒x2 and ψ2 ≥ ψ⇒x2 . Further, ψ2 ≥ ψ⇒x2

implies that ρ2(ψ2) ≤ ρ2(ψ⇒x2 ). Therefore (renaming ψ⇒x2 by ψ2), we obtain

(ρ⇒x1 ⊗ ρ2)⇒x(φ)≤ ∨ρ1(ψ1) ∧ ρ2(ψ2) : ψ1 = ψ⇒x1 , ψ2 = ψ⇒x2 , φ ≤ ψ = ψ⇒x ≤ ψ1 ⊗ ψ2= ∨ρ1(ψ1) ∧ ρ2(ψ2) : ψ1 = ψ⇒x1 , ψ2 = ψ⇒x2 , φ ≤ ψ1 ⊗ ψ2,

since we may always take ψ = ψ1 ⊗ ψ2 such that ψ⇒x = (ψ1 ⊗ ψ2)⇒x =ψ1 ⊗ ψ2. This shows that

ρ⇒x1 ⊗ ρ⇒x2 (φ) ≥ (ρ⇒x1 ⊗ ρ2)⇒x(φ).

And this proves (2).(3) We have ρ⇒x ≤ ρ, hence ρ⊗ ρ⇒x = ρ by the idempotency of combi-

nation. This proves the idempotency. utAs an example consider for a φ ∈ Φ the deterministic allocation,

ρ⇒xφ (ψ) = ∨ρφ(ψ′) : ψ′ = ψ′⇒x ≥ ψ.

This equals >, if there is a ψ′ = ψ′⇒x > ψ such that ψ′ ≤ φ, and ⊥otherwise. But, we have ψ′ = ψ′⇒x ≤ φ if, and only if, ψ′ = ψ′⇒x ≤ φ⇒x.This shows that ρ⇒xφ (ψ) = ρφ⇒x(ψ) or ρ⇒xφ = ρφ⇒x .

The last theorem, together with Theorem 28.2 shows that (AΦ, D) is aninformation algebra, except that the existence of a support is not necessarilyguaranteed. The existence of a support is guaranteed, if D has a top element.So, the algebraic structure of (Φ, D) carries over to AΦ. In fact, this algebracan be seen as an extension of Φ since the elements of Φ can be regarded asdeterministic allocations. The mapping φ 7→ ρφ is an embedding of (Φ, D)in (AΦ, D). The allocation ζ(φ) = ⊥ for all φ is a null element in (AΦ, D),since ρ ⊗ ζ = ζ for all allocations ρ. Also clearly ζ⇒x = ζ. So the algebrasatisfies the nullity axiom.

Page 242: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

242 CHAPTER 28. ALLOCATION OF PROBABILITY

28.2 Random Variables and Allocations of Proba-bility

In Section 27.2 we have seen, that we may extend the degree of supportinduced by a random mapping by means of the mapping ρ to all of Φ oreven its ideal completion IΦ. Now, random variables can identified withrandom mappings as we have seen in Section 27.3. Thus, this result appliesto random variables as well. For the latter we can however obtain sharperresults on this extension beyond.

We start with simple random variables. For any simple random variable∆ ∈ Rs, defined on a probability space (Ω,A, P ), we have seen that allelements of Φ and even of IΦ have measurable allocations of support s∆(φ) ∈A and their degree of support is well defined. If we pass from the probabilityspace (Ω,A, P ) to its associated probability algebra (B, µ) (see Section 27.1),then we can define the allocation of probability (a.o.p.) associated with therandom variable ∆,

ρ∆(φ) = [s∆(φ)]

for all elements φ ∈ Φ and even for all elements in IΦ. Thus, we obtain forthe degree of support induced by the random variable ∆,

sp∆(φ) = P (s∆(φ)) = µ(ρ∆(φ)).

Again this holds on all of Φ and even its ideal completion IΦ. The map-ping ρ∆ : Φ → B satisfies the properties (A1) and A(2) of an allocation ofprobability in the previous Section 28.1.

A simple random variable ∆ is defined by a partition B1, . . . , Bn anda mapping defined by ∆(ω) = φi for all ω ∈ Bi, and i = 1, . . . , n. We write∆(ω) = ∆(Bi), if ω ∈ Bi. To the partition B1, . . . , Bn of Ω corresponds apartition [B1], . . . , [Bn] of the probability algebra B. The simple randomvariable ∆ can also be defined by a mapping ∆([Bi]) = φi from the partitionof B into Φ. Its allocation of probability can then also be determined as

ρ∆(φ) = ∨[Bi] : φ ≤ ∆([Bi]).

We note that ρ∆ = ρ∆→ . As far as allocation of probability (and support) isconcerned we may as well restrict ourselves to considering canonical simplerandom variables and their information algebra (Rs,c, D) (see Section 26.1).So, in the sequel of this section all simple random variables are assumed tobe canonical.

We now consider the mapping ρ : ∆ 7→ ρ∆ which maps simple randomvariables into a.o.p.s We show that this mapping is a homomorphism:

Page 243: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

28.2. RANDOM VARIABLES AND ALLOCATIONS OF PROBABILITY243

Theorem 28.5 Let ∆1,∆2,∆ ∈ Rs,c be simple random variables, definedon partitions in a probability algebra (B, µ) with values in a domain-freeinformation algebra (Φ, D). Then, for all φ ∈ Φ and x ∈ D,

ρ∆1⊗∆2(φ) = (ρ∆1 ⊗ ρ∆2)(φ) (28.12)

ρ∆⇒x(φ) = (ρ⇒x∆ )(φ). (28.13)

Proof. (1) Assume that ∆1 is defined on the partition B1,1, . . . , B1,nand ∆2 on the partition B2,1, . . . , B2,m of B. From the definition of an allo-cation of probability, and the distributive and associative laws for completeBoolean algebra, we obtain

∨ρ∆1(φ1) ∧ ρ∆2(φ2) : φ ≤ φ1 ⊗ φ2= ∨(∨B1,i : φ1 ≤ ∆1(B1,i))∧ (∨B2,j : φ2 ≤ ∆2(B2,j)) : φ ≤ φ1 ⊗ φ2

= ∨∨B1,i ∧B2,j 6= ⊥ : φ1 ≤ ∆1(B1,i), φ2 ≤ ∆2(B2,j) : φ ≤ φ1 ⊗ φ2= ∨B1,i ∧B2,j 6= ⊥ : φ1 ≤ ∆1(B1,i), φ2 ≤ ∆2(B2,j), φ ≤ φ1 ⊗ φ2.

But φ ≤ φ1 ⊗ φ2, φ1 ≤ ∆1(B1,i) and φ2 ≤ ∆2(B2,j) if, and only if, φ ≤∆1(B1,i)⊗∆2(B2,j). So we conclude that

∨ρ∆1(φ1) ∧ ρ∆2(φ2) : φ ≤ φ1 ⊗ φ2= ∨B1,i ∧B2,j 6= ⊥ : φ ≤ ∆1(B1,i)⊗∆2(B2,j)= ∨B1,i ∧B2,j 6= ⊥ : φ ≤ (∆1 ⊗∆2)(B1,i ∧B2,j) (28.14)= ρ∆1⊗∆2(φ).

(2) Assume that ∆ is defined on the partition B1, . . . , Bn of B. Then∆⇒x is also defined on B1, . . . , Bn. The associative law of complete Booleanalgebra gives us then,

∨ρ∆(ψ) : φ ≤ ψ = ψ⇒x = ∨∨Bi : ψ ≤ ∆(Bi) : φ ≤ ψ = ψ⇒x= ∨Bi : φ ≤ ψ = ψ⇒x ≤ ∆(Bi).

But, φ ≤ ψ = ψ⇒x ≤ ∆(Bi) holds if, and only if, φ ≤ (∆(Bi))⇒x =∆⇒x(Bi). Hence we see that

∨ρ∆(ψ) : φ ≤ ψ = ψ⇒x = ∨Bi : φ ≤ ∆⇒x(Bi) = ρ∆⇒x(φ).

utAs far as allocations of probability, which are induced by simple random

variables, are concerned, this theorem shows that the combination and fo-cusing of allocations reflects correctly the corresponding operations on theunderlying random variables. Let Υs be the image of Rs,c under the map-ping ρ. That is Υs is the set of all allocations of probability which are

Page 244: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

244 CHAPTER 28. ALLOCATION OF PROBABILITY

induced by simple random variables in (B, µ). The mapping h 7→ ρh is ahomomorphism, since

ρ∆1⊗∆2 = ρ∆1 ⊗ ρ∆2 ,

ρ∆⇒x = ρ⇒x∆ .

On the right hand side we understand the operations of combination andfocusing as those defined in the algebra of a.o.p.s. Also the vacuous randomvariable maps to the vacuous allocation. Thus we conclude that Υs is asubalgebra of the information algebra AΦ. We remark that, if we restrictthe mapping ρ to canonical random variables, then the mapping becomesan isomorphism.

In any case the mapping is order-preserving. This implies that idealsΓ ∈ R, that is, generalized random variables, are mapped into ideals ρ∆ :∆ ∈ Γ. Thus, we can extend the mapping from Rs to Υs to a mappingfrom the ideal completion R of Rs to the ideal completion Υ of Υs,

ρΓ = ∨ρ∆ : ∆ ≤ Γ (28.15)

Note that, for an a.o.p., if ∆ ≤ Γ, we should expect that ρ∆(φ) ≤ ρΓ(φ) inthe domain of these a.o.p.s. Therefore, it makes sense to associate with ρΓ

the mapping

ρΓ(φ) = ∨ρ∆(φ) : ∆ ≤ Γ (28.16)

from IΦ into the Boolean algebra B. Does this mapping define an a.o.p.?Yes, it does indeed.

Theorem 28.6 The mapping ρΓ defined in (28.16) is an allocation of prob-ability, defined of IΦ.

Proof. (1) We have

ρΓ(e) = ∨ρ∆(e) : ∆ ≤ Γ.

Since ρ∆ is an a.o.p., it holds that ρ∆(e) = >. Thus it follows that ρΓ(e) =>.

(2) Let φ1, φ2 ∈ IΦ. We use the definition of ρΓ(φ) and the distributivelaw of Boolean algebras:

ρΓ(φ1) ∧ ρΓ(φ2)=

(∨ρ∆′(φ1) : ∆′ ≤ Γ

)∧(∨ρ∆′′(φ2) : ∆′′ ≤ Γ

)= ∨ρ∆′(φ1) ∧ ρ∆′′(φ2) : ∆′,∆′,∆′′ ≤ Γ.

But ∆′,∆′′ ≤ Γ implies that ∆′,∆′′ ≤ ∆′ ⊗ ∆′′ = ∆ ≤ Γ and ∆ is still asimple random variable. Therefore we conclude that

ρ∆′(φ1) ≤ ρ∆(φ1), ρ∆′′(φ2) ≤ ρ∆(φ2)

Page 245: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

28.2. RANDOM VARIABLES AND ALLOCATIONS OF PROBABILITY245

such that

ρ∆′(φ1) ∧ ρ∆′′(φ2) ≤ ρ∆(φ1) ∧ ρ∆(φ2) = ρ∆(φ1 ⊗ φ2).

This implies that

ρΓ(φ1) ∧ ρΓ(φ2) ≤ ∨ρ∆(φ1 ⊗ φ2) : ∆s ≤ Γ = ρΓ(φ1 ⊗ φ2).

But ρ∆(φ1) ≥ ρ∆(φ1 ⊗ φ2). Hence

ρΓ(φ1) = ∨ρ∆(φ1) : ∆ ≤ Γ ≥ ∨ρ∆(φ1 ⊗ φ2) : ∆ ≤ Γ= ρΓ(φ1 ⊗ φ2).

Similarly, we obtain ρΓ(φ2) ≥ ρΓ(φ1 ⊗ φ2), hence ρΓ(φ1 ⊗ φ2) ≤ ρΓ(φ1) ∧ρΓ(φ2) and therefore ρΓ(φ1 ⊗ φ2) = ρΓ(φ1) ∧ ρΓ(φ2). ut

Note that the image of R under the mapping Γ 7→ ρΓ corresponds to theideal completion I(Υs) of a.o.p.s associated with simple random variables.Further, we remark that, according to (28.16) this mapping is continuous.There is more: The mapping Γ 7→ ρΓ is a homomorphism of R into theinformation algebra Υ of a.o.p.s as the following theorem shows. This resultjustifies also the association of the mapping defined in (28.16) with the a.o.p.of the generalized random variable Γ as defined in (28.15).

Theorem 28.7 Let Γ,Γ1,Γ2 ∈ R and x ∈ D. Then

ρΓ1⊗Γ2 = ρΓ1 ⊗ ρΓ2 ,

ρΓ⇒x = ρ⇒xΓ ,

where ρΓ is defined by (28.16) and the operations of combination and fo-cusing on the right hand sides are those of the information algebra Υ ofa.o.p.s.

Proof. We have to show that

ρΓ1⊗Γ2(φ) = (ρΓ1 ⊗ ρΓ2)(φ),ρΓ⇒x(φ) = (ρ⇒xΓ )(φ),

for all φ ∈ I(Φ).(1) We noted above that the mapping Γ 7→ ρΓ is continuous. Therefore,

by well-known results about continuous mappings (see Part I, Lemma 25 (2)in Section 5.2),

ρΓ1⊗Γ2 = ρ∨∆1⊗∆2:∆1∈Γ1,∆2∈Γ2 = ∨ρ∆1⊗∆2 : ∆1 ∈ Γ1,∆2 ∈ Γ2.

Page 246: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

246 CHAPTER 28. ALLOCATION OF PROBABILITY

On the other hand, for every φ ∈ I(Φ), we obtain, using the associative anddistributive laws of Boolean algebra,

(ρΓ1 ⊗ ρΓ2)(φ)= ∨ρΓ1(φ1) ∧ ρΓ2(φ1) : φ ≤ φ1 ⊗ φ2= ∨(∨ρ∆1(φ1) : ∆1 ∈ Γ1)∧(∨ρ∆2(φ2) : ∆2 ∈ Γ2) : φ ≤ φ1 ⊗ φ2

= ∨ρ∆1(φ1) ∧ ρ∆2(φ2) : ∆1 ∈ Γ1,∆2 ∈ Γ2, φ ≤ φ1 ⊗ φ2= ∨∨ρ∆1(φ1) ∧ ρ∆2(φ2) : φ ≤ φ1 ⊗ φ2 : ∆1 ∈ Γ1,∆2 ∈ Γ2= ∨(ρ∆1 ⊗ ρ∆2)(φ) : ∆1 ∈ Γ1,∆2 ∈ Γ2= ∨ρ∆1⊗∆2(φ) : ∆1 ∈ Γ1,∆2 ∈ Γ2.

The last equality follows from Theorem 28.5 (28.12). This proves thatρΓ1⊗Γ2 = ρΓ1 ⊗ ρΓ2 .

(2) Again by continuity, we obtain from the definition of focusing ofgeneralized random variables

ρΓ⇒x = ρ∨∆⇒x:∆∈Γ = ∨ρ∆⇒x : ∆ ∈ Γ.

But, we have also, by Theorem 28.5 (28.13),

ρ⇒xΓ (φ) = ∨ρΓ(ψ) : φ ≤ ψ = ψ⇒x= ∨∨ρ∆(ψ) : ∆ ∈ Γ : φ ≤ ψ = ψ⇒x= ∨∨ρ∆(ψ) : φ ≤ ψ = ψ⇒x : ∆ ∈ Γ= ∨(ρ⇒x∆ )(φ) : ∆ ∈ Γ= ∨ρ∆⇒x(φ) : ∆ ∈ Γ.

(28.17)

This proves that ρΓ⇒x = ρ⇒xh . utNow we claim that the a.o.p. ρΓ as defined by (28.16) coincides for

generalized random variables with the a.o.p. ρ0 sΓ defined in Section 27.1for random mappings.

Theorem 28.8 For all φ ∈ Φ and for all generalized random variablesΓ ∈ R, it holds that

ρΓ(φ) = ρ0(sΓ(φ)).

Further, for all φ such that sΓ(φ) is measurable,

ρΓ(φ) = [sΓ(φ)].

Page 247: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

28.3. LABELED ALLOCATIONS OF PROBABILITY 247

Proof. Fix an element φ ∈ Φ and consider a measurable subset A ⊆sΓ(φ). We define s simple random variable

∆(ω) =φ if ω ∈ A,e otherwise.

Then certainly ∆(ω) ≤ Γ(ω) for all ω ∈ Ω, hence ∆ ≤ Γ. Furthermore wehave ρ∆(φ) = [A]. This implies that

∨ρ∆(φ) : ∆ ≤ Γ ≥ ∨[A] : A ⊆ sΓ(φ), A ∈ A = ρ0(sΓ(φ)).

Conversely, for all ∆ ≤ Γ it holds that s∆(φ) ⊆ sΓ(φ) and that s∆(φ) ∈ A.Therefore, we conclude that

∨ρ∆(φ) : ∆ ≤ Γ ≤ ∨[A] : A ⊆ sΓ(φ), A ∈ A = ρ0(sΓ(φ)).

This proves that ρΓ(φ) = ρ0(sΓ(φ)).If, finally, sΓ(φ) ∈ A, then ρΓ(φ) = [sΓ(φ)]. This proves the second part

of the theorem. utSo, Theorem 28.8 shows that ρΓ is an extension of the a.o.p. defined on

the measurable elements of Φ. The identity of two extensions ρΓ and ρ0 sΓ

on Φ is a further justification for accepting ρΓ as the a.o.p. associated withthe generalized random variable Γ. In Section 29.3 we further discuss theproblem of extending a.o.p.s.

28.3 Labeled Allocations of Probability

.To any domain-free information algebra (Φ, D) a corresponding labeled

algebra can be associated by considering labeled pieces of information (φ, x),where x is a support of φ (see Part I, Section 3.6). This can also be done withrespect to the information algebra (AΦ, D) of allocations. Any allocation ρcan be labeled by a support x of it, provided it has a support. Alternatively,the restriction ρx of ρ to the domain x can be considered. It will be shownin this section that this latter approach leads in a natural way to an algebraisomorphic to the labeled information algebra associated with (AΦ, D).

To start with we consider a labeled information algebra (Φ, D). As usual,Φx denotes the set of valuations with domain x. Consider now an allocationρ : Φx → B, such that

(A1) ρ(ex) = >,

(A2) for φ, ψ ∈ Φx, ρ(φ⊗ ψ) = ρ(φ) ∧ ρ(ψ).

Here (B, µ) is as usual a probability algebra. Such a mapping will be calleda labeled allocation with domain x. We define as its label d(ρ) its domain x.We note that, as before, for φ, ψ ∈ Φx, φ ≤ ψ implies ρ(φ) ≥ ρ(ψ).

Page 248: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

248 CHAPTER 28. ALLOCATION OF PROBABILITY

Like with the former domain-free allocations we define some operationswith labeled allocations. First, the labeled allocation νx defined by

νx(φ) = ⊥ for all φ ∈ Φx, φ 6= ex,

νx(ex) = > (28.18)

is called the vacuous allocation on domain x. Next, if ρx and ρy are twolabeled allocations on domains x and y respectively, then we define theircombination ρ in analogy to (28.2) for φ ∈ Φx∨y

ρ(φ) = ∨ρx(φx) ∧ ρy(φy) : d(φx) = x, d(φy) = y, φ ≤ φx ⊗ φy.(28.19)

Note that here, as well as in the sequel, the partial order of labeled infor-mation within a domain is used. We denote this mapping also as a product,ρ = ρx ⊗ ρy and write ρ(φ) = (ρx ⊗ ρy)(φ).

Furthermore, if ρ is a labeled allocation with domain x and y a domainwith y ≤ x, then a mapping ρ↓ is defined by

ρ↓y(φ) = ρ(φ↑x) for all φ ∈ Φy. (28.20)

Here φ↑x = φ⊗ ex is the vacuous extension of φ to domain x. The elementρ↓y is called the projection of ρ to y.

The following theorem confirms that ρx ⊗ ρy and ρ↓y are indeed againlabeled allocations.

Theorem 28.9 ρx⊗ ρy is a labeled allocation with domain x∨ y and ρ↓y isa labeled allocation with domain y.

Proof. The first statement is proved exactly as Theorem 28.1 noting thatd(φx ⊗ φy) = x ∨ y.

As for the second statement, note first that ρ↓y(ey) = ρ(e↑xy ) = ρ(ex) =>. This proves property (A1) of a labeled allocation. Furthermore, forφ, ψ ∈ Φy, we obtain

ρ↓y(φ⊗ ψ) = ρ((φ⊗ ψ)↑x) = ρ(φ↑x ⊗ ψ↑x)= ρ(φ↑x) ∧ ρ(ψ↑x) = ρ↓y(φ) ∧ ρ↓y(ψ).

This proves property (A2) for labeled allocations. utAs an inverse to the operation of projection, which restricts a labeled

allocation to a smaller domain, we may also define the operation of extensionof a labeled allocation to a finer domain. Formally, if ρ is a labeled allocationwith domain x and y a domain such that x ≤ y, then we define

ρ↑y = ρ⊗ νy.

This gives then, according to (28.19) for a φ ∈ Φy

ρ↑y(φ) = ∨ρ(ψ) : ψ ∈ Φx, φ ≤ ψ↑y.

Page 249: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

28.3. LABELED ALLOCATIONS OF PROBABILITY 249

This makes sense: the support allocated to a φ in Φy by a labeled allocationwith domain x ≤ y is the supremum of all supports allocated to pieces ofinformation ψ in domain x which imply φ. By Theorem 28.9 above, ρ↑y is alabeled allocation with domain y; it is called the vacuous extension of ρ toy.

The next theorems show that the labeled allocations form themselves alabeled information algebra.

Theorem 28.10 Let (Φ, D) be a labeled information algebra. Then thefollowing hold

1. Semigroup: Combination of labeled allocations of probability is asso-ciative and commutative.

2. Labeling: For two labeled allocations ρ1, ρ2 on (Φ, D) we have d(ρ1 ⊗ρ2) = d(ρ1)∨ d(ρ2), and ∀ρ ∈ AΦ and ∀x ∈ D, x ≤ d(ρ), we have thatd(ρ↓x) = x.

3. Neutrality: For all x ∈ D there exists an element νx such that ρ⊗νx =ρ if d(ρ) = x, and ∀y ∈ D. y ≤ x we have ν↓yx = νy.

4. Nullity: For all x ∈ D there exists an element χx such that ρ⊗χx = χxif d(ρ) = x, and and ∀y ∈ D. y ≥ x we have χx ⊗ νy = χy.

5. Projection: For a labeled allocation ρ and domains x, y ∈ D such thatx ≤ y ≤ d(ρ) we have (ρ↓y)↓x = ρ↓x.

6. Combination: For two labeled allocations ρx, ρy with d(ρx) = x, d(ρy) =y we have (ρx ⊗ ρy)↓x = ρx ⊗ ρ↓x∧yy .

7. Idempotency: For a labeled allocation ρ and x ∈ D with x ≤ d(ρ) wehave ρ⊗ ρ↓x = ρ.

Proof. (1) Commutativity of combination is evident. Associativity isproved as in Theorem 28.2.

(2) follows from the definition of combination and projection.(3) If νx is the vacuous allocation on domain x and ρ a labeled allocation

with domain x, then for any φ ∈ Φx

ρ⊗ νx(φ) = ∨ρ(ψ) : d(ψ) = x, φ ≤ ψ = ρ(φ).

since νx(ψ) = ⊥ unless ψ = ex, in which case νx(ex) = >. Further, forφ ∈ Φy, we have ν↓yx (φ) = νx(φ↑x). Since φ↑x = ex if, and only if, φ = ey,this shows that ν↓yx = νy.

(4) Define χx(φ) = > for all φ ∈ Φx. Then ρ⊗χx(φ) = > for all φ ∈ Φx.This proves that ρ ⊗ χx = χx. Similarly, we have χx ⊗ νy(φ) = > for allφ ∈ Φy, hence χx ⊗ νy = χy.

Page 250: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

250 CHAPTER 28. ALLOCATION OF PROBABILITY

(5) For a φ ∈ Φx we have, by definition, if d(ρ) = z,

(ρ↓y)↓x(φ) = ρ↓y(φ↑y) = ρ((φ↑y)↑z) = ρ(φ↑z) = ρ↓x(φ).

(6) For a φ ∈ Φx we have

(ρx ⊗ ρy)↓x(φ) = (ρx ⊗ ρy)(φ↑x∨y)= ∨ρx(φx) ∧ ρy(φy) : d(φx) = x, d(φy) = y, φ↑x∨y ≤ φx ⊗ φy.

On the other hand,

(ρx ⊗ ρ↓x∧yy )(φ)

= ∨ρx(φx) ∧ ρ↓x∧yy (φx∧y) : d(φx) = x, d(φx∧y) = x ∧ y, φ ≤ φx ⊗ φx∧y

= ∨ρx(φx) ∧ ρy(φ↑yx∧y) : d(φx) = x, d(φx∧y) = x ∧ y, φ ≤ φx ⊗ φx∧y.

Now, for φ ∈ Φx, φ↑x∨y ≤ φx ⊗ φy implies φ ≤ (φx ⊗ φy)↓x = φx ⊗ φ↓x∧yy

and, for φ ∈ Φy, (φ↓x∧y)↑y ≤ φ, hence ρy((φ↓x∧y)↑y) ≥ ρy(φ). It followsthat (ρx ⊗ ρy)↓x(φ) ≤ (ρx ⊗ ρ↓x∧yy )(φ). But φ ≤ φx ⊗ φx∧y implies thatφ↑x∨y = φ ⊗ ex∨y ≤ φx ⊗ φx∧y ⊗ ex∨y = φx ⊗ φ↑yx∧y. This shows thenthat (ρx ⊗ ρy)↓x(φ) ≥ (ρx ⊗ ρ↓x∧yy )(φ) and thus finally (ρx ⊗ ρy)↓x(φ) =(ρx ⊗ ρ↓x∧yy )(φ).

(7) Let d(ρ) = y and φ ∈ Φy. Then we have

ρ⊗ ρ↓x(φ) = ∨ρ(φy) ∧ ρ↓x(φx) : d(φy) = y, d(φx) = x, φ ≤ φy ⊗ φx= ∨ρ(φy) ∧ ρ(φ↑yx ) : d(φy) = y, d(φx) = x, φ ≤ φy ⊗ φx= ∨ρ(φy ⊗ φ↑yx ) : d(φy) = y, d(φx) = x, φ ≤ φy ⊗ φx= ∨ρ(φy ⊗ φ↑yx ) : d(φy) = y, d(φx) = x, φ ≤ φy ⊗ φ↑yx ≤ ρ(φ).

But we have also φ = φ⊗φ↓x = φ⊗ (φ↓x)↑y and therefore ρ(φ) ≤ ρ⊗ ρ↓x(φ)which proves that ρ(φ) = ρ⊗ ρ↓x(φ). ut

It has been noted in Part I that to any domain-free information algebra(Φ, D) a labeled information algebra (Φ∗, D) with

Φ∗ = (φ, x) : φ ∈ Φ, x ∈ D support of φ.

can be associated, such that d(φ′) = x if φ′ = (φ, x) and combination andprojection are defined by

(φ1, x)⊗ (φ2, y) = (φ1 ⊗ φ2, x ∨ y),(φ, x)↓y = (φ⇒y, y) for y ≤ x. (28.21)

The same construction applies in particular to the information algebra (AΦ, D)of allocations. Let thus

A∗Φ = (ρ, x) : ρ ∈ AΦ, x ∈ D support of ρ.

Page 251: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

28.3. LABELED ALLOCATIONS OF PROBABILITY 251

Then combination and projection can be defined as in (28.21) and (A∗Φ, D)becomes in this way a labeled information algebra. Only allocations ρ whichhave a support are in A∗Φ.

Note however that the elements (ρ, x) are not labeled allocations in thesense defined before. But a labeled allocation ρx defined by ρx(φ′) = ρ(φ)if φ′ = (φ, x) ∈ Φ∗x can be associated with it. This defines in fact anisomorphism between the two labeled information algebras. That is whatthe following theorem says.

Theorem 28.11 1. If (ρ(1), x) and (ρ(2), y) belong to (A∗Φ, D), then

ρ(1)x ⊗ ρ(2)

y = (ρ(1) ⊗ ρ(2))x∨y. (28.22)

2. If ν is the vacuous allocation in (AΦ, D), then νx is the vacuous allo-cation on Φx for all x ∈ D.

3. If (ρ, x) belongs to (A∗Φ, D), and y ≤ x, y ∈ D, then

ρ↓yx = (ρ⇒y)y. (28.23)

Proof. (1) Let φ′ = (φ, x ∨ y) belong to Φ∗x∨y and consider

ρ(1)x ⊗ ρ(2)

y (φ′)

= ∨ρ(1)x (φ′x) ∧ ρ(2)

y (φ′y) : φ′x = (φx, x), φ′y = (φy, y), φ′ ≤ φ′x ⊗ φ′y

= ∨ρ(1)(φx) ∧ ρ(2)(φy) : φx = φ⇒xx , φy = φ⇒yy , φ ≤ φx ⊗ φy.

By assumption, x is a support of ρ(1) that is

ρ1(φx) = ρ⇒x1 (φx) = ∨ρ1(ψx) : φx ≤ ψx = ψ⇒xx

and similarly for ρ(2) which has y as a support. This implies then,

(ρ(1) ⊗ ρ(2))x∨y(φ′) = ρ(1) ⊗ ρ(2)(φ)

= ∨ρ(1)(φx) ∧ ρ(2)(φy) : φ ≤ φx ⊗ φy

= ∨(∨ρ(1)(ψx) : φx ≤ ψx = ψ⇒xx

)∧(∨ρ(2)(ψy) : φy ≤ ψy = ψ⇒yy

): φ ≤ φx ⊗ φy

= ∨ρ(1)(ψx) ∧ ρ(2)(ψy) : φ ≤ ψx ⊗ ψy, ψx = ψ⇒xx , ψy = ψ⇒yy

= ρ(1)x ⊗ ρ(2)

y (φ′).

(2) We have, for φ′ = (φ, x), that νx(φ′) = ν(φ) = ⊥, if φ 6= e. And wehave νx(e, x) = ν(e) = >.

Page 252: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

252 CHAPTER 28. ALLOCATION OF PROBABILITY

(3) Let ρ = ρ⇒x, y ≤ x and φ′ = (φ, y). We have to prove that ρ↓yx (φ) =ρ⇒y(φ). Then

(ρ⇒y)(φ) = ρ(φ)

since φ = φ⇒y. On the other hand, for ψ′ = (ψ, x), we have ρx(ψ′) = ρ(ψ).We obtain therefore

ρ↓yx (φ′) = ρx(φ′↑x) = ρ(φ) = ρ⇒y(φ) (28.24)

since y ≤ x implies that φ = φ⇒x. Thus we have in fact (ρ⇒y)y = ρ↓yx . utThis theorem shows that we have two essentially equivalent ways to

construct labeled information algebras of allocations from algebras of allo-cations.

Page 253: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 29

Support Functions

29.1 Characterization

As we have noted before, we may consider a random mapping Γ as aninformation available to us: Γ(ω) is a certain “information”, which can beasserted, provided ω is the sample element chosen by a chance process, orthe “correct”, but unknown assumption in a set of possible assumptions Ω.Here, information Γ(ω) is to be interpreted in the sense of random mappingsas an ideal in a domain-free information algebra (Φ, D), as explained inSection 27.1. We have then in Section 27.2 (27.7) defined the allocationof support sΓ(φ) of a random mapping to an element φ ∈ Φ as the setof elements ω ∈ Ω, which imply φ, i.e. such that φ belongs to the idealΓ(ω), φ ∈ Γ(ω) or φ ≤ Γ(ω). Then any ω ∈ sΓ is an assumption, i.e. anargument, which permits to infer the piece of information φ in the light ofthe random mapping Γ. So the larger the set sΓ(φ), the more arguments arepresent to support φ. Or, more to the point, the more probable, the morelikely it is that the correct assumption ω belongs to sΓ(φ), the stronger thehypothesis of information φ is supported. We refer back to Chapter 25 fora discussion of this point of view. The allocation of support can be seenas a mapping sΓ : Φ → P(ω) from the information algebra, or its idealcompletion sΓ : IΦ → P(ω) into the sample space. This mapping has someinteresting properties as stated in the next theorem.

Before we formulate the theorem, let’s outline some special cases of in-formation algebras and also of random mappings: The algebra (Φ, D) mayby a σ-algebra and the the ideals Γ(ω) may be σ-ideals, i.e. closed undercountable combination, for all ω ∈ Ω. Then Γ is called a σ-random map-ping. Further Φ may be a complete semi-lattice. If, in this case, Γ(ω) is aprincipal ideal for each ω ∈ Ω, then Γ is called a complete random mapping.For a complete random mapping we have that ∨Γ(ω) ∈ Φ. Note that theideal completion IΦ of a complete algebra Φ is still a proper superset of Φ,not all ideals in Φ are proper ideals.

253

Page 254: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

254 CHAPTER 29. SUPPORT FUNCTIONS

We do not exclude that Γ(ω) = z for some ω. This represents improperinformation, which can be interpreted as contradictory information. Undersemantic aspects such improper information could and should be excluded.We refer to Chapter 25 for a discussion in the context of hints. If Γ(ω) 6= zfor all ω, the random mapping is called normalized.

Theorem 29.1 Let Γ ∈ R be a generalized random variable with values inan information algebra (Φ, D), then

1. sΓ(e) = Ω,

2. sΓ(z) = ∅, if Γ is normalized,

3. If ∨i∈Iφi exists in Φ, then

sΓ(∨i∈Iφi) ⊆⋂i∈I

sΓ(φi),

4. If I is finite, or Φ is a σ-algebra, I is countable and Γ a σ-randommapping, or if Φ is a complete semi-lattice and Γ a complete randommapping, then

sΓ(∨i∈Iφi) =⋂i∈I

sΓ(φi), (29.1)

Proof. (1) The first identity follows since e is the least element in Φ (orIΦ).

(2) follows from the definition of a normalized variable.(3) Suppose ∨i∈Iφi exists in Φ, then φi ≤ ∨i∈Iφi ∈ Γ(ω) implies φi ∈

Γ(ω), because the latter is an ideal in Φ. It follows therefore that ω ∈spΓ(∨i∈Iφi) implies ω ∈ sΓ(φi) for all i ∈ I, hence ω ∈

⋂i∈I sΓ(φi).

(4) Consider ω ∈ sΓ(φi) for all i ∈ I. This means that φi ∈ Γ(ω). If I isfinite, then ∨i∈Iφi ∈ Γ(ω) too, since Γ(ω) is an ideal, hence ω ∈ sΓ(∨i∈Iφi).Together with (3) just proved this shows that (29.1) holds. Similarly, ifΦ is a σ-algebra, I is countable, then ∨i∈Iφi exists and if Γ is a σ-randommapping, then ∨i∈Iφi ∈ Γ(ω) too. Finally, if Φ is complete and Γ a completerandom mapping, then again ∨i∈Iφi exists and ∨i∈Iφi ∈ Γ(ω). Thus (29.1)holds in these cases too. ut

The degree of support of an piece of information φ ∈ Φ can thus bemeasured by the the probability P (sΓ(φ)) of the support allocated to φ,provided this probability is defined. An element φ ∈ Φ is called measurablerelative to Γ (or Γ-measurable or also simply measurable if the context isclear), if sΓ(φ) ∈ A. Let’s formally define the degree of support

spΓ(φ) = P (sΓ(φ)),

Page 255: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.1. CHARACTERIZATION 255

for all measurable φ ∈ Φ. Let EΓ be the subset of Γ-measurable elementsof (Φ). As the following lemma shows, the set of Γ-measurable elements isclosed under combination, or even countable combination, if Φ is a σ-algebraand Γ a σ-random mapping.

Lemma 29.1 For any random mapping Γ, e ∈ E and z ∈ E, if Γ isnormalized, and if φ, ψ ∈ EΓ, then φ⊗ ψ ∈ EΓ. If Φ is a σ-algebra and Γ aσ-random mapping, then φ1, φ2, . . . ∈ EΓ implies ∨∞i=1φi ∈ EΓ.

Proof. By Theorem 29.1 (4)

sΓ(φ⊗ ψ) = sΓ(φ ∨ ψ) = sΓ(φ) ∩ sΓ(ψ).

Thus, sΓ(φ), sΓ(ψ) ∈ A implies sΓ(φ⊗ψ) ∈ A. Similarly, if Φ is a σ-algebraand Γ a σ-random mapping, then by Theorem 29.1 (4)

spΓ(∨∞i=1φi) =∞⋂i=1

spΓ(φi).

Hence spΓ(φi) ∈ A implies spΓ(∨∞i=1φi) ∈ A. utAs a corollary, it follows from Theorem 29.1 that EΓ is σ-closed for proper

random variables.The degree of support can be seen as a mapping spΓ : EΓ → [0, 1]. This

mapping has the following fundamental properties:

Theorem 29.2 Let Γ ∈ R. Then the mapping spΓ has the following prop-erties:

1. The neutral element e belongs to EΓ and spΓ(e) = 1. If Γ is normalized,z belongs to EΓ and spΓ(z) = 0.

2. Assume φ1, . . . , φn ≥ φ, n ≥ 1 and φ1, . . . , φn, φ ∈ EΓ. Then

spΓ(φ) ≥∑

∅6=I⊆1,...,n

(−1)|I|+1spΓ(⊗i∈Iφi). (29.2)

3. If Φ is a σ-algebra and Γ a σ-random mapping, then, if φ1 ≤ φ1 ≤· · · ∈ EΓ is a monotone sequence,

spΓ(∨∞i=1φi) = limi→∞

spΓ(φi). (29.3)

Proof. (1) follows directly from Theorem 29.1 (1) and (2).(2) On the right hand side of (29.2) we have by the inclusion-exclusion

formula of probability theory,∑∅6=I⊆1,...,m

(−1)|I|+1P (∩i∈IsΓ(φi)) = P (∪mi=1sΓ(φi)).

Page 256: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

256 CHAPTER 29. SUPPORT FUNCTIONS

But φ ≤ φ1, . . . , φm implies sΓ(φ) ⊇ sΓ(φi), hence

sΓ(φ) ⊇ ∪mi=1sΓ(φi)

This proves (29.2)(3) By Theorem 29.1 (4) we have

sΓ(∨∞i=1φi) = ∩∞i=1sΓ(φi).

Further we have that sΓ(φ1) ⊇ sΓ(φ2) ⊇ · · ·. Then, by the continuity ofprobability, we have

P (sΓ(∨∞i=1φi)) = P (∩∞i=1sΓ(φi)) = limi→∞

P (sΓ(φi)).

This proves (29.4). utAccording to Lemma 29.1 the measurable elements EΓ form a sub-semi-

lattice of Φ. A function on such a semi-lattice, satisfying (2) of Theorem 29.2above is called monotone of infinite order. If it satisfies in addition property(3), then it is called continuous. We call a function sp defined on a semi-lattice E satisfying properties (1) and (2) of Theorem 29.2 a support functionon E . If, in addition, sp(z) = 0, then the support function is called normal-ized. Support functions were studied by (Shafer, 1976; ?) under the nameof belief functions on subset families closed under intersection. The pre-sentation here is a generalization of the theory of hints (Kohlas & Monney,1995), as introduced in Chapter 25. Hints were motivated by Dempster’smultivalued mappings (Dempster, 1967), but with the interpretation givenhere as uncertain information, allowing different, uncertain interpretations.As the discussion above and the following show, information algebras seemto provide the natural framework to develop the theory of hints.

We are next going to examine the support functions of random variables.It turns out, and this is the main justification to consider the elements ofσ(Rs) as proper random variables, that all elements of Φ are measurable.

Theorem 29.3 Let Γ ∈ σ(Rs) be a random variable with values in thecompact information algebra (IΦ,Φ, D). Then Φ ⊆ EΓ, where EΓ is the setof Γ-measurable elements of IΦ.

Proof. Let φ ∈ Φ and Γ = ∨∞i=1Γi, where Γi are simple random variables.Note that Γ = ∨∞i=1∆i, where ∆i = ∨ij=1Γi. All ∆i are simple randomvariables and for all ω the sequence ∆i(ω) is monotone. Then

sΓ(φ) = ω ∈ Ω : φ ≤ ∨∞i=1∆i(ω).

Fix a ω ∈ sΓ(φ). The subset ∆i(ω), i = 1, 2, . . . of Φ is directed. Hence,by compactness of the information algebra (IΦ, I(Φ), D), there is a ∆j(ω)such that φ ≤ ∆j(ω), hence ω ∈ s∆j (φ). This shows that

sΓ(φ) ⊆ ∪∞i=1s∆i(φ).

Page 257: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.2. GENERATING SUPPORT FUNCTIONS 257

The converse inclusion is evident. Hence sΓ(φ) = ∪∞i=1s∆i(φ). But s∆i(φ) ∈A for all i, since ∆i are simple random variables. Therefore, we concludethat sΓ(φ) ∈ A ut

In fact, we may, as a corollary, extend this result to the σ-closure of Φ.

Corollary 29.1 Let Γ ∈ σ(Rs) be a random variable with values in thecompact information algebra (IΦ,Φ, D). Then σ(Φ) ⊆ EΓ, where EΓ is theset of Γ-measurable elements of IΦ.

Proof. Assume φ ∈ σ(Φ). Then, by Theorem 27.2 φ = ∨∞i=1φi, withψi ∈ Φ. It follows that

sΓ(φ) = ω : ∨∞i=1φi ≤ Γ(ω) =∞⋂i=1

ω : ψi ≤ Γ(ω) =∞⋂i=1

sΓ(ψi) ∈ A,

since, by Theorem 29.3, sΓ(ψi) ∈ A. utSo in these cases the support function spΓ is defined on the whole of

E . In the general case, extensions of the support function from E to Φ arediscussed in Section 29.3.

29.2 Generating Support Functions

In the previous section, we have seen that random mappings induce supportfunctions on some sub-semi-lattices of an information algebra. The questionarises whether any support sp function on some semi-lattice E is generated bysome random mapping. The answer is affirmative (Shafer, 1979; ?; Kohlas& Monney, 1995). It follows from the Krein-Milman theorem. The Krein-Milman theorem states that a compact convex subset S of a locally convexspace is the closed convex hull of its extreme points.

We are first going to reformulate the Krein-Milman theorem in a formadapted to our needs, i.e. in the form of an integral representation theorem.Let V be a locally convex space and S a nonempty compact subset of V ,and P a probability measure on S, i.e. a nonnegative measure on the Borelsets of V with P (S) = 1. Then, a point v ∈ V is said to be represented byP if, for every continuous linear function f : V → R,

f(v) =∫Sf(x)dP (x).

Sometimes it is also said, that in this case v is the barycenter or the resultantof P . We cite the following useful lemma:

Lemma 29.2 Suppose that C is a compact subset of a locally convex spaceV . A point v ∈ V is in the closed convex hull H of C if, and only if, thereexists a probability measure P on C which represents v.

Page 258: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

258 CHAPTER 29. SUPPORT FUNCTIONS

For the proof, which depends on the Riesz theorem, we refer to (?). If Pis a probability measure on a compact Hausdorf space V and B is a Borelsubset of V , then P is said to be supported by B if P (V \B) = 0.

Now, with the help of this lemma we reformulate the Krein-Milmantheorem as follows:

Theorem 29.4 Every point of a compact convex subset S of a locally con-vex space V is the barycenter of a probability measure on S which is supportedby the closure of the extreme points of S.

Proof. Suppose v ∈ S, and let T be the closure of the extreme points ofS; then, by the Krein-Milman theorem, v is in the closed convex hull H ofT . By Lemma 29.2 above, then, v is the barycenter of a probability measureP on T . If we extend this measure P in the obvious way to S, we get thedesired result. ut

We remark the Theorem 29.4 conversely implies the Krein-Milman the-orem too. But we only need the one direction. Thus, this version of theKrein-Milman theorem is used to prove the following theorem which statesthat any support function in an information algebra is generated by somerandom mapping.

Theorem 29.5 Let (Φ, D) be a domain-free information algebra and E ⊆ Φbe a sub-semi-lattice of Φ. If s is a support function on E, satisfying (1) and(2) of Theorem 29.2, then there exists a random mapping Γ in Φ, such thatsΓ = s. If Φ is a σ-algebra, E ⊂ Φ is closed under countable combination,and s a support function on E which satisfies also (3) of Theorem 29.2, thenthere exists a σ-random mapping Γ in Φ, such that sΓ = s.

Proof. We start be defining the vector space V of functionals f : E → Rwhere addition and scalar mutliplication is defined by (f + g)(φ) = f(φ) +g(φ), and (λf)(φ) = λf(φ) respectively. Define, pφ(f) = |f(φ)|, for anyφ ∈ E . Clearly pφ satisfies the following conditions:

1. pφ is positive semidefinite, i.e. pφ(f) ≥ 0 for any f ∈ V ,

2. pφ is positive homogeneous, i.e. pφ(λf) = |λ|pφ(f),

3. pφ satisfies the triangle inequality, i.e. pφ(f + g) ≤ pφ(f) + pφ(g).

This means that pφ(f) is a family of semi norms for φ ∈ E . This implies thatV is a locally convex topological vector space, where convergence of sequencesfi is defined pointwise, limi→∞ fi = f means that limi→∞ fi(φ) = f(φ) inR for every φ ∈ E . The set S of support functions s on E , satisfying (1)and (2) of Theorem 29.2, is convex in V . Further S is closed, since the limitof a sequence of support functions is a support function. Finally, the sets

Page 259: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.2. GENERATING SUPPORT FUNCTIONS 259

S[φ] = f(φ) : f ∈ S are bounded, hence their closures S[φ]− are compactfor all φ ∈ E . Since

S ⊆ ΠS[φ]− : φ ∈ E

and the product of compact sets S[φ]− is compact in V by Tychonov’stheorem, it follows that S is compact.

Therefore, by Theorem 29.4 any support function s ∈ S is the barycenterof a probability measure Ps which is supported by the closure of the extremepoints of S. What are the extreme point of S? By a deep theorem of Choquet(Choquet, 1953–1954) concerning functions monotone of order∞ on orderedsemi-groups, the extreme points of S are exactly the exponentials on E , i.e.functions e : E → R such that e(φ⊗ ψ) = e(φ) · e(ψ) for all φ, ψ ∈ E . In thecase of a semi-lattice E it follows from φ = φ ⊗ φ that e(φ) = e(φ) ⊗ e(φ),hence e(φ) = 0 or e(φ) = 1. Let then I = φ ∈ E : e(φ) = 1. Thenψ ≤ φ ∈ I implies 1 = e(φ) = e(ψ) · e(φ), hence e(ψ) = 1 or ψ ∈ I too.Further, let φ, ψ ∈ I, then e(φ ⊗ ψ) = e(φ) · e(ψ) = 1, therefore φ ⊗ ψ ∈ I.In ther words, I is an ideal in E . Conversely, assume I ⊆ E to be an idealin E and define e(φ) = 1 for φ ∈ I, e(φ) = 0, otherwise. Assume φ, ψ ∈ I,such that φ ⊗ ψ ∈ I too. Then e(φ ⊗ ψ) = 1 = e(φ) · e(ψ). Otherwise, ifφ 6∈ I, then φ ⊗ ψ 6∈ I too. Hence, again, e(φ ⊗ ψ) = 0 = e(φ) · e(ψ). Sothe function e is an exponential, hence an extreme point of S. This showsthat the set of extreme points of S is in one-to-one correspondence with theideals of E . In other words, the indicator functions iI of the ideals in E arethe extreme points of S. Further the pointwise limit e ∈ S of a sequence of0, 1-support function is still an extreme point. Hence the set of extremepoints of S is closed.

Consider then a support function s ∈ S and the probability measurePs supported by the set X of extreme points of S, which represents s bythe Krein-Milman theorem. Then s 7→ s(φ) is a linear, continuous functionV → R. Thus, it follows that

s(φ) =∫Xe(φ)dPs(e) = Pse ∈ X : e(φ) = 1 = PsiI : I ideal of E , φ ∈ I.

for all φ ∈ E . Let A be the algebra of Borel sets in X. Further, note thatan ideal I in E can be extended to an ideal in Φ, namely I ′ = φ ∈ Φ :∃ψ ∈ I, φ ≤ ψ. For any extreme point e of S let Ie be the ideal in Φassociated to the ideal in E corresponding to e. Define then Γ : X → IΦ byΓ(e) = Ie. Then Γ is a random mapping in Φ, defined on the probabilityspace (X,A, Ps), and its support function sΓ equals s. This is the desiredresult corresponding to the first part of the theorem.

The second part follows in a similar way: The set s of support func-tions on E satisfying (1) to (3) of Theorem 29.2 is still a convex subset ofV . Its extreme points are still exponentials on E . They are in one-to-one

Page 260: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

260 CHAPTER 29. SUPPORT FUNCTIONS

correspondance with the σ-ideals I of E , which can be extended to σ-idealsof Φ. The elements ss of S are represented by a probability measure Ps onthe set X of extreme points. So, Γ(e) = Ie, where Ie is the σ-ideal of Φcorresponding to the extreme point e of S, defines a σ-random mapping,such that spΓ = s. This proves the second part of the theorem. ut

For continuous support functions there is an alternative theory by Nor-berg for characterizing support functions based solely on lattice theory (Nor-berg, 1989). We shall apply this theory first to domain-free compact infor-mation algebras (Φ,Φf , D). We have seen in Part I of these Lecture Notes(Section 4.7) that Φ is a continuous lattice. Further, the relation φ ψholds for φ, ψ ∈ Φ if, and only if, φ ∈ Φf and φ ≤ ψ. Norberg (Norberg,1989) introduces a σ-field of subsets of Φ, on which a probability measurecan be defined such that the identity mapping on Φ becomes a randommapping generating a given continuous support function on Φ.

As a preparation let’s introduce Scott topology into Φ. A subset U of Φis called Scott-open, if

1. U is upward closed, i.e. if φ ∈ U and φ ≤ ψ, then ψ ∈ U .

2. ∀φ ∈ U there exists a ψ ∈ Φf ∩ U such that ψ ≤ φ.

The second condition says equivalently that for all φ ∈ U , there is a ψ ∈ Usuch that ψ φ. The family of Scott-open τ sets are the open sets of atopology, since, as can be easily verified,

1. Φ, ∅ ∈ τ ,

2. if U, V ∈ τ , then U ∩ V ∈ τ ,

3. if Ui ∈ τ for i in some set I, then⋃i∈I Ui ∈ τ .

The topology τ is called the Scott topology in the compact informationalgebra (Φ,Φf , D). The sets

↑ ψ = φ ∈ Φ : ψ ≤ φ

for ψ ∈ Φf form a base for the topology τ . This means in this particularcase that

1. If U ∈ τ , then U =⋃↑ ψ : ψ ∈ Φf ∩ U,

2. Φ =⋃↑ ψ : ψ ∈ Φf,

3. ↑ φ ∩ ↑ ψ =↑ (φ⊗ ψ).

Finally, a set F ⊆ Φ is a filter, if it is upward closed and if, for any φ, ψ ∈ Fthere is a η ∈ F such that η ≤ φ, ψ. Thus, a Scott-open filter is a filterwhich is Scott-open. This is in our case of a compact information algebraequivalent to

Page 261: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.2. GENERATING SUPPORT FUNCTIONS 261

1. F is upwards closed,

2. ∀φ, ψ ∈ F there exists a η ∈ Φf ∩ F such that η ≤ φ, ψ.

Denote by F the family of Scott-open filters in Φ and let σ(F) be the minimalσ-field in Φ generated by F .

From now on we assume that Φf is countable. This means that τ has acountable base. A topology with a countable base is called second countable.Since ↑ ψ ∈ F , hence belongs to σ(F) and any U ∈ τ is the countable unionof sets ↑ ψ for ψ ∈ Φf , it follows that U ∈ σ(F). Thus, σ(F) is the Borelalgebra with respect to the Scott topology.

Now, the following theorem is an immediate consequence of Theorem 3.3in (Norberg, 1989) (see alternatively Theorem 3.15 in (Molchanov, 2005)).

Theorem 29.6 Let (Φ,Φf , D) be a domain-free compact information alge-bra with a countable set Φf of finite elements. Let further s : Φ→ [0, 1] be acontinuous support function on Φ. Then, there exists a probability measurePs on the algebra σ(F), such that for all φ ∈ Φ,

s(φ) = Ps(↑ φ).

Consider the probability space (Φ, σ(F), P ) with P the probability mea-sure of Theorem 29.6. The identity mapping id(φ) = φ is a random vari-able in the information algebra Φ. Its support functions is defined byspid(φ) = Ps(↑ φ) = s(φ). Therefore, according to Theorem 29.6 any sup-port function on a compact information algebra with countably many finiteelements is generated by a random variable in this algebra. This is a firstexistence theorem for random variables.

Note that for any φ ∈ Φ we have

↑ φ =⋂

ψ∈Φf ,ψ≤φ↑ ψ.

Since Φf is countable, so is the intersection above, and we may enumeratethe elements ψ ≤ φ, ψ ∈ Φf by ψ1, ψ2, . . .. Define then ηi = ∨ij=1ψj suchthat η1 ≤ η2 ≤ · · · and φ = ∨∞i=1ηi. Also ↑ η1 ⊇↑ η2 ⊇ · · · and

↑ φ =∞⋂i=1

↑ ηi.

It follows that

s(φ) = Ps(↑ φ) = limi→∞

Ps(↑ ηi) = limi→∞

s(ηi) (29.4)

So, the support function on Φ is in fact determined by its values for thefinite elements in Φf .

Page 262: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

262 CHAPTER 29. SUPPORT FUNCTIONS

The result above (Theorem 29.6) can be slightly generalized. Let (Φ, D)be a domain-free information algebra and E ⊆ Φ a sub-semi-lattice of Φ.Then we may consider the family IE of ideals in E . The theory of idealcompletion of an information algebra applies essentially to E as to Φ: First,IE is a complete lattice. Second the elements of E (or more precisely, theprincipal ideals I(φ) for φ ∈ E) are the finite elements of the lattice IE . So,if E is countable, and s a continuous support function on E , which can beextended by (29.4) to IE , then there is, just as in Theorem 29.6, a probabilitymeasure Ps on E such that

s(φ) = Ps(↑ φ)

Note that E and IE are no more necessarily information algebras (i.e. closedunder focusing).

29.3 Extension of Support Functions

The natural way in the context of these notes to look at extension both of sp-fcts and a.o.p.s beyond their domain of definition is to use the informationin the generating random mapping which is in these notes considered asthe primary element describing uncertain information. This is essentiallydone in the previous Section 28.2, cumulating in Theorem 28.8 showing theequivalence of the two natural approaches introduced before for randomvariables.

However, when considering a.o.p.s as primary objects, it makes sense tostudy extensions of them without assuming the generating random mappingas given a priori. Then we may follow Shafer’s approach (Shafer, 1979). Al-ternatively we may use the method proposed in (Kohlas & Monney, 1995),i.e. studying the family of random mappings generating a given sp-fct. ora.o.p., delimiting canonical random mappings among them and looking atthe extension they create as random mappings. That’s what we are dis-cussing in this section. Note that in (Kohlas & Monney, 1995) the problemis treated in the context of random sets. It turns out that the main resultscan be generalized without difficulty to the more general case of randommappings into information algebras.

As a preparation we note the close connection between support functionsand allocations of probability (a.o.p.). In fact, as the following lemma shows,from an a.o.p. we may easily obtain the associated support function.

Lemma 29.3 Let ρ : E → B be an a.o.p. from a sub-semi-lattice E ⊆ Φinto a probability algebra (B, µ). Then µρ : E → [0, 1] is a support functionon E. If furthermore E is closed under countable combinations, and ρ is aσ-a.o.p., i.e. for any countable family ψi ∈ E, i = 1, 2, . . ., it holds that

ρ(⊗∞i=1ψi) = ∧∞i=1ρ(ψi),

Page 263: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.3. EXTENSION OF SUPPORT FUNCTIONS 263

then µ ρ is a continuous support function on E.

Proof. We have to verify properties (1) and (2) of Theorem 29.2 for theapplication µρ. Since ρ(e) = >, it follows that µ(ρ(e)) = 1. Thus condition(1) is satisfied. Further, let φ1, . . . , φn ∈ E , n ≥ 1 and φ1, . . . , φn ≥ φ ∈ E .Then we have ρ(φi) ≤ ρ(φ) for i = 1, . . . , n, hence

n∨i=1

ρ(φi) ≤ ρ(φ).

Hence, by the inclusion-exclusion formula of probability theory we obtain

µ(ρ(φ)) ≥∑

∅6=I⊆1,...,n

(−1)|I|−1µ(∧i∈Iρ(φi)).

Since ρ is an a.o.p., we have that ∧i∈Iρ(φi) = ρ(⊗i∈Iφi), and it followstherefore that

µ(ρ(φ)) ≥∑

∅6=I⊆1,...,n

(−1)|I|−1µ(ρ(⊗i∈Iφi)).

But this is property (2) of Theorem 29.2.If ρ is a σ-a.o.p. and ψ1 ≤ ψ2 ≤ . . . ∈ E , then by the continuity of

probabilty it follows that

µ(ρ(⊗∞i=1ψi)) = µ(∧∞i=1ρ(ψi)) = limi→∞

µ(ρ(ψi)),

since ρ(ψ1) ≥ ρ(ψ2) ≥ . . .. utLet (Φ, D) be a domain-free information algebra and E ⊆ Φ a sub-

semi-lattice contained in Φ. Further let (B, ν) be a probability algebra andρ : E → B an a.o.p. defined on E . The mapping sp = ν ρ defines thena support function on E . On the other hand, assume Γ a random mappingΓ : Ω → Φ on a probability space (Ω,A, P ) whose support function on Ecoincides with sp. Let (B, ν) be the probability algebra associated with theprobability space (Ω,A, P ), i.e. B = A/J where J is the σ-ideal of P -nullsets in A and ν([A]) = P (A) (see Section 27.1). Then by ρ(φ) = [sΓ(φ)] forφ ∈ E an a.o.p. defined on E is defined. So, if a support function sp on Eis given, we may consider any generating random mapping for sp and thenobtain in the way just described the associated a.o.p.

Now, for a support function sp on E by Theorem 29.5 there exists arandom mapping which generates sp on E . But this random mapping is notunique. For instance, assume E ⊆ E ′, where E ′ ⊆ Φ is another semi-latticein Φ and sp′ a support function on E ′ whose restriction to E equals sp, thena random mapping which generates sp′ on E ′ generates also sp on E . Or,looked the other way round, any random mapping Γ, which generates the

Page 264: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

264 CHAPTER 29. SUPPORT FUNCTIONS

support function sp on E , i.e., for which spΓ = sp on E , has an extensionspΓ(φ) = P∗(sΓ(φ)) to the whole of Φ (see Section 27.1). So, there are manyextension of a support function sp on E to Φ. The goal is to compare theseextension somehow and to single out particularly interesting ones.

In order to do that, we first single out particular random mappingswhich generate a support function sp on E , so-called canonical mappings.According to the proof Theorem 29.5 the random mapping generating agiven support function sp on a sub-semi-lattice E ⊆ Φ is defined on a proba-bility space (X,A, Psp), where X is the set of binary support functions withvalues in 0, 1 on E and A the Borel algebra in X. We noted already inSection 29.2, that the elements of X are in a one-to-one correspondence withideals in E . Therefore, if IE denotes the set of ideals in E , we may as wellconsider the probability space (IE ,A, Psp), where A, by abuse of notation,now denotes the image of the former Borel algebra A in X and Psp thecorresponding probability measure. The random mapping generating sp isthen defined as ν(I) = φ ∈ Φ : ∃ψ ∈ I, φ ≤ ψ. That is, the ideal I ∈ IE isextended to an ideal ν(I) ∈ IΦ.

Any ideal J ∈ IΦ becomes by restriction to E an ideal in J/E ∈ IE . So,we have a mapping π : IΦ → IE defined by π(J) = J/E . This mappingis called the projection of IΦ to IJ/E . It is onto: In fact, if I ∈ IE , thenν(I) ∈ IΦ and π(ν(I)) = I. Let

π−1(I) = J ∈ IΦ;π(J) = I.

Then the family of sets π−1(I) for all I ∈ IE forms a partition of IΦ. Further,if K ∈ π−1(I), then we claim that ν(I) ⊆ K. In fact, consider φ ∈ ν(I).Then there is a ψ ∈ I such that φ ≤ ψ. But we must also have ψ ∈ K, sinceK/E = I, hence φ ∈ K. Thus, ν(I) is the minimal ideal in π−1(I), minimalin the sense of the order in the information algebra IΦ.

In view of these considerations, by Theorem 29.5, the random mappingν, defined on a probability space (IE ,A, P ′sp) generates the support functionsp on E , that is

P ′spI ∈ IE : φ ∈ ν(I) = sp(φ)

for all φ ∈ E . Of course this implies that E is contained in the set Eν ofν-measurable elements of Φ, i.e. E ⊆ Eν . As before we denote by sν theallocation of support associated with the random mapping ν,

sν(φ) = I ∈ IE : φ ∈ ν(I).

Then we have sν(φ) ∈ A for all φ ∈ E . Let then Asp denote the σ-algebragenerated by the sets sν(φ) for φ ∈ E . Thus, Asp ⊆ A is the smallest sub-σ-algebra in A, with respect to which the allocations of support sν(φ) aremeasurable for all φ ∈ E . Let finally Psp be the restriction of the probability

Page 265: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.3. EXTENSION OF SUPPORT FUNCTIONS 265

measure P ′sp to Asp. The random mapping ν defined on the new probabilityspace (IE ,Asp, Psp) still generates the support function sp on E .

It is convenient, for the comparison of random mappings, to transportthe probability space to the set IΦ. As we have seen π−1(I) defines for I ∈ IEa partition of IΦ. The family of subsets π−1(A) of IΦ for A ∈ Asp form aσ-algebra of subsets of IΦ, and by Psp(A) a probability measure is definedon this σ-algebra of subsets of IΦ. By abuse of notation we denote thisnew probability space by (IΦ,Asp, Psp). The random mapping Γsp = ν πdefined on this new probability space still generates the support functionsp on E . We call this random mapping Γsp canonical and emphasize thedependence of the mapping on the support function sp and its domain E .In fact, the σ-algebra Asp is uniquely determined by the domain E of sp. Atthis point however, it is not clear whether the probability measure also isuniquely determined by sp. Of course it is uniquely determined on the setsπ−1(sν(φ)) for φ ∈ E . This is sufficient for the time being. Later it will turnout that it is in fact uniquely determined on the whole of Asp.

As in Section 27.1, we may define the probability algebra (Asp/Jsp, µsp)associated with the probability space (IΦ,Asp, Psp), the application ρ0 :IΦ → Asp/Jsp and then finally

spΓsp(φ) = µsp(ρ0(sΓsp(φ))) = Psp∗(spΓsp(φ)). (29.5)

The mapping spΓsp extends sp from E to Φ. We shall show that spΓsp isin fact a support function, and indeed the smallest support function, whichextends sp to Φ.

Consider two sub-semi-lattices E and E ′ of Φ, such that E ⊆ E ′ andfurther support functions sp and sp′ with domains E and E ′ respectively,such that sp′ is an extension of sp from E to E ′, i.e. sp′(φ) = sp(φ) forall φ ∈ E . We want to compare the canonical random mappings associatedwith the two support functions.

Lemma 29.4 Let (IΦ,Asp, Psp) and (IΦ,Asp′ , Psp′) associated with the canon-ical random mappings Γsp and Γsp of the support functions sp and sp′. Then

1. Γsp(I) ≤ Γsp′(I) for all I ∈ IΦ.

2. Asp ⊆ Asp′.

Proof. From E ⊆ E ′ it follows that νsp(I) ⊆ νsp′(I). This implies (1).Also, if φ ∈ E , then φ ∈ E ′, hence sΓsp(φ) ∈ Asp′ , which implies (2). ut

In some sense the canonical random mapping associated with the exten-sion sp′ is a refinement of the mapping corresponding to sp: It focuses toricher information Γsp(I) ≤ Γsp′(I). Further the probability measure Psp′ isan extension of Psp, hence the probability information of the mapping Γsp′ isalso richer than the one of Γsp. This issue will be further discussed in a later

Page 266: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

266 CHAPTER 29. SUPPORT FUNCTIONS

chapter, when the information content of random mappings is examined.Lemma 29.4 permits now to compare the extensions spΓsp and spΓsp′ of thesupport functions sp and sp′. In fact, the refined information allocates alarger degree of support to any hypothesis.

Lemma 29.5 Let spΓsp and spΓsp′ be the extensions defined in (29.5). Then,for all φ ∈ Φ,

spΓsp(φ) ≤ spΓsp′ (φ).

Proof. By Lemma 29.4 (1) we have Γsp(I) ⊆ Γsp′(I) for all ideals I ∈ IΦ.This implies for the allocations of support that

sΓsp(φ) = I ∈ Iφ : φ ∈ Γsp(I) ⊆ I ∈ Iφ : φ ∈ Γsp′(I) = sΓsp′ (φ).

Further, also by Lemma 29.4 (2) we have that the inner probability withrespect to Psp equals at most the inner probability with respect to Psp′ ,i.e. Psp∗(A) ≤ Psp′∗(A) for all subsets A of IΦ. These two considerationstogether imply for all φ ∈ Φ, that

spΓsp(φ) = Psp∗(sΓsp(φ)) ≤ Psp′∗(sΓsp′ (φ)) = spΓsp′ (φ).

This proves the lemma. utNext, we introduce an alternative extension of a support function on E

to the whole of Φ, which is shown to be indeed a support function on Φ.This extension has been introduced in (Shafer, 1979) and has there beencalled the canonical extension, because it is the minimal extension of sp toΦ.

Theorem 29.7 Let sp be a support function on a sub-semi-lattice E of Φ.Then

sp(φ) = sup ∑

∅6=I⊆1,...,n

(−1)|I|−1sp(⊗i∈Iφi)

φi ∈ E , φi ≥ φ, i = 1, . . . , n;n = 1, 2, . . .. (29.6)

is a support function on Φ, and sp(φ) = sp(φ) for all φ ∈ E. Further, sp(φ)is the minimal extension of sp to Φ, i.e. if sp′ is another extension of sp toΦ, then

sp(φ) ≤ sp′(φ) (29.7)

for all φ ∈ Φ.

Proof. Let ρ be the a.o.p. associated to the support function sp via thecanonical random mapping Γsp corresponding to sp and µ the probability

Page 267: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.3. EXTENSION OF SUPPORT FUNCTIONS 267

measure in the probability algebra associated with the probability space ofthe canonical mapping Γsp. This means that sp(φ) = µ(ρ(φ)) for all φ ∈ E .

We define for φ ∈ Φ

ρ(φ) = ∨ρ(ψ) : ψ ∈ E , φ ≤ ψ.

We claim that ρ is an a.o.p. on Φ, which extends ρ on E . Certainly, if φ ∈ E ,then ρ(φ) = ρ(φ). So, in particular, ρ(e) = >. Further, ρ(φ1) ≥ ρ(φ1 ⊗ φ2),ρ(φ2) ≥ ρ(φ1 ⊗ φ2), hence ρ(φ1) ∧ ρ(φ2) ≥ ρ(φ1 ⊗ φ2). On the other hand,let ψi ∈ E , φi ≤ ψi, for i = 1, 2. Then

ρ(ψ1) ∧ ρ(ψ2) = ρ(ψ1 ⊗ ψ2),

and ψ1 ⊗ ψ2 ∈ E , as well as φ1 ⊗ φ2 ≤ ψ1 ⊗ ψ2. From this we conclude that

ρ(ψ1) ∧ ρ(ψ2) ≤ ρ(φ1 ⊗ φ2),

Now ρ(φi) ≤ ρ(ψi) for i = 1, 2. Therefore, we obtain ρ(φ1) ∧ ρ(φ2) ≤ρ(ψ1) ∧ ρ(ψ2) ≤ ρ(φ1 ⊗ φ2), so, finally, ρ(φ1) ∧ ρ(φ2) = ρ(φ1 ⊗ φ2). Thisproves that ρ is an a.o.p. on Φ.

Next we show that sp = µ ρ. This proves then by Lemma 29.3 thatsp is indeed a support function. Now, if we replace the terms sp(⊗i∈Iφi) onthe right hand side of 29.6 by their identical values in terms of the a.o.p. ρ,we obtain, by the inclusion-exclusion formula of probability,

sp(φ) = sup ∑

∅6=I⊆1,...,n

(−1)|I|−1µ(∧i∈Iρ(φi)) :

φi ∈ E , φi ≥ φ, i = 1, . . . , n;n = 1, 2, . . .= supµ(∨ni=1ρ(φi)) : φi ∈ E , φi ≥ φ, i = 1, . . . , n;n = 1, 2, . . ..

(29.8)

The family of elements ∨ni=1ρ(φi) of the probability algebra B, for φi ∈E , φi ≥ φ forms an upward directed subset in B, because (∨ni=1ρ(φi)) ∨(∨mi=n+1ρ(φi)) ≤ ∨mi=1ρ(φi). Therefore, by Lemma 27.1 in Section 27.2, thelast term in (29.8) above equals

µ(∨∨ni=1ρ(φi) : φi ∈ E , φi ≥ φ, i = 1, . . . , n;n = 1, 2, . . .)= µ(∨ρ(ψ) : ψ ∈ E , φ ≤ ψ) = µ(ρ(φ)).

Here the associative law for joins in complete lattices is used. So, sp = µ ρas claimed above. Since ρ is an extension of ρ, it follows that sp is anextension of sp to Φ.

Finally, let sp′ be any support function on Φ, which on E coincideswith sp. Consider any φ ∈ Φ and collection φ1, . . . , φn ∈ E such thatφ ≤ φ1, . . . , φn. Then, by Theorem 29.2

sp′(φ) ≥∑

∅6=I⊆1,...,n

(−1)|I|+1sp(⊗i∈Iφi).

Page 268: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

268 CHAPTER 29. SUPPORT FUNCTIONS

This implies

sp′(φ) ≥ sup∑

∅6=I⊆1,...,n

(−1)|I|+1sp(⊗i∈Iφi) :

φi ∈ E , φi ≥ φ, i = 1, . . . , n;n = 1, 2, . . .= sp(φ).

This proves the inequality (29.7), which means that sp is the minimal sup-port function, which extends sp to Φ. ut

The main result is now that the canonical extension introduced aboveis identical to the extension spΓsp of sp. This proves at the same time thatspΓsp is indeed a support function, and it is the minimal extension of spto Φ. It proves by the way also that the probability Psp underlying thecanonical random mapping associated with sp must be unique, since thesupport function sp depends only on sp and its domain E .

Theorem 29.8 Let sp be a support function on a sub-semi-lattice E of Φ.Then

spΓsp(φ) = sp(φ)

for all φ ∈ Φ.

Proof. Since sp is a support function on Φ, there is an associated canon-ical random mapping Γsp defined on a probability space (IΦ,Asp, Psp). Notethat we have spΓsp = sp. According to Lemma 29.5 we have then

spΓsp(φ) ≤ sp(φ).

Note further that, if φ ≤ ψi for i = 1, . . . , n, then sΓsp(ψi) ⊆ sΓsp(φ),therefore

∪ni=1sΓsp(ψi) ⊆ sΓsp(φ)

Therefore, if the canonical random mapping generating sp is defined on theprobability space (IΦ,Asp, Psp), then, for all φ ∈ Φ we have

spΓsp(φ) = Psp∗(sΓsp(φ))≥ supPsp(∪ni=1sΓsp(ψi)) : φ ≤ ψi, ψi ∈ E , i = 1, . . . , n, n = 1, 2, . . .

= sup∑

∅6=I⊆1,...,n

(−1)|I|−1Psp(∩i∈IsΓsp(ψi)) :

φ ≤ ψi, ψi ∈ E , i = 1, . . . , n, n = 1, 2, . . .= sup

∑∅6=I⊆1,...,n

(−1)|I|−1sp(⊗i∈Iψi) :

φ ≤ ψi, ψi ∈ E , i = 1, . . . , n, n = 1, 2, . . .= sp(φ).

Page 269: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.3. EXTENSION OF SUPPORT FUNCTIONS 269

So finally it follows that spΓsp(φ) = sp(φ) for all φ ∈ Φ utThis result is a further justification of the extension of a support function

of a random mapping by the inner probability of the underlying probabilityspace.

The results obtained so far may be extended to the case of continuoussupport functions, although only under the additional assumption that thedomain E of the support function is a lattice. In this case, there is first asimpler representation of the canonical extension of a support function.

Theorem 29.9 Let sp be a support function defined on E ⊆ Φ. If E is alattice then for all φ ∈ Φ

sp(φ) = supsp(ψ) : ψ ∈ E , φ ≤ ψ. (29.9)

Proof. We have by (29.6) for ψ ∈ E such that ψ ≥ φ that sp(φ) ≥ sp(ψ),hence

sp(φ) ≤ supsp(ψ) : ψ ∈ E , φ ≤ ψ.

In order to prove the converse inequality let ρ be the a.o.p. associ-ated with the support function sp via the canonical random mapping Γspcorresponding to sp and µ the probability measure in the probability al-gebra associated with the canonical random mapping Γsp. Then we havesp(φ) = µ(ρ(φ)) for all φ ∈ E . In addition let sΓsp denote the allocationof support associated with the random mapping Γsp. Further we note that(using the assumption that E is a lattice), for ψ1, . . . , ψn ∈ E ,

ρ(∧ni=1ψi) = [sΓsp(∧ni=1ψi)] ≥ [(∪ni=1(sΓsp(ψi)]= ∨ni=1[(sΓsp(ψi] = ∨ni=1ρ(ψi).

Now, using this inequality in (29.8), we obtain that

sp(φ) = supµ(∨ni=1ρ(ψi)) : ψi ∈ E , ψi ≥ φ, i = 1, . . . , n;n = 1, . . .≥ supµ(ρ(∪ni=1ψi)) : ψi ∈ E , ψi ≥ φ, i = 1, . . . , n;n = 1, . . .≥ supµ(ρ(ψ)) : ψ ∈ E , ψ ≥ φ= supsp(ψ) : ψ ∈ E , ψ ≥ φ.

This proves the theorem. utNow, in the case of continuous support functions, the results on the

extension of a support function to Φ can be extended as follows:

Theorem 29.10 Assume the information algebra Φ to be a σ-algebra, andsp a continuous support function defined on a distributive lattice E ⊆ Φ,which is closed under countable combinations. Then

Page 270: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

270 CHAPTER 29. SUPPORT FUNCTIONS

1. The canonical extension sp is continuous, i.e.

sp(⊗∞i=1φi) = limi→∞

sp(φi),

for every monotone sequence φ1 ≤ φ2 ≤ . . . ∈ Φ.

2. For all φ ∈ Φ,

sp(φ) = sup limi→∞

sp(φi) : φ1 ≤ φ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1φi. (29.10)

Proof. As in the proof of the preceding theorem let ρ be the a.o.p. as-sociated with the support function sp via the canonical random mappingΓsp corresponding to sp and µ the probability measure in the probabil-ity algebra associated with the canonical random mapping Γsp, such thatsp(φ) = µ(ρ(φ)) for all φ ∈ E . In addition let sΓsp again denote the allo-cation of support associated with the random mapping Γsp. Note that forψi ∈ E for i = 1, . . ., by Theorem 29.1

ρ(∨∞i=1ψi) = [sΓsp(∨∞i=1ψi)] = [∩∞i=1sΓsp(ψi)]= ∧∞i=[sΓsp(ψi)] = ∧∞i=ρ(ψi). (29.11)

We define first for any φ ∈ Φ

ρ(φ) = ∨ρ(ψ) : ψ ∈ E , φ ≤ ψ.

Note then that

ρ(φ) = ∨ρ(∨i∈Iψi) : I countable set ψi ∈ E , φ ≤ ∨i∈Iψi= ∨ ∧i∈I ρ(ψi) : I countable set ψi ∈ E , φ ≤ ⊗i∈Iψi.

With ψi, i ∈ I we may associate the monotone sequence ηi = ∨ij=1ψj suchthat ∨i∈Iψi = ∨∞i=1ηi. This shows finally that

ρ(φ) = ∨ ∧∞i=1 ρ(ψi) : ψ1 ≤ ψ2 ≤ . . . ∈ E , φ ≤ ∨∞i=1ψi

holds too.Then the proof proceeds as follows: We show that ρ is a σ-a.o.p. and

that µ ρ equals the right hand side of (29.10). Applying Lemma 29.3we conclude then that the right hand side of (29.10) defines a continuoussupport function on Φ. Finally we are going to show that sp equals the righthand side of (29.10). This proves then the two parts of the theorem.

We are now going to show that

ρ(∨∞i=1φi) = ∧∞i=1ρ(φi). (29.12)

Page 271: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.3. EXTENSION OF SUPPORT FUNCTIONS 271

Define the sets

D = ρ(ψ) : ψ ∈ E ,∨∞i=1φi ≤ ψ,Di = ρ(ψ) : ψ ∈ E , φi ≤ ψ.

Then it holds that ∨D = ρ(∨∞i=1φi) and ∨Di = ρ(φi) such that we have toprove ∨D = ∧∞i=1(∨Di). Since φi ≤ ∨∞i=1φi ≤ ψ it follows that D ⊆ Di forall i, hence ∨D ≤ ∨Di for all i = 1, . . ., therefore

∨D ≤ ∧∞i=1(∨Di) = M.

It remains to show that ∨D ≥ M . We remark that the set Di is upwardsdirected: If ρ(ψ1), ρ(ψ2) ∈ Di, then ψ1, ψ2 ∈ E and φi ≤ ψ1, ψ2, thereforeψ1 ∧ ψ2 ∈ E (since E is assumed to be a lattice) and φi ≤ ψ1 ∧ ψ2. Thisimplies that ρ(ψ1), ρ(ψ2) ≤ ρ(ψ1 ∧ ψ2) ∈ Di. It follows then from Lemma27.1 that

µ(∨Di) = supµ(ρ(ψ)) : ψ ∈ E , φi ≤ ψ.

Therefore, for any ε > 0 there exists an element ρ(ψi,ε) ∈ Di such that

ε

2 · i≥ µ(∨Di)− µ(ρ(ψi,ε)) = µ(∨Di − ρ(ψi,ε)).

Since M ≤ ∨Di we obtain

µ(M − ρ(ψi,ε)) = µ(M ∧ (ρ(ψi,ε))c) ≤ µ(∨Di ∧ (ρ(ψi,ε))c)

= µ(∨Di − ρ(ψi,ε))) ≤ε

2 · i.

Now, φi ≤ ψi,ε implies ∨∞i=1φi ≤ ∨∞i=1ψi,ε and ∨∞i=1ψi,ε ∈ E because E issupposed to be closed under countable combinations. This implies by (29.11)that

Mε = ∧∞i=1ρ(ψi,ε) = ρ(∨∞i=1ψi,ε) ∈ D.

Further,

M −Mε = M ∧M cε = M ∧ (∧∞i=1ρ(ψi,ε))

c = M ∧ (∨∞i=1ρ(ψi,ε)c)= ∨∞i=1 (M − ρ(ψi,ε)c) = ∨∞i=1(M − ρ(ψi,ε)).

From this we obtain

µ(M −Mε) = µ(∨∞i=1(M − ρ(ψi,ε))) ≤∞∑i=1

ε

2 · i= ε.

Then, finally, Mε ≤ ∨D or M cε ≥ (∨D)c and therefore

M − ∨D = M ∧ (∨D)c ≤M ∧M cε = M −M c

ε .

Page 272: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

272 CHAPTER 29. SUPPORT FUNCTIONS

Thus we conclude that µ(M −∨D) ≤ µ(M −M cε ) ≤ ε. Since ε is arbitrarily

small, it follows µ(M − ∨D) = 0, hence from ∨D ≤ M it follows that∨D = M . This proves (29.12).

Let now spcm denote the right hand side of (29.10). We are next goingto show that spcm = µ ρ. In fact,

spcm(φ) = sup limi→∞

sp(φi) : φ1 ≤ φ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1φi

= supsp(⊗∞i=1φi) : φ1 ≤ φ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1φi= supµ(ρ(⊗∞i=1φi)) : φ1 ≤ φ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1φi

Let’s show now that the set ρ(⊗∞i=1φi)) : φ1 ≤ φ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1φi isupward directed. Let ρ(⊗∞i=1φi) and ρ(⊗∞i=1φ

′i) be two elements of this set.

Since E is assumed to be lattice, and ρ(φ)∨ ρ(ψ) ≤ ρ(φ∧ψ), it follows that

ρ(⊗∞i=1φi) ≤ ρ(⊗∞i=1φi) ∨ ρ(⊗∞i=1φ′i)

≤ ρ((⊗∞i=1φi) ∧ (⊗∞i=1φ′i))

= ρ(⊗∞i=1(φi ∧ φ′i))

and similarily ρ(⊗∞i=1φ′i) ≤ ρ(⊗∞i=1(φi ∧ φ′i)). Here we use the assumption

that the lattice E is distributive. So we conclude that

ρ(⊗∞i=1φi) ∨ ρ(⊗∞i=1φ′i) ≤ ρ(⊗∞i=1(φi ∧ φ′i)).

The element ρ(⊗∞i=1(φi ∧ φ′i)) belongs to the set, since φi ∧ φ′i ∈ E , furtherφ1 ∧ φ′1 ≤ φ2 ∧ φ′2 ≤ . . . and φ ≤ ⊗∞i=1(φi ∧ φ′i). Then by Lemma 27.1 that

spcm(φ) = supµ(ρ(⊗∞i=1φi)) : φ1 ≤ φ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1φi= µ(∨ρ(⊗∞i=1φi)) : φ1 ≤ φ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1φi)= µ(ρ(φ)).

Therefore spcm is indeed a support function, which extends sp from E to allof Φ. This implies that sp(φ) ≤ spcm(φ for all φ ∈ Φ, since sp is the minimalextension of sp (Theorem 29.7).

Finally we sow that sp = spcm. As sp is a continuous support func-tion on the lattice E , it follows that the allocation of support sΓsp satisfiessΓsp(⊗∞i=1φi) = ∩∞i=1sΓsp(φi). Let A be the σ algebra associated with thecanonical ran dom mapping induce by sp and P the associated probabilitymeasure. Then we have

sp(φ) = supP (A) : A ∈ A, A ⊆ sΓsp(φ)≥ supP (sΓsp(ψ)) : ψ ∈ E , φ ≤ ψ≥ supP (sΓsp(⊗∞i=1ψi)) : ψ1 ≤ ψ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1ψi= supP (∩∞i=1sΓsp(ψi)) : ψ1 ≤ ψ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1ψi= supP ( lim

i→∞sΓsp(ψi)) : ψ1 ≤ ψ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1ψi

= supP ( limi→∞

sp(ψi)) : ψ1 ≤ ψ2 ≤ . . . ∈ E , φ ≤ ⊗∞i=1ψi

= spcm(φ).

Page 273: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.3. EXTENSION OF SUPPORT FUNCTIONS 273

So, sp = spcm, which is continuous and satifies (refeq:ContExt). this com-pletes the proof. ut

Of course, it still holds that spΓsp = sp. So the property stated inTheorem 29.9 apply to the extension spΓsp too.

If the domain E of a support function is part of a compact informationalgebra (Φ,Φf , D), then, as we know, Φ is a complete lattice. If, in addition,E is a complete lattice too, then even more stringent statements can bemade about the canonical extension of sp to Φ. In particular, we claimthat, under some circumstances, the finite elements determine in a simpleway the canonical extension of the support function to Φ. This holds, if anyelement of E is in some sense arbitrarily near a coarser finite and measurableelement in E ∩ Φf . In this case the canonical extension sp to all of Φ is infact an extension of the restriction of sp to the finite elements.

Theorem 29.11 Let (Φ,Φf , D) be a compact information algebra, and spa support function with domain E ⊆ Φ, where E is a complete lattice, andfor all φ ∈ E and ε > 0 there is a ψ ∈ E ∩ Φf such that ψ ≤ φ andsp(η)− sp(φ) < ε for all η ∈ E such that ψ ≤ η ≤ φ. Then

sp(∨φ∈Xφ) = infφ∈X

sp(φ) (29.13)

for any directed subset X of Φ. In particular, for all φ ∈ Φ, it holds that

sp(φ) = infψ∈Φf ,ψ≤φ

sp(ψ).

Proof. Let ρ be the a.o.p. associated with the canoncical random map-ping corresponding to the support function sp defined on E and let (B, µ) bethe probability algebra derived from the canonical random mapping. Thenit holds that sp = µ ρ. We define now

ρ(φ) = ∧ρ(ψ) : ψ ∈ Φf , ψ ≤ φ,

and sp = µ ρ.Now, the set of finite elements ψ ≤ φ is an directed set, hence the set

ρ(ψ) for finite elements ψ ≤ φ is downwards directed in B. Hence, by Lemma27.1 we obtain

µ(ρ(φ)) = µ(∧ψ∈Φf ,ψ≤φρ(ψ)) = infψ∈Φf ,ψ≤φ

µ(ρ(ψ)) = infψ∈Φf ,ψ≤φ

sp(ψ).

The proof goes as follows: We show that ρ|E = ρ that ρ is an a.o.p., andthat ρ(∨X) = ∨φ∈X ρ(φ) for any subset X ⊆ Φ. This proves then that sp isa support function, which is an extension of sp and fulfills (29.13). Then weshow that sp = sp. This proves (29.13). The second part of the theorem isan immediate consequence of (29.13).

Page 274: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

274 CHAPTER 29. SUPPORT FUNCTIONS

Suppose then φ ∈ E . For any ψ ∈ Φf such that ψ ≤ φ we have alsoρ(ψ) ≥ ρ(φ) = ρ(φ), hence ρ(φ) ≥ ρ(φ). In order to prove that ρ(φ) =ρ(φ), we fix an ε > 0 and choose a ψε ∈ E ∩ Φf such that ψε ≤ φ andsp(η)− sp(φ) < ε/2 for all η ∈ E such that ψε ≤ η ≤ φ. Now, since φ ∈ E ,

ρ(φ) = ∧∨ρ(η) : η ∈ E , ψ ≤ η : ψ ∈ Φf , ψ ≤ φ= ∧∨ρ(η) : η ∈ E , ψ ≤ η ≤ φ : ψ ∈ Φf , ψ ≤ φ≤ ∨ρ(η) : η ∈ E , ψε ≤ η ≤ φ.

Denote this last element of B by mε. The family ρ(η) : η ∈ E , ψε ≤ η ≤ φis an upward directed set in B: Indeed, if ρ(η1) and ρ(η2) are both in theset, then so is ρ(η1 ∧ η2), which is larger than both ρ(η1) and ρ(η2). Notethat here we use the assumption that E is a lattice. This implies that

µ(mε) = supρ(η) : η ∈ E , ψε ≤ η ≤ φ.

Hence there is a ηε ∈ E such that ψε ≤ ηε ≤ φ and

µ(mε)− µ(ρ(ηε)) = µ(mε − ρ(ηε)) <ε

2.

Since ρ(φ) ≤ ρ(φ) ≤ mε it follows that

µ(ρ(φ)− ρ(φ)) ≤ µ(mε)− µ(ρ(φ)) = |µ(mε)− sp(φ)|

= |µ(mε)− sp(ηε) + sp(ηε)− sp(φ)| < ε

2+ε

2= ε.

Since ε > 0 is arbitrary, it follows that µ(ρ(φ) − ρ(φ)) = 0 which, togetherwith ρ(φ) ≥ ρ(φ), implies that ρ(φ) = ρ(φ), hence ρ|E = ρ.

Since e ∈ E and Φf it follows that ρ(e) = ρ(e) = >. Further, letφ1, φ2 ∈ Φ. Then

ρ(φ1) = ∧ρ(ψ) : ψ ∈ Φf , ψ ≤ φ1≥ ∧ρ(ψ) : ψ ∈ Φf , ψ ≤ φ1 ⊗ φ2= ρ(φ1 ⊗ φ2).

Similarily, it holds also that ρ(φ2) ≥ ρ(φ1 ⊗ φ2), hence ρ(φ1) ∧ ρ(φ2) ≥ρ(φ1 ⊗ φ2). On the other hand,

ρ(φ1 ⊗ φ2) = ∧ρ(ψ) : ψ ∈ Φf , ψ ≤ φ1 ⊗ φ2≤ ∧ρ(ψ1 ⊗ ψ2) : ψ1, ψ2 ∈ Φf , ψ1 ≤ φ1, ψ2 ≤ φ2= ∧ρ(ψ1) ∧ ρ(ψ2) : ψ1, ψ2 ∈ Φf , ψ1 ≤ φ1, ψ2 ≤ φ2= (∧ρ(ψ1) : ψ1 ∈ Φf , ψ1 ≤ φ1) ∧ (∧ρ(ψ2) : ψ2 ∈ Φf , ψ2 ≤ φ2)= ρ(φ1) ∧ ρ(φ2).

So, we conclude that ρ(φ1⊗ φ2) = ρ(φ1)∧ ρ(φ2) and this shows that ρ is ana.o.p.

Page 275: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

29.3. EXTENSION OF SUPPORT FUNCTIONS 275

Let X be an arbitrary subset of Φ. We claim that ρ(∨X) = ∧φ∈X ρ(φ).Using Lemma 27.1 it follows then that sp(∨X) = infφ∈X sp(φ), if X isdirected. Now, by compactness, if ψ ∈ Φf and ψ ≤ ∨X, if, and only if,there is a finite subset Y in X such that ψ ≤ ∨Y . Therefore,

ρ(∨X) = ∧ρ(ψ) : ψ ∈ Φf , ψ ≤ ∨X= ∧ρ(ψ) : ψ ∈ Φf , ψ ≤ ∨Y, for some finite Y ⊆ X= ∧ (∧ρ(ψ) : ψ ∈ Φf , ψ ≤ ∨Y ) : Y ⊆ X,Y finite= ∧ρ(∨Y ) : Y ⊆ X,Y finite= ∧ (∧ρ(φ) : φ ∈ Y ) : Y ⊆ X,Y finite= ∧ρ(φ) : φ ∈ X.

So, sp is indeed a support function on Φ satisfying (??). We remark thatso far we did not yet use the assumption that E is a complete lattice. Thus,this result holds without this assumption.

Finally we are going to prove that sp = sp if E is a complete lattice.Note in particular, that in this case sp satisfies (29.13) on E , since sp is anextension of sp. Assume that sp′ is an support function on Φ, which satisfies(29.13) and whose restriction to E equals sp. Then fromTheorem 29.7 weknow that sp ≤ sp′. Further, since sp′ is assumed to satisfy (29.13), weobtain

sp′(φ) = infψ∈Φf ,ψ≤φ

sp′(ψ) ≥ infψ∈Φf ,ψ≤φ

sp(ψ) = sp(φ).

This shows that sp is the minimal extension of sp from E to Φ, which satisfies(29.13). We show next that sp satisfies (29.13) if E is a complete lattice.Then sp ≥ sp. But the converse inequality holds as we have seen above,whence sp = sp as claimed. This concludes then the proof.

If E is a complete lattice, then a mapping θ : Φ → E can be defined asfollows:

θ(φ) = ∧ψ : ψ ∈ E , φ ≤ ψ.

Then it follows that sp = sp θ, since the set ρ(ψ) : ψ ∈ E , φ ≤ ψ isupwards directed (Lemma 27.1),

sp(θ(φ)) = sp(∧ψ : ψ ∈ E , φ ≤ ψ) = µ(ρ(∧ψ : ψ ∈ E , φ ≤ ψ))= µ(∨ρ(ψ) : ψ ∈ E , φ ≤ ψ) = supµ(ρ(ψ)) : ψ ∈ E , φ ≤ ψ= supsp(ψ) : ψ ∈ E , φ ≤ ψ = sp(φ),

by Theorem 29.9. Further, for an arbitrary subset X ⊆ Φ, certainly θ(φ) ≤θ(∨X) if φ ∈ X and therefore ∨φ∈Xθ(φ) ≤ θ(∨X). Let η ∈ E be any upperbound of θ(φ) for all φ ∈ X. Then φ ≤ θ(φ) = ∧ψ : ψ ∈ E , φ ≤ ψ ≤ ηfor any φ ∈ X. Therefore we conclude that ∨X ≤ η, hence θ(∨X) ≤ η,

Page 276: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

276 CHAPTER 29. SUPPORT FUNCTIONS

which shows that θ(∨X) is the lowest upper bound of the θ(φ) for φ ∈ Φ,i.e. ∨φ∈Xθ(φ) = θ(∨X). Let now X ⊆ Φ be a directed set. Then the setθ(φ) for φ ∈ X is directed too. This implies that

sp(∨X) = sp(θ(∨X)) = sp(∨φ∈Xθ(φ)) = infφ∈X

sp(θ(φ)) = infφ∈X

sp(φ).

So sp fulfills indeed (29.13) and this concludes the proof. utA support function sp which can be extended to a support function,

which satisfies (29.13), is called condensable (Shafer, 1979). Thus, Theorem29.11 gives a sufficient condition for a support function to be condensable.If Φf ⊆ E , and the conditions of Theorem 29.11 are satisfied, then it followsthat the canonical extension of sp is fully determined by the values of sp onthe finite elements,

sp(φ) = infψ∈Φf ,ψ≤φ

sp(ψ).

In this case sp depends only on the allocation of support to the finite ele-ments Φf of the algebra.

Page 277: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 30

Random Sets

30.1 Set-Valued Mappings: Finite Case

Subset algebras are important instances of information algebras. Therefore,we are going to study in this chapter random mappings into subset algebras.In particular, we examine how the general theory of Chapters 26 to 29specializes to this important case and in which way the theory may bedeveloped further, taking advantage of the additional structure of subsetalgebras. In particular in this first section we discuss the elementary caseof the subset algebra of a finite set. In the next Section 30.2 general subsetalgebras will be considered. There is a well-known theory of random sets (?)which is originally not related to out theory based on information algebras.In Section ?? we discuss how this theory is related to ours. Finally, inSections ?? and ?? two important special cases will be considered.

Consider a finite set U , called the universe. The associated power setP(U) forms a Boolean algebra. We may consider it as an information algebrawith intersection as combination and a single domain D = U. This corre-sponds to the view that the elements of U constitute the possible answers toa fixed question; and its subsets, i.e. the elements of the power algebra, arepossible pieces of information relative to this question. The partial order ofinformation A ≤ B means then B ⊆ A. Uncertain information relative toU is then given by a random mapping into U . Without loss of generalitywe may consider a finite sample space Ω = ω1, . . . , ωn together with aprobability distribution pi = P (ωi) for i = 1, . . . , n. Let then Γ : Ω→ P(U)denote such a random mapping into U . In terms of the general theory thisrepresents a simple random variable in the algebra (P(U), U).

If Γ(ωi) 6= ∅ for all ωi ∈ Ω, then we call the random variable normalized.If Γ(ωi) 6= Γ(ωj), whenever i 6= j, then the variable is called canonical (seealso Section26.1). To any random variable Γ we may associate a correspond-ing normalized variable Γ↓, where Γ↓(ωi) = Γ(ωi) for all ωi ∈ Ω such that

277

Page 278: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

278 CHAPTER 30. RANDOM SETS

Γ(ωi) 6= ∅. Further define Ω↓ = ωi ∈ Ω : Γ(ωi) 6= ∅ and

p↓i =pi

P (Ω↓).

This can be interpreted as follows: Sample points or assumptions ω, whichmap to an empty set are not really possible assumptions. Therefore theycan be eliminated and the probability conditioned on the event or informa-tion that only assumptions which refer to a proper information Γ(ωi 6= ∅are possible. Similarly to any random variable Γ we can associate a corre-sponding canonical variable: Consider the equivalence relation ωi ≡ ωj ifΓ(ωi)) = Γ(ωj). Let then Ω→ be the set of equivalence classes and define

p→([ωi]) =∑

j:ωj∈[ωi]

pj ,

and finally Γ→([ωi]) = Γ(ωi). Note that (Γ↓)→ = (Γ→)↓.As we have shown in Section 26.4 we may assign a basic probability

assignment (bpa) to a simple random variable Γ by

mΓ(A) = P (Γ(ω) = A) =∑

Γ(ωi)=A

pi,

for any A ⊆ U . Subsets A of U for which mΓ(A) > 0 are also called focalsets of Γ. The bpa allows to compute the degrees of support associated withthe random variable Γ,

spΓ(A) = P (Γ(ω) ⊆ A) =∑B⊆A

mΓ(B).

In Section 26.3 we introduced the notion of the possibility of a piece ofinformation. In the case of subsets, the possibility of a subset A relative toa random variable Γ is defined as

pΓ(A) = ωi ∈ Ω : Γ(ωi) ∩A 6= ∅ωi ∈ Ω : Γ(ωi) ⊆ Acc. (30.1)

From this it follows for the degrees of possibility

plΓ(A) = 1− spΓ(Ac).

As noted in Section 26.4 the bpa serves also to compute the degrees ofpossibility

plΓ(A) =∑

B∩A 6=∅

mΓ(B).

The case of finite subsets discussed here formally corresponds to the clas-sical Dempster-Shafer theory of evidence as presented in (Shafer, 1976).

Page 279: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

30.1. SET-VALUED MAPPINGS: FINITE CASE 279

There the degrees of possibility are also called upper probabilities (a termgoing back to (Dempster, 1967) and also degree of plausibility. In Dempster-Shafer theory of evidence yet another concept, the commonality numbers areintroduced

qΓ(A) =∑A⊆B

mΓ(B).

It has been shown in (Shafer, 1976) that any one of the functions mΓ(A),spΓ(A), plΓ(A) or qΓ(A) suffices to determine the other ones. It has alreadybeen shown how the bpa determines the support, possibility and common-ality functions. The following theorem specifies the other relations

Theorem 30.1 Let Γ be a random variable with subsets of an universe Uas values. Then the following equations hold for all subsets A of the universeU ,

mΓ(A) =∑B⊂A

(−1)|A\B|spΓ(B),

spΓ(A) =∑B⊂Ac

(−1)|B|qΓ(B),

qΓ(A) =∑B⊂A

(−1)|B|spΓ(Bc),

plΓ(A) =∑

B⊂A,B 6=∅

(−1)|B|+1qΓ(B),

qΓ(A) =∑B⊂A

(−1)|B|+1plΓ(B),

The proofs of these relations are of purely combinatorial nature and willnot be reproduced here. We refer to (Shafer, 1976; Kohlas & Monney, 1995)for more details and the proofs.

Since the sample space Ω is finite its power set algebra ⊗ together withthe probability measure P form a probability algebra. The allocation ofsupport sΓ : U → ⊗ is then at the same time an allocation of probability(a.o.p.). In the same way the mapping pΓ : U → ⊗ defined by pΓ(H) =scΓ(Hc) (see (30.1) abvove) is an allowment of probability. This means itdesignates the amount of probability mass under which hypothesis H ⊆ Uremains possible. An a.o.p. is characterized by properties (A1) and (A2) ofSection28.1. For an allowment of probability as exemplified by pΓ we derivetwo similar conditions:

pΓ(∅) = scΓ(U) = Ωc = ∅,pΓ(H1 ∪H2) = scΓ((H1 ∪H2)c) = scΓ(Hc

1 ∩Hc2)

= (sΓ(Hc1) ∩ sΓ(Hc

2))c = scΓ(Hc1) ∪ scΓ(Hc

2)= pΓ(H1) ∪ pΓ(H2).

Page 280: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

280 CHAPTER 30. RANDOM SETS

Further, if Γ is normalized, then pΓ(U) = Ω.The support function spΓ satisfies of course the characterizing properties

of Theorem 29.2 in Section 29.1. In the present case of a subset algebra thistranslates as follows:

1. spΓ(U) = 1, and, if Γ is normalized, spΓ(∅) = 0.

2. Assume H1, . . . ,Hn ⊆ H ⊆ U , n ≥ 1. Then

spΓ(H) ≥∑

∅6=I⊆1,...,n

(−1)|I|+1spΓ(∩i∈IHi). (30.2)

Property (3) of Theorem 29.2 makes no sense in a finite algebra. Simi-lar properties can also be derived for the plausibility function plΓ, as thefollowing theorem shows.

Theorem 30.2 Let Γ be a random mapping into the power algebra of afinite set U . Then the mapping plΓ has the following properties

1. plΓ(∅) = 0, and, if Γ is normalized, spΓ(U) = 1.

2. Assume U ⊆ H1, . . . ,Hn ⊇ H, n ≥ 1. Then

plΓ(H) ≥∑

∅6=I⊆1,...,n

(−1)|I|+1plΓ(∪i∈IHi). (30.3)

Proof. Both (1) and (2) follow from plΓ(H) = 1 − spΓ(Hc). In fact,plΓ(∅) = 1− sp(Ω) = 0. Or, from (30.2), since Hi ⊇ H implies Hc

i ⊆ H,

plΓ(H) = 1− spΓ(Hc) ≥ 1−∑

∅6=I⊆1,...,n

(−1)|I|+1spΓ(∩i∈IHci )

= 1−∑

∅6=I⊆1,...,n

(−1)|I|+1spΓ((∪i∈IHi)c)

= 1−∑

∅6=I⊆1,...,n

(−1)|I|+1(1− plΓ(∪i∈IHi)).

The inequality (30.3) follows finally from

∑∅6=I⊆1,...,n

(−1)|I|+1 = −

∑I⊆1,...,n

(−1)|I| − 1

= 1. (30.4)

utA function satisfying (2) of the theorem above is called alternating of

infinite order.An important particular case arises, when all focal sets of a random

mapping Γ : Ω → P(U) are singleton sets, Γ(ωi) = ui. Such a random

Page 281: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

30.2. SET-VALUED MAPPINGS: GENERAL CASE 281

mapping is called precise, and it corresponds in fact to a classical randomvariable γ : Ω→ U , defined by γ(ωi) = ui. Clearly in this case for all subsetsH of U we obtain sΓ(H) = pΓ(H), hence spΓ(H) = plΓ(H). Furthermore,properties (30.3) and (30.3) become

spΓ(∪ni=1Hi) =∑

∅6=I⊆1,...,n

(−1)|I|+1spΓ(∩i∈IHi),

and

plΓ(∩ni=1Hi) =∑

∅6=I⊆1,...,n

(−1)|I|+1plΓ(∩i∈IHi).

But this is equivalent to additivity of both functions,

spΓ(∪ni=1Hi) =n∑i=1

spΓ(Hi),

if the sets Hi are mutually disjoint. This means that spΓ = plΓ become aprobability measure on the the power set of U .

30.2 Set-Valued Mappings: General Case

The results of the previous section can in many respect easily generalized torandom mappings into general sets. The main new issue is measurability. Asin the previous section consider a set U , the universe, only that it is no moreassumed to be finite. Again the power set P(U) is a Boolean algebra,whichcan be considered as an information algebra with a single domain. So theresults of Chapter 27 apply. We are going to review the results of thischapter with respect to the algebra

P(U)

as well as those of Chapters 28.1 29 briefly in this section.A random mapping is now a mapping Γ : Ω → P(U) of a probability

space (Ω,A.P ) into the power set of the universe U . If the subsets of Uare considered as possible pieces of information in the spirit of informationalgebras, then, information may be more generally be represented by filtersin the Boolean lattice P(U), i.e. families of subsets F such that

1. H ∈ F and H ⊆ H ′ implies H ′ ∈ F .

2. H1, H2 ∈ F implies H1 ∩H2 ∈ F .

Since information order in the Booelan information algebra P(U) is theinverse order to inclusion, filters of Boolean algebra correspond to ideals in

Page 282: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

282 CHAPTER 30. RANDOM SETS

the information algebra. So, as in Section 27.1 we may generalize randommappings to fitler-valued maps. So, in general, for all ω ∈ Ω we considerΓ(ω) to be a filter of subsets of U . As usual, random mappings with subsetsas values can be considered as filter-valued mappings whose values are theprincipal filters corresponding to the set values.

A random mapping Γ induces then an allocation of support

sΓ(H) = ω ∈ Ω : H ∈ Γ(ω).

Note that H ∈ Γ(ω) is equivalent to H ⊆ Γ(ω), if Γ is set-valued. As inSection 27.2 we may define, if Γ is set-valued.

pΓ(H) = ω ∈ Ω : H ∩ Γ(ω) 6= ∅,

to be the et of assumptions ω under which H, if not necessarily true, is atleast possible. We call the mapping pΓ : P(U) → P(Ω) the allowment ofpossibility associated with the random mapping Γ. Note that

pΓ(H) = ω ∈ Ω : Hc ∈ Γ(ω)c = sΓ(Hc).

So all ω allow H if they do not imply its complement (negation) Hc. Thisdefinition applies more generally also to filter-valued random mappings Γ.So in a Boolean algebra there is a strong duality between support and plau-sibility, which in this form does not exist in general information algebras.

We denote as in Chapter 29 the family subsets H of U for which theallocationof support sΓ(H) ∈ (A) is measurable by EΓ. The duality betweensupport and possibility implies that EcΓ = H : Hc ∈ EΓ contains all thesubsets of U for which the allowment of possibility are measurable. Lemma29.1 of Section 29.1 can be extended to the present case as follows:

Lemma 30.1 For any random mapping Γ, U ∈ E and ∅ ∈ E if Γ is nor-malized, and if H1, H2 ∈ E, then H1 ∩ H2 ∈ E. If Γ is set-valued, thenH1, H2, . . . ∈ E implies ∩∞i=1Hi ∈ E. And dually: ∅ ∈ Ec and U ∈ Ec if Γ isnormalized, and if H1, H2 ∈ Ec, then H1 ∪H2 ∈ Ec. If Γ is set-valued, thenH1, H2, . . . ∈ Ec implies ∪∞i=1Hi ∈ Ec.

This is an immediate consequence of Lemma 29.1 and the duality betweensupport and possibility. So E is closed under finite intersections, it is asemilattice under intersections, whereas Ec is closed under finite unions.This are minimal properties. Often E enjoys stonger properties, as alreadythe lemma above illustrates. We shall come back to this issue below.

On E we define the support function spΓ(H) = P (sΓ(H)) as usual andon Ec the plausibility function plΓ(H) = P (pΓ(H)). As in the case of a finiteuniverse, for H ∈ Ec we have still

plΓ(H) = 1− spΓ(Hc),

Page 283: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

30.3. RANDOM MAPPINGS INTO SET ALGEBRAS 283

which is a further element of duality between support and possibility. Fur-ther, the support function is still monotone of infinite order and the plau-sibility function alternating of infinite order, that is (30.2) and (30.3) holdfor elements in E and Ec respectively.

If Γ is a set valued random mapping, then, as shown in Lemma 30.1,E is closed under countable intersections, and Ec under countable unions.In these cases, both the support function and the plausibility function arecontinuous, as the next lemma shows

Lemma 30.2 Let Γ be a random mapping such that E is closed under count-able intersections and Ec under countable unions. Then, for H1 ⊇ H2 ⊇· · · ∈ E,

spΓ(∩∞i=1Hi) = limi→∞

spΓ(Hi).

Further, for H1 ⊆ H2 ⊆ · · · ∈ Ec,

plΓ(∪∞i=1Hi) = limi→∞

plΓ(Hi).

The proof of the first part goes as the proof of Theorem 29.2 and thesecond part follows from duality.

In random set theory (Molchanov, 2005; Nguyen, 1978) usually restric-tive assumptions on the random mappings are made. For instance U isassumed to be a topological space, all Γ(ω) are closed sets and EΓ is sup-posed to contain all Borel sets (Molchanov, 2005). This leads then to arich theory exploiting the interplay between measurability and topology. Inanother direction, Probabilistic argumentation systems combine logic andprobability in order to derive probabilities of provability of hypotheses us-ing assumption-based reasoning. This has been developed in a context ofpropositional logic in (?). In another context this approach has been ap-plied to statistical inference (?). It has been shown in Chapter 25 howthis is related to random mappings, namely how random mappings can bederived from the assumption-based inference. So, random mappings, in par-ticular set-valued random mappings, provide for the abstract background ofprobabilistic argumentation systems.

30.3 Random Mappings into Set Algebras

In Chapter 27 we have seen that random mappings into an informationalgebras form themeselves an information algebra. In this section we con-sider the algebra of random mapping into a set-information algebra. Thisis an instance of a Boolean information algebra. We use it to illustrate thespecificities of uncertain information in Boolean structures. In particular wediscuss the transport of uncertain information from one domain to anotherone.

Page 284: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

284 CHAPTER 30. RANDOM SETS

As a base for the present discussion, we consider a compatible family offrames (see Example 5 in Section 2.2 of Part I). Remind that a frame issimply a set x. Different frames x and y may be linked by a refining, thatis a mapping τ : x→ P(y), such that

1. ∀θ ∈ x τ(x) 6= ∅,

2. θ′ 6= θ′′ implies τ(θ′) ∩ τ(θ′′) = ∅,

3.⋃θ∈x τ(θ) = y.

A compatible family of frames is a family F of sets, with partial order x ≤ yif there is a refining τ : x → P(y), and which is a lattice under this order.It has been remarked in Example 5 of Section 3.1 of Part I that the subsetsof frames in a compatible family of frames form an information algebra.

In fact, let Φx denote the power set of frame x, and

Φ =⋃x∈F

Φx.

If x ≤ y denote the refining of x to y be τx→y. If φ is an element of Φx, thatis a subset of frame x, then, as usual, define

τx→y(φ) =⋃θ∈x

τx→y(θ).

Labeling is defined by d(φ) = x if φ ∈ Φx. Combination is defined for φ ∈ Φx

and ψ ∈ Φy by

φ⊗ ψ = τx→x∨y(φ) ∩ τy→x∨y(ψ).

Projection finally is defined for φ ∈ Φy and x ≤ y as

φ↓x(φ) = θ ∈ x : τx→y(θ) ∩ φ 6= ∅.

Let’s call this information algebra (Φ,F).Consider now random mappings Γ : Ω → Φx from a probability space

(Ω,A, P ) into Φx for some frame x ∈ F . Such a mapping Γ is labeled withd(Γ) = x. These random mappings inherit as usual the structure of theunderlying information algebra, a labeled algebra tis time. So combinationof random mappings Γ′ and Γ′′ with d(Γ′) = x and d(Γ′′) = y results in themapping

(Γ′ ⊗ Γ′′)(ω) = Γ′(ω)⊗ Γ′′(ω).

with domain x ∨ y. Similarly, if Γ is a random mapping with domain y andx ≤ y, then the projection of Γ to x is a random mapping

Γ↓x(ω) = (Γ(ω))↓x

Page 285: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

30.3. RANDOM MAPPINGS INTO SET ALGEBRAS 285

with domain x. The set of these labeled random mappings R clearly formstogether with the lattice of frames F a labeled information algebra (R,F).

Now, as usual, a random mapping Γ ∈ R is considered as some uncertaininformation about the elements of Φ. Assume d(Γ) = x and consider anelement φ ∈ Φy. What is the support allocated by Γ to φ? An elementω ∈ Ω supports φ, given the mapping Γ, if (Γ(ω))→y implies φ, that is,(Γ(ω))→y ⊆ φ. Remember that ψ→y = (ψ↑x∨y)↓y. Now, ψ→y ⊆ φ if, andonly if, ψ↑x∨y ⊆ φ↑x∨y, which in turn holds if, and only if ψ ⊆ φ→x. So, itfollows also that (Γ(ω))→y ⊆ φ if, and only if Γ(ω) ⊆ φ→x. Therefore, weobtain for the allocation of support induced by Γ,

sΓ(φ) = ω ∈ Ω : Γ(ω) ⊆ φ→x = ω ∈ Ω : (Γ(ω))→x ⊆ φ,

if d(Γ) = x. Similarly, an element ω ∈ Ω allows for a φ ∈ Φy, if it does notsupport φc. Hence the allowment of possibility associated with Γ is

pΓ(φ) = scΓ(φc) = ω ∈ Ω : (Γ(ω))→x ⊆ φcc

= ω ∈ Ω : (Γ(ω))→x ∩ φ 6= ∅.

Just as we may pass from the labeled information algebra (Φ,F) to theassociated domain-free information algebra (Φ/σ,F) by the congruence σdefined as φ ≡σ ψ if d(φ) = x, d(ψ) = y and φ↑x∨y = ψ↑x∨y (see Chapter 3,Part I), we may do the same with the labeled algebra (R,F). Remind alsothat φ ≡σ ψ if, and only if φ = ψ→x and φ→y = ψ. So, we define Γ′ ≡σ Γ′′ ifd(Γ′) = x, d(Γ′′) = y and Γ′↑x∨y = Γ′′↑x∨y, or also Γ′ = Γ′′→x and Γ′→y = Γ′′.This leads then to the domain-free information algebra (R/σ,F).

Then, to the equivalence class [Γ]σ of random mappings we may assigna random mapping [Γ]σ : Ω→ Φ/σ, defined by

[Γ]σ(ω) = [Γ(ω)]σ.

This is well defined since Γ′ ≡σ Γ′′ implies Γ′(ω) ≡σ Γ′′(ω) for all ω ∈ Ω.Now, it becomes evident that for a given random mapping Γ two elementsφ and ψ have the same support if φ ≡σ ψ and further, if Γ′ ≡σ Γ′′, thenan element φ has the same support induced by Γ′ as by Γ′′. So, we obtainfinally for any Γ ∈ R and φ ∈ Φ, that

sΓ(φ) = s[Γ]σ([φ]σ), pΓ(φ) = p[Γ]σ([φ]σ)

In particular, degrees of support and degrees of plausibility are the same forequivalent elements φ and ψ and are further identical for equivalent randommappings. So, the theories of random mappings in its labeled and domain-free versions are equivalent and we may pass at convenience from one to theother.

Page 286: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

286 CHAPTER 30. RANDOM SETS

Page 287: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 31

Systems of IndependentSources

287

Page 288: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

288 CHAPTER 31. SYSTEMS OF INDEPENDENT SOURCES

Page 289: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 32

Probabilistic ArgumentationSystems

32.1 Propositional Argumentation

32.2 Predicate Logic and Uncertainty

32.3 Stochastic Linear Systems

289

Page 290: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

290 CHAPTER 32. PROBABILISTIC ARGUMENTATION SYSTEMS

Page 291: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

Chapter 33

Random Mappings asInformation

33.1 Information Order

33.2 Information Measure

33.3 Gaussian Information

291

Page 292: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

292 CHAPTER 33. RANDOM MAPPINGS AS INFORMATION

Page 293: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

References

Chaitin, G. J. 2001. Exploring Randomness. Berlin: Springer-Verlag.

Choquet, G. 1953–1954. Theory of Capacities. Annales de l’Institut Fourier,5, 131–295.

Cowell, R. G., Dawid, A. P., Lauritzen, S. L., & Spiegelhalter, D. J. 1999.Probabilistic Networks and Expert Systems. Information Sci. and Stats.Springer, New York.

Date, C.J. 1975a. An Introduction to Database Systems. Addison-Wesley.

Date, C.J. 1975b. An Introduction to Database Systems. Addison-Wesley.

Davey, B.A., & Priestley, H.A. 1990a. Introduction to Lattices and Order.Cambridge University Press.

Davey, B.A., & Priestley, H.A. 1990b. Introduction to Lattices and Order.Cambridge University Press.

Dawid, A. P. 2001. Separoids: A Mathematical Framework for ConditionalIndependence and Irrelevance. Ann. Math. Artif. Intell, 32(1–4), 335–372.

Dempster, A. 1967. Upper and Lower Probabilities Induced by a MultivaluedMapping. Ann. Math. Stat., 38, 325–339.

Gierz, et. al. G. 2003. Continuous Lattices and Domains. Cambridge Uni-versity Press.

Gratzer, G. 1978. General Lattice Theory. Academic Press.

Guan, Xuechong, & Li, Yongming. 2010. The Continuity of InformationAlgebra. Unpublished paper, 1–13.

Halmos, Paul R. 1962. Algebraic Logic. New York: Chelsea.

Halmos, Paul R. 1963. Lectures on Boolean Algebras. Van Nostrand-Reinhold.

293

Page 294: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

294 REFERENCES

Henkin, L., Monk, J. D., & Tarski, A. 1971. Cylindric Algebras. Amsterdam:North-Holland.

Kahre, Jan. 2002. The Mathematical Theory of Information. Kluwer.

Kappos, D. A. 1969. Probability Algebras and Stochastic Spaces. New York:Academic Press.

Kohlas, J. 1995. Mathematical Foundations of Evidence Theory. Pages31–64 of: Coletti, G., Dubois, D., & Scozzafava, R. (eds), Mathemat-ical Models for Handling Partial Knowledge in Artificial Intelligence.Plenum Press.

Kohlas, J. 1997. Computational Theory for Information Systems. Tech.rept. 97–07. Institute of Informatics, University of Fribourg.

Kohlas, J. 2003. Information Algebras: Generic Structures for Inference.Springer-Verlag.

Kohlas, J., & Monney, P.A. 1995. A Mathematical Theory of Hints. AnApproach to the Dempster-Shafer Theory of Evidence. Lecture Notesin Economics and Mathematical Systems, vol. 425. Springer.

Kohlas, J., Haenni, R., & Lehmann, N. 1999. Assumption-Based Reasoningand Probabilistic Argumentation Systems. Accepted for Publication inthe Final DRUMS Handbook: Kohlas J. and Moral S. (eds), Algorithmsfor Uncertainty and Defeasible Reasoning.

Kolmogorov, A. N. 1965. Three Approaches to the Quantitative Definitionof Information. Problems Inform. Transmission, 1, 1–7.

Kolmogorov, A. N. 1968. Logical basis for information theory and probabil-ity. IEEE Transactions on Information Theory, 14, 662–664.

Kullback, S. 1959. Information Theory and Statistics. Wiley.

Langel, J. 2009. An Algebraic Theory of Semantic Information and Ques-tions. PhD thesis, Dept. of Informatics, University of Fribourg.

Lauritzen, S. L., & Spiegelhalter, D. J. 1988. Local computations withprobabilities on graphical structures and their application to expertsystems. J. Royal Statis. Soc. B, 50, 157–224.

Li, M., & Vitanyi, P. 1993. An introduction to Kolmogorov complexity andits applications. New York, NY, USA: Springer-Verlag New York, Inc.

Maier, D. 1983. The Theory of Relational Databases. London: Pitman.

Molchanov, I. 2005. Theory of Random Sets. London: Springer.

Page 295: Lecture Notes The Algebraic Theory of Information - …diuf.unifr.ch/drupal/tns/sites/diuf.unifr.ch.drupal.tns/files/file/...Lecture Notes The Algebraic Theory of Information Prof

REFERENCES 295

Nguyen, H. 1978. On random sets and belief functions. Journal of Mathe-matical Analysis and Applications, 65, 531–542.

Norberg, T. 1989. Existence theorems for measures on continuous posets,with applications to random set theory. Math. Scand., 64, 15–51.

Pouly, M., & Kohlas, J. 2010. Generic Inference. John Wiley.

Shafer, G. 1973. Allocation of Probability: A Theory of Partial Belief. Ph.D.thesis, Princeton University.

Shafer, G. 1976. A Mathematical Theory of Evidence. Princeton UniversityPress.

Shafer, G. 1979. Allocations of Probablity. Ann. of Prob., 7, 827–839.

Shafer, G. 1996. Probabilistic Expert Systems. CBMS-NSF Regional Confer-ence Series in Applied Mathematics, no. 67. Philadelphia, PA: SIAM.

Shafer, G., Shenoy, P.P., & Mellouli, K. 1987. Propagating Belief FUnctionsin Qualitative Markov Trees. Int. J. of Approximate Reasoning, 1(4),349–400.

Shannon, C.E. 1948. A Mathematical Theory of Communications. The BellSystem Technical Journal, 27, 379–432.

Smyth, M.B. 1992. Topology. Pages 642—761 of: S. Abramsky, D. Gabbay,& Maibaum, T.S.E. (eds), Handbook of Logic in Computer Science.Oxford Science Publications.

Studeny, M. 1993. Formal Properties of Conditional Independence in Differ-ent Calculi of AI. Pages 341–348 of: Clarke, Michael, Kruse, Rudolf,& Moral, Serafın (eds), Symbolic and Quantitative Approaches to Rea-soning and Uncertainty. Lecture Notes in Computer Science, vol. 747.Springer, Berlin.