analogical proportions

26
arXiv:2006.02854v5 [cs.LO] 17 Apr 2021 ANALOGICAL PROPORTIONS CHRISTIAN ANTI ´ C Abstract. Analogy-making is at the core of human intelligence and creativity with applications to such diverse tasks as commonsense rea- soning, learning, language acquisition, and story telling. This paper contributes to the foundations of artificial general intelligence by intro- ducing from first principles an abstract algebraic framework of analogical proportions of the form ‘a is to b what c is to d’ in the general setting of universal algebra. This enables us to compare mathematical objects possibly across different domains in a uniform way which is crucial for AI-systems. The main idea is to define solutions to analogical equa- tions in terms of maximal sets of algebraic justifications, which amounts to deriving abstract terms of concrete elements from a ‘known’ source domain which can then be instantiated in an ‘unknown’ target domain to obtain analogous elements. It turns out that our notion of analog- ical proportions has appealing mathematical properties. For example, we show that analogical proportions preserve functional dependencies across different domains, which is desirable. We extensively compare our framework with two prominent and recently introduced frameworks of analogical proportions from the literature in the concrete domains of sets, numbers, and words, and we show that in each case we either disagree with the notion from the literature justified by some plausible counter-examples or we can show that our model yields strictly more reasonable solutions. This provides evidence for its applicability. In a broader sense, this paper is a first step towards a theory of analogical reasoning and learning systems with potential applications to fundamen- tal AI-problems like commonsense reasoning and computational learning and creativity. 1. Introduction Analogy-making is at the core of human intelligence and creativity with applications to such diverse tasks as commonsense reasoning, learning, lan- guage acquisition, and story telling (see, e.g., Hofstadter (2001), Hofstadter and Sander (2013), Gust, Krumnack, K¨ uhnberger, and Schwering (2008), Boden (1998), Sowa and Majumdar (2003), Winston (1980), and Wos (1993)). This paper contributes to the foundations of artificial general intelligence by introducing from first principles an abstract algebraic framework of analog- ical proportions of the form ‘a is to b what c is to d’ in the general setting of universal algebra. This enables us to compare mathematical objects possibly across different domains in a uniform way which is crucial for AI-systems. The main idea is simple and is illustrated in the following example. 1

Upload: others

Post on 06-Feb-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ANALOGICAL PROPORTIONS

arX

iv:2

006.

0285

4v5

[cs

.LO

] 1

7 A

pr 2

021

ANALOGICAL PROPORTIONS

CHRISTIAN ANTIC

Abstract. Analogy-making is at the core of human intelligence andcreativity with applications to such diverse tasks as commonsense rea-soning, learning, language acquisition, and story telling. This papercontributes to the foundations of artificial general intelligence by intro-ducing from first principles an abstract algebraic framework of analogicalproportions of the form ‘a is to b what c is to d’ in the general settingof universal algebra. This enables us to compare mathematical objectspossibly across different domains in a uniform way which is crucial forAI-systems. The main idea is to define solutions to analogical equa-tions in terms of maximal sets of algebraic justifications, which amountsto deriving abstract terms of concrete elements from a ‘known’ sourcedomain which can then be instantiated in an ‘unknown’ target domainto obtain analogous elements. It turns out that our notion of analog-ical proportions has appealing mathematical properties. For example,we show that analogical proportions preserve functional dependenciesacross different domains, which is desirable. We extensively compareour framework with two prominent and recently introduced frameworksof analogical proportions from the literature in the concrete domainsof sets, numbers, and words, and we show that in each case we eitherdisagree with the notion from the literature justified by some plausiblecounter-examples or we can show that our model yields strictly morereasonable solutions. This provides evidence for its applicability. In abroader sense, this paper is a first step towards a theory of analogicalreasoning and learning systems with potential applications to fundamen-tal AI-problems like commonsense reasoning and computational learningand creativity.

1. Introduction

Analogy-making is at the core of human intelligence and creativity withapplications to such diverse tasks as commonsense reasoning, learning, lan-guage acquisition, and story telling (see, e.g., Hofstadter (2001), Hofstadterand Sander (2013), Gust, Krumnack, Kuhnberger, and Schwering (2008),Boden (1998), Sowa andMajumdar (2003), Winston (1980), andWos (1993)).This paper contributes to the foundations of artificial general intelligence byintroducing from first principles an abstract algebraic framework of analog-ical proportions of the form ‘a is to b what c is to d’ in the general setting ofuniversal algebra. This enables us to compare mathematical objects possiblyacross different domains in a uniform way which is crucial for AI-systems.The main idea is simple and is illustrated in the following example.

1

Page 2: ANALOGICAL PROPORTIONS

2 CHRISTIAN ANTIC

Example 1. Imagine two domains, one consisting of positive integers 1, 2, . . .and the other made up of words ab, ba . . . et cetera. The analogical equation

2 : 4 :: ab : z(1)

is asking for some word z (here z is a variable) which is to ab what 4 is to 2.What can be said about the relationship between 2 and 4? One simple obser-vation is that 4 is the square root of 2. Now, by analogy, what is the ‘squareroot’ of ab? If we interpret ‘multiplication’ of words as concatenation—anatural choice—then (ab)2 is the word abab, which is a plausible solution to(1). We can state this more formally as follows. Let s(z) := z and t(z) := z2

be two terms. We have

2 = s(2), 4 = t(2), and ab = s(ab).(2)

By continuing the pattern in (2), what could z in (1) equal to? In (2), we seethat transforming 2 into 4 means transforming s(2) into t(2). Now what doesit mean to transform ab ‘in the same way’ or ‘analogously’? The obviousanswer is to transform s(ab) into the solution t(ab) = abab computed before.As a formal solution to (1), this yields the analogical proportion betweennumbers and words given by

2 : 4 :: ab : abab.

As simple as this line of reasoning may seem, it cannot be formalized bycurrent models of analogical proportions which restrict themselves to pro-portions between objects of a single domain (cf. Stroppa and Yvon (2006)and Miclet, Bayoudh, and Delhay (2008)) and we will return to this specificanalogical proportion in a more formal manner in Example 14.

The rest of the paper is devoted to formalizing and studying reasoningpatterns as in the example above within the abstract algebraic setting ofuniversal algebra. We extensively compare our framework with two promi-nent and recently introduced frameworks of analogical proportions from theliterature, namely Stroppa and Yvon (2006)’s and Miclet et al. (2008)’s,within the concrete domains of sets, numbers, and words, and in each casewe either disagree with the notion from the literature justified by some plau-sible counter-examples or we can show that our model yields strictly morereasonable solutions, which provides evidence for its applicability.

The aim of this paper is to introduce our model of analogical proportions—which to the best of our knowledge is original—in its full generality. Thecore idea is formulated in Definition 7 and despite its simplicity it has inter-esting consequences with mathematically appealing proofs, which we planto explore further in the future. Since ‘plausible analogical proportion’ isan informal concept, we cannot hope to formally prove the soundness andcompleteness of our framework—the best we can do is to prove that desir-able proportions are derivable within our framework (e.g. Theorem 2) andthat ‘obviously implausible’ proportions cannot be derived (e.g. Theorem 3and Example 18).

Page 3: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 3

The rest of the paper is structured as follows. The next section is in-troductory and recalls some basic concepts of universal algebra. Section3—the main section of the paper—introduces analogical equations and pro-portions based on maximal sets of algebraic justifications. Section 4 studiessome elementary properties of analogical proportions. Specifically, we showthat analogical proportions preserve functional dependencies (Theorem 2).Moreover, we discuss Lepage (2003)’s axioms and argue why we agree withsymmetry, determinism, and (strong) reflexivity, while we are disagreeingwith his exchange of the means and strong determinism axioms (Theorem3). The Sections 5, 6, and 7 compare our framework with Stroppa and Yvon(2006)’s and Miclet et al. (2008)’s models in the concrete domains of sets,numbers, and words, respecitively, and in each case we either disagree withthe notion from the literature justified by some plausible counter-examples(Example 18) or we can show that our model yields strictly more reasonablesolutions (Theorems 5,11,14), which provides evidence for its applicability.Section 8 briefly discusses some further related work, most notably Dastani,Indurkhya, and Scha (2003)’s framework of word proportions and in Ex-ample 42 we discuss a simple proportion which requires heavy machinerywithin Dastani et al. (2003)’s model. Section 9 concludes the paper with abrief discussion of future work.

2. Preliminaries

Given a positive integer n, we define [1, n] := {1, . . . , n}. Given anysequence of objects o = o1 . . . on, n ≥ 0, we denote the length n of o by|o|. We denote the powerset of a set U by P(U). The natural numbersare denoted by N := {0, 1, 2, . . .}, the integers are denoted by Z, and therational numbers are denoted by Q. Moreover, the booleans are denoted by1

BOOL := {0, 1} with conjunction 0∧ 0 := 1∧ 0 := 0∧ 1 := 0 and 1∧ 1 := 1,and disjunction 0 ∨ 0 := 0 and 1 ∨ 0 := 0 ∨ 1 := 1 ∨ 1 := 1. Given a finitealphabet Σ, we denote the set of all finite words over Σ containing the emptyword ε by Σ∗ and we define Σ+ := Σ∗ − {ε}.

2.1. Universal Algebra. We recall some basic notions and notations ofuniversal algebra (see e.g. Burris and Sankappanavar (2000)).

2.1.1. Syntax. A language of algebras L consists of a set FsL of functionsymbols, a set CsL of constant symbols, a rank function rk : FsL → N,and a denumerable set V = {z1, z2, . . .} of variables. The sets FsL, CsL,and V are pairwise disjoint. Moreover, we always assume that L containsthe equality relation symbol = interpreted as the equality relation in everyalgebra. An L-expression is any finite string of symbols from L. An L-atomic term is either a variable or a constant symbol and we denote the setof all L-atomic terms by aTmL. The set TmL of L-terms is the smallest set

1We refuse to denote the booleans by B as this symbol is reserved in Section 3 andbeyond to denote a generic target domain.

Page 4: ANALOGICAL PROPORTIONS

4 CHRISTIAN ANTIC

of L-expressions such that (i) every L-atomic term is an L-term; and (ii) forany L-function symbol f and any L-terms t1, . . . , trk(f), f(t1, . . . , trk(f)) isan L-term. We denote the set of variables occurring in a term t by V (t).

2.1.2. Semantics. An L-algebra A consists of (i) a non-empty set A, the

universe of A; (ii) for each f ∈ FsL, a function fA : Ark(f) → A, the func-tions of A; and (iii) for each c ∈ CsL, an element cA ∈ A, the distinguishedelements of A.

Notation 2. Given a subset A′ of the universe of A, the language L(A′) isthe language L augmented by a constant symbol a for each element a ∈ A′.

Notation 3. With a slight abuse of notation, we will not distinguish be-tween an L-algebra A and its universe A in case the operations are under-stood from the context. This means we will write a ∈ A instead of a ∈ A etcetera.

For any L-algebra A, an A-assignment is a function ν : V → A. Forany assignment ν, let νz 7→a denote the assignment ν ′ such that ν ′(z) := a,and for all other variables z′, ν ′(z′) := ν(z′). For any L-structure A andany A-assignment ν, (i) for every variable z ∈ V , zA[ν] := ν(x); (ii) forevery c ∈ CsL, cA[ν] := cA; (iii) for every f ∈ FsL and t1, . . . , trk(f) ∈

TmL, f(t1, . . . , trk(f))A[ν] := fA(tA1 [ν], . . . , t

Ark(f)[ν]). Notice that every term

t induces a function tA : A|V (t)| → A given by

tA(a1, . . . , a|V (t)|) := tA[ν(a1,...,a|V (t)|)],

where ν(a1,...,a|V (t)|)(zi) := ai, for all i ∈ [1, |V (t)|]. Given an L-algebra A,

an A-term is an L-term which may contain distinguished elements of A asconstant symbols with the obvious interpretation. We denote the set of allA-terms with variables among z = z1, . . . , zn, n ≥ 0, by A[z].

Notation 4. By convention, every term in A[z] must contain all variablesin z.

For instance, 2z + 1 is a term in (N,+, 1)[z], whereas 2z2 + 1 is notas z2 requires multiplication.2 We call a term t constant in A iff tA is aconstant function, and we call t injective in A iff tA is an injective function.For instance, the term t(z) = 0z ∈ (N, ·, 0)[z] is constant in (N, ·) despitecontaining the variable z. Terms can be interpreted as ‘generalized elements’containing variables as placeholders for concrete elements, and they will playa central role in our algebraic formulation of analogical proportions givenbelow.

2Of course, 2z and z2 are abbreviations of z + z and z · z, respectively.

Page 5: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 5

3. Analogical Proportions

In the rest of the paper, we may assume some ‘known’ source domain A

and some ‘unknown’ target domain B, both L-algebras of same languageL. We may think of the source domain A as our background knowledge—arepertoire of elements we are familiar with—whereas B stands for an un-familiar domain which we want to explore via analogical transfer from A.For this we will consider analogical equations which are expressions of theform ‘a is to b what c is to z’—in symbols, a : b :: c : z—where a and bare elements of A, c is an element of B, and z is a variable. Solutions toanalogical equations will be elements of B which are to c in B what a isto b in A in a mathematically precise way (Definition 7). Specifically, wewant to functionally relate elements of an algebra via term rewrite rules asfollows. Recall from Example 1 that transforming 2 into 4 in the algebra(N, ·) of non-negative integers with multiplication means transforming s(2)into t(2),3 where s(z) := z and t(z) := z2 are terms. We can state thistransformation more pictorially as the term rewrite rule s → t. Now trans-forming the word ab ‘in the same way’ means to transform s(ab) into t(ab),which again is an instance of s → t. Let us make this notation official.

Notation 5. We will always write s(z) → t(z) or s → t instead of (s, t), forany pair of L-terms s and t containing the same variables among z.

The above explanation motivates the following definition.

Definition 6. Define the set of justifications of two elements a, b ∈ A in A

by4

JusA(a, b) :={

s → t ∈ A[z]2∣

∣a = sA(e) and b = tA(e), for some e ∈ A|z|

}

.

For instance, in the example above, Jus(N,·)(2, 4) and Jus({a,b}∗,·)(ab, abab)

both contain the justification z → z2, for e1 := 2 ∈ N and e2 := ab ∈ {a, b}∗.We are now ready to introduce the main notion of the paper.

Definition 7. An analogical equation in (A,B) is an expression of the form‘a is to b what c is to z’—in symbols,

a : b :: c : z,(3)

where a and b are source elements from A, c is a target element from B, andz is a variable. Given a target element d ∈ B, define the set of justificationsof a : b :: c : d in (A,B) by

Jus(A,B)(a : b :: c : d) := JusA(a, b) ∩ JusB(c, d).

We say that d ∈ B is a solution to (3) in (A,B) iff Jus(A,B)(a : b :: c : d) is asubset maximal set of justifications with respect to b and d, that is, iff for

3To be more precise, we transform s(N,·)(2) into t(N,·)(2).4It is important to emphasize that both s and t contain all variables z by Notation 4.

Page 6: ANALOGICAL PROPORTIONS

6 CHRISTIAN ANTIC

any elements b′ ∈ A and d′ ∈ B,

Jus(A,B)(a : b :: c : d) ⊆ Jus(A,B)(a : b′ :: c : d′)

implies

Jus(A,B)(a : b′ :: c : d′) ⊆ Jus(A,B)(a : b :: c : d).

In this case, we say that a, b, c, d are in analogical proportion in (A,B) writtenas

(A,B) |= a : b :: c : d.

Notation 8. We will always write A instead of (A,A) et cetera.

Roughly, an element d in the target domain is a solution to an analogicalequation of the form a : b :: c : z iff there is no other target element d′

whose relation to c is more similar to the relation between a and b in thesource domain expressed in terms of maximal sets of algebraic justifications.Analogical equations formalize the idea that analogy-making is the task oftransforming different objects from the source to the target domain in ‘thesame way’;5 or as Polya (1954) puts it:

Two systems are analogous if they agree in clearly definablerelations of their respective parts.

In our formulation, the ‘parts’ are the elements a, b, c, d and the ‘definablerelations’ are represented by term rewrite rules relating a, b and c, d in ‘thesame way’ via maximal sets of justifications.

Notation 9. Notice that any justification s(z) → t(z) of a : b :: c : d in(A,B) must satisfy

a = sA(e1) and b = tA(e1) and c = sB(e2) and d = tB(e2),(4)

for some e1 ∈ A|z| and e2 ∈ B|z|. We sometimes write se1→e2−−−−→ t to make the

witnesses e1, e2 and their transition explicit. This situation can be depictedas follows:

a : b :: c : d.

s(z)

t(z)

z/e1 z/e2

z/e1 z/e2

Example 10. Consider the analogical equation

2 : 4 :: 3 : z.

5This is why ‘copycat’ is the name of a prominent model of analogy-making (Hofstadter& Mitchell, 1995). See Correa, Prade, and Richard (2012).

Page 7: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 7

We can transform 2 into 4 in at least three different ways justified byz → 2 + z, z → 2z, and z → z2. Here it is important to clarify the al-gebras involved. The first two justifications require addition, whereas thelast justification requires multiplication. Moreover, the first justification ad-ditionally presupposes that 2 is a distinguished element—this is not the casefor the last two justifications as 2z and z2 are abbreviations for z+z and z ·z,respectively, not involving 2. Analogously, transforming 3 ‘in the same way’as 2 can therefore mean at least three things: 3 → 2+3 = 5, 3 → 3+3 = 6,and 3 → 32 = 9. More precisely, z → 2 + z is a justification of 2 : b :: 3 : din (N,+, 2) iff b = 4 and d = 5 which shows that Jus(N,+,2)(2 : 4 :: 3 : 5)is a subset maximal set of justifications with respect to the second and lastargument. This formally proves

(N,+, 2) |= 2 : 4 :: 3 : 5.

The other two cases being analogous, we can further derive

(N,+) |= 2 : 4 :: 3 : 6 and (N, ·) |= 2 : 4 :: 3 : 9.

4. Properties of Analogical Proportions

This section studies some basic mathematical properties of analogicalequations and proportions.

4.1. Characteristic Justifications. Computing all justifications of an ana-logical proportion is difficult in general, which fortunately can be omittedin many cases.

Definition 11. We call a set J of justifications a characteristic set of jus-tifications of a : b :: c : d in (A,B) iff J is a sufficient and necessary set ofjustifications of a : b :: c : d in (A,B), that is, iff

J ⊆ Jus(A,B)(a : b :: c : d) ⇔ (A,B) |= a : b :: c : d.(5)

In case J = {s → t} is a singleton set satisfying (5), we call s → t acharacteristic justification of a : b :: c : d in (A,B). Moreover, we say thatJ is a trivial set of justifications in (A,B) iff every justification in J justifiesevery proportion a : b :: c : d in (A,B), that is, iff

J ⊆ Jus(A,B)(a : b :: c : d) for all a, b ∈ A and c, d ∈ B.

In this case, we call every justification in J a trivial justification in (A,B). Wesay that a : b :: c : d is a trivial proportion in (A,B) iff (A,B) |= a : b :: c : dand JusA,B(a : b :: c : d) consists only of trivial justifications.6

Remark 12. Notice that the empty set is always a trivial set of justifica-tions. In some cases, given an analogical equation a : b :: c : z, the setJus(A,B)(a : b :: c : d) of justifications of a : b :: c : d in (A,B) is empty, forany b ∈ A and d ∈ B, in which case we trivially have (A,B) |= a : b :: c : d.

6See Examples 16 and 22.

Page 8: ANALOGICAL PROPORTIONS

8 CHRISTIAN ANTIC

This is, for example, the case in any structure (A), consisting only of a uni-verse A without any functions on A—given distinct elements a, b, c, d ∈ A,we always have Jus(A)(a : b :: c : d) = ∅ and hence a : b :: c : d is a trivialproportion in (A).

The following lemma is a useful characterization of characteristic justifi-cations in terms of injectivity.

Lemma 1. For any justification s(z) → t(z) of a : b :: c : d in (A,B), if

there are unique e1 ∈ A|z| and e2 ∈ B|z| such that

a = sA(e1) and c = sB(e2),

then s → t is a characteristic justification of a : b :: c : d in (A,B).

Proof. Since s(z) → t(z) is a justification of a : b :: c : d in (A,B) by

assumption, there are sequences of elements e1 ∈ A|z| and e2 ∈ B|z| satisfying(4), where e1 and e2 are uniquely determined by assumption. Consequently,given any elements b′ ∈ A and d′ ∈ B, s → t is a justification of a : b′ :: c : d′

in (A,B) iff b′ = tB(e1) = b and d′ = tB(e2) = d, which shows that s → t isindeed a characteristic justification. �

4.2. Functional Dependencies. The following reasoning pattern—whichroughly says that functional dependencies are preserved across (different)domains—will often be used in the rest of the paper.

Theorem 2. For any L-term t(z),7 we have

(A,B) |= a : tA(a) :: c : tB(c), for all a ∈ A and c ∈ B.

Proof. The justification z → t(z) is a characteristic justification of a : tA(a) ::c : tB(c) in (A,B) by Lemma 1 as z is injective in A and B. �

Remark 13. It is important to emphasize that in Theorem 2, the L-termt(z) must contain the variable z (Notation 4)—otherwise t(z) := b wouldcharacteristically justify the analogical proportion a : b :: c : b, for anydistinguished element b, which is implausible (but see Remark 23).

Example 14. We want to formally solve the analogical equation (1) ofExample 1 given by

2 : 4 :: ab : z.

For this, we first need to specify the algebras involved. Let L be the lan-guage consisting of a single binary function symbol ·, and let (N, ·N) and(Σ∗, ·Σ

∗), where Σ := {a, b}, be L-algebras. This means we interpret · as

multiplication of numbers in N and as concatenation of words in Σ∗. As adirect consequence of Theorem 2 with t(z) := z · z, we can formally derivethe solution abab to (1):

((N, ·), (Σ∗, ·)) |= 2 : 4 :: ab : abab.

7Recall from Notation 4 that t(z) must contain the variable z. So, for instance, t(z)cannot be a constant symbol (see Remark 13).

Page 9: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 9

4.3. Lepage’s Axioms. Lepage (2003) proposes the following axioms (cf.Miclet et al. (2008, p.797)) as a guideline for formal models of analogical pro-portions within a single domain,8 adapted here to our framework formulatedabove:

A |= a : b :: c : d ⇔ A |= c : d :: a : b (symmetry),(6)

A |= a : b :: c : d ⇔ A |= a : c :: b : d (exchange of the means),(7)

A |= a : a :: c : d ⇒ d = c (strong determinism),(8)

A |= a : b :: a : d ⇒ d = b (strong reflexivity).(9)

We add to the above list the axioms

A |= a : a :: c : c (determinism),(10)

and

A |= a : b :: a : b (reflexivity).(11)

Symmetry, reflexivity, strong reflexivity, and determinism are plausibleand we prove below that they are satisfied within our framework. On theother hand, we disagree with Lepage’s exchange of the means and strongdeterminism axioms justified as follows.

Theorem 3. Definition 7 implies (6), (9), (10), (11), and it neither implies(7) nor (8).

Proof. Symmetry is an immediate consequence of

Jus(A,B)(a : b :: c : d) = Jus(A,B)(c : d :: a : b)

and the fact that for a, b, c, d to be in analogical proportion in (A,B) the setof justifications of a : b :: c : d in (A,B) needs to be subset maximal withrespect to b and d.

Next, we prove strong reflexivity. For this, first notice that we have

a → b ∈ JusA(a : b :: a : b) and a → b 6∈ JusA(a : b :: a : d), for all d 6= b.

Moreover, we have

JusA(a : b :: a : d) = Jus(a, b) ∩ JusA(a, d) ⊆ JusA(a, b) = JusA(a : b :: a : b).

Hence, we have

JusA(a : b :: a : d) ( JusA(a : b :: a : d), for all d 6= b,

which shows that in case A |= a : b :: a : d, we must have d = b.Determinism is an immediate consequence of Theorem 2 with t(z) := z,

while reflexivity is characteristically justified by a → b.Next, we disprove exchange of the means. On the one hand, Theorem 2

implies via z → z ∨ 1:

(BOOL,∨, 1) |= 1 : 1 :: 0 : 1.(12)

8Lepage (2003) formulates his axioms to hold in a single domain without any referenceto an underlying structure A.

Page 10: ANALOGICAL PROPORTIONS

10 CHRISTIAN ANTIC

On the other hand, we prove

(BOOL,∨, 1) 6|= 1 : 0 :: 1 : 1(13)

as follows. Before we continue with the formal proof, first observe thatdisjunction is a monotone operation preserving the order 0 < 1, and thatthe first proportion (12) preserves monotonicity (1 ≤ 1 and 0 ≤ 1), whereasthe second does not (1 6≤ 0 whereas 1 ≤ 1). This is the intuition behindthe following argument, which is not explicitly mentioning the ordering. Weproceed by showing that every justification s → t of 1 : 0 :: 1 : 1 justifies1 : 0 :: 1 : 0 in (BOOL,∨, 1) as follows. Here s and t have the generic forms

s = s1 ∨ . . . ∨ sm and t = t1 ∨ . . . ∨ tn, m, n ≥ 1,(14)

where each atomic term si, ti is either a variable or the boolean value 1 (wecan safely exclude the neutral element 0 with respect to disjunction). By

definition, we thus have with s := s(BOOL,∨,1) and t := t(BOOL,∨,1),

1 = s(e1) and 0 = t(e1) and 1 = s(e2) and 1 = t(e2),

for some sequences of boolean values e1, e2. Notice that 0 = t(e1) impliesthat 1 does not occur in t, which means that we can simplify t as

t = z1 ∨ . . . ∨ zn,(15)

consisting only of variables. Recall from Notation 4 that s and t mustcontain the same variables, which means that s is a term s(z) containingthe variables z := (z1, . . . , zn). For 0 = t(e1) to be true, we must havee1 = 0 := (0, . . . , 0) by (15). Now, for s(e1) = s(0) = 1 to be true, s mustcontain the boolean value 1, which means that we can rewrite s as

s = s′ ∨ 1,

for some disjunction of atomic terms s′ (see (14)). Hence, we have

1 = s(0).

The following figure summarizes the above situation and illustrates thats → t is indeed a justification of 1 : 0 :: 1 : 0 in (BOOL,∨, 1):

1 : 0 :: 1 : 0.

s(z) = (s′ ∨ 1)(z)

t(z) = z1 ∨ . . . ∨ zn

z/0 z/0

z/0 z/0

Now, the fact that 1 → 0 is a justification of 1 : 0 :: 1 : 0, but not of1 : 0 :: 1 : 1, finally proves

Jus(BOOL,∨,1)(1 : 0 :: 1 : 1) ( Jus(BOOL,∨,1)(1 : 0 :: 1 : 0),

Page 11: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 11

which implies (13). The relations in (12) and (13) violate Lepage’s exchangeof the means axiom.

Lastly, we disprove strong determinism. Consider the analogical equationin (Z, ·) given by

1 : 1 :: −1 : z.(16)

One obvious solution to (16), consistent with Lepage’s strong determinismaxiom, is z = −1, characteristically justified via z → z by Theorem 2.However, there is another solution to (16), justified again by Theorem 2 viaz → z2, namely

(Z, ·) |= 1 : 1 :: −1 : 1.

This analogical proportion, which is intuitively plausible given the justifica-tion z → z2, violates Lepage’s axiom of strong determinism. �

5. Set Proportions

We now study analogical proportions between sets called set proportions.

Notation 15. In the rest of this section, let L := {∩,∪, .c} be the languageof sets, interpreted in the usual way, let U and V be universes, and let

A := (P(U),∩,∪, .c,P(U) ∩P(V )) and B := (P(V ),∩,∪, .c,P(U) ∩P(V ))

be L(P(U) ∩ P(V ))-algebras containing the distinguished sets in P(U) ∩P(V ) as constants (Notation 2). Notice that in case A = B, every set in A

is a distinguished set. The empty set is always a distinguished set.

The following proposition summarizes some elementary properties of setproportions.

Proposition 4. The following set proportions hold in (A,B), for all A ∈P(U), C ∈ P(V ), and B,E ∈ P(U) ∩P(V ):

A : Ac :: C : Cc(17)

A : A ∪ E :: C : C ∪ E and A : A ∩ E :: C : C ∩ E(18)

A : A ∪ C :: C : A ∪ C and A : A ∩ C :: C : A ∩ C if A,C ∈ P(U) ∩P(V )(19)

A : U :: C : U if U = V , and A : ∅ :: C : ∅.(20)

Moreover, in case B ⊆ A and B ⊆ C, we further have the set proportion

A : B :: C : B.

Proof. All proportions are immediate consequences of Theorem 2. For ex-ample, the fourth line of proportions follows from Theorem 2 with t1(Z) :=Z ∪ U and t2(X) := Z ∩ ∅, respectively, and the last line follows witht3(Z) := Z ∩B. �

Page 12: ANALOGICAL PROPORTIONS

12 CHRISTIAN ANTIC

Example 16. The terms

tr1(X,Y ) := (X ∩ Y ) ∪ (X − Y ) and tr2(X,Y ) := (X ∩ Y ) ∪ (Y −X)

justify any set proportion A : B :: C : D, which shows that tr1(A,B)→(C,D)−−−−−−−−→

tr2 is a trivial justification. This example shows that trivial justificationsmay contain useful information about the underlying structures—in thiscase, it encodes the trivial observation that any two sets A and B are sym-metrically related via A = (A ∩B) ∪ (A−B) and B = (A ∩B) ∪ (B −A).

5.1. Stroppa and Yvon. The following definition is due to Stroppa andYvon (2006, Proposition 4).

Definition 17. For any sets A,B,C,D ∈ A, define9

A |=SY A : B :: C : D :⇔ A = A1 ∪A2, B = A1 ∪D2,

C = D1 ∪A2, D = D1 ∪D2.

For example, with A1 := {a1}, A2 := {a2}, D1 := {d1}, and D2 := {d2},we obtain the set proportion

{a1, a2} : {a1, d2} :: {d1, a2} : {d1, d2}.(21)

So, roughly, we obtain the set {a1, d2} from {a1, a2} by replacing a2 by d2,which coincides with the transformation from {d1, a2} into {d1, d2}.

Although Definition 17 works in some cases, in general we disagree withStroppa and Yvon (2006)’s notion of set proportions justified by the follow-ing counter-example.

Example 18. Given A1 := A2 := {a} and D1 := D2 := ∅, Definition 17yields

(P({a}),∩,∪, .c, ∅, {a}) |=SY {a} : {a} :: {a} : ∅,

which is implausible. In fact, strong reflexivity (Theorem 3) implies that{a} is the only solution to {a} : {a} :: {a} : Z in (P({a}),∩,∪, .c, ∅, {a})according to our Definition 7.

5.2. Miclet et al. There is at least one more definition of set proportionsin the literature due to (Miclet et al., 2008, Definition 2.3).10

9We adapt Stroppa and Yvon (2006)’s definition to our schema by making the under-lying structure A explicit—recall from Notation 8 that A is an abbreviation for (A,A),which according to Notation 15 means that every set in A is a distinguished set—thismeans, we can use every set in A to form terms.

10To be more precise, Miclet et al. (2008)’s definition is stated informally as

Four sets A,B,C and D are in analogical proportion A : B :: C : D iffA can be transformed into B, and C into D, by adding and subtractingthe same elements to A and C.

This definition is ambiguous. One interpretation is the one we choose in Definition 19—another interpretation would be equivalent to Definition 17.

Page 13: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 13

Definition 19. Given a finite universe U and sets A,B,C,D ∈ P(U) = A,

A |=MBD A : B :: C : D :⇔ B = (A− E) ∪ F

D = (C − E) ∪ F,

for some finite sets E and F .

Remark 20. Notice that both Stroppa and Yvon (2006) and Miclet et al.(2008) define set proportions only for sets over the same universe which isa serious restriction to its practical applicability. Even more problematic,Miclet et al. (2008) define set proportions only for finite sets.

For example,

{a1, d2} = ({a1, a2} − {a2}) ∪ {d2} and {d1, d2} = ({d1, a2} − {a2}) ∪ {d2}

shows that (21) holds with respect to Definition 19 as well.We have the following implication.

Theorem 5. For any finite sets A,B,C,D ∈ A, we have

A |=MBD A : B :: C : D ⇒ A |= A : B :: C : D.

Proof. An immediate consequence of Theorem 2 with t(Z) := (Z − E) ∪F .11 �

The following example shows that the converse of Theorem 5 fails ingeneral.

Example 21. Consider the analogical equation

{a} : {b} :: ∅ : Z.(22)

As a consequence of (17), {a, b} is a solution of (22). However, since thereare no finite sets E and F in P({a, b}) satisfying {b} = ({a} ∪ E) − F and{a, b} = (∅ ∪E)− F , {a, b} is not a solution of (22) according to Definition19.

Theorem 5 together with Example 21 shows that our notion of set pro-portion yields strictly more plausible solutions than Miclet et al. (2008)’snotion.

6. Numerical Proportions

This section studies analogical proportions between numbers called nu-merical proportions. Let us first summarize some elementary properties.

Proposition 6. For any integers a, b, c, d ∈ Z, we have

(Z,+,−) |= a : −a :: a : −a and (Q, ·, .−1) |= a :1

a:: c :

1

c

11Here it is important to emphasize that we assume every set in A to be a distinguishedset by Notation 15 (recall from Notation 8 that A is here an abbreviation of (A,A)).

Page 14: ANALOGICAL PROPORTIONS

14 CHRISTIAN ANTIC

and, given some distinguished integers k and ℓ,

(Z,+, k, ℓ) |= a : ka+ ℓ :: c : kc+ ℓ and (Z, ·, k, ℓ) |= a : ak · ℓ :: c : ck · ℓ.

Specifically, we have

(Z,+,−) |= a : 0 :: c : 0 and (Q, ·, .−1) |= a : 1 :: c : 1.

Proof. An immediate consequence of Theorem 2 with t(z) defined as follows:the first line is justified via t(z) := −z and t(z) := 1

z; the second line is

justified via t(z) := kz + ℓ and t(z) := zk · ℓ; and the third line is justifiedvia t(z) := z − z and t(z) := z

z. �

Example 22. The terms

tr1(x, y) := x+ y − y and tr2(x, y) := x+ y − x

justify any numerical proportion a : b :: c : d in (Z,+,−), which shows that

tr1(a,b)→(c,d)−−−−−−−→ tr2 is a trivial justification encoding the trivial observation

that any two integers a and b are symmetrically related via b = a + b − aand a = b+ a− b.

Remark 23. An interesting consequence of our definition of analogical pro-portion is that in case b is a distinguished integer, we have

(Z,+,−, b) |= a : b :: c : b, for all a, c ∈ Z,(23)

characteristically justified by Theorem 2 via

z → z − z + b.(24)

This can be intuitively interpreted as follows: every distinguished elementhas a ‘name’ in our language, which means that it is in a sense a ‘known’element. As the framework is designed to compute ‘novel’ or ‘unknown’elements in the target domain via analogy-making, (23) means that ‘known’target elements can always be computed given an invertible operation (inthis case addition). It is important to emphasize that in case addition isnot invertible (e.g. in N), (23) can no longer be justified via (24) containingsubtraction. Here we should stress once more that, for example, the rewriterule z → b is not a valid justification of a : b :: c : b in any algebras (A,B)by Notation 4, which explains why the operation needs to be invertible.

The following result formally proves a well-known numerical proportionknown as ‘arithmetic’ or ‘difference proportion’.12

Theorem 7. For any integers a, b, c, d ∈ Z,

b− a = d− c ⇒ (Z,+,−,Z) |= a : b :: c : d (difference proportion).

Proof. A direct consequence of Theorem 2 with t(z) := z + b− a.13 �

12See https://encyclopediaofmath.org/wiki/Arithmetic_proportion.13Notice that in the algebra (Z,+,−,Z) every integer is a distinguished element, which

shows that the constants a and b in z + b− a are syntactically correct.

Page 15: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 15

The following counter-example shows that the converse of Theorem 7 failsin general.

Example 24. Theorem 2 implies

(Z,+,−,Z) |= a : 2a :: c : 2c, for all integers a and c.

On the other hand, we have 2a− a = 2c− c iff a = c.

In analogy to Theorem 7, we have the following result.

Theorem 8. For any integers a, b, c, d ∈ Z,

b

a=

d

c⇒ (Q, ·, .−1,Q) |= a : b :: c : d.

Proof. A direct consequence of Theorem 2 with t(z) := z ba. �

The following counter-example shows that the converse of Theorem 8 failsin general.

Example 25. Theorem 2 implies

(Q, ·, .−1,Q) |= a : a2 :: c : c2, for all integers a and c.

On the other hand, we have a2

a= c2

ciff a = c.

Stroppa and Yvon. The following notion of numerical proportion is aninstance of the more general definition due to Stroppa and Yvon (2006,Proposition 2) given for abelian semigroups.

Definition 26. For any integers a, b, c, d ∈ Z, define

(Z,+,−,Z) |=SY a : b :: c : d :⇔ a = a1 + a2, b = a1 + d2,

c = d1 + a2, d = d1 + d2.

For example, with a := 1 + 1, b := 1 + 2, c := 2 + 1, and d := 2 + 2, weobtain the numerical proportion

2 : 3 :: 3 : 4.

We have the following implication.

Theorem 9. For any integers a, b, c, d ∈ Z, we have

(Z,+,Z) |=SY a : b :: c : d ⇒ (Z,+,Z) |= a : b :: c : d.

Proof. Let a, b, c, d be decomposed as in Definition 26. Define the terms

s(z) := z + a2 and t(z) := z + d2.

Then sa1→d1−−−−→ t is a justification of a : b :: c : d in (Z,+,Z)—since s is

injective in (Z,+,Z), s → t is characteristic by Lemma 1. �

The following example shows that the converse of Theorem 9 fails ingeneral.

Page 16: ANALOGICAL PROPORTIONS

16 CHRISTIAN ANTIC

Example 27. Consider the analogical equation in (Z,+,Z) given by

0 : 0 :: 1 : z.

Theorem 2 implies the solution z = 2 characteristically justified by z → 2z.This solution is plausible as we can transform 0 into itself by computing0 = 2 · 0—analogously, computing 2 · 1 yields the solution z = 2. Thissolution cannot be obtained from Definition 26 by the following argument.Suppose, towards a contradiction, that 0, 1, 2 can be decomposed accordingto Definition 26 into

0 = a1 + a2, 0 = a1 + d2, 1 = d1 + a2, and 2 = d1 + d2.

From the first two identities we deduce a2 = d2, which further implies 1 =d1 + a2 = d1 + d2 = 2—a contradiction.

Remark 28. It is important to highlight that Stroppa and Yvon (2006)’smodel is defined only for semigroups containing a single operation and it isnot at all clear how to extend the framework to include multiple operations.

7. Word Proportions

Words are ubiquitous in computer science and linguistics and in this sec-tion we study analogical proportions between words called word proportions.

Notation 29. In the rest of this section, Σ denotes a finite non-emptyalphabet of symbols, · denotes concatenation of words, and a,b, c,d denotenon-empty words over Σ.

Interestingly enough, it turns out that the word domain has the key prop-erty that in case the empty word is disallowed, every justification is a char-acteristic justification given the following lemma.

Lemma 10. Every (Σ∗, ·)-term s containing at least one variable is injectivein (Σ+, ·). Consequently, every justification s → t of a : b :: c : d in (Σ+, ·)is characteristic.

Proof. Every (Σ∗, ·)-term s(z) has the form

s(z) = a1z1a2 . . . a|z|z|z|a|z|+1,

for some words a1, . . . ,a|z|+1 ∈ Σ∗ and sequences of variables z1, . . . , z|z|.Since s contains at least one variable by assumption, s cannot be a constantword. Every replacement of variables in s by non-empty words yields a dif-ferent word, which means that s is injective in (Σ+, ·). Now apply Lemma1 to prove the second statement (in case s = a we must have t = b (No-tation 4), for some non-empty words a,b ∈ Σ+, and a → b is obviouslycharacteristic). �

Remark 30. It is important to emphasize that Lemma 10 fails in case weinclude the empty word. For instance, the term s(z1, z2) := z1z2 is not in-

jective in ({a}∗, ·), witnessed by the simple computation a = s({a}∗,·)(a, ε) =

s({a}∗,·)(ε, a).

Page 17: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 17

7.1. Stroppa and Yvon. We want to compare our notion of word propor-tion with the following notion due to Stroppa and Yvon (2006, Definition3).

Definition 31. Given words a,b, c,d ∈ Σ+, define

(Σ+, ·,Σ+) |=SY a : b :: c : d

iff there are decompositions

a = a1 . . . an and b = b1 . . . bn and c = c1 . . . cn and d = d1 . . . dn,

where ai, bi, ci, di ∈ Σ, 1 ≤ i ≤ n, n ≥ 1, such that

[ai = bi and ci = di] or [ai = ci and bi = di] holds for all 1 ≤ i ≤ n.

For instance, the word proportions aa : aa :: bb : bb and abc : abd :: bbc :bbd are instances of Definition 31.

We have the following implication.

Theorem 11. For any words a,b, c,d ∈ Σ+, we have

(Σ+, ·,Σ+) |=SY a : b :: c : d ⇒ (Σ+, ·,Σ+) |= a : b :: c : d.

Proof. Let a,b, c,d be decomposed as in Definition 31. If a = b we musthave c = d—hence, as a consequence of determinism (10), we then have(Σ∗, ·) |= a : a :: c : c. Otherwise, there is at least one index i in [1, n] suchthat ai = ci 6= bi and bi = di. In this case, let I := {i1, . . . , ik}, 1 ≤ k ≤ n,be all the indices in [1, n] such that

ai1 = ci1 6= bi1 , . . . , aik = cik 6= bik and bi1 = di1 , . . . , bik = dik .

Now define the (Σ+, ·)-terms s := s1 . . . sn and t := t1 . . . tn, for all i ∈ [1, n],as follows:

si :=

{

ai if ai = ci 6= bi and bi = di;

zi otherwise

ti :=

{

bi if ai = ci 6= bi and bi = di;

zi otherwise.

By construction, with z := zi1 , . . . , zik , e1 := ai1 , . . . , aik , and e2 := ci1 , . . . , cik ,we have

a : b :: c : d.

s(z)

t(z)

z/e1 z/e2

z/e1 z/e2

Page 18: ANALOGICAL PROPORTIONS

18 CHRISTIAN ANTIC

This shows that s → t is a justification of a : b :: c : d in (Σ+, ·), and Lemma10 implies that s → t is characteristic. �

Example 32. Consider the analogical word equation

abc : adc :: cba : z.

Definition 31 yields the solution z = cda. This solution is characteristicallyjustified within our framework via Lemma 10 by

abc : adc :: cba : cda.

z1bz3

z1dz3

(z1, z3)/(a, c) (z1, z3)/(c, a)

(z1, z3)/(a, c) (z1, z3)/(c, a)

Remark 33. Notice that Stroppa and Yvon (2006) define word proportionsonly for words over the same alphabet, which is a serious restriction toits practical applicability. We therefore cannot expect the converse of ageneralized version of Theorem 11 to be true with respect to two differentword domains, and the following counter-example shows that it may faileven in the case of a single domain as above.

Example 34. Consider the analogical word equation

a : ab :: c : z.

Theorem 2 implies the solution z = cb characteristically justified by z → zb.This solution is plausible as we transform a into ab by appending the letterb to a—analogously, appending the letter b to c yields the solution cb. Thissolution cannot be obtained from Definition 31 since the lengths of the wordsa, ab, c, and cb differ.

7.2. Miclet et al. We now want to compare our notion of word proportionswith the one of Miclet et al. (2008). This requires some auxiliary definitions(cf. Miclet et al. (2008, Definitions 2.6–2.8)).

Definition 35. An MBD-axiom is either a letter proportion of the forma : b :: c : d, where a, b, c, d are letters from Σ ∪ {∼}, or an instance ofdeterminism (10) or reflexivity (11) in (Σ+, ·).

Definition 36. We say that a word a ∈ Σ+ is semantically equivalent toa word a′ ∈ (Σ ∪ {∼})+ iff a can be obtained from a′ by omitting thesymbol ∼ in a′. We write a ≈ a′ in this case. We extend semanticalequivalence from single words to word proportions component-wise, that is,a : b :: c : d ≈ a′ : b′ :: c′ : d′ iff a ≈ a′, b ≈ b′, c ≈ c′, and d ≈ d′.

Semantical equivalence identifies words which differ only by different oc-currences of the symbol ∼. For example, we have ab ∼ a ∼ a ≈ abaa.

Page 19: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 19

Definition 37. An alignment between four words a,b, c,d ∈ Σ+ is a wordover the alphabet (Σ∪{∼})4−{(∼,∼,∼,∼)} whose projection on the first,second, third, and fourth component is semantically equivalent to a,b, c,and d, respectively.

Informally, an alignment represents a one-to-one letter correspondencebetween words, in which some letters ∼ may be inserted. For instance, analignment between ab, abc, acd, a is given by (a ∼ b, abc, acd, a ∼∼).

The following definition of word proportions is due to Miclet et al. (2008,Definition 2.9).

Definition 38. Let A be a set of MBD-axioms containing all instances ofdeterminism (10). For any words a,b, c,d ∈ Σ+, define

(Σ+, ·,Σ+), |=MBD a : b :: c : d

iff there exist four words a′,b′, c′,d′ ∈ (Σ ∪ {∼})+ of same length n, n ≥ 1,such that

(1) a′i : b′i :: c

′i : d

′i ∈ A, for all 1 ≤ i ≤ n,

(2) a′ ≈ a, b′ ≈ b, c′ ≈ c, d′ ≈ d.

Remark 39. Notice that Miclet et al. (2008) define word proportions onlybetween words over the same alphabet which is a serious restriction to itspractical applicability.

Remark 40. It is important to emphasize that defining word proportionswith respect to an arbitrary set A of MBD-axioms as in Definition 38 isnaive—e.g. in case A contains an MBD-axiom a : b :: c : d, for all lettersa, b, c, d ∈ Σ∪ {∼}, we can ‘justify’ any word proportion, which is implausi-ble.

For example, Σ := {a, b, α, β,A,B} with given MBD-axioms

a : b :: A : B and a : α :: b : β and A : α :: B : β(25)

and the alignment (a ∼ BA,αbBA, b ∼ a ∼, βba ∼) between the four se-quences aBA, αbBA, ba, and βba ‘justify’ the word proportion

aBA : αbBA :: ba : βba.(26)

First, notice that Miclet et al. (2008) assume in the derivation of (26) givenanalogical proportions of the form (25) between letters of the alphabet as‘axioms’ (called MBD-axioms here), which have no direct correspondencewithin our framework. More precisely, the MBD-axioms in (25) have nojustifications according to Definition 7 with respect to concatenation. How-ever, we can extend the source and target domains by unary substitutionsmodeling the given MBD-axioms as follows. Define a substitution to be anymapping σ : Σ → Σ, homomorphically extended to non-empty words inΣ∗ letter-wise. In the example above, we define σ1 := {a 7→ b,A 7→ B},σ2 := {a 7→ α, b 7→ β}, and σ3 := {A 7→ α,B 7→ β}, and σi(e) := e for

Page 20: ANALOGICAL PROPORTIONS

20 CHRISTIAN ANTIC

every other e ∈ Σ, i = 1, 2, 3. The MBD-axioms in (25) can now be mod-eled within our framework as an instance of Theorem 2 given the unarysubstitution operations14 σ1, σ2, σ3 by

a : σ1(a) :: A : σ1(A) and a : σ2(a) :: b : σ2(b) and A : σ3(A) :: B : σ3(B).(27)

We can now justify the word proportion in (26) with the following lemma.

Lemma 12. For any words a,b, c,d,a′,b′, c′,d′ ∈ Σ+ and unary functionsσ1, . . . , σn : (Σ+, ·) → (Σ+, ·), n ≥ 0,

(Σ+, ·, σ1, . . . , σn) |= a : b :: c : d and (Σ+, ·, σ1, . . . , σn) |= a′ : b′ :: c′ : d′

imply

(Σ+, ·, σ1, . . . , σn) |= aa′ : bb′ :: cc′ : dd′.

Proof. If s(z)e1→e2−−−−→ t(z) and s′(z′)

e′1→e

′2−−−−→ t′(z′) are justifications of a : b :: c : d

and a′ : b′ :: c′ : d′ in (Σ+, ·, σ1, . . . , σn), respectively, then

s(z)s′(z′)e1e

′1→e2e

′2−−−−−−−→ t(z)t′(z′),

where eie′i means the juxtaposition of ei and e′i, is a justification of

aa′ : bb′ :: cc′ : dd′

in (Σ+, ·, σ1, . . . , σn), and Lemma 13 implies that ss′ → tt′ is characteristic.This can be depicted as follows:

aa′ : bb′ :: cc′ : dd′.

s(z)s′(z′)

t(z)t′(z′)

zz′/e1e′1

zz′/e2e′2

zz′/e1e′1

zz′/e2e′2

We can finally justify the word proportion in (26) by an iterated applica-tion of Lemma 12 to the proportions in (27) together with axiomatic letterproportions—as instances of determinism (10)—of the form

∼: b ::∼: b and A : A ::∼:∼ and B : B :: a : a.(28)

More precisely, we have by Lemma 12,

a : α :: b : β and ∼: b ::∼: b ⇒ a ∼: αb :: b ∼: βb.

14Here we do not distinguish between the new function symbols σi and its interpretation

function σ(Σ+,·,σ1,σ2,σ3)i , i = 1, 2, 3.

Page 21: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 21

Two more applications of Lemma 12 to (27) and (28) yield

a ∼ BA : αbBA :: b ∼ a ∼: βba ∼,(29)

which is an aligned variant of (26). Lastly, remove ∼ from (29) to obtain(26).

We now want to formally compare Miclet et al. (2008)’s with our notionof word proportions.

As mentioned before, MBD-axioms have no direct correspondence withinour framework. This is not a shortcoming of our framework. The reasonis that any word proportion can be ‘justified’ given an appropriate set ofMBD-axioms containing all necessary letter proportions (Remark 40), whichis implausible. In our framework, to model MBD-axioms we therefore haveto expand the domain with unary substitutions as in (27) defined as follows.

Definition 41. Given an MBD-axiom A of the form a : b :: c : d, for someletters a, b, c, d ∈ Σ, we define σA by

σA(a) := b and σA(c) := d and σA(e) := e, e ∈ Σ− {a, c}.

Moreover, we need to generalize Lemma 10 to include substitutions.

Lemma 13. Let σ1, . . . , σn : (Σ+, ·) → (Σ+, ·), n ≥ 0, be injective unaryfunctions. Then every (Σ∗, ·, σ1, . . . , σn)-term s, containing at least onevariable, is injective in (Σ+, ·, σ1, . . . , σn). Consequently, every justifications → t of a : b :: c : d in (Σ+, ·, σ1, . . . , σn) is characteristic.

Proof. We prove by term induction on the shape of s that s := s(Σ+,·,σ1,...,σn)

is injective. (i) The induction base in which s is a variable holds trivially(since s contains a variable by assumption, s cannot be a constant word).(ii) In case s = σi(u), for some (Σ+, ·, σ1, . . . , σn)-term u, the injectivityof s follows from the assumed injectivity of σi and the induction hypothe-sis that u(Σ

∗,·,σ1,...,σn) is injective. Finally, (iii) in case s = s1s2, for some(Σ∗, ·, σ1, . . . , σn)-terms s1 and s2, the injectivity of s follows from the injec-tivity of concatenation in Σ+—not containing the empty word (cf. Remark

30)—and the assumed injectivity of s(Σ∗,·,σ1,...,σn)1 and s

(Σ∗,·,σ1,...,σn)2 . �

We can finally prove the following implication.

Theorem 14. Let A be a set of MBD-axioms including all instances of (10)and (11). For any words a,b, c,d ∈ Σ+,

(Σ+, ·,Σ+) |=MBD a : b :: c : d

implies

(Σ+, ·,Σ+, {σA | A ∈ A}) |= a′ : b′ :: c′ : d′,

where a′,b′, c′,d′ ∈ (Σ ∪ {∼})+ such that a ≈ a′, b ≈ b′, c ≈ c′, andd ≈ d′.

Page 22: ANALOGICAL PROPORTIONS

22 CHRISTIAN ANTIC

Proof. By definition, we have (Σ+, ·) |=MBD a : b :: c : d iff there are MBD-axioms A1, . . . , An ∈ A, n ≥ 1, such that a : b :: c : d = ρ(A1 ·. . . ·An)—hereA1·. . .·An means component-wise concatenation of word proportions—whereρ : (Σ∪{∼})+ → Σ+ is the reduct of a word containing ∼ to the same wordwithout ∼. Let A ∈ A be an MBD-axiom. If A is an instance of determinism(10), then Theorem 3 implies

(Σ+, ·,Σ+, {σA | A ∈ A}) |= A.(30)

Otherwise, if A is a letter proportion

A = a : b :: c : d = a : σA(a) :: c : σA(c),

for some letters a, b, c, d ∈ Σ, then Theorem 2 implies

(Σ+, ·,Σ+, {σA | A ∈ A}) |= a : σA(a) :: c : σA(c).(31)

Now iteratively apply Lemma 12 n times to (30) and (31) to obtain

(Σ+, ·,Σ+, {σA | A ∈ A}) |= a′ : b′ :: c′ : d′,

where a′,b′, c′,d′ ∈ (Σ ∪ {∼})+ such that a ≈ a′, b ≈ b′, c ≈ c′, andd ≈ d′. �

8. Related Work

Formal models of analogical proportions started to appear only very re-cently and in this paper we extensively compared our algebraic frameworkwith two prominent models from the literature, namely Stroppa and Yvon(2006)’s and Miclet et al. (2008)’s algebraic models, in the concrete do-mains of sets, numbers, and words and we showed that in each case weeither disagree with the notion from the literature justified by some plau-sible counter-examples or we can show that our model yields strictly morereasonable solutions. This provides evidence for its applicability. We ex-pect similar results in other domains where the models of Stroppa and Yvon(2006) and Miclet et al. (2008) are applicable.

A conceptually related approach to solving analogical word equations isgiven by Dastani et al. (2003). At this point, it is not entirely clear how oursimple framework formulated in this paper relates to the rather complicatedmodel of Dastani et al. (2003) built on top of concepts such as ‘gestalts’of sequential patters, structural information theory (SIT), algebraic codingsystems for SIT, information load, representation systems, local homomor-phism, constraints, et cetera. We challenge the reader to find instanceswhere the model of Dastani et al. (2003) is more expressive—in the worddomain—than our model, which would (partially) justify their heavy ma-chinery. To give a glimpse of what we mean, consider the following simpleexample (cf. Navarrete and Dartnell (2017, p.4)).

Page 23: ANALOGICAL PROPORTIONS

ANALOGICAL PROPORTIONS 23

Example 42. Let Σ := {a, b, c, d} be an alphabet, linearly ordered in A :=(Σ+, ·A) and B := (Σ+, ·B) via

a <A b <A c <A d and d <B c <B b <B a,(32)

extended to words lexicographically.15 Consider the analogical equation in(A,B) given by

abc : abcd :: dcb : z.

This equation is asking for a word which is to dcb in B what abcd is to abc inA. Observe that we obtain abcd from abc by concatenating the ‘successor’of c at the end of abc. We therefore add an unary function symbol succ toour language interpreted as the successor functions in A′ := (Σ+, ·A, succA

′)

and B′ := (Σ+, ·B, succB′)—that is, we define

succA′(a) := b, succA

′(b) := c, succA

′(c) := d, succA

′(d) := da

and

succB′(d) := c, succB

′(c) := b, succB

′(b) := a, succB

′(a) := ad,

extended to words lexicographically. Since succ is injective in A′ and B′, thesolution z = dcba is characteristically justified via Lemma 13 by

abc : abcd :: dcb : dcba.

z1z2z3

z1z2z3 · succ(z3)

z1z2z3/abc z1z2z3/dcb

z1z2z3/abc z1z2z3/dcb

Dastani et al. (2003) obtain the same solution in a different way by using thealgebras generated by the letters in Σ and operators (named in their termi-nology ‘gestalts’) such as ‘iteration’, ‘successor’, ‘symmetry’, ‘alternation’,‘representation systems’, et cetera, and, finally, by computing the solutiondcba via ‘local homomorphisms’.

9. Conclusion

This paper contributed to the foundations of artificial general intelligenceby introducing from first principles an abstract algebraic framework of ana-logical proportions in the general setting of universal algebra. This enabledus to compare mathematical objects possibly across different domains ina uniform way which is crucial for AI-systems. We showed that analogi-cal proportions are compatible with functional dependencies (Theorem 2),which is desirable. We further discussed Lepage (2003)’s axioms and ar-gued why we agree with symmetry, determinism, and (strong) reflexivity,

15See https://en.wikipedia.org/wiki/Lexicographic_order .

Page 24: ANALOGICAL PROPORTIONS

24 CHRISTIAN ANTIC

while we are disagreeing with his exchange of the means and strong deter-minism axioms (Theorem 3). We then extensively compared our frameworkwith two prominent and recently introduced frameworks of analogical pro-portions from the literature, namely Stroppa and Yvon (2006)’s and Micletet al. (2008)’s, within the concrete domains of sets, numbers, and words,and in each case we either disagreed with the notion from the literaturejustified by some plausible counter-examples or we showed that our modelyields strictly more reasonable solutions, which provides evidence for its ap-plicability. In a broader sense, this paper is a first step towards a theoryof analogical reasoning and learning systems with potential applications tofundamental AI-problems like commonsense reasoning and computationallearning and creativity.

Future Work. This theoretical paper introduces and studies some basicproperties of analogical proportions within the general setting of universalalgebra and within the specific domains of sets, numbers, and words. In thefuture, we wish to expand this study to other domains relevant for computerscience and artificial intelligence as, for instance, trees, graphs, automata,neural networks, logic programs, et cetera.

From a practical point of view, the main task for future research is todevelop algorithms for the computation of some or all solutions to analogi-cal equations as defined in this paper. This problem is highly non-trivial inthe general case. A reasonable starting point is therefore to first study con-crete mathematical domains such as the ones studied in this paper from thecomputational perspective. Another approach is to study analogical pro-portions in finite and automatic models (cf. Ebbinghaus and Flum (1999),Libkin (2012)), which are more relevant to computer science and artificialintelligence research than the infinite models studied in classical universalalgebra. Here interesting connections between, e.g., word proportions andlogics on words studied in algebraic formal language and automata theorywill hopefully become available, which may then lead to concrete algorithmsfor solving analogical equations over words, trees, and related data struc-tures.

Another key line of research is to apply our model to various AI-relatedproblems such as, e.g., commonsense reasoning, formalizing metaphors, andlearning by analogy. For this, it will be useful to apply our model to logicprogramming (cf. Apt (1990)) as follows. First, introduce appropriate alge-braic operations and relations on the space of all programs. Next, consideranalogical proportions P : Q :: R : S between logic programs. (Antic (2020)did exactly this but with an outdated and unpublished version of the frame-work proposed in this paper.) We are convinced that promising results willfollow in that direction.

From a mathematical point of view, relating analogical proportions toother concepts of universal algebra and related subjects is an interestingline of research. Specifically, studying analogical proportions in abstract

Page 25: ANALOGICAL PROPORTIONS

References 25

mathematical structures like, for example, various kinds of lattices, semi-groups and groups, rings, et cetera, is particularly interesting in the case ofproportions between objects from different domains. At this point—due tothe author’s lack of expertise—it is not clear how analogical proportions fitinto the overall landscape of universal algebra and relating analogical pro-portions to other concepts of algebra is therefore an important line of futureresearch.

Acknowledgments

This work has been supported by the Austrian Science Fund (FWF)project P31063-N35.

References

Antic, C. (2020). Logic-based analogical reasoning and learning.https://arxiv.org/pdf/1809.09938.pdf.

Apt, K. R. (1990). Logic programming. In van Leeuwen, J. (Ed.), Hand-book of Theoretical Computer Science, Vol. B, pp. 493–574. Elsevier,Amsterdam.

Boden, M. A. (1998). Creativity and artificial intelligence. Artificial Intel-ligence, 103 (1-2), 347–356.

Burris, S., & Sankappanavar, H. (2000). A Course inUniversal Algebra (The Millenium Edition edition).http://www.math.hawaii.edu/ ralph/Classes/619/univ-algebra.pdf.

Correa, W., Prade, H., & Richard, G. (2012). When intelligence is just amatter of copying. In Raedt, L. D., Bessiere, C., Dubois, D., Doherty,P., Frasconi, P., Heintz, F., & Lucas, P. (Eds.), ECAI 2012, Vol. 242of Frontiers in Artificial Intelligence and Applications, pp. 276–281.

Dastani, M., Indurkhya, B., & Scha, R. (2003). Analogical projection inpattern perception. Journal of Experimental & Theoretical ArtificialIntelligence, 15 (4), 489–511.

Ebbinghaus, H.-D., & Flum, J. (1999). Finite Model Theory (2 edi-tion). Springer Monographs in Mathematics. Springer-Verlag,Berlin/Heidelberg.

Gust, H., Krumnack, U., Kuhnberger, K.-U., & Schwering, A. (2008). Ana-logical reasoning: a core of cognition. Kunstliche Intelligenz, 22 (1),8–12.

Hofstadter, D. (2001). Analogy as the core of cognition. In Gentner, D.,Holyoak, K. J., & Kokinov, B. K. (Eds.), The Analogical Mind: Per-spectives from Cognitive Science, pp. 499–538. MIT Press/BradfordBook, Cambridge MA.

Hofstadter, D., & Mitchell, M. (1995). The copycat project: a model ofmental fluidity and analogy-making. In Fluid Concepts and Cre-ative Analogies. Computer Models of the Fundamental Mechanismsof Thought, chap. 5, pp. 205–267. Basic Books, New York.

Page 26: ANALOGICAL PROPORTIONS

26 References

Hofstadter, D., & Sander, E. (2013). Surfaces and Essences. Analogy as theFuel and Fire of Thinking. Basic Books, New York.

Lepage, Y. (2003). De L’Analogie. Rendant Compte de la Commutation enLinguistique. Habilitation a diriger les recherches, Universite JosephFourier, Grenoble.

Libkin, L. (2012). Elements of Finite Model Theory. Springer-Verlag,Berlin/Heidelberg.

Miclet, L., Bayoudh, S., & Delhay, A. (2008). Analogical dissimilarity: def-inition, algorithms and two experiments in machine learning. Journalof Artificial Intelligence Research, 32, 793–824.

Navarrete, J. A., & Dartnell, P. (2017). Towards a category theory approachto analogy: analyzing re-representation and acquisition of numericalknowledge. Computatinal Biology, 13 (8), 1–38.

Polya, G. (1954). Induction and Analogy in Mathematics, Vol. 1 of Math-ematics and Plausible Reasoning. Princeton University Press, Prince-ton, New Jersey.

Sowa, J. F., & Majumdar, A. K. (2003). Analogical reasoning. In Ganter,B., Moor, A., & Lex, W. (Eds.), ICCS 2003, LNAI 2746, pp. 16–36.Springer-Verlag, Berlin/Heidelberg.

Stroppa, N., & Yvon, F. (2006). Formal models of analogical propor-

tions. Technical Report D008, Telecom ParisTech - Ecole NationaleSuperieure de Telecommunications, Telecom Paris.

Winston, P. H. (1980). Learning and reasoning by analogy. Communicationsof the ACM, 23 (12), 689–703.

Wos, L. (1993). The problem of reasoning by analogy. Journal of AutomatedReasoning, 10 (3), 421–422.