formal learning theory: an introduction
TRANSCRIPT
Formal Learning Theory:an Introduction
Roberto Bonato
LaBRI
Universite de Bordeaux I
Universita degli Studi di Verona
Formal Learning Theory:an Introduction – p.1/35
Contents
Basic notions of Learnability Theory
Gold’s model of grammatical inference:identification in the limit
Some representative results of(un)learnability
Learning Categorial Grammars:the RG algorithm
Formal Learning Theory:an Introduction – p.2/35
Contents
Basic notions of Learnability Theory
Gold’s model of grammatical inference:identification in the limit
Some representative results of(un)learnability
Learning Categorial Grammars:the RG algorithm
Formal Learning Theory:an Introduction – p.2/35
Contents
Basic notions of Learnability Theory
Gold’s model of grammatical inference:identification in the limit
Some representative results of(un)learnability
Learning Categorial Grammars:the RG algorithm
Formal Learning Theory:an Introduction – p.2/35
Contents
Basic notions of Learnability Theory
Gold’s model of grammatical inference:identification in the limit
Some representative results of(un)learnability
Learning Categorial Grammars:the RG algorithm
Formal Learning Theory:an Introduction – p.2/35
The poverty of the stimulus paradox
How comes it that human beings,whose contacts with the worldare brief and personal and limited,are nevertheless able to know as much asthey do know?
Sir Bertrand Russellquoted by Chomsky, 1975
Formal Learning Theory:an Introduction – p.3/35
Its theoretical linguistics version
The learning paradox: how do children learn thesyntax of their mother tongue (and ratherquickly!) given that:
Natural language syntax is very complicated
Not so many examples are provided
Negative examples are of no use
Formal Learning Theory:an Introduction – p.4/35
Its theoretical linguistics version
The learning paradox: how do children learn thesyntax of their mother tongue (and ratherquickly!) given that:
Natural language syntax is very complicated
Not so many examples are provided
Negative examples are of no use
Formal Learning Theory:an Introduction – p.4/35
Its theoretical linguistics version
The learning paradox: how do children learn thesyntax of their mother tongue (and ratherquickly!) given that:
Natural language syntax is very complicated
Not so many examples are provided
Negative examples are of no use
Formal Learning Theory:an Introduction – p.4/35
Chomsky’s solution
Universal Grammar, defined by Principles
(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)
Categorial analogy:Universal rulesOnly types in the lexicon are languagespecific. . . MORE on this later
Formal Learning Theory:an Introduction – p.5/35
Chomsky’s solution
Universal Grammar, defined by Principles
(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)
Categorial analogy:Universal rulesOnly types in the lexicon are languagespecific. . . MORE on this later
Formal Learning Theory:an Introduction – p.5/35
Chomsky’s solution
Universal Grammar, defined by Principles
(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)
Categorial analogy:
Universal rulesOnly types in the lexicon are languagespecific. . . MORE on this later
Formal Learning Theory:an Introduction – p.5/35
Chomsky’s solution
Universal Grammar, defined by Principles
(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)
Categorial analogy:Universal rules
Only types in the lexicon are languagespecific. . . MORE on this later
Formal Learning Theory:an Introduction – p.5/35
Chomsky’s solution
Universal Grammar, defined by Principles
(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)
Categorial analogy:Universal rulesOnly types in the lexicon are languagespecific
Formal Learning Theory:an Introduction – p.5/35
Chomsky’s solution
Universal Grammar, defined by Principles
(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)
Categorial analogy:Universal rulesOnly types in the lexicon are languagespecific. . . MORE on this later
Formal Learning Theory:an Introduction – p.5/35
What is Learnability Theory?
Learnability refers to a set of mathematicalmodels of how a human language can beacquired
Learnability is a constraint on UniversalGrammar:
the class of human languagesmust be learnable
Formal Learning Theory:an Introduction – p.6/35
What is Learnability Theory?
Learnability refers to a set of mathematicalmodels of how a human language can beacquired
Learnability is a constraint on UniversalGrammar:
the class of human languagesmust be learnable
Formal Learning Theory:an Introduction – p.6/35
What is Learnability Theory?
Learnability refers to a set of mathematicalmodels of how a human language can beacquired
Learnability is a constraint on UniversalGrammar:
the class of human languagesmust be learnable
Formal Learning Theory:an Introduction – p.6/35
Why should we care?
Mathematical precision is a good thing!
Learnability Theory can suggest differenttheories of Universal Grammar:
If one can show that some theory of UG can resultin an unlearnable array of possible languages, thattheory must be changed.
We can use learnability to constrain the observedset of languages, not just UG.
Formal Learning Theory:an Introduction – p.7/35
Why should we care?
Mathematical precision is a good thing!
Learnability Theory can suggest differenttheories of Universal Grammar:
If one can show that some theory of UG can resultin an unlearnable array of possible languages, thattheory must be changed.
We can use learnability to constrain the observedset of languages, not just UG.
Formal Learning Theory:an Introduction – p.7/35
Why should we care?
Mathematical precision is a good thing!
Learnability Theory can suggest differenttheories of Universal Grammar:
If one can show that some theory of UG can resultin an unlearnable array of possible languages, thattheory must be changed.
We can use learnability to constrain the observedset of languages, not just UG.
Formal Learning Theory:an Introduction – p.7/35
Why should we care?
Mathematical precision is a good thing!
Learnability Theory can suggest differenttheories of Universal Grammar:
If one can show that some theory of UG can resultin an unlearnable array of possible languages, thattheory must be changed.
We can use learnability to constrain the observedset of languages, not just UG.
Formal Learning Theory:an Introduction – p.7/35
The acquisition framework
Innateness
Positive evidence
Learning = setting the parameters of aUniversal Grammar
Formal Learning Theory:an Introduction – p.8/35
The acquisition framework
Innateness
Positive evidence
Learning = setting the parameters of aUniversal Grammar
Formal Learning Theory:an Introduction – p.8/35
The acquisition framework
Innateness
Positive evidence
Learning = setting the parameters of aUniversal Grammar
Formal Learning Theory:an Introduction – p.8/35
Innateness
A grammar is a finite specification of alanguage.
Innateness holds that the learner can onlyacquire certain kinds of grammars and notothers.
Some language type would therefore beimpossible.
Formal Learning Theory:an Introduction – p.9/35
Positive evidence
In general, children do not learn fromcorrection
R. Brown, C. Hanlon, DerivationalComplexity and the order of acquisition ofchild speech. 1970
Effectively, the input to the learner onlyinclude grammatical sentences:
Steven Pinker, The language instinct.Harper, 2000
Formal Learning Theory:an Introduction – p.10/35
The learning “algorithm”
The learner has a set of possible grammarsto choose from.
The learner is presented with some finite setof sentences.
What grammar does the learner choose?
Human Languages?
CSLCFLRL REL
Formal Learning Theory:an Introduction – p.11/35
The learning “algorithm”
The learner has a set of possible grammarsto choose from.
The learner is presented with some finite setof sentences.
What grammar does the learner choose?
Human Languages?
CSLCFLRL REL
Formal Learning Theory:an Introduction – p.11/35
The learning “algorithm”
The learner has a set of possible grammarsto choose from.
The learner is presented with some finite setof sentences.
What grammar does the learner choose?
Human Languages?
CSLCFLRL REL
Formal Learning Theory:an Introduction – p.11/35
The learning “algorithm”
The learner has a set of possible grammarsto choose from.
The learner is presented with some finite setof sentences.
What grammar does the learner choose?
Human Languages?
CSLCFLRL REL
Formal Learning Theory:an Introduction – p.11/35
Let us play a game...
I think of a certain set of numbers e.g.x : x ≥ 10 and x is even and you have toguess it
I’ll provide you with an infinite number ofclues in the form “the number x belong to theset”, one at a time
After each rule, you make a guess
I will never tell whether you’re right or not
Formal Learning Theory:an Introduction – p.12/35
Let us play a game...
I think of a certain set of numbers e.g.x : x ≥ 10 and x is even and you have toguess it
I’ll provide you with an infinite number ofclues in the form “the number x belong to theset”, one at a time
After each rule, you make a guess
I will never tell whether you’re right or not
Formal Learning Theory:an Introduction – p.12/35
Let us play a game...
I think of a certain set of numbers e.g.x : x ≥ 10 and x is even and you have toguess it
I’ll provide you with an infinite number ofclues in the form “the number x belong to theset”, one at a time
After each rule, you make a guess
I will never tell whether you’re right or not
Formal Learning Theory:an Introduction – p.12/35
Let us play a game...
I think of a certain set of numbers e.g.x : x ≥ 10 and x is even and you have toguess it
I’ll provide you with an infinite number ofclues in the form “the number x belong to theset”, one at a time
After each rule, you make a guess
I will never tell whether you’re right or not
Formal Learning Theory:an Introduction – p.12/35
Some questions
What should count as winning this game?
What happens if I am allowed to select theset of all positive integers?
Formal Learning Theory:an Introduction – p.13/35
Some questions
What should count as winning this game?
What happens if I am allowed to select theset of all positive integers?
Formal Learning Theory:an Introduction – p.13/35
Who are the players?
SCIENTIFIC INDUCTION
Nature vs. Scientists
FIRST LANGUAGE ACQUISITIONAdults vs. Child
Formal Learning Theory:an Introduction – p.14/35
Who are the players?
SCIENTIFIC INDUCTIONNature vs. Scientists
FIRST LANGUAGE ACQUISITIONAdults vs. Child
Formal Learning Theory:an Introduction – p.14/35
Who are the players?
SCIENTIFIC INDUCTIONNature vs. Scientists
FIRST LANGUAGE ACQUISITION
Adults vs. Child
Formal Learning Theory:an Introduction – p.14/35
Who are the players?
SCIENTIFIC INDUCTIONNature vs. Scientists
FIRST LANGUAGE ACQUISITIONAdults vs. Child
Formal Learning Theory:an Introduction – p.14/35
Learning in Gold’s Framework
the learner is provided with infinite stream of examples:s1, s2, . . . , si, . . .;
at each step i the learner makes a guess Gi
compatible with the examples seen thus far;
the process is infinite:
s1, s2, s3, . . . sn, . . .
G1, G2, G3, . . . Gn, . . .
learning is successful when there is a certain point(even if we don’t know which!) after which the guessmade by the learner doesn’t change and is correct
Formal Learning Theory:an Introduction – p.15/35
Learning in Gold’s Framework
the learner is provided with infinite stream of examples:s1, s2, . . . , si, . . .;
at each step i the learner makes a guess Gi
compatible with the examples seen thus far;
the process is infinite:
s1, s2, s3, . . . sn, . . .
G1, G2, G3, . . . Gn, . . .
learning is successful when there is a certain point(even if we don’t know which!) after which the guessmade by the learner doesn’t change and is correct
Formal Learning Theory:an Introduction – p.15/35
Learning in Gold’s Framework
the learner is provided with infinite stream of examples:s1, s2, . . . , si, . . .;
at each step i the learner makes a guess Gi
compatible with the examples seen thus far;
the process is infinite:
s1, s2, s3, . . . sn, . . .
G1, G2, G3, . . . Gn, . . .
learning is successful when there is a certain point(even if we don’t know which!) after which the guessmade by the learner doesn’t change and is correct
Formal Learning Theory:an Introduction – p.15/35
Learning in Gold’s Framework
the learner is provided with infinite stream of examples:s1, s2, . . . , si, . . .;
at each step i the learner makes a guess Gi
compatible with the examples seen thus far;
the process is infinite:
s1, s2, s3, . . . sn, . . .
G1, G2, G3, . . . Gn, . . .
learning is successful when there is a certain point(even if we don’t know which!) after which the guessmade by the learner doesn’t change and is correct
Formal Learning Theory:an Introduction – p.15/35
Grammatical Inference
Formal Learning Theory:an Introduction – p.16/35
Grammatical Inference
grammars S = samplesΩ =
Formal Learning Theory:an Introduction – p.16/35
Grammatical Inference
grammars S = samples
G
Ω =
Formal Learning Theory:an Introduction – p.16/35
Grammatical Inference
G
grammars S = samples
G
Ω =
Formal Learning Theory:an Introduction – p.16/35
Grammatical Inference
G
L(G)
grammars S = samples
G
Ω =
Formal Learning Theory:an Introduction – p.16/35
Grammatical Inference
G
L(G)
grammars S = samples
G l1
li
l2
Ω =
Formal Learning Theory:an Introduction – p.16/35
Grammatical Inference
l1
li
l2
l1
G
L(G)
grammars S = samples
G
G1
φ( )
Ω =
Formal Learning Theory:an Introduction – p.16/35
Grammatical Inference
l1
li
l2
l1
l2
G
L(G)
grammars S = samples
G
G1
G2
φ( )
φ( )
Ω =
Formal Learning Theory:an Introduction – p.16/35
Grammatical Inference
l1
li
l2
l1
li
l2
G
L(G)
grammars S = samples
G
G1
iG
G2
φ( )
φ( )
φ( )
Ω =
Formal Learning Theory:an Introduction – p.16/35
More Formally...
Let 〈Ω, S, L〉 be a grammar system
let φ : finite seq. of sentences of S Ω
let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of
sentences from S
let Gi = φ(〈s0, . . . , si〉)
φ converges to G on 〈si〉i∈Nif there exists an n ∈ N
such that for all i ≥ n,
Gi = φ(〈s0, . . . , si〉) is defined
and L(Gi) = L(G)
Formal Learning Theory:an Introduction – p.17/35
More Formally...
Let 〈Ω, S, L〉 be a grammar system
let φ : finite seq. of sentences of S Ω
let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of
sentences from S
let Gi = φ(〈s0, . . . , si〉)
φ converges to G on 〈si〉i∈Nif there exists an n ∈ N
such that for all i ≥ n,
Gi = φ(〈s0, . . . , si〉) is defined
and L(Gi) = L(G)
Formal Learning Theory:an Introduction – p.17/35
More Formally...
Let 〈Ω, S, L〉 be a grammar system
let φ : finite seq. of sentences of S Ω
let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of
sentences from S
let Gi = φ(〈s0, . . . , si〉)
φ converges to G on 〈si〉i∈Nif there exists an n ∈ N
such that for all i ≥ n,
Gi = φ(〈s0, . . . , si〉) is defined
and L(Gi) = L(G)
Formal Learning Theory:an Introduction – p.17/35
More Formally...
Let 〈Ω, S, L〉 be a grammar system
let φ : finite seq. of sentences of S Ω
let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of
sentences from S
let Gi = φ(〈s0, . . . , si〉)
φ converges to G on 〈si〉i∈Nif there exists an n ∈ N
such that for all i ≥ n,
Gi = φ(〈s0, . . . , si〉) is defined
and L(Gi) = L(G)
Formal Learning Theory:an Introduction – p.17/35
More Formally...
Let 〈Ω, S, L〉 be a grammar system
let φ : finite seq. of sentences of S Ω
let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of
sentences from S
let Gi = φ(〈s0, . . . , si〉)
φ converges to G on 〈si〉i∈Nif there exists an n ∈ N
such that for all i ≥ n,
Gi = φ(〈s0, . . . , si〉) is defined
and L(Gi) = L(G)
Formal Learning Theory:an Introduction – p.17/35
More Formally...
Let 〈Ω, S, L〉 be a grammar system
let φ : finite seq. of sentences of S Ω
let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of
sentences from S
let Gi = φ(〈s0, . . . , si〉)
φ converges to G on 〈si〉i∈Nif there exists an n ∈ N
such that for all i ≥ n,
Gi = φ(〈s0, . . . , si〉) is defined
and L(Gi) = L(G)
Formal Learning Theory:an Introduction – p.17/35
More Formally...
Let 〈Ω, S, L〉 be a grammar system
let φ : finite seq. of sentences of S Ω
let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of
sentences from S
let Gi = φ(〈s0, . . . , si〉)
φ converges to G on 〈si〉i∈Nif there exists an n ∈ N
such that for all i ≥ n,
Gi = φ(〈s0, . . . , si〉) is defined
and L(Gi) = L(G)
Formal Learning Theory:an Introduction – p.17/35
Towards Learnability
Convergence is about a function and agrammar
Learnability is about a (learning) function andclasses of grammars
Formal Learning Theory:an Introduction – p.18/35
Learnability in the limit
Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:
for every language L ∈ L(G)
for every infinite sequence 〈si〉i∈Nwhich enumerates
the elements of L (i.e. si|i ∈ N = L)
there exists some G ∈ G such that
L(G) = L
and φ converges to G on 〈si〉i∈N
Formal Learning Theory:an Introduction – p.19/35
Learnability in the limit
Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:
for every language L ∈ L(G)
for every infinite sequence 〈si〉i∈Nwhich enumerates
the elements of L (i.e. si|i ∈ N = L)
there exists some G ∈ G such that
L(G) = L
and φ converges to G on 〈si〉i∈N
Formal Learning Theory:an Introduction – p.19/35
Learnability in the limit
Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:
for every language L ∈ L(G)
for every infinite sequence 〈si〉i∈Nwhich enumerates
the elements of L (i.e. si|i ∈ N = L)
there exists some G ∈ G such that
L(G) = L
and φ converges to G on 〈si〉i∈N
Formal Learning Theory:an Introduction – p.19/35
Learnability in the limit
Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:
for every language L ∈ L(G)
for every infinite sequence 〈si〉i∈Nwhich enumerates
the elements of L (i.e. si|i ∈ N = L)
there exists some G ∈ G such that
L(G) = L
and φ converges to G on 〈si〉i∈N
Formal Learning Theory:an Introduction – p.19/35
Learnability in the limit
Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:
for every language L ∈ L(G)
for every infinite sequence 〈si〉i∈Nwhich enumerates
the elements of L (i.e. si|i ∈ N = L)
there exists some G ∈ G such that
L(G) = L
and φ converges to G on 〈si〉i∈N
Formal Learning Theory:an Introduction – p.19/35
Initial Pessimism
Gold, 1967: A class G of grammars is notlearnable if L(G) contains all finite languagesand at least one infinite language.
just like regular languages!
and context free-grammars!
and many others...
Formal Learning Theory:an Introduction – p.20/35
Initial Pessimism
Gold, 1967: A class G of grammars is notlearnable if L(G) contains all finite languagesand at least one infinite language.
just like regular languages!
and context free-grammars!
and many others...
Formal Learning Theory:an Introduction – p.20/35
Initial Pessimism
Gold, 1967: A class G of grammars is notlearnable if L(G) contains all finite languagesand at least one infinite language.
just like regular languages!
and context free-grammars!
and many others...
Formal Learning Theory:an Introduction – p.20/35
Initial Pessimism
Gold, 1967: A class G of grammars is notlearnable if L(G) contains all finite languagesand at least one infinite language.
just like regular languages!
and context free-grammars!
and many others...
Formal Learning Theory:an Introduction – p.20/35
More Generally: Limits Points
A class L of languages has a limit point if there exists
an infinite sequence 〈Ln〉n∈Nof languages in L such that
Formal Learning Theory:an Introduction – p.21/35
More Generally: Limits Points
A class L of languages has a limit point if there exists
an infinite sequence 〈Ln〉n∈Nof languages in L such that
L0
L0
Formal Learning Theory:an Introduction – p.21/35
More Generally: Limits Points
A class L of languages has a limit point if there exists
an infinite sequence 〈Ln〉n∈Nof languages in L such that
L0 ⊂ L1
L0
L1
Formal Learning Theory:an Introduction – p.21/35
More Generally: Limits Points
A class L of languages has a limit point if there exists
an infinite sequence 〈Ln〉n∈Nof languages in L such that
L0 ⊂ L1 ⊂ . . .
L0
L1
Formal Learning Theory:an Introduction – p.21/35
More Generally: Limits Points
A class L of languages has a limit point if there exists
an infinite sequence 〈Ln〉n∈Nof languages in L such that
L0 ⊂ L1 ⊂ . . . ⊂ Ln
L0
L1
Ln
Formal Learning Theory:an Introduction – p.21/35
More Generally: Limits Points
A class L of languages has a limit point if there exists
an infinite sequence 〈Ln〉n∈Nof languages in L such that
L0 ⊂ L1 ⊂ . . . ⊂ Ln ⊂ . . .
L0
L1
Ln
Formal Learning Theory:an Introduction – p.21/35
More Generally: Limits Points
A class L of languages has a limit point if there exists
an infinite sequence 〈Ln〉n∈Nof languages in L such that
L0 ⊂ L1 ⊂ . . . ⊂ Ln ⊂ . . .
L0
L1
Ln
L
AND there exists another language L in L such that L =⋃
n∈NLn
Formal Learning Theory:an Introduction – p.21/35
A Renewed Interest
Gold, 1968 (pessimist!): neither regular nor context-free
grammars are identifiable in the limit from positive examples.
Angluin, 1980: “pattern” languages are learnable
Shinohara, 1990: more non-trivial classes learnable, k-rigid
context sensitive grammars are learnable!
Kanazawa, 1998: rigid and k-valued classical categorial
grammars are learnable, both from structures and from
strings (but that’s NP-hard)
Formal Learning Theory:an Introduction – p.22/35
A Renewed Interest
Gold, 1968 (pessimist!): neither regular nor context-free
grammars are identifiable in the limit from positive examples.
Angluin, 1980: “pattern” languages are learnable
Shinohara, 1990: more non-trivial classes learnable, k-rigid
context sensitive grammars are learnable!
Kanazawa, 1998: rigid and k-valued classical categorial
grammars are learnable, both from structures and from
strings (but that’s NP-hard)
Formal Learning Theory:an Introduction – p.22/35
A Renewed Interest
Gold, 1968 (pessimist!): neither regular nor context-free
grammars are identifiable in the limit from positive examples.
Angluin, 1980: “pattern” languages are learnable
Shinohara, 1990: more non-trivial classes learnable, k-rigid
context sensitive grammars are learnable!
Kanazawa, 1998: rigid and k-valued classical categorial
grammars are learnable, both from structures and from
strings (but that’s NP-hard)
Formal Learning Theory:an Introduction – p.22/35
A Renewed Interest
Gold, 1968 (pessimist!): neither regular nor context-free
grammars are identifiable in the limit from positive examples.
Angluin, 1980: “pattern” languages are learnable
Shinohara, 1990: more non-trivial classes learnable, k-rigid
context sensitive grammars are learnable!
Kanazawa, 1998: rigid and k-valued classical categorial
grammars are learnable, both from structures and from
strings (but that’s NP-hard)
Formal Learning Theory:an Introduction – p.22/35
Pattern Languages
Σ = a, b, c, . . . is any finite alphabet
Var = x1, x2, x3, . . . set of variables
Σ ∩ Var = ∅
a pattern p over Σ is an element of (Σ ∪ Var)+
L(p)=w | w is obtained from p by replacing variableswith non-empty constant strings
example: L(axbx) = awbw|w ∈ Σ+
Formal Learning Theory:an Introduction – p.23/35
Finite Elasticity
A class L of languages is said to have infiniteelasticity if there exists an infinite sequence〈sn〉n∈N of sentences and an infinite sequence〈Ln〉n∈N of languages in L such that for all n ∈ N,
sn 6∈ Ln
s0, . . . , sn ⊆ Ln+1
A class L of languages is said to have finite elas-
ticity if it doesn’t have infinite elasticity
Formal Learning Theory:an Introduction – p.24/35
A Theorem by Angluin (1979)
Any class G with finite elasticity is inferablefrom positive data
The class of pattern languages has finiteelasticity so...
The class of pattern languages is inferablefrom positive data
Formal Learning Theory:an Introduction – p.25/35
A Theorem by Angluin (1979)
Any class G with finite elasticity is inferablefrom positive data
The class of pattern languages has finiteelasticity so...
The class of pattern languages is inferablefrom positive data
Formal Learning Theory:an Introduction – p.25/35
A Theorem by Angluin (1979)
Any class G with finite elasticity is inferablefrom positive data
The class of pattern languages has finiteelasticity so...
The class of pattern languages is inferablefrom positive data
Formal Learning Theory:an Introduction – p.25/35
Summing up...
L(G) has a limit point ⇒ G is unlearnable ⇒ L(G) has infinite
elasticity
L(G) has finite elasticity ⇒ G is learnable
Formal Learning Theory:an Introduction – p.26/35
Classical Categorial Grammars
Johnnp
likes(np\s)/np
Marynp
np\s
s
\E
s
runsnp\s
Johnnp
\E
/E
Formal Learning Theory:an Introduction – p.27/35
Classical Categorial Grammars
CCGs = typed words
G :
loves 7→ (np\s)/np, np\(s/np)
John 7→ np
Mary 7→ np
runs 7→ np\s
Johnnp
likes(np\s)/np
Marynp
np\s
s
\E
s
runsnp\s
Johnnp
\E
/E
Formal Learning Theory:an Introduction – p.27/35
Classical Categorial Grammars
CCGs = typed words + composition rules
G :
loves 7→ (np\s)/np, np\(s/np)
John 7→ np
Mary 7→ np
runs 7→ np\s
A A\B[\E]
B
B/A A[/E]
B
Johnnp
likes(np\s)/np
Marynp
np\s
s
\E
s
runsnp\s
Johnnp
\E
/E
Formal Learning Theory:an Introduction – p.27/35
Classical Categorial Grammars
CCGs = typed words + composition rules
G :
loves 7→ (np\s)/np, np\(s/np)
John 7→ np
Mary 7→ np
runs 7→ np\s
A A\B[\E]
B
B/A A[/E]
B
Johnnp
likes(np\s)/np
Marynp
np\s
s
\E
s
runsnp\s
Johnnp
\E
/E
Formal Learning Theory:an Introduction – p.27/35
The RG Algorithm (Buszkowski 1989)
Input: finite sets of functor-argumentstructures
Output: a rigid categorial grammar thatgenerates them
/E /E
\E
/E \E
a man
swims
a fish swims fast
D = ,
Formal Learning Theory:an Introduction – p.28/35
The RG Algorithm (Buszkowski 1989)
Input: finite sets of functor-argumentstructures
Output: a rigid categorial grammar thatgenerates them
/E /E
\E
/E \E
a man
swims
a fish swims fast
D = ,
Formal Learning Theory:an Introduction – p.28/35
The RG Algorithm (Buszkowski 1989)
Input: finite sets of functor-argumentstructures
Output: a rigid categorial grammar thatgenerates them
/E /E
\E
/E \E
a man
swims
a fish swims fast
D = ,
Formal Learning Theory:an Introduction – p.28/35
RG runs
/E /E \E
/E \E
fastswimsa fish
swims
a man
_ _
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodes
Formal Learning Theory:an Introduction – p.29/35
RG runs
/E /E \E
/E \E
fastswimsa fish
swims
a man
_ _
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodes
Formal Learning Theory:an Introduction – p.29/35
RG runs
/E /E \E
/E \E
fastswimsa fish
s
swims
a
s
man
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodes
Formal Learning Theory:an Introduction – p.29/35
RG runs
/E /E \E
/E \E
fastswimsa fish
s
swims
a
s
man
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodes
Formal Learning Theory:an Introduction – p.29/35
RG runs
/E /E
1x
\E
/E \E
fastswimsa fish
s
swims
a
s
man
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodes
Formal Learning Theory:an Introduction – p.29/35
RG runs
/E /E
1x
x2
\E
/E \E
fastswimsa fish
s
swims
a
s
man
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodes
Formal Learning Theory:an Introduction – p.29/35
RG runs
/E /E
x31x
x2
\E
/E \E
fastswimsa fish
s
swims
a
s
man
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodes
Formal Learning Theory:an Introduction – p.29/35
RG runs
/E /E
x4
x31x
x2
\E
/E \E
fastswimsa fish
s
swims
a
s
man
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodes
Formal Learning Theory:an Introduction – p.29/35
RG runs
/E /E
x4
x31x
x2
\E
/E \E
fastswimsx5
a fish
s
swims
a
s
man
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodes
Formal Learning Theory:an Introduction – p.29/35
RG runs
/E /E
x4
x31x
x2
\E
/E \E
fastswimsx5
a fish
s
swims
a
s
man
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E /E
x4
x31x
2x1 / x x2
\E
/E \E
fastswimsx5
a fish
s
swims
a
s
man
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E
manx2
/E
x4
x3
2x1 / x
1x
\E
/E \E
fastswimsx5
a fish
s
swims
a
s
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E
manx2
/E
x4
x3
2x1 / x
x1 \ s1x
\E
/E \E
fastswimsx5
a fish
s
swims
a
s
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E
a
2x1 / xmanx2
1xswimsx1 \ s
/E
x4
x3
\E
s
/E \E
fastswimsx5
a fish
s
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E
a
2x1 / xmanx2
1xswimsx1 \ s
/E
x4
x3
\E
s
/E \E
fastswimsx5
a fish
x3 \ s
s
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E
a
2x1 / xmanx2
1xswimsx1 \ s
/E
x4
x3
\E
ss
/E \E
fastswimsx5
x3 \ s
a fish
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E
a
2x1 / xmanx2
1xswimsx1 \ s
/E
x3 / x 4 x4
x3
\E
ss
/E \E
fastswimsx5
x3 \ s
a fish
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E
a
2x1 / xmanx2
1xswimsx1 \ s
/E
x3 / x 4
ax4
fish
x3
\E
ss
/E \E
fastswimsx5
x \ s3
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E
a
2x1 / xmanx2
1xswimsx1 \ s
/E
x3 / x 4
ax4
fish
x3
\E
ss
/E \E
fastswimsx5 x5 \ s)3 \ (x
x \ s3
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG runs
/E
a
2x1 / xmanx2
1xswimsx1 \ s
/E
x3 / x 4
ax4
fishx5
swimsx5 \ (x 3\ s)
fast
x3 \ sx3
\E
ss
/E \E
Assign a type to each node of the structures:
Assign s to each distinct root node
Assign distinct variables to argument nodes
Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35
RG: Collecting Types
GF(D):
a 7→ x1/x2, x3/x4
fast 7→ x5\(x3\s)
fish 7→ x4
man 7→ x2
swims 7→ x1\s, x5
Formal Learning Theory:an Introduction – p.30/35
RG: Collecting Types
GF(D):
a 7→ x1/x2, x3/x4
fast 7→ x5\(x3\s)
fish 7→ x4
man 7→ x2
swims 7→ x1\s, x5
Formal Learning Theory:an Introduction – p.30/35
RG: Unifying Types
Formal Learning Theory:an Introduction – p.31/35
RG: Unifying Types
a 7→ x1/x2, x3/x4
Formal Learning Theory:an Introduction – p.31/35
RG: Unifying Types
a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2
Formal Learning Theory:an Introduction – p.31/35
RG: Unifying Types
a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2
swims 7→ x1\s, x5
Formal Learning Theory:an Introduction – p.31/35
RG: Unifying Types
a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2
swims 7→ x1\s, x5 ⇒ x5 7→ x1\s
Formal Learning Theory:an Introduction – p.31/35
RG: Unifying Types
a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2
swims 7→ x1\s, x5 ⇒ x5 7→ x1\s
σ = x3 7→ x1, x4 7→ x2, x5 7→ x1\s
Formal Learning Theory:an Introduction – p.31/35
RG: Unifying Types
a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2
swims 7→ x1\s, x5 ⇒ x5 7→ x1\s
σ = x3 7→ x1, x4 7→ x2, x5 7→ x1\s
RG(D) = σ[GF (D)]:
a 7→ x1/x2
fast 7→ (x1\s)\(x1\s)
fish 7→ x2
man 7→ x2
swims 7→ x1\s
Formal Learning Theory:an Introduction – p.31/35
Properties of RG
φRG(〈T0, . . . , Tn〉) w RG(T0, . . . , Tn)
φRG learns Grigid from structures
φRG is incremental
φRG can be implemented to run in linear time
Formal Learning Theory:an Introduction – p.32/35
Properties of RG
φRG(〈T0, . . . , Tn〉) w RG(T0, . . . , Tn)
φRG learns Grigid from structures
φRG is incremental
φRG can be implemented to run in linear time
Formal Learning Theory:an Introduction – p.32/35
Properties of RG
φRG(〈T0, . . . , Tn〉) w RG(T0, . . . , Tn)
φRG learns Grigid from structures
φRG is incremental
φRG can be implemented to run in linear time
Formal Learning Theory:an Introduction – p.32/35
Properties of RG
φRG(〈T0, . . . , Tn〉) w RG(T0, . . . , Tn)
φRG learns Grigid from structures
φRG is incremental
φRG can be implemented to run in linear time
Formal Learning Theory:an Introduction – p.32/35
Isn’t there a contradiction?
Regular languages are not learnable from positivedata
Context-free languages are not learnable from positivedata
Rigid classical categorial languages are learnable, butthey are “transversal” to Chomsky’s hierarchy
RL
CFL
RCCL
Formal Learning Theory:an Introduction – p.33/35
Isn’t there a contradiction?
Regular languages are not learnable from positive data
Context-free languages are not learnable from positivedata
Rigid classical categorial languages are learnable, butthey are “transversal” to Chomsky’s hierarchy
RL
CFL
RCCL
Formal Learning Theory:an Introduction – p.33/35
Isn’t there a contradiction?
Regular languages are not learnable from positive data
Context-free languages are not learnable from positivedata
Rigid classical categorial languages are learnable, butthey are “transversal” to Chomsky’s hierarchy
RL
CFL
RCCL
Formal Learning Theory:an Introduction – p.33/35
Isn’t there a contradiction?
Regular languages are not learnable from positive data
Context-free languages are not learnable from positivedata
Rigid classical categorial languages are learnable, butthey are “transversal” to Chomsky’s hierarchy
RL
CFL
RCCL
Formal Learning Theory:an Introduction – p.33/35
Other results
Extensions to k-valued classical categorial grammars
converge (Kanazawa 1998)
k-valued categorial grammars are even learnable from
strings (Kanazawa 1998), but that’s NP-hard
(Costa-Florencio 2002)
Rigid Lambek grammars are learnable from structures
(Bonato 2000)
Lambek rigid grammars are not learnable from strings
(Le Nir and Foret 2002)
Some classes of regular tree languages are learnable
(Marion and Besombes 2001)
Formal Learning Theory:an Introduction – p.34/35
Other results
Extensions to k-valued classical categorial grammars
converge (Kanazawa 1998)
k-valued categorial grammars are even learnable from
strings (Kanazawa 1998), but that’s NP-hard
(Costa-Florencio 2002)
Rigid Lambek grammars are learnable from structures
(Bonato 2000)
Lambek rigid grammars are not learnable from strings
(Le Nir and Foret 2002)
Some classes of regular tree languages are learnable
(Marion and Besombes 2001)
Formal Learning Theory:an Introduction – p.34/35
Other results
Extensions to k-valued classical categorial grammars
converge (Kanazawa 1998)
k-valued categorial grammars are even learnable from
strings (Kanazawa 1998), but that’s NP-hard
(Costa-Florencio 2002)
Rigid Lambek grammars are learnable from structures
(Bonato 2000)
Lambek rigid grammars are not learnable from strings
(Le Nir and Foret 2002)
Some classes of regular tree languages are learnable
(Marion and Besombes 2001)
Formal Learning Theory:an Introduction – p.34/35
Other results
Extensions to k-valued classical categorial grammars
converge (Kanazawa 1998)
k-valued categorial grammars are even learnable from
strings (Kanazawa 1998), but that’s NP-hard
(Costa-Florencio 2002)
Rigid Lambek grammars are learnable from structures
(Bonato 2000)
Lambek rigid grammars are not learnable from strings
(Le Nir and Foret 2002)
Some classes of regular tree languages are learnable
(Marion and Besombes 2001)
Formal Learning Theory:an Introduction – p.34/35
Other results
Extensions to k-valued classical categorial grammars
converge (Kanazawa 1998)
k-valued categorial grammars are even learnable from
strings (Kanazawa 1998), but that’s NP-hard
(Costa-Florencio 2002)
Rigid Lambek grammars are learnable from structures
(Bonato 2000)
Lambek rigid grammars are not learnable from strings
(Le Nir and Foret 2002)
Some classes of regular tree languages are learnable
(Marion and Besombes 2001)
Formal Learning Theory:an Introduction – p.34/35
Other results
Extensions to k-valued classical categorial grammars
converge (Kanazawa 1998)
k-valued categorial grammars are even learnable from
strings (Kanazawa 1998), but that’s NP-hard
(Costa-Florencio 2002)
Rigid Lambek grammars are learnable from structures
(Bonato 2000)
Lambek rigid grammars are not learnable from strings
(Le Nir and Foret 2002)
Some classes of regular tree languages are learnable
(Marion and Besombes 2001)Formal Learning Theory:an Introduction – p.34/35
Open Issues
Learning WHAT? (CFG, Categorial,Miminalist, . . . )
Learning FROM WHAT? (sentences,structures, skeletons. . . )
Learning HOW? (Gold, PAC,. . . )
Formal Learning Theory:an Introduction – p.35/35
Open Issues
Learning WHAT? (CFG, Categorial,Miminalist, . . . )
Learning FROM WHAT? (sentences,structures, skeletons. . . )
Learning HOW? (Gold, PAC,. . . )
Formal Learning Theory:an Introduction – p.35/35
Open Issues
Learning WHAT? (CFG, Categorial,Miminalist, . . . )
Learning FROM WHAT? (sentences,structures, skeletons. . . )
Learning HOW? (Gold, PAC,. . . )
Formal Learning Theory:an Introduction – p.35/35
Open Issues
Learning WHAT? (CFG, Categorial,Miminalist, . . . )
Learning FROM WHAT? (sentences,structures, skeletons. . . )
Learning HOW? (Gold, PAC,. . . )
Formal Learning Theory:an Introduction – p.35/35