computational learning: statistical & soft-computing ......prof. nikhil pal, editor-in-chief,...

50
1 Computational Learning: Statistical & Soft-Computing Approaches: with focus on Language Structure using Fuzzy Similarity Narendra S Chaudhari School of Computer Engineering Nanyang Technological University Singapore Emails: [email protected]; [email protected]; [email protected] verb predicate noun article phrase noun predicate phrase noun sentence _ _ runs verb cat noun the article Son of (Ethel) Sara Turing (conceived 1911 : Chatrapur, India; born 23 June 1912, London) Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. Fuzzy Systems & Professor, Indian Statistical Institute (ISI), Calcutta, India

Upload: others

Post on 28-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

1

Computational Learning: Statistical & Soft-Computing Approaches: with focus on Language Structure using Fuzzy Similarity

Narendra S Chaudhari

School of Computer Engineering Nanyang Technological UniversitySingapore

Emails: [email protected]; [email protected]; [email protected]

verbpredicate

nounarticlephrasenoun

predicatephrasenounsentence

_

_

runsverb

catnoun

thearticle

Son of (Ethel) Sara Turing (conceived 1911 : Chatrapur, India; born 23 June 1912, London)

Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. Fuzzy Systems & Professor, Indian Statistical Institute (ISI), Calcutta, India

Page 2: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

2

Language Structure using Fuzzy Similarity:Language Structure – Some Approaches

• Model Construction– Grammar Learning– Grammar Models of Practical Interest– Regular Language Learning

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen, 2000 ] – Alignment Profile– Profile Similarity– Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Concluding Remarks

• Research Interests

Page 3: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

3

Human beings appear to be able to learn:new conceptswithout needing

to be programmed explicitlyin any conventional sense.

[Leslie G. Valiant, 1984]Harvard University,

Aiken Computation Laboratory, Cambridge, MA, USA

Grammar Learning: BIG PICTURE

Formal Language Learning: (Grammar Learning): Automation of formal language (Chomsky Hierarchy) learning.

Page 4: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

4

Grammar Learning: Model Construction

• Grammar Learning (also called as Grammar Inference: GI) is to identify grammatical models from generated samples.

Grammar Model

S→ ….

….. …

11

111

1111

…?

Page 5: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

5

Grammar Learning: Approach -Induction, and, Inductive Inference

Induction: reasoning from a part to a whole, from particulars to generals, or from individual to the universal.

Inductive inference: process of hypothesizing a general rule from examples.

– Example:

100, 111100, 11000, 1110, 1100,…

– Guess:

“any number of 1’s followed by any number of 0’s”.

Page 6: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

6

E.M. Gold [1967]Formulated concept of “Identification in the Limit”It views inductive inference as an infinite process.

PROVES THE “Negative” result:

Any “super-finite” language class (class containing at least one infinite language) cannot be learnt with only “positive” examples

E. M. Gold [1967] Language identification in the limit,Information and Control, Vol. 10, pp. 447-474, 1967.

Grammar Learning: E.M. Gold’s “negative” result

Page 7: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

7

E.M. Gold [1978]Formulated concept of “Language learning in the Limit”(again views inductive inference as an infinite process)

PROVES THE “Positive” result:

“Regular” languages are “learnable in the limit”

“practicality problems” (good algorithms) still remain

E. M. Gold [1978] Complexity of automation identification from given data. Information and Control, Vol. 37, pp. 302-320, 1978.

Grammar Learning: E.M. Gold’s “positive” result

Page 8: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

8

Grammar Models of practical interest

Regular Languages

Context-free Languages

a

b b c

Stochastic Extensions

S→AB (80%)

S→B (20%)

Sub classes

Extended Models

Page 9: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

9

Regular Language (Finite Automaton) Learning

• Learning from Representative Samples and Membership Queries [Angluin 1981]

• Learning by Membership and Equivalence Queries [Angluin 1987b]• Learning Reversible Languages [Angluin 1982]

[Angluin 1981] D. Angluin, "A Note on the Number of Queries Needed to Identify Regular Languages", Information and Control, 51, pp. 76-87, 1981. [Angluin 1987b] D. Angluin, "Learning regular sets from queries and counterexamples", Information and Control, 39, pp. 337-350, 1987b. [Angluin 1982] D. Angluin, "Inference of Reversible Languages", Journal of ACM, 29, pp. 741-765, 1982.

Page 10: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

10

designed efficient algorithms for incrementalefficient algorithms for incrementallearning of DFAlearning of DFA.– A version space based framework (using the lattice

of DFAs) is reported in:

– Suresh Jain, Narendra S. Chaudhari, “An Incremental Algorithm for learning DFA from characteristic sample,”International Journal of Computational Intelligence Research (IJCIR), Vol. 3, No. 4, pp. 297-312 (Online: http://www.ripublication.com/ijcirv3/ijcirv3n4_3.pdf )(Dec 2007).

Regular Language Learning:Some of Our Contributions

Page 11: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

11

Language Structure using Fuzzy Similarity

• Model Construction– Grammar Learning– Grammar Models of Practical Interest– Regular Language Learning

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen, 2000 ] – Alignment Profile

• Alignment Profile as a Fuzzy Set– Profile Similarity

• Profile Similarity formulated in terms of Fuzzy Operations– Indistinguishable Grammar Symbols

• Profile Similarity and Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Concluding Remarks

• Research Interests

Page 12: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

12

Language Structure using Fuzzy Similarity

• Model Construction– Grammar Learning– Grammar Models of Practical Interest– Regular Language Learning

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen, 2000 ] – Alignment Profile

• Alignment Profile as a Fuzzy Set– Profile Similarity

• Profile Similarity formulated in terms of Fuzzy Operations– Indistinguishable Grammar Symbols

• Profile Similarity and Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Concluding Remarks

• Research Interests

Page 13: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

13

Alignment Based Learning (ABL)

Introduced by Zaanen [Zaanen 2000, 2002a]For Context-free Languages• Based on alignment information• Unsupervised• Does not require language details

[Zaanen 2000] Menno van Zaanen, "ABL: Alignment-Based Learning", Proceedings of the 18th International Conference on Computational Linguistics (COLING), Saarbrücken, Germany, pp. 961-967, 31 Jul-4 Aug 2000.

Induction of Linguistic Knowledge

ILK Research GroupDept. of Communication and Information SciencesFaculty of HumanitiesTilburg UniversityP.O. Box 90153NL-5000 LE TilburgThe Netherlands

Page 14: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

14

ABL Step 1

• Get alignments between each pair of sentences from the samples.

Oscar sees the large, green apple.

Cookie monster sees the red apple.

Oscar sees the large, green apple.

Cookie monster sees the red apple.

Page 15: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

15

ABL Step 2

• Extract context-free grammar

A → OscarA → Cookie monsterB → large, greenB → redS → A sees the B apple.

Oscar sees the large, green apple.

Cookie monster sees the red apple.

Page 16: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

16

Problems with ABL

• Unable to identify the following alignment– Book [a trip to Goa beach].– Show [ me Big Bird’s house].

• Generates the following alignment, and generates incorrect grammatical rules – Gopal eats [biscuits].– Gopal eats [well] .

Book and Show are “verbs” but ABL cannot learn this concept of verbs.

biscuits and well should not be “clubbed” as single concept (because biscuits is noun, well is adverb) but ABL will club them to be derived from the same non-terminal symbol (since other parts Gopal eats are aligned in ABL) .

Page 17: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

17

OUR EXTENSION: Profile Based Alignment Learning1,2

• Objective: – Refine the learned model by reducing identified invalid rules

– Improve the precision of context-free grammatical rules extracted from samples

1: X. Wang and N. S. Chaudhari, “Alignment Based Similarity Measure for Grammar Learning”, 2006 IEEE International Conference on Fuzzy Systems, pp. 1902 – 1909, July 16-21, 2006.2: X. Wang and N. S. Chaudhari, “Profile Based Alignment Learning System for Language Inference”, in Proceedings of the 14th International Conference on Intelligent and Adaptive Systems and Software Engineering, pp. 94-99, Toronto, Canada, July 20-22, 2005.

biscuits and well should not be “clubbed” as single concept … but ABL will club them to be derived from the same non-terminal symbol (since other parts Gopal eats are aligned in ABL) .

Book and Show are of the “same catergories” (verbs) but ABL cannot learn this concept of “same category” (verb).

Page 18: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

18

Language Structure using Fuzzy Similarity

• Model Construction– Grammar Learning– Grammar Models of Practical Interest– Regular Language Learning

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen ] – Alignment Profile

• Alignment Profile as a Fuzzy Set– Profile Similarity

• Profile Similarity formulated in terms of Fuzzy Operations– Indistinguishable Grammar Symbols

• Profile Similarity and Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Concluding Remarks

• Research Interests

Page 19: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

19

Alignment Profile

• Suppose– “apples” aligned with “pears” in one context

• I brought some [apples] this morning• I brought some [pears] this morning

– “apple” aligned with “well” in another context• Gopal eats [pears]• Gopal eats [apples]• Gopal eats [well]

apples pears well

2/5 2/51/5

• Then alignment profile for “apples” is:

apples pears well

2/5 2/51/5

• The alignment profile for “pears” is:

apples pears well

1/5 1/5 1/5

• Alignment profile for “well” is:

√ Not matching

Page 20: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

20

Alignment Profile: as a Fuzzy Set

apples pears well

2/5 2/51/5

• The alignment profile for “apples” is:

We define Alignment profile as a fuzzy set AP defined on the set of symbols N ∪Σ, denoted by AP = { <v, μAP(v) > | v ∈N ∪Σ },

where μAP(v) : N ∪Σ → [0,1]is the membership grade of the fuzzy set AP.

Page 21: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

21

Language Structure using Fuzzy Similarity

• Model Construction– Grammar Learning– Grammar Models of Practical Interest– Regular Language Learning

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen ] – Alignment Profile

• Alignment Profile as a Fuzzy Set– Profile Similarity

• Profile Similarity formulated in terms of Fuzzy Operations– Indistinguishable Grammar Symbols

• Profile Similarity and Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Concluding Remarks

• Research Interests

Page 22: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

22

Profile Similarity

• Our basic idea: words with higher “Profile similarity” tend to be generated from the same nonterminal in context-free languages

(%)

Similarity

Dissimilarity

Page 23: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

23

Profile Similarity

• Our basic idea: words with higher “Profile similarity” tend to be generated from the same nonterminal in context-free languages

(%)

Similarity

Dissimilarity

Page 24: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

24

Profile Similarity: formulated in terms of Fuzzy operations

OUR Approach to prove that words with higher “Profile similarity”tend to be generated from the same nonterminal in context-free languages: – Formulate “Alignment Profile” as a Fuzzy Set over Grammar symbols – Formulate Profile Similarity in terms of Fuzzy Set operation(s) The Profile similarity of alignments profiles AP1 and AP2, denoted as Ps(AP1, AP2),is formulated as:

Ps(AP1, AP2) = .

(%)

SimilarityDissimilarity||

||

21

21

APAPAPAP

∪∩

Page 25: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

25

Language Structure using Fuzzy Similarity

• Model Construction– Grammar Learning– Grammar Models of Practical Interest– Regular Language Learning

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen ] – Alignment Profile

• Alignment Profile as a Fuzzy Set– Profile Similarity

• Profile Similarity formulated in terms of Fuzzy Operations– Indistinguishable Grammar Symbols

• Profile Similarity and Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Concluding Remarks

• Research Interests

Page 26: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

26

Indistinguishable Grammar Symbols: our definition †

Define indistinguishable (grammar) symbols:In a CFG G = (N, Σ, P, S), A∈N, B∈N,if for all p∈P and

for all “surrounding contexts” α, β, RHS(p)=αAβ,

there exists a q∈P, RHS(q)=αBβ, and LHS(p) = LHS(q), then A and B are called indistinguishable symbols, Otherwise, A and B are distinguishable.

X → αAβ

Y → αBβ

p: q: X

† : motivation: we introduce this definition to formalize the construction to “merge” grammar symbols (e.g. “apples”, “pears”) with high profile similarity

Page 27: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

27

Indistinguishable Grammar Symbols & Profile Similarity

Theorem 1. In a SCFG G(N, Σ, P, S), A∈N, w1∈N∪Σ, w2∈N∪Σ, {A→w1 (pr1), A→ w2 (pr2) } ⊆ P, pr1>0, pr2>0.

Assume that A can be generated from S. Then, given enough generated sentences,Ps(AP(w1), AP(w2)) = 1.

Theorem 2. In a SCFG G(N, Σ, P, S), A∈N, w1 ∈N∪Σ, w2 ∈N∪Σ, {A→w1 (pr1), B→ w2 (pr2) } ⊆ P, pr1>0, pr2>0. Assume that A and B are distinguishable, thenthe profile similarity, Ps(AP(w1), AP(w2)) = 1 - σ,

where σ is a positive real number less than 1.Note: To represent the “frequency of occurrence” of the rule, we use “stochastic” extension of CFG: i.e. we use SCFG.

Page 28: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

28

Example

Sample text 1: London came out on top when the announcement was made by IOC President Jacques Rogge in Singapore.

Sample text 2: This was the fourth bid from Britain. London will become the first city to have hosted the Olympics three times.

Sample text 3:This was the fourth bid from Britain. London will become the first city to have hosted the Olympics three times.

production rules and their counts are:

p1: Sn1 → [London] [came] [out] [on] [top] [when] [the] [announcement] [was] [made] [by] [IOC] [President] [Jacques] [Rogge] [in] [Singapore], (C(p1) = 1)

p2 : S → Sn1 , (C(p2) = 1)p3 : Sn2 →[This] [was] [the] [fourth] [bid] [from] [Britain], (C(p3) = 2)p4 : Sn3 → [London] [will] [become] [the] [first] [city] [to] [have] [hosted] [the]

[Olympics] [three] [times], (C(p4) = 2)p5 : S → Sn1 Sn2, (C(p5) = 2)

Page 29: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

29

Language Structure using Fuzzy Similarity

• Model Construction– Grammar Learning– Grammar Models of Practical Interest– Regular Language Learning

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaneen ] – Alignment Profile

• Alignment Profile as a Fuzzy Set– Profile Similarity

• Profile Similarity formulated in terms of Fuzzy Operations– Indistinguishable Grammar Symbols

• Profile Similarity and Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Concluding Remarks

• Research Interests

Page 30: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

30

Profile Based Alignment Learning: PBAL Framework

Makes use of the following concepts:– Slot Alignment Score– Sentence Similarity– Sentence Similarity Threshold– Dynamic Sentence Similarity Threshold

STEPS in Our PBAL Framework:• Find pairwise alignment for each sentence pair• Calculate and accumulate alignment counts (slot alignment,

sentence similarity)• Calculate alignment profile and Profile similarities• Redo the alignment until there is no further change• Extract grammar by extracting non-terminal for each slot

Page 31: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

31

Experiments, and Example Rules Generated

For experiments and testing, we used• CHILDES database: a collection of child-related speech• sample from the English-American corpora• No editing / cleanup is done and used directly for alignment• Number of Sentences: more than 10,000

One sample rule is:

don't put that in the {1155}[dough ] (+)oh lets put that in the {1155}[middle ]

where {1155} is the ID of nonterminal of CFG, with [dough] and [middle] identified to be generated from {1155}.

(the number in bracket is the number of times the rule is used. ):\769 = \437 \510 \544 \757 .(1)

CHILDES Reference: B. MacWhinney, The CHILDES Project: Tools for analyzing talk, third ed., Lawrence Erlbaum Associates, Mahwah, NJ, 2000

Page 32: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

32

Rules Generated: Example\769 = \437 \510 \544 \757 .(1)

\769 = \470 \515 \508 \758 \534 \466 \759 \470 \515 .(1)\769 = \578 \760 \439 \399 \761 .(1)\769 = \478 \499 \617 \484 \510 \490 \762 \469 \490 \475 \527 \647 \465 .(1)\769 = \385 \473 \763 \764 \445 \765 .(1)\769 = \394 \768 \402 \533 \766 \765 \647 .(1)\769 = \552 \442 \766 \765 \647 .(1)\769 = \552 \606 \406 .(1)\1073 = \594 .(1)\1072 = \635 .(1)\1073 = \679 .(1)\1072 = \680 .(1)\1078 = \419 .(1)\1078 = \431 .(1)\1079 = \450 \542 .(1)\1080 = \675 .(1)\1080 = \580 \507 \678 .(1)\1081 = \450 .(1)\1079 = \733 \734 .(1)\1081 = \465 .(1)\1090 = \386 \387 \388 \389 .(1)\1092 = \497 .(1)\1091 = \487 \498 .(1)\1093 = \390 \516 .(1)\1094 = \526 .(1)\1094 = \527 \528 .(1)\1090 = \536 \537 \537 \538 \521 .(1)

\1092 = \582 .(1)\1091 = \599 \439 .(1)\1095 = \435 \460 \545 \414 \530 .(1)\1095 = \416 \622 \516 .(1)\1093 = \400 \647 \646 .(1)\1097 = \515 .(1)\1096 = \503 .(1)\1098 = \460 .(1)\1097 = \563 \690 .(1)\1096 = \691 .(1)\1098 = \414 .(1)\1099 = \385 \466 \437 .(1)\1099 = \484 \401 .(1)\1120 = \460 .(1)\1120 = \1098 .(1)\1124 = \400 \564 .(1)\1123 = \565 .(1)\1124 = \390 .(2)\1123 = \1095 .(2)\1129 = \668 \544 \440 \669 .(1)\1129 = \536 \766 \765 \767 .(1)

Page 33: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

33

Precision and Number of Rules Generated by Different PBAL Formulations

9996.93%Dynamic SST PBAL

6094.97%R_Score based PBAL

3491.47%PBAL SA

2291.36%PBAL NSA

1485.82%ABL SST

# of Rules Generated

PrecisionFormulation

Page 34: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

34

Language Structure using Fuzzy Similarity

• Model Construction– Grammar Learning– Grammar Models of Practical Interest– Regular Language Learning

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen ] – Alignment Profile

• Alignment Profile as a Fuzzy Set– Profile Similarity

• Profile Similarity formulated in terms of Fuzzy Operations– Indistinguishable Grammar Symbols

• Profile Similarity and Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Concluding Remarks

• Research Interests

Page 35: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

35

Language Structure Learning: Concluding Remarks

• Grammar Learning: computationally hard – More tractable with “additional information” given in terms of:

• ‘negative’ examples• ‘structural’ information

• Learning of Regular Grammars: – many approaches have been proposed by researchers– We have developed one approach based on “Version Spaces”:

useful for incremental learning

• Learning of Context Free Grammars: – Researchers have investigated approaches based on soft-

computing models– We extended Zaanen’s Alignment Based Learning (ABL) to make

use of “Profile Based Alignment Learning”

• Applications: • one area: Games

Page 36: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

36

Questions ? …

…• More information:

– Web: www/ntu/edu/sg/home/asnarendra• Contact

– Email: [email protected] Emails: [email protected],

[email protected]

Page 37: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

37

Research Interests

• Algorithms– Graph Isomorphism– Parsing of Context Free Grammars (CFGs)

• Soft Computing– Specialized Recurrent Neural Networks (RNNs) for protein secondary

Structure (PSS) Prediction – Long-Short Term Memory (LSTM) Networks– Segmented Memory Recurrent Neural Networks (SMRNNs)– Bidirectional SMRNNs (BSMRNNs)

– Binary Neural Networks (BNNs)– Construction Method(s)

– Grammar Learning– For Regular Grammars: Use of Version Spaces– For CFGs: Use of Profile Based Alignment Learning

• Simulations and Games• BSP for urban Terrain Modeling• Game AI: non-conventional

Page 38: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

38

• Non-conventional Game AI– Games with Computational Models

• Computational Learning• Soft-computing• Neural Networks and Binary Neural Networks

• Concluding Remarks

Non-conventional Game AI: Computational Learning, and Computer

Science Models: Outline

Page 39: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

39

Computational Models: Formal Language

• Language Structures– Verbal Explanation

(Study of Forms of Language Structures)

• Formal Languages • Chomsky’s classification:

– Regular (Type 3) Languages– Context Free (Type 2) Languages– Context Sensitive Languages– Phrase Structured Languages

Page 40: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

40

Table (represents mapping) - …(phoneme – Viseme)Finite Automaton (FA) - regular languagesPush down automata - Context-freeLinear Bounded Automata - Context sensitiveTuring machines - Phrase-structured lgs

Automaton Model

Formal Languages they represent

Page 41: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

41

Our Work: Computational Learning - I

• Learning of Regular Grammars• Through “Reversible Automata”• Construction through

– Positive Examples, and– Negative Examples

• “Optimal Construction” remains NP-Complete– However, “good” algorithms possible for Construction

• Games with such learning models

Web: http://www.ntu.edu.sg/home/asnarendra

Page 42: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

42

Our Work: Computational Learning - II

• Learning of Context Free Grammars• Through “Structured Examples”• Construction through

– Positive “structured” Examples, and– Negative “structured” Examples

• Use of Soft-Computing Approaches for learning of Structures

• PBAL Based Approach for CFG Learning

• Games with such learning models

Page 43: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

43

Soft Computing

• Current state of soft computing: • collection of the following techniques

– Fuzzy Logic – Neural Networks– Evolutionary Computing Techniques inspired from behavioral

studies like• Genetic Algorithms• ant colony optimization• small world theory• theory of memes

Page 44: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

44

Neural Networks

• Models cortical structures of the brain. • Neurons - interconnected processing

elements, that work together to produce output.

• Co-operation - output relies on the cooperation of the individual neurons within the network to operate.

• Parallel Processing: Processing of information is parallel (to contrast with sequential nature of Turing Model).

Page 45: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

45

Neural Networks• Difficulties:• Initialization of network's starting structure, • training parameters, and, • weight updates• Other factors:

– Number of hidden layers– Number of hidden nodes– Initial weights

Some (famous) Solutions:– Self Organizing Maps– Binary Neural Networks

Page 46: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

46

Binary Neural Networks (BNNs)

• Hard-limiter nonlinearity

• Our Contribution:Constructive methods for

– Number of hidden layers– Number of hidden nodes

• Based on geometric (hypersphere) concepts, we have developed BNN construction methods.

±1Sum

Threshold = +1

w1= +1

w2= +2

Page 47: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

47

• Non-conventional Game AI– Games with Computational Models

• Computational Learning• Soft-computing• Neural Networks and Binary Neural Networks

• Concluding Remarks

Future Game AI: Computational Learning,

and Computer Science Models: Outline

Page 48: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

48

Non-conventional Game AI: Concluding Remarks

• Two Computational Models– Chomsky’s Grammars: Formal Languages– Automata Models: FA, PDA, TMs

• Existing Computational Learning Techniques allow automatic construction of – Regular Languages, and,– Part of Context Free Languages– Algorithms for such Constructions remain Compute Intensive,

specially for Context Free Languages

Page 49: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

49

Non-conventional Game AI: Concluding Remarks: Continued …

• Neural Network Construction: Trial & Error, and Time Consuming– “constructive methods”

• Many “specialized” variants available • GA based Approaches: NEAT (2004), RT-NEAT (2006),

etc• Binary Neural Networks

– Construction methods based on Geometric Space Expansion

• Future Game AI– would integrate these technologies

Page 50: Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. ... – Profile Similarity – Indistinguishable Grammar Symbols – Profile

50

Questions ? …

Email: [email protected]; Personal Emails: [email protected], [email protected]

Web: www.ntu.edu.sg/home/asnarendra