zulu: an active finite state machine learning competition

20
ICGI, Valencia, S eptember 2010 1 Zulu: an active finite state machine learning competition Valencia September 2010 Colin de la Higuera

Upload: abrial

Post on 13-Jan-2016

34 views

Category:

Documents


1 download

DESCRIPTION

Zulu: an active finite state machine learning competition. Valencia September 2010. Colin de la Higuera. General goal. http://labh-curien.univ-st-etienne.fr/ zulu To support research in DFA learning To promote active learning as an alternative to statistical learning - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

1

Zulu: an active finite state machine learning competition

ValenciaSeptember 2010

Colin de la Higuera

Page 2: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

2

Cdlh 2010

General goal

http://labh-curien.univ-st-etienne.fr/zulu

To support research in DFA learning

To promote active learning as an alternative to statistical learning

To attempt to use learning for under-resourced languages

Page 3: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

3

Cdlh 2010

State of the art (1)

1. Learning automata is a difficult but great topic, with not enough positive results (… do come this afternoon…)

2. The question of learning DFA has received attention for 30 years

3. Typical protocol consists in learning from a bunch of data: you need a lot of data if you want to learn…

Page 4: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

4

Cdlh 2010

State of the art (2)

1. Alternative introduced by Angluin: the learner can make queries to an oracle

2. Typical queries are membership q., equivalence q., subset q. or correction q.

3. Algorithm L* can learn DFA with a polynomial amount of resources

Page 5: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

5

Cdlh 2010

State of the art (3)

Many reasons for wanting to learn DFA from queries

Useful in a number of fields Start with DFA… Under-resourced languages

Page 6: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

6

Cdlh 2010

The task

The participant is told that (s)he is to learn a DFA and allowed to ask k membership queries

She is given the alphabet, k, and an upper bound on the number of states.

The participant interactively uses the online oracle, and after making k queries, is given 1800 strings that she has to parse and classify. Score is % of correct labels.

Page 7: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

7

Cdlh 2010

The baseline

Angluin’s L* algorithm learns perfectly but uses MQ and EQs

A version in which EQs are “simulated” by random sampling is provided

Page 8: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

8

Cdlh 2010

A membership query

Learner: does aababababbbab belong to the language?

Oracle: no

Page 9: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

9

Cdlh 2010

An equivalence query

Learner: Is (aa*(b+ab)*bb+aa)* the correct answer?

Oracle: No, because aabababba does belong to the language

Page 10: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

10

Cdlh 2010

Simulating an equivalence query

Random strings are sampled: aabba, bbabba, aaaababab, bbabababaaaa,…

Learner’s hypothesis: aabba L Learner: does aabba belong to L? Oracle: yes (if we agree many times I

can’t be far off) Oracle: no (aabba can be used as a

counterexample)

Page 11: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

11

Cdlh 2010

The theory

DFA are learnable with MQ and EQ DFA are not learnable from a

polynomial number of MQ You can’t really simulate the EQ

through sampling because you don’t know what the distribution is

Page 12: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

12

Cdlh 2010

The oracle (1)

is given an upper bound n on the number of queries and the size of the alphabet

generates a (minimal) DFA with at most n states

runs the baseline on this DFA and halts as soon as it is 70% correct. This gives the number of queries (k) for that task.

gives the player an identifier.

Page 13: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

13

Cdlh 2010

The oracle (2)

interacts with the learner and answers to k queries

generates 1800 strings and gives them to the learner

receives the 1800 labels and computes the score

Page 14: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

14

Cdlh 2010

Scientific committee Dana Angluin, Yale University, USA Leo Becerra Bonache, Univ. de Tarragona, Spain François Coste, IRISA, Rennes, France Alex Clark, Royal Holloway Univ. of London, UK Ricard Gavaldá, UPC Barcelona, Spain Colin de la Higuera, U. Saint-Etienne/Nantes, France Jean-Christophe Janodet, U. de Saint-Etienne, France Aurélien Lemay, Université de Lille 3, France Laurent Miclet, ENSSAT Lannion and IRISA, France Tim Oates, University of Maryland, USA Anssi Yli-Jyrä, Helsinki, Finland Menno van Zaanen, Tilburg University, The

Netherlands

Page 15: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

15

Cdlh 2010

Organisation committee

Myrtille Ponge David Combe Jean-Christophe Janodet Colin de la Higuera

Page 16: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

16

Cdlh 2010

Some open issues

How should the DFA be generated? What is a random DFA? Generate random NFA instead? Should they not be “typical DFA”?

What distribution for the test set? If the distribution is known, this helps!

How do we have a fair competition?

Page 17: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

17

Cdlh 2010

Main dates

23rd of July 2009: official launch till May 2010: advertising and training

phase June 2010: competition phase 7th July 2010: results published September 2010: Workshop / Special

session

Page 18: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

18

Cdlh 2010

Zulu competition

http://labh-curien.univ-st-etienne.fr/zulu 23 competing algorithms, 11 players End of the competition a week ago. Tasks: Learn a DFA, be as precise as possible,

with n queries

Page 19: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

19

Cdlh 2010

ResultsTask

queries

alphabet

Best%

states

Task

queries

alphabet

states Best %

1 304 3 100,00

8 13 725 15 10 100,00

2 199 3 100,00

16 14 1365 15 17 100,00

3 1197 3 96,50 81 15 5266 15 60 100,00

4 1384 3 93,22 100 16 7570 15 71 100,00

5 1971 3 85,89 151 17 17034 15 147 100,00

6 3625 3 100,00

176 18 16914 15 143 87,94

7 429 5 100,00

15 19 1970 5 93 81,67

8 375 5 100,00

18 20 1329 5 61 70,00

9 2524 5 96,44 84 21 571 5 40 69,22

10 3021 5 100,00

90 22 735 5 57 65,11

11 5428 5 99,94 153 23 483 5 73 86,61

12 4616 5 100,00

123 24 632 5 78 100,00

Page 20: Zulu: an active finite state machine learning competition

ICGI, Valencia, September 2010

20

Cdlh 2010

Winners

Falk Howar Balle Eisenstat