euler: a logic‐based toolkit for aligning & reconciling multiple taxonomic perspectives
DESCRIPTION
CIRSS (Center for Informatics Research in Science and Scholarship) Seminar talk given on Sept. 19, 2014 at GSLIS, UIUC. http://cirssweb.lis.illinois.edu/Events/eventDetails.php?id=214TRANSCRIPT
![Page 1: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/1.jpg)
Euler: A Logic-‐Based Toolkit for Aligning and Reconciling Mul:ple Taxonomic
Perspec:ves
Mingmin Chen1 Shizhuo Yu1 Parisa Kianmajd1 Nico Franz2 Shawn Bowers3 Bertram Ludäscher 4
1 Dept. of Computer Science , University of California, Davis 2 School of Life Sciences, Arizona State University 3 Dept. of Computer Science, Gonzaga University
4 GSLIS & NCSA, University of Illinois at Urbana-‐Champaign
![Page 2: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/2.jpg)
Outline • Meet Nico, Curator of Insects
• TAP: The Taxonomy Alignment Problem
• Euler/X – Logic Inside! (X in FOL, RCC, ASP)
• Related Projects
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 2
![Page 3: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/3.jpg)
Meet Prof. Nico Franz: Curator of Insects @ ASU
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 3
![Page 4: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/4.jpg)
What Nico et al. do for a living …
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 4
![Page 5: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/5.jpg)
Perelleschus salpinflexus sec. Franz & Cardona-‐Duque (2013) DOI:10.1080/14772000.2013.806371
1 Input ar:cula:ons: Franz & Cardona-‐Duque. 2013. Descripaon of two new species and phylogeneac reassessment of Perelleschus Wibmer & O'Brien, 1986 (Coleoptera: Curculionidae), with a complete taxonomic concept history of Perelleschus sec. Franz & Cardona-‐Duque, 2013. 2013. Systema5cs and Biodiversity 11: 209–236. Merge analyses: Franz et al. 2014. Reasoning over taxonomic change: exploring alignments for the Perelleschus use case. PLoS ONE. (in press)
Use Case: Perelleschus sec. 2001 & 2006 1
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 5
![Page 6: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/6.jpg)
T1: Perelleschus sec. 2001 • Phylogeneac revision • 8 ingroup species concepts • 2 outgroup concepts • 18 concepts total
T2: Perelleschus sec. 2006 • Exemplar analysis • 2 ingroup species concepts • 1 outgroup concept • 7 concepts total
Goal: Align two phylogenies with differen:al taxon sampling
Source: Nico Franz. Explaining taxonomy's legacy to computers – how and why? The Meaning of Names: Naming Diversity in the 21st Century, Museum of
Natural History, U of Colorado, 9/30/2014.
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 6
![Page 7: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/7.jpg)
What Nico does for a living (cont’d): The Indoors Part
• Go fun places, find new bugs, study them … – “Bugs-‐R-‐Us” (see taxonbytes.org)
• Now: Compare, align and revise taxonomies, based on careful observaaon, “character” data, experase …
• Formally: – Input: T1 + T2 (taxonomies) + A (expert ar3cula3ons)
– Output: revised, “merged” taxonomy (-‐ies) T3
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 7
![Page 8: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/8.jpg)
• Given: – Taxonomies T1 , T2
• incl. constraints (coverage, disjointness) – Set of articulations (an alignment) A
• Find: – Combined (“merged”) taxonomy T3 (= T1 + T2 + A)
• Is it a taxonomy? Or a DAG? – Optional:
• Final alignment (should be minimal)
Taxonomy Alignment Problem (TAP)
T1
T2
T3 A
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 8
![Page 9: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/9.jpg)
Real Example: Turn this …
1.16
1.17
1.20
2.40
< OR ==
1.18
1.19
2.41
==
1.14
1.15
2.36
!
2.38
< OR ==
2.39
==
1.12
1.13
1.12L
!
2.37
==
1.11
2.42
==
2.43
==
1.27 2.50==
1.23
1.25
1.24
2.53
> OR !
2.52
> OR !
2.47
< OR ==
2.54
> OR !
1.22
2.46
==
1.21
2.45
==
2.44
< OR ==
1.26 2.49==2.48
==2.51
==
2.35
2.36L
Nodes
1 18
2 21
Edges
isa_1 17
isa_2 20
Art. 20
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 9
![Page 10: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/10.jpg)
… into this! (Perellescus Alignment Result)
• T3 := T1 and T2 are “merged” – Blue dashed: overlaps è resolve via “zoom-in view”
1.16
1.14
2.40
2.44
2.47
1.11
2.382.35
1.20
1.23
2.52
2.53
2.54
1.172.41
1.222.46
1.252.48
1.122.36
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.212.45
1.12L2.36L
1.272.50
1.242.51
Nodes
Taxonomy 1 5Taxonomy 2 8
MERGED Taxa 13 Edges
Overlaps 10Input 24
INFERRED 5
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 10
![Page 11: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/11.jpg)
So how does it work? • If you have 3 concepts A, B, and C. • Assume you know something about
– A óR1 B (e.g. R1: A is a subset of B) – B óR2 C (e.g., R2: B is disjoint from C)
• Now what can you say about this: – A óR3 C
• Yes ?? • … it follows that R3: A is disjoint from C!
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 11
![Page 12: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/12.jpg)
Ar:cula:on Language (RCC-‐5) • How does the expert express the known (or assumed) relaaonship between taxa A and B?
• How can A and B be related? • Use basic set constraints (B5):
– A = B (equals EQ) (==) – A < B (proper part of PP) (<) – A > B (inverse proper part of IPP) (>) – A o B (paraally overlaps PO) (><) – A ! B (disjoint “region” DR) (!)
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 12
![Page 13: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/13.jpg)
Taxonomies and Ar:cula:ons in Euler
There are 32 (= 25) possible disjunc:ons for represenang par:al informa:on.
A taxonomy T is a triple (N, ≼, ϕ) with names (taxa) N, a paraal order (is-‐a) ≼, and taxonomic constraints ϕ.
• Sibling Disjointness: sibling taxa do not overlap • (Parent) Coverage: The union of the children “covers” the
parent è no “missing” children
A B
(iv) par5al overlap
A B
(ii) proper part
B A
(iii) Inverse proper part
A B
(i) congruence
A B
(v) disjointness
An ar:cula:on is a relaaon (set-‐constraint) between taxa A and B. One, and only one, of the following base relaaons B5 must hold:
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 13
![Page 14: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/14.jpg)
R32 lahce of 32 (=25) disjunc:ons over B5
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 14
= < > o !(TRUE)
= < > != < > o < > o != > o != < o !
= < > < > != > != < ! < > o= > o= < o > o !< o != o !
< >= >= < > !< != ! > o< o= o o !
><= !o
∅(FALSE)
= EQ(x,y) Equals< PP(x,y) Proper Part of> iPP(x,y) Inverse Proper Parto PO(x,y) Partially Overlaps! DR(x,y) Disjoint from
Level 1(BASE-5 relations)
Level 2
Level 3
Level 4
Level 5(tautology)
Level 0(contradiction)
![Page 15: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/15.jpg)
• … Aristotle … • … Euler … • … • … Greg Whitbread …
• [BPB93] J. H. Beach, S. Pramanik, and J. H. Beaman. Hierarchic taxonomic databases.,Advances in Computer Methods for Systematic Biology: Artificial Intelligence, Databases, Computer Vision, 1993
• [Ber95] Walter G. Berendsohn. The concept of “potential taxa” in databases. Taxon, 44:207–212, 1995.
• [Ber03] Walter G. Berendsohn. MoReTax – Handling Factual Information Linked to Taxonomic Concepts in Biology. No. 39 in Schriftenreihe für Vegetationskunde. Bundesamt für Naturschutz, 2003.
• [GG03] M. Geoffroy and A. Güntsch. Assembling and navigating the potential taxon graph. In [Ber03], pages 71–82, 2003.
• [TL07] Thau, D., & Ludäscher, B. (2007). Reasoning about taxonomies in first-order logic. Ecological Informatics, 2(3), 195-209.
• [FP09] Franz, N. M., & Peet, R. K. (2009). Perspectives: towards a language for mapping relationships among taxonomic concepts. Systematics and Biodiversity, 7(1), 5-20.
• …
15
Some History
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014
![Page 16: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/16.jpg)
What’s in a name? Euler Diagrams
• Project named after Euler Diagrams: IF A is-a B AND C and B are disjoint ------------------------------------ THEN: A and C are disjoint!
16 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014
![Page 17: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/17.jpg)
Euler Diagrams asTrees (or Graphs) A containment hierarchy (taxonomy)
An equivalent graph (w/ transi5ve edges)
same informa:on
17 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014
![Page 18: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/18.jpg)
Represent Phylogenies as Trees …
T1: Perelleschus sec. 2001 • Phylogeneac revision • 8 ingroup species concepts • 2 outgroup concepts • 18 concepts total
1.16
1.17 1.20
1.18 1.19
1.14
1.15
1.12
1.13 1.12L
1.11
1.27
1.23
1.25 1.24
1.22 1.21
1.26
2.41
2.42 2.43
2.35
2.36 2.38
2.37 2.36L 2.39 2.40
2.53
2.52
2.54
2.51
2.50
2.44
2.45 2.46 2.47
2.48
2.49
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 18
![Page 19: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/19.jpg)
… for all taxonomies of interest …
1.16
1.17 1.20
1.18 1.19
1.14
1.15
1.12
1.13 1.12L
1.11
1.27
1.23
1.25 1.24
1.22 1.21
1.26
2.41
2.42 2.43
2.35
2.36 2.38
2.37 2.36L 2.39 2.40
2.53
2.52
2.54
2.51
2.50
2.44
2.45 2.46 2.47
2.48
2.49
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 19
![Page 20: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/20.jpg)
… ready, rotate by 90o, set …
1.16
1.17 1.20
1.18 1.19
1.14
1.15
1.12
1.13 1.12L
1.11
1.27
1.23
1.25 1.24
1.22 1.21
1.26
2.41
2.40
2.35
2.37
2.36
2.39
2.38
2.53
2.52
2.47
2.51
2.50
2.48
2.422.43
2.44
2.452.46
2.49 2.54
2.36L
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 20
![Page 21: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/21.jpg)
Go! An expert input alignment! Just add some Euler Reasoning …
1.16
1.17
1.20
2.40
< OR ==
1.18
1.19
2.41
==
1.14
1.15
2.36
!
2.38
< OR ==
2.39
==
1.12
1.13
1.12L
!
2.37
==
1.11
2.42
==
2.43
==
1.27 2.50==
1.23
1.25
1.24
2.53
> OR !
2.52
> OR !
2.47
< OR ==
2.54
> OR !
1.22
2.46
==
1.21
2.45
==
2.44
< OR ==
1.26 2.49==2.48
==2.51
==
2.35
2.36L
Nodes
1 18
2 21
Edges
isa_1 17
isa_2 20
Art. 20
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 21
![Page 22: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/22.jpg)
Euler/X toolkit in a single screenshot (desktop version, IX-‐2014)
![Page 23: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/23.jpg)
… et voilà! The merged T3 (=T1 & T2 & A)
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 23
The Euler reasoner(s) infer: -‐ Grey: “perfect match” (congruences) -‐ Green, Yellow: “keepers” from T1, T2 -‐ Red edges: deduced subset/“sub-‐class”relaaons -‐ Blue edges: deduced overlaps
1.16
1.14
2.40
2.442.47
1.11
2.38
2.35
1.20
1.23
2.52
2.53
2.54
1.172.41
1.222.46
1.252.48
1.122.36
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.212.45
1.12L2.36L
1.272.50
1.242.51
![Page 24: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/24.jpg)
1.16
1.14
2.402.44
2.47
1.11
2.38
1.12
2.35
2.36
2.36L
1.12L
1.20
1.23
2.52
2.53
2.54
1.172.41
1.252.48
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.222.46
1.212.45
1.272.50
1.242.51
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 24
But wait: PW1 …
![Page 25: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/25.jpg)
1.16
1.14
2.402.44
2.47
1.11
2.38
1.122.35
1.12L
2.36
1.20
1.23
2.52
2.53
2.54
2.36L
1.172.41
1.252.48
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.222.46
1.212.45
1.272.50
1.242.51
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 25
… PW2
![Page 26: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/26.jpg)
1.16
1.14
2.40
2.442.47
1.11
2.38
1.122.36
2.36L
2.35
1.12L
1.20
1.23
2.52
2.53
2.54
1.172.41
1.252.48
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.222.46
1.212.45
1.272.50
1.242.51
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 26
… PW3
![Page 27: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/27.jpg)
1.16
1.14
2.40
2.442.47
1.11
2.38
2.35
1.20
1.23
2.52
2.53
2.54
1.172.41
1.222.46
1.252.48
1.122.36
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.212.45
1.12L2.36L
1.272.50
1.242.51
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 27
… PW4
![Page 28: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/28.jpg)
1.16
1.14
2.40
2.44
2.47
1.11
2.38
1.12
2.35
2.36
1.12L
1.20
1.23
2.52 2.53
2.54
2.36L
1.172.41
1.252.48
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.222.46
1.212.45
1.272.50
1.242.51
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 28
… PW5
![Page 29: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/29.jpg)
1.16
1.14
2.40
2.442.47
1.11
2.38
1.122.36
2.36L
2.35
1.12L
1.20
1.23
2.52
2.53
2.54
1.172.41
1.252.48
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.222.46
1.212.45
1.272.50
1.242.51
Hmmm… depending on input alignment: PW1
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 29
![Page 30: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/30.jpg)
1.16
1.14
2.40
2.442.47
1.11
2.38
2.35
1.20
1.23
2.52
2.53
2.54
1.172.41
1.222.46
1.252.48
1.122.36
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.212.45
1.12L2.36L
1.272.50
1.242.51
… and PW2 are the only solu:ons!
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 30
What happened?
![Page 31: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/31.jpg)
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 31
![Page 32: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/32.jpg)
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 32
![Page 33: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/33.jpg)
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
1.b2.e
1.c
1.a2.d
2.f
Ambiguous! è Mul5ple Possible Worlds
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 33
![Page 34: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/34.jpg)
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
1.b2.e
1.c
1.a2.d
2.f
Ambiguous! è Mul5ple Possible Worlds
1.c2.f
1.b
1.a2.d
2.e
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 34
![Page 35: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/35.jpg)
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
1.b2.e
1.c
1.a2.d
2.f
Ambiguous! è Mul5ple Possible Worlds
1.c2.f
1.b
1.a2.d
2.e
1.b1.a
2.e
2.d1.c
2.fB. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 35
![Page 36: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/36.jpg)
• FO reasoning about taxonomies (MFOL)
• Earlier: CleanTax – Prover9/Mace4
• Now: Euler – ASP Reasoners (DLV,
Clingo) – Specialized reasoners
(PyRCC) – … – X = ASP, RCC, …
Euler/X Toolkit and Workflow
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 36
![Page 37: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/37.jpg)
Reducing Ambiguity
Possible Worlds (PWs) View
Aggregate View (AV) Cluster View
(CV)
Explore!
37 B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014
![Page 38: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/38.jpg)
Common Outcome: Inconsistency!
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
• Need to debug the input araculaaons è (black-‐box) diagnosis!
• Focus: – How do we efficiently compute the diagnosac lauce?
• Also: – How to visualize..
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 38
![Page 39: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/39.jpg)
A Hybrid Diagnosis Approach Combining Black-‐Box and White-‐Box Reasoning
Mingmin Chen1 Shizhuo Yu1 Nico Franz2 Shawn Bowers3 Bertram Ludäscher 4
1 Department of Computer Science , University of California, Davis 2 School of Life Sciences, Arizona State University
3 Department of Computer Science, Gonzaga University 4 GSLIS & NCSA, University of Illinois at Urbana-‐Champaign
![Page 40: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/40.jpg)
Example Instance (from syntheac benchmark suite)
• Here: N = 10 taxa in T1, T2 • Euler/X finds:
inconsistent! • è diagnos:c lahce of 210
= 1024 nodes è Find minimal inconsistent
subset (MIS) è maximal consistent subset
(MCS) .. è show to user!
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 40
![Page 41: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/41.jpg)
Visualizing Diagnoses
N = 10 araculaaons è 210 = 1024 node diagnosac lauce B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 41
![Page 42: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/42.jpg)
Bener Idea: Just show MIS, MCS
N = 4 araculaaons è 24 = 16 node diagnosac lauce, but 3 MCS and 2 MIS are enough!
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 42
![Page 43: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/43.jpg)
Visualizing Diagnoses
.. but 4 MCS and 1 MIC tell it all!
1024 node lauce
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 43
![Page 44: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/44.jpg)
Visualizing Diagnoses Example from RuleML’14 paper: N=12 è 4096 nodes .. but 7 MCS and 5 MIC tell it all!
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 44
![Page 45: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/45.jpg)
Black-Box Inconsistency Analysis (Diagnostic Lattice)
• Then: – Repair: find & revise minimal inconsistent subsets (Min-Incons) – Expand: find maximal consistent subsets (Max-Cons) & revise outs
What happens if you can’t have all (here: 4) articulations together?
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 45
![Page 46: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/46.jpg)
• Black-‐box Analysis (Hiung Set algo.) yields a Diagnosis (lauce) – for n=4 araculaaons, there are 168 possible diagnoses – depending on expected “red/green areas” è explore space differently
• |araculaaons| = n è |possible diagnoses| = |monotonic Boolean funcaons| = Dedekind Number (n): 2, 3, 6, 20, 168, 7581, 7828354, ...
Inconsistency Analysis (Diagnostic Lattice)
• The Min-Incons (MIS) and Max-Cons (MCS) sets determine all others
è Repair MIS and/or Expand MCS
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 46
![Page 47: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/47.jpg)
Improving Diagnosis • Reiter’s “black-‐box” (model-‐based) diagnosis helps debug the araculaaons
• Limited scalability (inherent complexity) • But every bit helps:
– Hiung Set Algorithm (“logarithmic extracaon”)
• Our idea: – Exploit “white-‐box” reasoning informaaon è RULES to the rescue
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 47
![Page 48: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/48.jpg)
Key Idea: exploit white-‐box info • We use Answer Set Programming (ASP) to solve Taxonomy Alignment Problem (TAP)
• Inconsistency = “False” is derived in the head: False :-‐ <denial of integrity constraint>
• Apply provenance trick from databases J – What araculaaons contribute to a derivaaon of “False” ? – Eliminate those that don’t! è an example of reusing inferences across separate black-‐box tests!
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 48
![Page 49: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/49.jpg)
The Provenance “Trick”
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 49
![Page 50: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/50.jpg)
Hybrid Provenance
A3: c < f Black-‐box Provenance
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 50
![Page 51: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/51.jpg)
Hybrid Provenance
A3: c < f Black-‐box Provenance
r7: d = e ∪ f
a = e ∪ f
A1: a = d
r3: a = b ∪ c
f < c
r4: b ∩ c = ∅ r8: e ∩ f = ∅ A2: b < e
A1+A2 + … => f < c
White-‐box Provenance
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 51
![Page 52: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/52.jpg)
The Hybrid Approach
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 52
![Page 53: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/53.jpg)
Hybrid Approach
What ar5cula5ons contribute to some inconsistency?
Good old black-‐box (HST)
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 53
![Page 54: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/54.jpg)
Benchmark Results
• White-‐box < Hybrid < Black-‐box (runames) • Note: white-‐box does not give you a diagnosis • Potassco < DLV
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 54
![Page 55: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/55.jpg)
Benchmark DLV
• White-‐box < Hybrid < Black-‐box (runames) • Potassco < DLV
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 55
![Page 56: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/56.jpg)
Benchmark Clingo
• White-‐box < Hybrid < Black-‐box (runames) • Potassco < DLV
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 56
![Page 57: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/57.jpg)
Summary: Hybrid Diagnosis • ASP rules can be used to efficiently solve real-‐world taxonomy reasoning problems
• Reiter’s diagnosis useful to debug inconsistent alignments
• Adding a “white-‐box” provenance approach speeds up state-‐of-‐the-‐art HST algorithm by elimina:ng independent ar:cula:ons
• Future work: – Further improvements, including parallelism:
• Trade-‐off with sharing inferences across parallel instances
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 57
![Page 58: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/58.jpg)
Related Projects
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 58
![Page 59: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/59.jpg)
The Data Life Cycle
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 59
![Page 60: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/60.jpg)
Data Quality & Curation Workflows • Collections & occurrence data is
all over the map – … literally (off the map!)
• Issues: – Lat/Long transposition,
coordinate & projection issues – Data entry/creation, “fuzzy”
data, naming issues, bit rot, data conversions and transformations, schema mappings, … (you name it)
• Filtered-Push Collaboration
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 60
![Page 61: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/61.jpg)
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 61
![Page 62: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/62.jpg)
Filtered-Push: Kurator (Data Curation Workflows)
Tianhong Song
Lei Dou (former member)
Sven Köhler
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 62
![Page 63: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/63.jpg)
From Tool Users to Tool Makers
Screen capture… back to the original definition
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 63
![Page 64: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/64.jpg)
Theory meets Prac:ce
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 64
![Page 65: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/65.jpg)
Under the hood: Logic (ASP)
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 65
![Page 66: Euler: A Logic‐Based Toolkit for Aligning & Reconciling Multiple Taxonomic Perspectives](https://reader033.vdocuments.site/reader033/viewer/2022042614/556da1bad8b42a875d8b4641/html5/thumbnails/66.jpg)
Summary & Invita:on • Building open source tools for
– Euler: Reasoning about taxonomies (& data integraaon) – Kurator: Data Curaaon workflows
• … and other scienafic workflows
• Topic not covered: – (Game) Theory of Provenance (DAIS talk @CS, 10/7/2014)
• Looking for: – new collaborators, students, ..
• Let’s meet! – [email protected]
B. Ludäscher Euler: Reasoning about Taxonomies CIRSS Seminar 9/19/2014 66