a semantic search component for bexis 2 - uni-jena.de€¦ · a semantic search component for bexis...

18
A Semantic Search Component for BExIS 2 Friederike Klan , Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta König Ries BExIS DevConf 2017

Upload: others

Post on 18-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

A Semantic Search Component for BExIS 2

Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta König Ries‐

BExIS DevConf 2017

Page 2: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Keyword-Based Search

Page 3: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Keyword-Based Search

supports search queries of the type

“Find me all datasets that mention keyword1, keyword2 and … .”

Page 4: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Keyword-Based Search

supports search queries of the type

“Find me all datasets that mention keyword1, keyword2 and … .”

does not work, if search terms are not explicitly mentioned in the (meta-)data

Page 5: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Example

Searchdata referring to alkaline milieu

samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm23 S1 10.12.2011 15:00 7,1 0,01 0,006 9,2 8,9 2,418 S2 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,34 S3 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,87 S1 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1

12 S2 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,335 S3 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,15 S4 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,28 S2 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,3

9 S3 30/10/2011 15:30 7,1 0 0,007 9,3 8,7 23,517 S4 11.01.2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8

Page 6: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Example

Searchdata referring to alkaline milieu

samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm23 S1 10.12.2011 15:00 7,1 0,01 0,006 9,2 8,9 2,418 S2 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,34 S3 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,87 S1 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1

12 S2 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,335 S3 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,15 S4 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,28 S2 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,3

9 S3 30/10/2011 15:30 7,1 0 0,007 9,3 8,7 23,517 S4 11.01.2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8

alkalineis a Thing which has a pH value > 7

Page 7: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Keyword-Based Search

supports search queries of the type

“Find me all datasets that mention keyword1, keyword2 and … .”

works, if search terms are explicitly mentioned in the (meta-)data

is fast, even if primary data are indexed

Page 8: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Challenge

provide a mechanism for knowledgeable data search, which supports other kinds of relevant queries

(Has species XY been observed in soil?) is also applicable to unstructured data

Page 9: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Semantic Search

Searchkeywords

samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm23H43 12/10/2011 15:00 7,1 0,01 0,006 9,2 8,9 2,418H41 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,34 H51 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,87 H43 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1

12H41 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,335H51 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,15 H51 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,28 H51 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,39 H43 30/10/2011 15:30 7,1 0 0,007 9,3 8,7 23,5

17H41 01/11/2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8

S2S1

located inTranslate

Data

BechstedterGrund

alkaline is ...

Knowledge Base reasoner

Page 10: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Knowledge Base & Semantic Annotation

oboe-temporal

ChEBI-light-module

oboe-chemistry

ENVO

OBI

NCIT-module

Page 11: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Knowledge Base & Semantic AnnotationSF_ml Tree-No Species TreeCircum_m BHD_m TreeHeight_m

212 56 Birke 1,56 0,48 26,3435 28 Eiche 1,19 0,35 24,8662 34 Eiche 1,55 0,52 28,4643 96 Buche 1,43 0,43 31,72334 12 Buche 1,57 0,49 31,87

1,57

31,87

334

Page 12: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Knowledge Base & Semantic AnnotationSF_ml Tree-No Species TreeCircum_m BHD_m TreeHeight_m

212 56 Birke 1,56 0,48 26,3435 28 Eiche 1,19 0,35 24,8662 34 Eiche 1,55 0,52 28,4643 96 Buche 1,43 0,43 31,72334 12 Buche 1,57 0,49 31,87

334

Virtual ABox

CHARACTERISTIC-TYPE:crct_{crct_id} a <{crct}> .SELECT DISTINCT crct, chrct_id FROM annotation

MEASUREMENT_TYPE-and-VALUE:msmt_{value} a oboe-core:Measurement;oboe-core:hasValue {value}^^xsd:string . SELECT DISTINCT value FROM measurement_valuesM

AP

PIN

GS

1,57

31,87

334

SPARQL Endpoint

Quest Reasoner

Annotation+

Page 13: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Supported Queries

Show me all datasets that refer to measurements of the nitrate concentration in groundwater.

At which locations within the Hainich transect has soil moisture been measured?

On which dates has the concentration of nitrate been measured at well H31?

Has species XY been observed in aquifer moTK?

Which characteristics have been measured at H31?

Page 14: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

User Interface

Page 15: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

From Keywords to SPARQL

Searchgroundwater concentration_of nitrate

EntityTypeClass CharacteristicClass EntityTypeClass

MA

PP

ING

S

SELECT DISTINCT ?dsetWHERE { ?dset ad:refersToObservationType ?obstype. ?obstype ad:refersToEntityType ?enttype. ?enttype a <http://../someontology.owl#Nitrate>. ?obstype ad:refersToMeasurementType ?meastype. ?meastype ad:ofCharacteristic ?crct. ?crct a <http://../someontology.owl#concentration_of>}

SELECT DISTINCT ?dsetWHERE { ?dset ad:refersToObservationType ?obstype. ?obstype ad:refersToEntityType ?enttype. ?enttype a <http://../someontology.owl#Groundwater>}

Page 16: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Architecture

Click to add Title

MD DS PDMD PD

TBox(domain knowledge)

Virtual ABox (Ontop)

SA

Mappings

PubMed/Medline PubMed Central

Publication Search

Publication Annotator(extraction of class mentions,named entities and relations)

Query Translation

SPARQL Endpoint

Quest Reasoner (Ontop)

JCoRe GeNO LINNEAUS BioSem

BExIS 2 Search Interface(including autocompletion)

BExIS 2 Module ADOnIS

Page 17: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Questions?

Remarks?

Suggestions?

Ideas?

Page 18: A Semantic Search Component for BExIS 2 - uni-jena.de€¦ · A Semantic Search Component for BExIS 2 Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta KönigRies‐

Knowledge Base & Semantic Annotation

334

Virtual ABox

CHARACTERISTIC-TYPE:crct_{crct_id} a <{crct}> .SELECT DISTINCT crct, chrct_id FROM annotation

MEASUREMENT_TYPE-and-VALUE:msmt_{value} a oboe-core:Measurement;oboe-core:hasValue {value}^^xsd:string . SELECT DISTINCT value FROM measurement_valuesM

AP

PIN

GS

1,57

31,87

334

measurement_values (materialized view)

msmt instance id valueid literal

annotations

dataset column id entity characteristic standardid IRI entity IRI characteristic IRI measurement

standard