a semantic search component for bexis 2 - uni-jena.de€¦ · a semantic search component for bexis...

Post on 18-Oct-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A Semantic Search Component for BExIS 2

Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta König Ries‐

BExIS DevConf 2017

Keyword-Based Search

Keyword-Based Search

supports search queries of the type

“Find me all datasets that mention keyword1, keyword2 and … .”

Keyword-Based Search

supports search queries of the type

“Find me all datasets that mention keyword1, keyword2 and … .”

does not work, if search terms are not explicitly mentioned in the (meta-)data

Example

Searchdata referring to alkaline milieu

samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm23 S1 10.12.2011 15:00 7,1 0,01 0,006 9,2 8,9 2,418 S2 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,34 S3 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,87 S1 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1

12 S2 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,335 S3 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,15 S4 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,28 S2 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,3

9 S3 30/10/2011 15:30 7,1 0 0,007 9,3 8,7 23,517 S4 11.01.2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8

Example

Searchdata referring to alkaline milieu

samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm23 S1 10.12.2011 15:00 7,1 0,01 0,006 9,2 8,9 2,418 S2 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,34 S3 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,87 S1 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1

12 S2 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,335 S3 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,15 S4 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,28 S2 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,3

9 S3 30/10/2011 15:30 7,1 0 0,007 9,3 8,7 23,517 S4 11.01.2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8

alkalineis a Thing which has a pH value > 7

Keyword-Based Search

supports search queries of the type

“Find me all datasets that mention keyword1, keyword2 and … .”

works, if search terms are explicitly mentioned in the (meta-)data

is fast, even if primary data are indexed

Challenge

provide a mechanism for knowledgeable data search, which supports other kinds of relevant queries

(Has species XY been observed in soil?) is also applicable to unstructured data

Semantic Search

Searchkeywords

samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm23H43 12/10/2011 15:00 7,1 0,01 0,006 9,2 8,9 2,418H41 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,34 H51 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,87 H43 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1

12H41 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,335H51 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,15 H51 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,28 H51 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,39 H43 30/10/2011 15:30 7,1 0 0,007 9,3 8,7 23,5

17H41 01/11/2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8

S2S1

located inTranslate

Data

BechstedterGrund

alkaline is ...

Knowledge Base reasoner

Knowledge Base & Semantic Annotation

oboe-temporal

ChEBI-light-module

oboe-chemistry

ENVO

OBI

NCIT-module

Knowledge Base & Semantic AnnotationSF_ml Tree-No Species TreeCircum_m BHD_m TreeHeight_m

212 56 Birke 1,56 0,48 26,3435 28 Eiche 1,19 0,35 24,8662 34 Eiche 1,55 0,52 28,4643 96 Buche 1,43 0,43 31,72334 12 Buche 1,57 0,49 31,87

1,57

31,87

334

Knowledge Base & Semantic AnnotationSF_ml Tree-No Species TreeCircum_m BHD_m TreeHeight_m

212 56 Birke 1,56 0,48 26,3435 28 Eiche 1,19 0,35 24,8662 34 Eiche 1,55 0,52 28,4643 96 Buche 1,43 0,43 31,72334 12 Buche 1,57 0,49 31,87

334

Virtual ABox

CHARACTERISTIC-TYPE:crct_{crct_id} a <{crct}> .SELECT DISTINCT crct, chrct_id FROM annotation

MEASUREMENT_TYPE-and-VALUE:msmt_{value} a oboe-core:Measurement;oboe-core:hasValue {value}^^xsd:string . SELECT DISTINCT value FROM measurement_valuesM

AP

PIN

GS

1,57

31,87

334

SPARQL Endpoint

Quest Reasoner

Annotation+

Supported Queries

Show me all datasets that refer to measurements of the nitrate concentration in groundwater.

At which locations within the Hainich transect has soil moisture been measured?

On which dates has the concentration of nitrate been measured at well H31?

Has species XY been observed in aquifer moTK?

Which characteristics have been measured at H31?

User Interface

From Keywords to SPARQL

Searchgroundwater concentration_of nitrate

EntityTypeClass CharacteristicClass EntityTypeClass

MA

PP

ING

S

SELECT DISTINCT ?dsetWHERE { ?dset ad:refersToObservationType ?obstype. ?obstype ad:refersToEntityType ?enttype. ?enttype a <http://../someontology.owl#Nitrate>. ?obstype ad:refersToMeasurementType ?meastype. ?meastype ad:ofCharacteristic ?crct. ?crct a <http://../someontology.owl#concentration_of>}

SELECT DISTINCT ?dsetWHERE { ?dset ad:refersToObservationType ?obstype. ?obstype ad:refersToEntityType ?enttype. ?enttype a <http://../someontology.owl#Groundwater>}

Architecture

Click to add Title

MD DS PDMD PD

TBox(domain knowledge)

Virtual ABox (Ontop)

SA

Mappings

PubMed/Medline PubMed Central

Publication Search

Publication Annotator(extraction of class mentions,named entities and relations)

Query Translation

SPARQL Endpoint

Quest Reasoner (Ontop)

JCoRe GeNO LINNEAUS BioSem

BExIS 2 Search Interface(including autocompletion)

BExIS 2 Module ADOnIS

Questions?

Remarks?

Suggestions?

Ideas?

Knowledge Base & Semantic Annotation

334

Virtual ABox

CHARACTERISTIC-TYPE:crct_{crct_id} a <{crct}> .SELECT DISTINCT crct, chrct_id FROM annotation

MEASUREMENT_TYPE-and-VALUE:msmt_{value} a oboe-core:Measurement;oboe-core:hasValue {value}^^xsd:string . SELECT DISTINCT value FROM measurement_valuesM

AP

PIN

GS

1,57

31,87

334

measurement_values (materialized view)

msmt instance id valueid literal

annotations

dataset column id entity characteristic standardid IRI entity IRI characteristic IRI measurement

standard

top related