a semantic search component for bexis 2 - uni-jena.de€¦ · a semantic search component for bexis...
Post on 18-Oct-2020
5 Views
Preview:
TRANSCRIPT
A Semantic Search Component for BExIS 2
Friederike Klan, Alsayed Algergawy, Erik Fäßler, Udo Hahn, Birgitta König Ries‐
BExIS DevConf 2017
Keyword-Based Search
Keyword-Based Search
supports search queries of the type
“Find me all datasets that mention keyword1, keyword2 and … .”
Keyword-Based Search
supports search queries of the type
“Find me all datasets that mention keyword1, keyword2 and … .”
does not work, if search terms are not explicitly mentioned in the (meta-)data
Example
Searchdata referring to alkaline milieu
samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm23 S1 10.12.2011 15:00 7,1 0,01 0,006 9,2 8,9 2,418 S2 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,34 S3 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,87 S1 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1
12 S2 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,335 S3 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,15 S4 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,28 S2 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,3
9 S3 30/10/2011 15:30 7,1 0 0,007 9,3 8,7 23,517 S4 11.01.2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8
Example
Searchdata referring to alkaline milieu
samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm23 S1 10.12.2011 15:00 7,1 0,01 0,006 9,2 8,9 2,418 S2 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,34 S3 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,87 S1 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1
12 S2 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,335 S3 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,15 S4 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,28 S2 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,3
9 S3 30/10/2011 15:30 7,1 0 0,007 9,3 8,7 23,517 S4 11.01.2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8
alkalineis a Thing which has a pH value > 7
Keyword-Based Search
supports search queries of the type
“Find me all datasets that mention keyword1, keyword2 and … .”
works, if search terms are explicitly mentioned in the (meta-)data
is fast, even if primary data are indexed
Challenge
provide a mechanism for knowledgeable data search, which supports other kinds of relevant queries
(Has species XY been observed in soil?) is also applicable to unstructured data
Semantic Search
Searchkeywords
samp_ID loc date time pH Fe2+ NO2- temp_1 temp_2 r_mm23H43 12/10/2011 15:00 7,1 0,01 0,006 9,2 8,9 2,418H41 14/10/2011 17:30 7,2 0,03 0,004 9,4 9,1 10,34 H51 16/10/2011 11:55 7,3 0,02 0,003 10,1 8,9 1,87 H43 19/10/2011 17:25 7,3 0,04 0,004 9,6 9,2 7,1
12H41 21/10/2011 14:50 7,1 0,08 0,005 9,4 9,4 2,335H51 23/10/2011 11:40 7,1 0,15 0,003 9,6 9,3 8,15 H51 25/10/2011 15:20 7,1 0,04 0,005 10,3 9,9 21,28 H51 28/10/2011 16:45 7,1 0,02 0,006 10,1 9,8 10,39 H43 30/10/2011 15:30 7,1 0 0,007 9,3 8,7 23,5
17H41 01/11/2011 13:30 7,2 0,05 0,03 9,3 8,8 1,8
S2S1
located inTranslate
Data
BechstedterGrund
alkaline is ...
Knowledge Base reasoner
Knowledge Base & Semantic Annotation
oboe-temporal
ChEBI-light-module
oboe-chemistry
ENVO
OBI
NCIT-module
Knowledge Base & Semantic AnnotationSF_ml Tree-No Species TreeCircum_m BHD_m TreeHeight_m
212 56 Birke 1,56 0,48 26,3435 28 Eiche 1,19 0,35 24,8662 34 Eiche 1,55 0,52 28,4643 96 Buche 1,43 0,43 31,72334 12 Buche 1,57 0,49 31,87
1,57
31,87
334
Knowledge Base & Semantic AnnotationSF_ml Tree-No Species TreeCircum_m BHD_m TreeHeight_m
212 56 Birke 1,56 0,48 26,3435 28 Eiche 1,19 0,35 24,8662 34 Eiche 1,55 0,52 28,4643 96 Buche 1,43 0,43 31,72334 12 Buche 1,57 0,49 31,87
334
Virtual ABox
CHARACTERISTIC-TYPE:crct_{crct_id} a <{crct}> .SELECT DISTINCT crct, chrct_id FROM annotation
MEASUREMENT_TYPE-and-VALUE:msmt_{value} a oboe-core:Measurement;oboe-core:hasValue {value}^^xsd:string . SELECT DISTINCT value FROM measurement_valuesM
AP
PIN
GS
1,57
31,87
334
SPARQL Endpoint
Quest Reasoner
Annotation+
Supported Queries
Show me all datasets that refer to measurements of the nitrate concentration in groundwater.
At which locations within the Hainich transect has soil moisture been measured?
On which dates has the concentration of nitrate been measured at well H31?
Has species XY been observed in aquifer moTK?
Which characteristics have been measured at H31?
User Interface
From Keywords to SPARQL
Searchgroundwater concentration_of nitrate
EntityTypeClass CharacteristicClass EntityTypeClass
MA
PP
ING
S
SELECT DISTINCT ?dsetWHERE { ?dset ad:refersToObservationType ?obstype. ?obstype ad:refersToEntityType ?enttype. ?enttype a <http://../someontology.owl#Nitrate>. ?obstype ad:refersToMeasurementType ?meastype. ?meastype ad:ofCharacteristic ?crct. ?crct a <http://../someontology.owl#concentration_of>}
SELECT DISTINCT ?dsetWHERE { ?dset ad:refersToObservationType ?obstype. ?obstype ad:refersToEntityType ?enttype. ?enttype a <http://../someontology.owl#Groundwater>}
Architecture
Click to add Title
MD DS PDMD PD
TBox(domain knowledge)
Virtual ABox (Ontop)
SA
Mappings
PubMed/Medline PubMed Central
Publication Search
Publication Annotator(extraction of class mentions,named entities and relations)
Query Translation
SPARQL Endpoint
Quest Reasoner (Ontop)
JCoRe GeNO LINNEAUS BioSem
BExIS 2 Search Interface(including autocompletion)
BExIS 2 Module ADOnIS
Questions?
Remarks?
Suggestions?
Ideas?
Knowledge Base & Semantic Annotation
334
Virtual ABox
CHARACTERISTIC-TYPE:crct_{crct_id} a <{crct}> .SELECT DISTINCT crct, chrct_id FROM annotation
MEASUREMENT_TYPE-and-VALUE:msmt_{value} a oboe-core:Measurement;oboe-core:hasValue {value}^^xsd:string . SELECT DISTINCT value FROM measurement_valuesM
AP
PIN
GS
1,57
31,87
334
measurement_values (materialized view)
msmt instance id valueid literal
annotations
dataset column id entity characteristic standardid IRI entity IRI characteristic IRI measurement
standard
top related