deana pennington, natalia villanueva rosales, and paulo...

23
Knowledge Subproject Deana Pennington, Natalia Villanueva Rosales, and Paulo Pinheiro da Silva University of Texas at El Paso 2/16/2012 Board of Advisers 2012

Upload: others

Post on 05-Nov-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Knowledge Subproject Deana Pennington, Natalia Villanueva Rosales,

and Paulo Pinheiro da Silva

University of Texas at El Paso

2/16/2012 Board of Advisers 2012

Page 2: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Outline

Change in personnel

Highlights of first five years

Plans for next five

Dr. Paulo Pinheiro da Silva => Pacific Northwest National Lab

New co-leads:

Dr. Deana Pennington

Dr. Natalia Villanueva Rosales

2

Page 3: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Knowledge about events: time awareness

Natalia www.natalia-villanueva.com

Natalia es mexicana All Mexicans love Mexican food

Event

(today)

is presenting

happening at

Natalia is currently at UTEP.

I am giving a lecture in

Cyber-ShARE, UTEP

11/12/2011, 14:00hrs

3

Page 4: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Vision: First five years

2/16/2012

Scientific

Discovery

Experimental

Science

Theoretical

Science

D

ata

, p

rod

uct

un

ders

tan

din

g a

nd

accep

tan

ce

(tru

st

an

d u

ncert

ain

ty m

an

ag

em

en

t)

Da

ta c

ura

tio

n Data and product

attribution

Product

derivation

understanding

Da

ta a

nd

pro

du

ct

dis

co

ve

ry/r

eu

se

Process

understanding

Info

rma

tio

n

inte

gra

tio

n

Domain

ontologies

Task

ontologies

Provenance

Knowledge of

data, product

derivation and

assertion

Process

knowledge

Domain

knowledge

General-

purpose

ontologies

Common

knowledge

Computational

Science

Data

Exploration

Science

Page 5: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

5

Organized (Connected) Data

Crustal Modeling SAW

Crustal Modeling WDO

Seismology WDO

Hole’s Code Tomography

South NM Tomography British Columbia

Tomography

OWL Time ontology

SWEET ontology

pro

ve

nan

ce

Ab

str

act

Wo

rkfl

ow

s

Do

main

On

tolo

gie

s

General-Purpose

Ontologies PML ontology

Page 6: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Accomplishments Development of CI-Miner Infrastructure

6

Page 7: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Accomplishments Development of CI-Miner methodology

Development of three complementary integrated languages:

Workflow Driven Ontologies

Semantic Abstract Workflows

Proof Markup Language

Capable of enhancing scientific processes with:

Semantic search

Knowledge-based visualization

Information management services

7

Page 8: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Students

Two Computer Science PhD students graduating this

spring/early summer Nick Del Rio – knowledge based visualization

Leo Salayandia – semantic abstract workflows

Research presentations to follow

Three more over the next year Aida Gandara -- Semantically enabled collaboration

Jitin Arora – Triple stores

Hugo Porras – Visualizing provenance

Antonio – Representing human processes

8

Page 9: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Building on the past

Continue work on ontologies, provenance, and other knowledge representation tools

Extend work to incorporate new areas of expertise with Natalia Villanueva Rosales

Consider new directions in human factors and the semantics of human processes

Semantic abstract workflows capture human processes

Other mechanisms to capture semantics (Deriva)

Semantic collaboration focuses on supporting human collaboration

Deana Pennington’s research on collaboration and knowledge synthesis

Consider new directions linking visualization and semantics

9

Page 10: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Pharmacogenomics ontology

PharmGKB database : Genes, Gene variants, SNPs , Drugs, Measures and outcomes, interactions, treatments.

Manual augmentation with

literature curated pharmacogenomics knowledge of depression.

- effective drug treatment - - favorable outcome? - possible side effects? -gene variants affect therapeutic outcomes?

Semantic Web for retrieving and integrating

scientific knowledge for pharmacogenomics

10

[Ferres et. al., 2009, JWS.]

Page 11: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Retrieve all time series graphs.

TimeSeriesGraph: EquivalentTo Graph

and hasPart some TimeSeries

E.g.

series1 hasPart datapoint7, datapoint7

hasPart x7,

x7 type SecondQuarter,

SecondQuarter subClassOf Quarter,

Quarter subClassOf TimeInterval,

graph1 hasPart series1

Using Protégé 4 alpha (build 53) , FACT++ DL reasoner

and Manchester Syntax. Across graphs.

Statistical graphs query answering,

same methodology, different data sources

Statistical graphs query answering,

same methodology, different data sources.

11

[Dumontier & Villanueva Rosales, 2009, BiB.]

Page 12: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Summary of first approach:

manual creation of ontologies

Ontologies: - Functional groups [Villanueva-Rosales & Dumontier, 2007, OWLED]

- yOWL [Villanueva-Rosales & Dumontier, 2009, JBI]

- Pharmacogenomics

- Statistical graphs

Involved: - Developing customized parsers to

obtain data.

- Analysis of database, tab files

structure, web services definition.

Disadvantages: - Not very scalable approach (time consuming).

- Parsers hard to reuse or maintain.

12

Page 13: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Can we automate the process of creating

ontologies from relational databases?

Goal

Automatically create an expressive OWL ontology using

a set of rule-based heuristics represented in Semantic

Web Languages over a normalized relational database.

Represent, create and execute mappings between

relational databases and ontologies.

Database Ontology

13

Page 14: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

DBOwlizer enables the creation and execution of

mappings between DB and OWL ontologies

Employees

Schema

Ontology

Emplo-

yee

string

hasName

Depart-

ment

worksfor

Employees database schema diagram

(excluding views) auto-generated by

MySQL Workbench ver. 5.1.18

Relational-model

Relational-to-ontology-

mapping

Heuristics

DBOwlizer

14

Page 15: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

DBOwlizer maps information in views

(queries)

DB View Ontology

Employee

>=80,000

hasEmployee.salary

CREATE VIEW

`employees `.`high_salary_employee_salary` AS

SELECT

`employees`.`employee`.`name` AS `name`,

`employees`.`employee`.`salary` AS `salary`

from `employees`.`employee`

where (`employees`.`employee`.`salary` >

80,000)

Class: dl:High_salary_employee_salary

EquivalentTo:

(Employee

and (dl:hasEmployee.salary some

xsd:double[>= 80,000]))

and (dl:hasEmployee.salary some

rdfs:Literal)

15

Page 16: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Bottom-up (semi-) automated

methodology

Better capture of intended semantics of the data than other

approaches.

Exposing contents + mappings on the semantic web (deep web).

Enables query of database(s) with terminology from domain

ontologies (with mapping).

Geospatial

ontology

Query about temperature

in specific locations

Query about

C fluxes in a

specific

region

Query about temperature and C fluxes in a

specific region to identify correlations

16

Page 17: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Future directions

Bottom-up approach in scientific domains of knowledge (e.g. environmental sciences, geo) - Develop use cases for data integration, exchange and question

answering.

- Mapping extracted ontologies to domain ontologies.

Improve scalability and robustness. - Computer science research questions (i.e. complexity,

expressivity, optimizations).

Map and contribute to the RDB2RDF W3C’s working group (Semantic Web community) - Benchmark for Relational Databases to Ontology

Include human factor (Deana’s work)

17

Page 18: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

New directions: Human factors

Ontologies

Data/Process

Connect

(Top down)

Generate

(Bottom up)

Grudin 1994 Groupware challenges (8)

-Work vs benefit (perceived usefulness)

-Critical mass

-Disrupt social processes

-Failure of intuition

-Difficult adoption process

Dataset

Data driven

(Natalia)

Documents

Document driven

(IBM Watson; PNNL)

Human

Actions

Tags

Links Socially driven

(Deana)

Human reasoning about scientific

data, processes, and documents

-Semantic-enabled collaboration

Boundary

Negotiating

Objects (Lee, 2010

Pennington, 2010)

Bada (2004)

-Clear goals

-Limited scope

-Simple, intuitive

•How can we semantically-enable this process? Workflow driven

(Semantic abstract workflows)

Analysis

18

Page 19: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Analysis driven

Semantic Workflow Negotiation ???

Semantic Abstract Workflow

Executable Scientific Workflow

19

Page 20: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Bottom-up (semi-) automated

methodology in Cyber-ShARE

Geospatial

ontology

Query about temperature

in specific locations

Query about

C fluxes in a

specific

region

Query about temperature and C fluxes in a

specific Region to identify correlations

20

Page 21: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Negotiation artifact driven:

Initial Model (Pennington 2010)

Boundary

Negotiating

Objects

Method

Alignment

& Standardization

Boundary

Concepts

Conceptual

Linkages

Boundary Objects

(Star & Griesemer 1989)

Standardized

Concepts & => Ontologies

Processes

Support

Negotiating

Purpose/Scope

Via boundary

Negotiating

objects

Standardization

Alignment

Clear goals

Limited scope

Simple

Intuitive to the scientist 21

Page 22: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Potential Research Objectives

1. Model boundary negotiation process (Deana)

2. Knowledge representation approaches (Natalia)

3. New approaches to bridge gaps between

4. Evaluate usefulness

22

Page 23: Deana Pennington, Natalia Villanueva Rosales, and Paulo ...cybershare.utep.edu/sites/default/files/attachments/events/boa2012-knowledge...Development of three complementary integrated

Questions?

2/16/2012