integration of data ninja services with oracle spatial and graph

14
INTEGRATION OF TEXT ANALYTICS WITH ORACLE SPATIAL AND GRAPH Oracle OpenWorld 2016 San Francisco September 18 to 22 1 © 2016 DOCOMO Innovations, Inc. All rights reserved.

Upload: data-ninja-api

Post on 05-Apr-2017

26 views

Category:

Data & Analytics


0 download

TRANSCRIPT

INTEGRATION OF

TEXT ANALYTICS WITH ORACLE

SPATIAL AND GRAPH Oracle OpenWorld 2016

San Francisco

September 18 to 22

1 © 2016 DOCOMO Innovations, Inc. All rights reserved.

Oracle Big Data Spatial and Graph

© 2016 DOCOMO Innovations, Inc. All rights reserved. 2

• Distributed graph with analytic functions

• Examples of graph analysis use cases

*Data Sheet: Oracle Big Data Spatial and Graph

Semantic Indexing

• An index type that can make use of information extractors to link data semantically stored in relational tables

DOCOMO Innovations、Inc. All Rights Reserved. 3

*Presentation: Oracle Spatial and Graph - Semantic Indexing

Using Data Ninja as Custom Semantic Extractor

© 2016 DOCOMO Innovations, Inc. All rights reserved. 4

Big Data Spatial and Graph

RDF Graph Access Layer

RDF Extractor (PL/SQL)

Graph Analytics

Preprocessing - Ingestion

Text Analytics

Ontology (RDF)

Data Ninja RDF Loader (Java)

Data Ninja RDF Custom Extractor

(PL/SQL)

Unstructured Big Data

Other RDF Components

Fuseki Server

Cyto-scape Server

Graph Visualization

Programmable API

to plug in third-party

extractors into

Oracle database

News Crawling Example

newsID newsArticle newsSource

20160902_555 A new study says that parts of Africa and the Asia-Pacific region may be vulnerable to outbreaks of the Zika virus, including some of the world's most populous countries and many with limited resources to identify and respond to the mosquito-borne disease. [more]

http://www.newkerala.com/news/2016/fullnews-113309.html

20160903_1317 Hurricane Hermine, set to cause flooding and damage when it hits Florida overnight, will make it harder for the state to fight Zika, a mosquito-borne virus shown to cause birth defects, experts in infectious diseases and mosquitoes said on Thursday. [more]

http://kelo.com/news/articles/2016/sep/01/hurricane-hermine-will-complicate-floridas-zika-fight-experts/

20160904_2209 Singapore confirmed 26 more cases of locally transmitted Zika infections, the health ministry and National Environment Agency (NEA) said in a joint statement on Saturday, bringing the tally to 215. Of the 26 new cases, 24 were linked to existing clusters while two cases have no known links to any existing cluster, they said. [more]

https://www.yahoo.com/news/singapore-says-confirms-26-more-local-transmission-zika-052937119--finance.html

… … …

• Domain-specific, health-related news crawling – English language only – Worldwide coverage – Healthcare-related keywords in news titles

DOCOMO Innovations、Inc. All Rights Reserved. 5

RDF Example of Extracted Entities

Subject Property Object

http://www.newkerala.com/news/2016/fullnews-113309.html

http://dataninja.net/occurrence urn:uuid:e47c4916-e7c1-4a3b-b650-f243e0d7ba33

urn:uuid:e47c4916-e7c1-4a3b-b650-f243e0d7ba33

http://dataninja.net/entity http://dataninja.net/entity/Zika+virus

urn:uuid:e47c4916-e7c1-4a3b-b650-f243e0d7ba33

http://dataninja.net/occurrence/entity/sentiment

http://dataninja.net/entity/sentiment/negative

urn:uuid:e47c4916-e7c1-4a3b-b650-f243e0d7ba33

http://dataninja.net/occurrence/entity/count "12"^^xsd:integer

urn:uuid:e47c4916-e7c1-4a3b-b650-f243e0d7ba33

http://dataninja.net/occurrence/entity/sentiment_score

“-1.0"^^xsd:float

urn:uuid:e47c4916-e7c1-4a3b-b650-f243e0d7ba33

http://dataninja.net/occurrence/entity/score "1.0"^^xsd:float

urn:uuid:e47c4916-e7c1-4a3b-b650-f243e0d7ba33

http://dataninja.net/occurrence/entity/entity_locations

"(135,145) (565,575) (777,787) (950,960) (1142,1152) (1535,1545) (1696,1706) (1755,1765) (1887,1891) (2191,2195) (2352,2362) (2376,2386)"

(265 more for same news article)

DOCOMO Innovations、Inc. All Rights Reserved. 6

RDF Graphs for Extracted Entities (one news article)

© 2016 DOCOMO Innovations, Inc. All rights reserved. 7

http://www.newkerala.com/news/2016/fullnews-113309.html

urn:uuid:e47c4916-e7c1-4a3b-b650-f243e0d7ba33

http://dataninja.net/entity/Zika+virus

negative

12 …

http://dataninja.net/occurrence

http://dataninja.net/entity

http://dataninja.net/occurrence/entity/sentiment

http://dataninja.net/occurrence/entity/count

http://dataninja.net/entity/Philippines

http://dataninja.net/entity/Thailand

http://dataninja.net/entity/Nigeria

One occurrence-blank node

for each extracted entity

RDF Graphs for Extracted Entities (multiple articles)

© 2016 DOCOMO Innovations, Inc. All rights reserved. 8

http://www.newkerala.com/news/2016/fullnews-113309.html

urn:uuid:e47c4916-e7c1-4a3b-b650-f243e0d7ba33

http://dataninja.net/entity/Zika+virus

http://www.newkerala.com/news/2016/fullnews-113309.html

urn:uuid:68282cbb-b70c-4f6e-8157-5ef6b1d34d31

https://www.yahoo.com/news/singapore-says-confirms-26-more-local-transmission-zika-052937119--finance.html

urn:uuid:ab7b9e43-710f-436e-b6ff-15abad71ca15

Same URI for same entity

Ontology for Extracted Entities

© 2016 DOCOMO Innovations, Inc. All rights reserved. 9

http://dataninja.net/entity/Philippines

http://dataninja.net/entity/Thailand

http://dataninja.net/entity/Nigeria

http://dataninja.net/entity/Location http://dataninja.net/entity/Country

http://dataninja.net/entity/Kingdom

rdfs:subClassOf

Ontology extracted for categories of entities

Smart Sentiment

Ontology for Extracted Entities (with more categories)

© 2016 DOCOMO Innovations, Inc. All rights reserved. 10

http://dataninja.net/entity/Philippines

http://dataninja.net/entity/Thailand

http://dataninja.net/entity/Nigeria

http://dataninja.net/entity/category/Location

http://dataninja.net/entity/category/Country

rdfs:subClassOf

http://dataninja.net/category/Southeast+Asia

http://dataninja.net/entity/category/Kingdom

http://dataninja.net/category/Regions+of+Asia

Smart Data

Additional categories of entities

added to ontology by using Smart

Data knowledge graph

http://dataninja.net/category/Africa

RDF Graphs for Extracted Concepts (one news article)

© 2016 DOCOMO Innovations, Inc. All rights reserved. 11

http://www.newkerala.com/news/2016/fullnews-113309.html

urn:uuid:3f365159-2572-4c91-99ea-0f7ec7c0b7bc

http://dataninja.net/concept/Zika+virus

0.33

http://dataninja.net/occurrence

http://dataninja.net/concept

http://dataninja.net/occurrence/concept/score

http://dataninja.net/entity/Zika+fever

Smart Content

Same URI for same concepts, but

not for entities with same names

http://dataninja.net/entity/Zika+virus

Owl:sameAs

RDF Graphs for Extracted Concepts (with categories)

12 © 2016 DOCOMO Innovations, Inc. All rights reserved.

http://dataninja.net/concept/Zika+virus

http://dataninja.net/entity/Zika+fever http://dataninja.net/category/Flaviviruses

http://dataninja.net/category/Zoonoses http://dataninja.net/category/Viral+diseases

http://dataninja.net/category/Infectious+diseases

rdfs:subClassOf

rdfs:subClassOf

Smart Content

More categories of concepts

added to improve richness of

ontology

RDF Graphs for Extracted Relationships

© 2016 DOCOMO Innovations, Inc. All rights reserved. 13

https://www.yahoo.com/news/singapore-says-confirms-26-more-local-transmission-zika-052937119--finance.html

http://dataninja.net/entity/Zika+virus http://dataninja.net/entity/Singapore

http://dataninja.net/occurrence

http://dataninja.net/entity

http://dataninja.net/relationship/Outbreak

http://dataninja.net/relationship/Mosquitoes

http://dataninja.net/relationship/Infections

New relationships discovered

over time to enrich the

ontology further

owl:intersectionOf

Newsbot Ninja App in Beta

© 2016 DOCOMO Innovations, Inc. All rights reserved. 14

Fetching/ Streaming

Pre-process

Data Ninja Client

Structured Data

Text Extraction

Text Analytics

Topic Extraction

Post-process

Hourly

View Builder Newsbot Service

Automotive

News Crawling Service

Health

Continuous

discovery of new

relationships