sumeet vij enterprise_knowledge_graph

14
Enterprise Knowledge Graph (EKG) Mining an Enterprise’s Systems of Engagements Sumeet Vij Senior Associate Booz Allen Hamilton

Upload: open-analytics

Post on 12-May-2015

936 views

Category:

Technology


1 download

DESCRIPTION

BAH's OA DC Summit Prese

TRANSCRIPT

Page 1: Sumeet vij enterprise_knowledge_graph

Enterprise Knowledge Graph (EKG)Mining an Enterprise’s Systems of Engagements

Sumeet VijSenior Associate

Booz Allen Hamilton

Page 2: Sumeet vij enterprise_knowledge_graph

Can you make your decisions on just 20% of your data?

◦ According to IDC Research, less than 20% percent of an enterprise’s information is in the form of structured data which can reside neatly in traditional columnar relational databases

◦ 80% of information is unstructured and semi-structured in the form of documents, web-pages, emails, images and videos which are growing at a tremendous rate

◦ Current Enterprise Systems of Record (ERPs, CRMs) capture a miniscule amount of information generated within an Enterprise

◦ However the Systems of Record remain the main focus of the IT team and the main source of information for the enterprise leadership

Page 3: Sumeet vij enterprise_knowledge_graph

Unstructured Data creates Enterprise Information Management Challenges

• Information is scattered and inaccessible• Spread across documents, spreadsheet, emails

• Data is stored in multiple, often incompatible formats

• Data sources are not linked• No documented relationships between pieces of

information

• No easy way to harness data from external sources including social networks

• Information is hard to understand• Different terminology and vocabularies

Page 4: Sumeet vij enterprise_knowledge_graph

How do employees create and share information? Through Systems of EngagementThese systems are the primary

way employees in an Enterprise communicate and share information, namely◦Email◦IM (Lync)◦Social collaboration tools like Yammer,

Tibbr, JiveNot surprisingly, these systems

generate unstructured data at high velocity and variety

Page 5: Sumeet vij enterprise_knowledge_graph

Systems of Engagement Loosely structured knowledge flows Conversational Dynamic and in flux

Page 6: Sumeet vij enterprise_knowledge_graph

How does the industry extract information from unstructured text? Google Knowledge Graph

The Google Knowledge Graph provides “Things not just Strings”, that is, it enhances its search results with semantic information gathered from multiple sources. It provides structured information about entities and links to other related entities. Its goal is to help people

• Find the right thing: Find the right entity, understand the difference between Taj Mahal the monument and Taj Mahal the musician

• Get the best summary: Summarize relevant content related to the entity, key facts and other related entities

• Go deep and broader: Help make unexpected discoveries and relationships

Page 7: Sumeet vij enterprise_knowledge_graph

How does an Enterprise extract information from these Systems of Engagement? Enter the Enterprise Knowledge Graph (EKG)Along the lines of the Google Knowledge Graph, the EKG aims to help enterprises extract and explore information created by systems of engagement. Core EKG concepts are:

• Knowledge Capture: Extract key concepts and relationships from unstructured documents using an Enterprise Ontology. This allows concept based indexing of content• Example: An employee submits a trip report in the form of an email. EKG automatically

extracts the Who, What, When and Where information and links it to other relevant resources.

• Knowledge Discovery: Search multiple data sources for information using a relevant Enterprise Ontology• Example: A proposal manager can ask, “Who has background information

about the Army CIO/G6?”.

• Knowledge Exploration: Expose information to a host of graphical tools to visualize and further analyze relationships between data

Page 8: Sumeet vij enterprise_knowledge_graph

How is the EKG seeded? Crowd-source the creation The major source of information generation in an

enterprise is email. The process to seed the EKG with email would be:◦ The sender copies their email to a monitored EKG

email mailbox◦ The EKG parses, analyzes and adds the extracted

facts to the Knowledge Graph◦ The EKG then sends an automated email back to

the sender, describing the facts and a link to correct the extracted information

Start with a specific Ontology geared towards a high value use case and then build out the entities and their relationships

Integrate with Linked Data sources like Freebase and DBPedia to provide external context

Page 9: Sumeet vij enterprise_knowledge_graph

Benefits of adding email to the EKG◦Bigger insights as we can leverage the

collective interactions of all the employees (not just the respondent) and the subsequent interactions enrich the EKG, allowing even more questions to be answered

◦Liberate employee knowledge, expertise and interactions from the mailbox and make it available for the enterprise to leverage.

Page 10: Sumeet vij enterprise_knowledge_graph

SLIDE 10

• Utilize all available knowledge sources• Allows documents, spreadsheets and emails to serve as “top-

level” information sources

• Integration• Ties disconnected pieces of data together into meaningful

wholes that provide a basis for planning and decision making

• Meaning-Centric• Facts around an object or an entity can be easily explored

• Search phrases are better “understood” as they are based upon concepts and not literals

• Serendipity • Related searches allow the formerly “unknown” to be

discovered

EKG Benefits

Page 11: Sumeet vij enterprise_knowledge_graph

Facts

How we discover information within an Enterprise today

SLIDE 11

Proposal Manager

Who has information about the army CIO/G6 ?

Sumeet VijResumeSystem

OpportunityManagement

System

CRM

DoD SOA &

Semantic Technology Symposiu

m

Presented at

Cliff Daus

Attended

Demonstration at

CIO/G6

Attended

Follow on Meeting

CIO/G6

Employee of

Customer

Attended

Semantic Technologi

es

Topic

DA CB

Attended

Search

Search

Search

Trip Report

Social Network

Web

Systems of Record Systems of Engagement

Page 12: Sumeet vij enterprise_knowledge_graph

Knowledge Discovery

Knowledge Discovery using EKG

SLIDE 12

Proposal Manager

CRM

Opportunity Management

System

Resume System

Sumeet Vij

Who has information about the Army CIO/G6?

Parse

Determine Sources for Information

Query

Knowledge Capture

Trip ReportsMeeting MinutesEtc.

Ent

ity E

xtra

ctio

n

Knowledgebase

Email Submission

Web Submission

Update

Submit

Page 13: Sumeet vij enterprise_knowledge_graph

SLIDE 13

Conceptual EKG Architecture

Integration Layer

E-Mail Connector

Database Connector

Web Services Client

Semantic Processing Layer

User Interface Layer

Persistence Layer

NoSQL Store

Entity Extraction

Knowledge BrowserQuery UI

Document Upload

Concept Catalog

Data Source Catalog

• An open architecture composed of re-useable open source components

Page 14: Sumeet vij enterprise_knowledge_graph

Questions?