ibm watson knowledge studio

Watson Knowledge Studio

Cognitive Solutions

Unstructured data

• Every day we produce 2.5 quintillion bytes of new data

• Most of this new data is unstructured data

•Books, journals, health records, e-mails, blogs, tweets, etc.

• Holds many possibilities, but is vastly underutilized due to challenges in

understanding and using the data

•Typical organizations only leverage 8% of this data!

Extracting value from unstructured data

• Natural Language Processing (NLP) is a

core function for parsing and identifying

significant words in language.

• Most organizations need to mine unstructured text for specific

information that is unique to their industry or business needs

• Organizations must have the ability to customize the NLP model in order

to realize the full value/benefit of mining the unstructured text

• Helps organizations generate business insights

Introducing IBM Watson Knowledge Studio

• Software-as-a-Service (SaaS) offering available exclusively through the

IBM Cloud Marketplace

• Intended to accelerate the training and adaption of Watson with specific

industry and organizational domain knowledge

• Leverages state-of-the-art supervised machine learning techniques that

allow you to create machine-learning models that understand the

linguistic nuances, meaning, and relationships specific to your industry

• http://ibm.biz/ibmwatsonknowledgestudio


• Enables developers and domain experts to collaborate on the creation of

custom annotator components that can be used to identify mentions and

relations in unstructured text


Watson Explorer AlchemyLanguage on WDC

Analytics Exchange

SME DEV

Target Users

Domain Adaptation with Knowledge Studio

ExpensiveProcess of training machines to extract

information from new domain is fragmented

making it expensive

IsolatedIsolated development environments make it

challenging for domain experts & developers to

work together

ComplexAmbiguous nature of natural language makes it

complex for people to program machines

Challenges IBM Watson Knowledge Studio

CollaborativeSMEs work together to infuse domain

knowledge in cognitive applications

IntuitiveUse a guided experience to teach

Watson nuances of natural language

without writing a single line of code

Cost EffectiveCreate and deploy domain knowledge

infused annotators faster than ever

before using an integrated

development environment

Example: Auto manufacturer

• Use case: Identify safety defects using traffic incident reports

• Solution: Create a NLP model that understands relationships between

manufacturer, make, model, type of incident, and date of incident

Watson Knowledge Studio terminology

• An Annotator adds annotations (metadata) to text that appears in natural

language content. Used by applications to analyze and process text.

• A Type System is an inventory of everything we want WKS to

understand about the unstructured text

•Mentions = any span of text relevant to the current domain

– Example: airbag, child restraint system, etc.

•Entities = group of Mentions that refer to the same thing

– Example: CarMake, AccidentLocation

•Relation = a binary relationship between two entities

– Example: occurredAt defines a relationship between CarMake and

AccidentLocation

Annotation example

John Smith works for IBM. He has been with Big Blue for 20 years.

Entity: PERSON

John Smith

Entity: ORG

IBM Corp

Relation: employedBy Relation: employedBy

Creating an Annotator

• Knowledge curation (performed outside of WKS)

•Collect and maintain content relevant to a specific domain

• Ground truth generation

•Produce a collection of vetted data to train Watson on a specific domain

• Annotator component development

•Human annotations used to further train Watson

• Annotator component evaluation

•Determine which documents are promoted to ground truth

• Annotator component deployment

•Export model into machine-learning runtime environments

1 – Create a project

• Defines the resources required to create a machine-learning annotator

• training documents, type system, dictionaries, human annotations

2 – Create a type system

• Inventory of everything you want WKS to understand in unstructured text

•Mentions, entities, relations

3 – Add documents

• Documents that are representative of your domain content (ie: corpus)

• Create document sets and assign to human annotators

4 – Pre-annotate using dictionaries

• IBM Bluemix Analytics Exchange provides industry-specific dictionaries

that can be used to automatically annotate documents before humans

•https://console.ng.bluemix.net/data/exchange

https://console.ng.bluemix.net/data/exchange

5 – Annotate documents

• Human annotators use the Ground Truth Editor to apply type system

labels to unstructured text

• Multiple users will perform this task across document sets

5 – Analyze results

• Inter-Annotator Agreement (IAA) scores can be used to determine

whether humans are annotating overlapping documents consistently

• Documents with a passing score are promoted to ground truth

6 – Create a machine learning annotator

• Select document sets that will be used to train the annotator

• Can only train using documents that have been promoted to ground truth

End-to-end domain adaptation

Watson Knowledge Studio trial

• http://ibm.biz/ibmwatsonknowledgestudio

• Free 30-day trial

– 5 authorized users

– 10 projects

– Leverage artifacts from IBM Analytics Exchange

– Deploy models directly to the Watson Developer Cloud

http://ibm.biz/ibmwatsonknowledgestudio

AlchemyLanguage

• A collection of APIs that offer text analysis using NLP

• Helps you understand sentiment, keywords, entities,

concepts, and more

• Available via the Watson Developer Cloud

• Knowledge domains

•By default, uses a public IBM provided model

– Trained using billions of English websites and news content

•May use custom domain models created using WKS

AlchemyLanguage: custom domain model

• In Bluemix:

•Create AlchemyLanguage service instance using Advanced plan

•Obtain AlchemyAPI key

• In Watson Knowledge Studio:

•Create and train custom annotator model

•Deploy model to AlchemyLanguage service instance using API key

• In your application

•Based off desired text analysis, choose API(s) relevant to your app

•Append the following parameter/payload to API request(s):

model=name_of_model

AlchemyLanguage API demo

• URL: https://alchemy-language-demo.mybluemix.net/

{

"count": "2",

"emotions": {

"anger": "0.396646",

"disgust": "0.602397",

"fear": "0.502285",

"joy": "0.020129",

"sadness": "0.074501”

},

"sentiment": {

"score": "-0.185101",

"type": "negative"

},

"text": "Ford",

"type": "MANUFACTURER”

}

• Review API results using different models

•Public: understands general entities such

as “automobile”

•Custom: trained on traffic incident reports

and will find specific entities such as part

of car, accident outcome, or impact

Sentiment analysis on custom entities