ibm watson knowledge studio
TRANSCRIPT
Watson Knowledge Studio
Cognitive Solutions
Unstructured data
• Every day we produce 2.5 quintillion bytes of new data
• Most of this new data is unstructured data
•Books, journals, health records, e-mails, blogs, tweets, etc.
• Holds many possibilities, but is vastly underutilized due to challenges in
understanding and using the data
•Typical organizations only leverage 8% of this data!
Extracting value from unstructured data
• Natural Language Processing (NLP) is a
core function for parsing and identifying
significant words in language.
• Most organizations need to mine unstructured text for specific
information that is unique to their industry or business needs
• Organizations must have the ability to customize the NLP model in order
to realize the full value/benefit of mining the unstructured text
• Helps organizations generate business insights
Introducing IBM Watson Knowledge Studio
• Software-as-a-Service (SaaS) offering available exclusively through the
IBM Cloud Marketplace
• Intended to accelerate the training and adaption of Watson with specific
industry and organizational domain knowledge
• Leverages state-of-the-art supervised machine learning techniques that
allow you to create machine-learning models that understand the
linguistic nuances, meaning, and relationships specific to your industry
• http://ibm.biz/ibmwatsonknowledgestudio
Watson Knowledge Studio
• Enables developers and domain experts to collaborate on the creation of
custom annotator components that can be used to identify mentions and
relations in unstructured text
Watson Knowledge Studio
Watson Explorer AlchemyLanguage on WDC
Analytics Exchange
SME DEV
Target Users
Domain Adaptation with Knowledge Studio
ExpensiveProcess of training machines to extract
information from new domain is fragmented
making it expensive
IsolatedIsolated development environments make it
challenging for domain experts & developers to
work together
ComplexAmbiguous nature of natural language makes it
complex for people to program machines
Challenges IBM Watson Knowledge Studio
CollaborativeSMEs work together to infuse domain
knowledge in cognitive applications
IntuitiveUse a guided experience to teach
Watson nuances of natural language
without writing a single line of code
Cost EffectiveCreate and deploy domain knowledge
infused annotators faster than ever
before using an integrated
development environment
Example: Auto manufacturer
• Use case: Identify safety defects using traffic incident reports
• Solution: Create a NLP model that understands relationships between
manufacturer, make, model, type of incident, and date of incident
Watson Knowledge Studio terminology
• An Annotator adds annotations (metadata) to text that appears in natural
language content. Used by applications to analyze and process text.
• A Type System is an inventory of everything we want WKS to
understand about the unstructured text
•Mentions = any span of text relevant to the current domain
– Example: airbag, child restraint system, etc.
•Entities = group of Mentions that refer to the same thing
– Example: CarMake, AccidentLocation
•Relation = a binary relationship between two entities
– Example: occurredAt defines a relationship between CarMake and
AccidentLocation
Annotation example
John Smith works for IBM. He has been with Big Blue for 20 years.
Entity: PERSON
John Smith
Entity: ORG
IBM Corp
Relation: employedBy Relation: employedBy
Creating an Annotator
• Knowledge curation (performed outside of WKS)
•Collect and maintain content relevant to a specific domain
• Ground truth generation
•Produce a collection of vetted data to train Watson on a specific domain
• Annotator component development
•Human annotations used to further train Watson
• Annotator component evaluation
•Determine which documents are promoted to ground truth
• Annotator component deployment
•Export model into machine-learning runtime environments
1 – Create a project
• Defines the resources required to create a machine-learning annotator
• training documents, type system, dictionaries, human annotations
2 – Create a type system
• Inventory of everything you want WKS to understand in unstructured text
•Mentions, entities, relations
3 – Add documents
• Documents that are representative of your domain content (ie: corpus)
• Create document sets and assign to human annotators
4 – Pre-annotate using dictionaries
• IBM Bluemix Analytics Exchange provides industry-specific dictionaries
that can be used to automatically annotate documents before humans
•https://console.ng.bluemix.net/data/exchange
5 – Annotate documents
• Human annotators use the Ground Truth Editor to apply type system
labels to unstructured text
• Multiple users will perform this task across document sets
5 – Analyze results
• Inter-Annotator Agreement (IAA) scores can be used to determine
whether humans are annotating overlapping documents consistently
• Documents with a passing score are promoted to ground truth
6 – Create a machine learning annotator
• Select document sets that will be used to train the annotator
• Can only train using documents that have been promoted to ground truth
End-to-end domain adaptation
Watson Knowledge Studio trial
• http://ibm.biz/ibmwatsonknowledgestudio
• Free 30-day trial
– 5 authorized users
– 10 projects
– Leverage artifacts from IBM Analytics Exchange
– Deploy models directly to the Watson Developer Cloud
AlchemyLanguage
• A collection of APIs that offer text analysis using NLP
• Helps you understand sentiment, keywords, entities,
concepts, and more
• Available via the Watson Developer Cloud
• Knowledge domains
•By default, uses a public IBM provided model
– Trained using billions of English websites and news content
•May use custom domain models created using WKS
AlchemyLanguage: custom domain model
• In Bluemix:
•Create AlchemyLanguage service instance using Advanced plan
•Obtain AlchemyAPI key
• In Watson Knowledge Studio:
•Create and train custom annotator model
•Deploy model to AlchemyLanguage service instance using API key
• In your application
•Based off desired text analysis, choose API(s) relevant to your app
•Append the following parameter/payload to API request(s):
model=name_of_model
AlchemyLanguage API demo
• URL: https://alchemy-language-demo.mybluemix.net/
{
"count": "2",
"emotions": {
"anger": "0.396646",
"disgust": "0.602397",
"fear": "0.502285",
"joy": "0.020129",
"sadness": "0.074501”
},
"sentiment": {
"score": "-0.185101",
"type": "negative"
},
"text": "Ford",
"type": "MANUFACTURER”
}
• Review API results using different models
•Public: understands general entities such
as “automobile”
•Custom: trained on traffic incident reports
and will find specific entities such as part
of car, accident outcome, or impact
Sentiment analysis on custom entities
Watson Knowledge Studio
http://ibm.biz/ibmwatsonknowledgestudio