using construction grammar in conversational systems

43
Using Construction Grammar in Conversational Systems Marie-Claire Jenkins, PhD Thesis (High level overview)

Upload: cj-jenkins

Post on 11-May-2015

2.778 views

Category:

Technology


4 download

DESCRIPTION

PhD Thesis Overview - This is a very high level presentation about my PhD research.

TRANSCRIPT

Page 1: Using construction grammar in conversational systems

Using Construction Grammarin

Conversational Systems

Marie-Claire Jenkins, PhD Thesis

(High level overview)

Page 2: Using construction grammar in conversational systems

Overview

This thesis was motivated by the machine's limitations in  understanding natural language and in forming responses. The limitations and complexities of current search engine querying was also a factor.

Conversational systems are good for testing possible solutions and are useful on the web.

We used methods that are not common in these systems:

- Construction Grammar (CxG)- OWL ontologies- Lexical semantics- A new stemmer (Uea-Lite)

Page 3: Using construction grammar in conversational systems

What I'm going to talk about

• Conversational systems: what they are and how they work & what their limitations are

•  The Turing test and the Loebner prize

•  2 early experimental systems that we built

• OWL ontologies vs databases

•  Construction grammar and Fluid construction grammar

•  UEA-Lite stemmer

• Machine learning component

•  KIA system diagram

• Evaluation methods and learnings

Page 4: Using construction grammar in conversational systems

Things I covered in my research:

- Natural language understanding - Natural language generation- Human computer interaction- Service oriented systems

Things I didn't cover in my research:

- Knowledge acquisition- Open domains- Affective behaviour- Everything else

Page 5: Using construction grammar in conversational systems

Conversational systems

They are more commonly referred to as "chatbots" or “Artificial Conversational Entities”

They converse with a user in natural language and simulate a human-human conversation.

They need to:

- "Understand” the user input- Retrieve relevant information- Generate a natural language response

There are 3 different kinds of chatbots...

Page 6: Using construction grammar in conversational systems

Social chatbots

Their purpose is to chat freely about anything at all with a user, much like you would with a friend. They are used online for fun.

Page 7: Using construction grammar in conversational systems

Educational chatbots

Their purpose is to help the user learn about something such as a new language, history or geography. They are often used in schools

Page 8: Using construction grammar in conversational systems

Service oriented chatbots

Their purpose is to help customers find their way around the website and also to answer questions about their products & services.

Page 9: Using construction grammar in conversational systems

How they work

There are a variety of methods used but the most popular are:

- Database driven- AIML (artificial intelligence markup language, xml based) - Canned responses- Stochastic methods- Supervised learning- Named entity recognition- Templates

Page 10: Using construction grammar in conversational systems

“Phrase Based systems” are seen as generalized templates at the sentence level (like phrase structure rules) or at the discourse level.

1- Phrasal pattern selected [subject noun verb]

2 - Each part of the pattern is expanded [noun modifiers]

3 - When each phrasal pattern has been replaced by 1+ words –END

They are very difficult to build because the phrasal interrelationships must be clearly specified otherwise there can be inappropriate phrase expansions.  

Phrase-based systems

Page 11: Using construction grammar in conversational systems

In “Feature-based systems” each possible alternative is represented by a feature and each sentence is specified by them.  

Sentence generation is achieved by using all of these features until the sentence is determined.  

Features may include: positive/negative, past/present, statement/question…

Strength: any distinction in language can be a feature

Weakness: very hard to maintain feature inter-relationships and the control of feature selection.

Feature-based systems

Page 12: Using construction grammar in conversational systems

Tests on dialogue from the human-human customer service system on a large commercial website reveal that there is no consistency in language or phrase formulation.

There is a very small amount of Formulaic language (canned responses).

A question was never formulated in the same way and never answered in the same way (apart from formulaicity).

This makes it hard for us to produce templates or anticipate user utterances.

Observations from live data

Page 13: Using construction grammar in conversational systems

More Limitations

Main issues with existing systems:

- Scalability- Knowledge & information storage- User input disambiguation- Response generation (word order, vocabulary, etc...)- Knowledge/information retrieval- Anaphora- Managing the dialogue- Displaying appropriate behaviour (affective issues)- Knowledge assimilation- Evaluation

Page 14: Using construction grammar in conversational systems

Turing test

“A machine is termed capable of thinking if it can, under certain prescribed conditions imitate a human by answering questions sufficiently well to deceive a human questioner for a reasonable period of time.” (Turing)

Objections to the test include proving intelligence, "understanding" and other things.

My personal opinion has changed since the beginning of my PhD research:

“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.” (Dijkstra)

Page 15: Using construction grammar in conversational systems

Turing test illustration

Wikipedia

Page 16: Using construction grammar in conversational systems

XKCD

Page 17: Using construction grammar in conversational systems

Loebner prize

This yearly contest is run by Hugh Loebner who has offered a $100,000 prize for the 1st chatbot to pass the Turing test

This test is controversial. Marvin Minsky said:

“I do hope that someone will volunteer to violate this proscription so that Mr. Loebner will indeed revoke his stupid prize, save himself

some money, and spare us the horror of this obnoxious and unproductive annual publicity campaign.”

Page 18: Using construction grammar in conversational systems

Loebner prize diagram

Michael Mauldin- carnegie mellon 

Page 19: Using construction grammar in conversational systems

John

We built a conversational chatbot and entered it into the Loebner prize (2006). It was designed & built in 2 months and operated on a closed domain.

Reason: to run on a small database requiring little manual labour. We used ngrams, weighted responses, a vector approach, perl, Brill, UEA-Lite, wildcards, AIML

We were a finalist and we learned that:

- A small database worked for a small amount of time- A database system makes for laborious build and limited

information (well used systems work much better)- Template methods are limited- Canned responses are awkward- AIML is restrictive

Page 20: Using construction grammar in conversational systems

KIA: the HCI tests

We designed a system made to research human-machine interaction and human behaviour: this is a test on humans and not the system

We included functions that were meant to test user persistence with query repair, emotive response, language etc...

Results: users persist, are emotive, sensitive to interface design and more.

Details available in our paper

Page 21: Using construction grammar in conversational systems

KIA – a CxG & OWL driven system

Page 22: Using construction grammar in conversational systems

Databases vs OWL ontologies:

Databases focus on local semantics and ontologies on global semantics.

In ontologies the semantics are explicit and in databases implicit.

Ontologies allow data to be reused whereas database schemas cannot be reused.

Ontologies are portable between websites to facilitate maintenance and construction

Restrictions in databases do not allow for all of the necessary relations to be built into the data.

Page 23: Using construction grammar in conversational systems

Database(Wordpress Bits)

Owl Ontology(Richard Durban)

Page 24: Using construction grammar in conversational systems

OWL flavour

We used OWL (Web Ontology Language) as it is more expressive than other semantic web languages and is built to enable ontologies to be created easily.

It is a semantic markup language and an extension of RDF (Resource Description Framework).

There are different subsets of OWL: OWL Full, OWL Lite and OWL DL (Description Logic).

We chose to use OWL DL.

Page 25: Using construction grammar in conversational systems

Why Ontologies & why OWL DL?

Taxonomies are also not as expansive as ontologies.

“At one extreme there are ontologies and the other mind maps and pathfinder networks, and in between taxonomies and browserable hierarchies”. (Brewtser and Wilkes)

Ontologies have a greater potential for inference and a greater degree of formality.

OWL DL has stricter restrictions which are necessary in our type of system.

It has maximum expressiveness without losing computational completeness (all entailments are will be computed) and decidability (all computations will finish in finite time) of reasoning systems.

Page 26: Using construction grammar in conversational systems

OWL Ontology example: Koala

Page 27: Using construction grammar in conversational systems

What do we store in there?

- All of the domain knowledge (e.g all about Koalas)

- The collection of constructions (commonly used when discussing koalas)

- Canned responses (formulaic language)

Page 28: Using construction grammar in conversational systems

KIA system domain knowledge

Page 29: Using construction grammar in conversational systems

Construction Grammar 

It is a cognitive linguistic method and it is:

- Constraint based- Generative- Non-derivational- A monostratal grammatical model- Incorporates the cognitive and interactional foundations of

language- Consists of taxonomies of families of constructions- Uses entire constructions as the primary unit of grammar- Is a pairing of form and meaning (metonomic)- Frames used in CxG != regular frames because the argument

structure types invoke frames which designate event types- The verb alone is not the main unit of meaning, the construction

itself is

Page 30: Using construction grammar in conversational systems

ConstructionsWords

Sentences

Constructions make sense in computing

Page 31: Using construction grammar in conversational systems

Example of CxG

Semantics: relational predicate involving a singer Syntactics: predicate requires arguments and ``Heather'' is the

subject

Generative Grammar

Construction Grammar

Page 32: Using construction grammar in conversational systems

Advantages of CxG

- Adapts to changing language patterns easily

- Takes into consideration both semantics and syntactics

- Constructions are easier to manage than words as the atomic unit

- Allows for integration into bigger collections of constructions

- Can be computed

Page 33: Using construction grammar in conversational systems

UEA-Lite stemmerAfter testing the system with all available stemmers, we realised that

we needed to design our own to facilitate topic/construction detection.

UEA-Lite stems conservatively to orthographically correct word forms and recognizes words which do not need to be stemmed.

There is a Perl, Java and Ruby version

More information here(an updated paper to follow soon)

Page 34: Using construction grammar in conversational systems

Machine learning 

It identifies constructions (NP or VP), the syntactic pole and the semantic pole feed information so constructions to be loaded with meaning and form information. 

The machine learning engine finds sets of constructions which commonly work in conjunction with each other or that have been used in conjunction in the past. 

The weights are adjusted each time a new construction is added. This happens when the system encounters a new instance.

The engine runs through this data and calculates a probability of the right matches to the query information to be found.

Page 35: Using construction grammar in conversational systems

Algorithms

- Jaccard Distance to weight the constructions (how often different constructions are found in conjunction, partial or complete)

- Naive Bayes algorithm clusters all of the constructions according to their different features in our training set (requires little training data)

Once the data has been processed through the Naive Bayes algorithm we know which constructions are often found with others, and in what order. We not only look at the syntax but also at the semantic aspect both in isolation and in conjunction with each other.

The role of the classifier is to determine which categories future constructions belong to, and also to tell us which constructions are a likely match to a query.

Page 36: Using construction grammar in conversational systems

Naïve Bayes for CxG

P (Constructions) doesn't change over time. Naive Bayes estimates a multinomial distribution over categories, which is the prior distribution of categories We can therefore say that:

Best category [ArgaMax cat in cats] = P (constructions ¦ cat) (P (cat)) 

If c1, c2, ... cn are the constructions in the document, then:

Best category [ArgaMax cat in cats] = P(c1|cat)*P(c2|cat)*...*P(cn|cat)*P(cat)

Page 37: Using construction grammar in conversational systems

System diagram

There are many more components to the systemthan presented in this presentation as you can see.

Page 38: Using construction grammar in conversational systems

Evaluation methods

There are not any robust evaluation methods for conversational systems but we found that a mixture of the following worked well:

- Human evaluation (feedback form)- "Pourpre” to evaluate sentence complexity (Jimmy Lin)- Expected vs Given response score

Evaluation is not finished as yet but the initial results are encouraging with good knowledge retrieval and construction selection.

Page 39: Using construction grammar in conversational systems

Things that didn't work

Using LSI/PLSI to determine the similarity between individual utterances in order to extract useful constructions failed.

The reasons:

 LSI is an information retrieval method and Q&A systems require a higher level of accuracy. 

Information retrieval uses a hammer and every problem is a nail.Subtler systems require a more delicate approach.

It is very hard to get LSI to scale to sentence level, which is interesting as it has been proven that it doesn't scale

The fact that it can't capture polysemy is ok because we disambiguate prior to this and append information to constructions

Page 40: Using construction grammar in conversational systems

Fluid construction Grammar (FCG)(also didn't work!)

- Bi-directional (using rules)

- Selects meanings and maps them into the real world.

- "fluid" because it takes into consideration the fact that users change and update their grammars often.

- User input can be broken down syntactically in order to gain meaning from the grammatical components, whilst also being able to map the semantic relationships

BUT: not developed enough to work well in our system 

Also: bi-directional rules are very hard to write

Page 41: Using construction grammar in conversational systems

Some Outcomes & Learnings

- Construction Grammar is a useful method for NLU & NLG

- OWL ontologies are well suited to these systems

- Stemming affects the system greatly

- Fluid CxG is not practical at this time

- Better evaluation methods need to be developed

- Turing test is not useful as it does not provemachine intelligence or understanding

- User perception is a primordial area of research

Page 42: Using construction grammar in conversational systems

Applications & Future work

- Assisted search- Summarization systems

- Content creation- Speech systems

- Sentiment analysis- More powerful AI module

- Anaphora resolution- Open domain testing

- Improved machine learning- Further work on query disambiguation methods

Page 43: Using construction grammar in conversational systems

Thank you

Find me at:

http://www.scienceforseo.comhttp://twitter.com/missmcj

Google reader