language technology: research and development topic: verbal …nivre/master/intro_vmwe.pdf ·...
TRANSCRIPT
Language Technology:Research and Development
Topic: Verbal Multiword Expressions
Fabienne Cap
Aug. 31th, 2016
What is this bear doing?
The red bear marches to a di↵erent drummer.Der rote Bar tanzt aus der Reihe. (= dances out of the order)L’ours rouge nage a contre-courant. (= swims against the stream)El oso rojo sale de la fila. (= step out of the line)
But:Where is the drummer?Is it really dancing?... or swimming?Is it a line it was standing in?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
What is this bear doing?
The red bear marches to a di↵erent drummer.
Der rote Bar tanzt aus der Reihe. (= dances out of the order)L’ours rouge nage a contre-courant. (= swims against the stream)El oso rojo sale de la fila. (= step out of the line)
But:Where is the drummer?Is it really dancing?... or swimming?Is it a line it was standing in?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
What is this bear doing?
The red bear marches to a di↵erent drummer.Der rote Bar tanzt aus der Reihe. (= dances out of the order)
L’ours rouge nage a contre-courant. (= swims against the stream)El oso rojo sale de la fila. (= step out of the line)
But:Where is the drummer?Is it really dancing?... or swimming?Is it a line it was standing in?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
What is this bear doing?
The red bear marches to a di↵erent drummer.Der rote Bar tanzt aus der Reihe. (= dances out of the order)L’ours rouge nage a contre-courant. (= swims against the stream)
El oso rojo sale de la fila. (= step out of the line)
But:Where is the drummer?Is it really dancing?... or swimming?Is it a line it was standing in?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
What is this bear doing?
The red bear marches to a di↵erent drummer.Der rote Bar tanzt aus der Reihe. (= dances out of the order)L’ours rouge nage a contre-courant. (= swims against the stream)El oso rojo sale de la fila. (= step out of the line)
But:Where is the drummer?Is it really dancing?... or swimming?Is it a line it was standing in?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
What is this bear doing?
The red bear marches to a di↵erent drummer.Der rote Bar tanzt aus der Reihe. (= dances out of the order)L’ours rouge nage a contre-courant. (= swims against the stream)El oso rojo sale de la fila. (= step out of the line)
But:Where is the drummer?
Is it really dancing?... or swimming?Is it a line it was standing in?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
What is this bear doing?
The red bear marches to a di↵erent drummer.Der rote Bar tanzt aus der Reihe. (= dances out of the order)L’ours rouge nage a contre-courant. (= swims against the stream)El oso rojo sale de la fila. (= step out of the line)
But:Where is the drummer?Is it really dancing?
... or swimming?Is it a line it was standing in?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
What is this bear doing?
The red bear marches to a di↵erent drummer.Der rote Bar tanzt aus der Reihe. (= dances out of the order)L’ours rouge nage a contre-courant. (= swims against the stream)El oso rojo sale de la fila. (= step out of the line)
But:Where is the drummer?Is it really dancing?... or swimming?
Is it a line it was standing in?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
What is this bear doing?
The red bear marches to a di↵erent drummer.Der rote Bar tanzt aus der Reihe. (= dances out of the order)L’ours rouge nage a contre-courant. (= swims against the stream)El oso rojo sale de la fila. (= step out of the line)
But:Where is the drummer?Is it really dancing?... or swimming?Is it a line it was standing in?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Verbal Multiword Expressions
Such expressions are called verbal multiword expressions(VMWEs).
They form a unit that crosses word boundaries (whitespaces).
Just like the bear, VMWEs march to a di↵erent drummer
they behave unlike literal expressions of the same kind.Consider for example their meaning:
to kick the bucket ! to die (VMWE)to kick the ball ! to kick the ball (literal combination)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Verbal Multiword Expressions
Such expressions are called verbal multiword expressions(VMWEs).
They form a unit that crosses word boundaries (whitespaces).
Just like the bear, VMWEs march to a di↵erent drummer
they behave unlike literal expressions of the same kind.Consider for example their meaning:
to kick the bucket ! to die (VMWE)to kick the ball ! to kick the ball (literal combination)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Verbal Multiword Expressions
Such expressions are called verbal multiword expressions(VMWEs).
They form a unit that crosses word boundaries (whitespaces).
Just like the bear, VMWEs march to a di↵erent drummer
they behave unlike literal expressions of the same kind.Consider for example their meaning:
to kick the bucket ! to die (VMWE)to kick the ball ! to kick the ball (literal combination)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Verbal Multiword Expressions
Such expressions are called verbal multiword expressions(VMWEs).
They form a unit that crosses word boundaries (whitespaces).
Just like the bear, VMWEs march to a di↵erent drummer
they behave unlike literal expressions of the same kind.Consider for example their meaning:
to kick the bucket ! to die (VMWE)to kick the ball ! to kick the ball (literal combination)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Verbal Multiword Expressions
Such expressions are called verbal multiword expressions(VMWEs).
They form a unit that crosses word boundaries (whitespaces).
Just like the bear, VMWEs march to a di↵erent drummer
they behave unlike literal expressions of the same kind.Consider for example their meaning:
to kick the bucket ! to die (VMWE)to kick the ball ! to kick the ball (literal combination)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Definitions
Selected definitions from the literature:
Multiword Expressions (MWEs) are idiosyncraticinterpretations that cross word boundaries. (Sag et al. 2002)
A Multiword Expression is usually taken to be any wordcombination (adjacent or otherwise) that has some feature(syntactic, semantic or purely statistic) that cannot bepredicted on the basis of its component words and/or thecombinatorial process of language. (Bannard, 2007)
... there are many more definitions and terms used to describethe phenomenon, e.g. Idioms, Fixed Expressions,Collocations.
For here and now, we stick with verbal MWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Definitions
Selected definitions from the literature:
Multiword Expressions (MWEs) are idiosyncraticinterpretations that cross word boundaries. (Sag et al. 2002)
A Multiword Expression is usually taken to be any wordcombination (adjacent or otherwise) that has some feature(syntactic, semantic or purely statistic) that cannot bepredicted on the basis of its component words and/or thecombinatorial process of language. (Bannard, 2007)
... there are many more definitions and terms used to describethe phenomenon, e.g. Idioms, Fixed Expressions,Collocations.
For here and now, we stick with verbal MWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Definitions
Selected definitions from the literature:
Multiword Expressions (MWEs) are idiosyncraticinterpretations that cross word boundaries. (Sag et al. 2002)
A Multiword Expression is usually taken to be any wordcombination (adjacent or otherwise) that has some feature(syntactic, semantic or purely statistic) that cannot bepredicted on the basis of its component words and/or thecombinatorial process of language. (Bannard, 2007)
... there are many more definitions and terms used to describethe phenomenon, e.g. Idioms, Fixed Expressions,Collocations.
For here and now, we stick with verbal MWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Definitions
Selected definitions from the literature:
Multiword Expressions (MWEs) are idiosyncraticinterpretations that cross word boundaries. (Sag et al. 2002)
A Multiword Expression is usually taken to be any wordcombination (adjacent or otherwise) that has some feature(syntactic, semantic or purely statistic) that cannot bepredicted on the basis of its component words and/or thecombinatorial process of language. (Bannard, 2007)
... there are many more definitions and terms used to describethe phenomenon, e.g. Idioms, Fixed Expressions,Collocations.
For here and now, we stick with verbal MWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Definitions
Selected definitions from the literature:
Multiword Expressions (MWEs) are idiosyncraticinterpretations that cross word boundaries. (Sag et al. 2002)
A Multiword Expression is usually taken to be any wordcombination (adjacent or otherwise) that has some feature(syntactic, semantic or purely statistic) that cannot bepredicted on the basis of its component words and/or thecombinatorial process of language. (Bannard, 2007)
... there are many more definitions and terms used to describethe phenomenon, e.g. Idioms, Fixed Expressions,Collocations.
For here and now, we stick with verbal MWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
“Special” at di↵erent levels
MWEs are special at one or more levels of linguistic description:
non-compositional semantics:let the cat out of the bag(VMWE) 6=fixed syntax:to go bananas (VMWE) vs. *bananas are gone
fixed morphology:
feed to the lions (VMWE) vs. Litfeed to the lion
frequency: combination occurs more frequently than expectedvaguely remember (VMWE) vs. *vaguely recall
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
“Special” at di↵erent levels
MWEs are special at one or more levels of linguistic description:
non-compositional semantics:let the cat out of the bag(VMWE) 6=fixed syntax:to go bananas (VMWE) vs. *bananas are gone
fixed morphology:
feed to the lions (VMWE) vs. Litfeed to the lion
frequency: combination occurs more frequently than expectedvaguely remember (VMWE) vs. *vaguely recall
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
“Special” at di↵erent levels
MWEs are special at one or more levels of linguistic description:
non-compositional semantics:let the cat out of the bag(VMWE) 6=fixed syntax:to go bananas (VMWE) vs. *bananas are gone
fixed morphology:
feed to the lions (VMWE) vs. Litfeed to the lion
frequency: combination occurs more frequently than expectedvaguely remember (VMWE) vs. *vaguely recall
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
“Special” at di↵erent levels
MWEs are special at one or more levels of linguistic description:
non-compositional semantics:let the cat out of the bag(VMWE) 6=fixed syntax:to go bananas (VMWE) vs. *bananas are gone
fixed morphology:
feed to the lions (VMWE) vs. Litfeed to the lion
frequency: combination occurs more frequently than expectedvaguely remember (VMWE) vs. *vaguely recall
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
“Special” at di↵erent levels
MWEs are special at one or more levels of linguistic description:
non-compositional semantics:let the cat out of the bag(VMWE) 6=fixed syntax:to go bananas (VMWE) vs. *bananas are gone
fixed morphology:
feed to the lions (VMWE) vs. Litfeed to the lion
frequency: combination occurs more frequently than expectedvaguely remember (VMWE) vs. *vaguely recall
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Identification of VMWEs
Their special behaviour can be used to identify them!
feature identificationnon-compositional semantics word alignmentfixed syntax count occurring variantsfixed morphology count occurring variantsfequency statistical association measures
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Identification of VMWEs
Their special behaviour can be used to identify them!
feature identification
non-compositional semantics word alignmentfixed syntax count occurring variantsfixed morphology count occurring variantsfequency statistical association measures
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Identification of VMWEs
Their special behaviour can be used to identify them!
feature identificationnon-compositional semantics word alignment
fixed syntax count occurring variantsfixed morphology count occurring variantsfequency statistical association measures
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Identification of VMWEs
Their special behaviour can be used to identify them!
feature identificationnon-compositional semantics word alignmentfixed syntax count occurring variantsfixed morphology count occurring variants
fequency statistical association measures
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Identification of VMWEs
Their special behaviour can be used to identify them!
feature identificationnon-compositional semantics word alignmentfixed syntax count occurring variantsfixed morphology count occurring variantsfequency statistical association measures
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard
! to perform well
to sweep under the rug
! to hide
to spill the beans
! to reveal
(spill)
a secret
(the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard
! to perform well
to sweep under the rug
! to hide
to spill the beans
! to reveal
(spill)
a secret
(the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard
! to perform well
to sweep under the rug
! to hide
to spill the beans
! to reveal
(spill)
a secret
(the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug
! to hide
to spill the beans
! to reveal
(spill)
a secret
(the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug
! to hide
to spill the beans
! to reveal
(spill)
a secret
(the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug ! to hide
to spill the beans
! to reveal
(spill)
a secret
(the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug ! to hide
to spill the beans
! to reveal
(spill)
a secret
(the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug ! to hide
to spill the beans ! to reveal
(spill)
a secret
(the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug ! to hide
to spill the beans ! to reveal (spill) a secret (the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug ! to hide
to spill the beans ! to reveal (spill) a secret (the “beans”)
to bring somebody down
! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug ! to hide
to spill the beans ! to reveal (spill) a secret (the “beans”)
to bring somebody down ! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug ! to hide
to spill the beans ! to reveal (spill) a secret (the “beans”)
to bring somebody down ! to make somebody sad
to go ahead
! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
CompositionalityThe most prominent feature of VMWEs is their non-compositionality:you cannot tell about the whole based on it’s parts.
But: not all MWEs are completely non-compositional.It can be di�cult to draw the line:
to cut the mustard ! to perform well
to sweep under the rug ! to hide
to spill the beans ! to reveal (spill) a secret (the “beans”)
to bring somebody down ! to make somebody sad
to go ahead ! to start/proceed
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Context
In some contexts, expressions that are taken to be VMWEs mightbe used compositionally. Imagine:
A soccer player kicking a bucket
A cat that has been caught in a bag and is later released
A janitor sweeping some dirt under some rug
A child spilling some beans on the floor
! not very likely, probably not very frequent, but not impossible!
! VMWE decisions have to be made context-dependently
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Context
In some contexts, expressions that are taken to be VMWEs mightbe used compositionally. Imagine:
A soccer player kicking a bucket
A cat that has been caught in a bag and is later released
A janitor sweeping some dirt under some rug
A child spilling some beans on the floor
! not very likely, probably not very frequent, but not impossible!
! VMWE decisions have to be made context-dependently
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Context
In some contexts, expressions that are taken to be VMWEs mightbe used compositionally. Imagine:
A soccer player kicking a bucket
A cat that has been caught in a bag and is later released
A janitor sweeping some dirt under some rug
A child spilling some beans on the floor
! not very likely, probably not very frequent, but not impossible!
! VMWE decisions have to be made context-dependently
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Context
In some contexts, expressions that are taken to be VMWEs mightbe used compositionally. Imagine:
A soccer player kicking a bucket
A cat that has been caught in a bag and is later released
A janitor sweeping some dirt under some rug
A child spilling some beans on the floor
! not very likely, probably not very frequent, but not impossible!
! VMWE decisions have to be made context-dependently
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Context
In some contexts, expressions that are taken to be VMWEs mightbe used compositionally. Imagine:
A soccer player kicking a bucket
A cat that has been caught in a bag and is later released
A janitor sweeping some dirt under some rug
A child spilling some beans on the floor
! not very likely, probably not very frequent, but not impossible!
! VMWE decisions have to be made context-dependently
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Context
In some contexts, expressions that are taken to be VMWEs mightbe used compositionally. Imagine:
A soccer player kicking a bucket
A cat that has been caught in a bag and is later released
A janitor sweeping some dirt under some rug
A child spilling some beans on the floor
! not very likely, probably not very frequent, but not impossible!
! VMWE decisions have to be made context-dependently
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Context
In some contexts, expressions that are taken to be VMWEs mightbe used compositionally. Imagine:
A soccer player kicking a bucket
A cat that has been caught in a bag and is later released
A janitor sweeping some dirt under some rug
A child spilling some beans on the floor
! not very likely, probably not very frequent, but not impossible!
! VMWE decisions have to be made context-dependently
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
High Variety
VMWEs come in many di↵erent shapes, e.g.:
Idioms: let the cat out of the bag, spill the beans
Light-verb constructions: make a mistake vs. *do a mistake
Particle verbs: to bring down vs. Litto pick up
Inherently reflexive verbs:
sich enthalten (to abstrain) vs. Litenthalten (to contain)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
High Variety
VMWEs come in many di↵erent shapes, e.g.:
Idioms: let the cat out of the bag, spill the beans
Light-verb constructions: make a mistake vs. *do a mistake
Particle verbs: to bring down vs. Litto pick up
Inherently reflexive verbs:
sich enthalten (to abstrain) vs. Litenthalten (to contain)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
High Variety
VMWEs come in many di↵erent shapes, e.g.:
Idioms: let the cat out of the bag, spill the beans
Light-verb constructions: make a mistake vs. *do a mistake
Particle verbs: to bring down vs. Litto pick up
Inherently reflexive verbs:
sich enthalten (to abstrain) vs. Litenthalten (to contain)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
High Variety
VMWEs come in many di↵erent shapes, e.g.:
Idioms: let the cat out of the bag, spill the beans
Light-verb constructions: make a mistake vs. *do a mistake
Particle verbs: to bring down vs. Litto pick up
Inherently reflexive verbs:
sich enthalten (to abstrain) vs. Litenthalten (to contain)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
High Variety
VMWEs come in many di↵erent shapes, e.g.:
Idioms: let the cat out of the bag, spill the beans
Light-verb constructions: make a mistake vs. *do a mistake
Particle verbs: to bring down vs. Litto pick up
Inherently reflexive verbs:
sich enthalten (to abstrain) vs. Litenthalten (to contain)
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Problematic for many NLP applications
Everyone knows that VMWEs cause problems in NLP,but they are still often ignored in many applications.
many di↵erent shapes
long-distance dependencies between parts
data sparsity: many types not so many tokens
compositionality: where to draw the line?
translation: VMWEs do not always translate into VMWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Problematic for many NLP applications
Everyone knows that VMWEs cause problems in NLP,but they are still often ignored in many applications.
many di↵erent shapes
long-distance dependencies between parts
data sparsity: many types not so many tokens
compositionality: where to draw the line?
translation: VMWEs do not always translate into VMWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Problematic for many NLP applications
Everyone knows that VMWEs cause problems in NLP,but they are still often ignored in many applications.
many di↵erent shapes
long-distance dependencies between parts
data sparsity: many types not so many tokens
compositionality: where to draw the line?
translation: VMWEs do not always translate into VMWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Problematic for many NLP applications
Everyone knows that VMWEs cause problems in NLP,but they are still often ignored in many applications.
many di↵erent shapes
long-distance dependencies between parts
data sparsity: many types not so many tokens
compositionality: where to draw the line?
translation: VMWEs do not always translate into VMWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Problematic for many NLP applications
Everyone knows that VMWEs cause problems in NLP,but they are still often ignored in many applications.
many di↵erent shapes
long-distance dependencies between parts
data sparsity: many types not so many tokens
compositionality: where to draw the line?
translation: VMWEs do not always translate into VMWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Problematic for many NLP applications
Everyone knows that VMWEs cause problems in NLP,but they are still often ignored in many applications.
many di↵erent shapes
long-distance dependencies between parts
data sparsity: many types not so many tokens
compositionality: where to draw the line?
translation: VMWEs do not always translate into VMWEs
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Projects
Research on VMWEs:
di↵erent languages
di↵erent linguistic description levels
di↵erent applications
di↵erent approaches for identification(e.g. linguistics-based, heuristic, machine learning, ...)
Possible Projects:
identification of VMWEs
improved processing of VMWEs (e.g. in SMT)
determine the degree of compositionality of VMWEs
determine the presence of VMWEs in di↵erent contexts
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
PARSEME
PARSEME = PARSing and Multi-word Expressions,a European project devoted to multiword expressions.http://typo.uni-konstanz.de/parseme/
upcoming shared task on identification of VMWEs
dates: t.b.a. February-March 2017
Workshop at EACL2017 in Valencia, Spain (April 2017)
Languages:
Germanic: English, German, Swedish, YiddishRomance: French, Italian, Romanian, Spanish, Brazilian PortugueseBalto-Slavic: Bulgarian, Czech, Croatian, Lithuanian, Polish, SloveneOther: Farsi, Greek, Hebrew, Hungarian, Maltese, Turkish
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
PARSEME
PARSEME = PARSing and Multi-word Expressions,a European project devoted to multiword expressions.http://typo.uni-konstanz.de/parseme/
upcoming shared task on identification of VMWEs
dates: t.b.a. February-March 2017
Workshop at EACL2017 in Valencia, Spain (April 2017)
Languages:
Germanic: English, German, Swedish, YiddishRomance: French, Italian, Romanian, Spanish, Brazilian PortugueseBalto-Slavic: Bulgarian, Czech, Croatian, Lithuanian, Polish, SloveneOther: Farsi, Greek, Hebrew, Hungarian, Maltese, Turkish
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
PARSEME
PARSEME = PARSing and Multi-word Expressions,a European project devoted to multiword expressions.http://typo.uni-konstanz.de/parseme/
upcoming shared task on identification of VMWEs
dates: t.b.a. February-March 2017
Workshop at EACL2017 in Valencia, Spain (April 2017)
Languages:
Germanic: English, German, Swedish, YiddishRomance: French, Italian, Romanian, Spanish, Brazilian PortugueseBalto-Slavic: Bulgarian, Czech, Croatian, Lithuanian, Polish, SloveneOther: Farsi, Greek, Hebrew, Hungarian, Maltese, Turkish
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
PARSEME
PARSEME = PARSing and Multi-word Expressions,a European project devoted to multiword expressions.http://typo.uni-konstanz.de/parseme/
upcoming shared task on identification of VMWEs
dates: t.b.a. February-March 2017
Workshop at EACL2017 in Valencia, Spain (April 2017)
Languages:
Germanic: English, German, Swedish, YiddishRomance: French, Italian, Romanian, Spanish, Brazilian PortugueseBalto-Slavic: Bulgarian, Czech, Croatian, Lithuanian, Polish, SloveneOther: Farsi, Greek, Hebrew, Hungarian, Maltese, Turkish
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
PARSEME
PARSEME = PARSing and Multi-word Expressions,a European project devoted to multiword expressions.http://typo.uni-konstanz.de/parseme/
upcoming shared task on identification of VMWEs
dates: t.b.a. February-March 2017
Workshop at EACL2017 in Valencia, Spain (April 2017)
Languages:
Germanic: English, German, Swedish, YiddishRomance: French, Italian, Romanian, Spanish, Brazilian PortugueseBalto-Slavic: Bulgarian, Czech, Croatian, Lithuanian, Polish, SloveneOther: Farsi, Greek, Hebrew, Hungarian, Maltese, Turkish
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions
Questions?
Fabienne Cap Language Technology:Research and DevelopmentTopic: Verbal Multiword Expressions