center for personkommunikation p.1 do you need speech in your project? speech recognition? speech...
Post on 15-Jan-2016
226 views
TRANSCRIPT
P.1
Center for PersonKommunikation
Do you need speech in your project?
• Speech recognition?
• Speech synthesis?
• English or Danish or … (who are going to be the test persons in your user trials?)
• Multilingual system?
P.2
Center for PersonKommunikation
Available speech software
• javax.speech, jsapi, jsapi-compliant speech engines, e.g. IBMs Viavoice
• MS SAPI & SAPI compliant speech engines, e.g. MS Whisper
---------------------------------------• HTK: graphVite (recogniser)• CPK SLANG (recogniser, open source, network service)• Danish Speech Synthesis (network service)
P.3
Center for PersonKommunikation
Project proposals 1
• User Interaction Paradigms on Portable Devices – Lars Bo Larsen – Speech recognition required
• A search Engine for images on the web - Lars Bo Larsen – Speech recognition can be considered (“Voice-enabled HTML”)
• 3D scanner and Face Animator – Henning Nielsen– Speech rec/synth. not relevant (?)
• Outdoor Navigation System for Blind Pedestrians – Ove Andersen– Speech synthesis required, recognition to be considered
• Electronic Reception Desk– Ove Andersen – Speech recognition/synthesis obvious
P.4
Center for PersonKommunikation
Project proposals 2
• A decision support system for assessing critically ill patients. – Steve Rees, Steen Andreassen
• Decision support system for advice on antibiotic therapy – Steen Andreassen– Speech not obvious, however the user interface may include speech
• Multi Modal Mediator - Gael Rosset – Speech recognition/synthesis required VoiceXML?
• Beyond WAP - Gael Rosset– Speech not obvious
• An Internet based decision support tool for diabetes patients– Speech not obvious, however the user interface may include speech
P.5
Center for PersonKommunikation
Natural Language ProcessingTom Brøndsted, CPK
• Symbols on the slides:
– this point may be brought up at the examination!
– this will NOT be brought up at the examination!
• Linguistic terms (noun, verb, nounphrase, verbphrase etc.) are explained in
– http://www.sil.org/LINGUISTICS/glossary
!
P.6
Center for PersonKommunikation
Dialogue System (text)James Allen: Natural Language Understanding, 1995
P.7
Center for PersonKommunikation
Recognition and parsing
decoding parsingLanguagemodel
grammar
vocabulary lexicon
speech text
text Semantic representation
P.8
Center for PersonKommunikation
MM1
• Chomsky: Types of grammars used in NLP
• Young: Types of grammars used in speech recognition
• Winograd: lexical ambiguity, structural ambiguity: What the simple grammar types can be used for: postponed to MM2
P.9
Center for PersonKommunikation
Background for NLP
• Questions brought up by N. Chomsky in the 1950’ies:– Can a natural language like English be described (“parsed”, “compiled”) with
the same methods as used for formal/artificial (programming) languages in computer science?
– Can we use simple finite state grammars or context-free grammars for the description of English?
– Or does linguistics need to invent an own and more powerful grammar type for the description of natural languages?
• Offshoots: “The Chomsky Hierarchy of Grammars”, “Natural Language Processing”, “Generative Transformational Grammar”,
P.10
Center for PersonKommunikation
Chomsky: Grammar Theory 0
• Some key extracts/quotations from ”Syntactic Structures”
– A language is a (infinite) set of sentences, each finite in length and constructed out of a finite set of elements.
– A grammar is a device that separate the grammatical sequences from the ungrammatical sentences and generates the structures of the grammatical ones.
– A grammar is a reconstruction of the native speaker’s competence, his ability to generate (produce and understand) an infinite number of sentences
– A grammar is a theory of a language. It must comply with the empiristic axioms: The theory must be adequate and simple.
P.11
Center for PersonKommunikation
Chomsky: Grammar Theory 1
English Native SpeakerEnglish Grammar/Language Theory
Have you a book on modern music?The book seems interesting.…...
Sentence parsable!Sentence parsable!…..
The grammar must generate (“parse”) ALL sentences acceptable to the native speaker and ….
!
P.12
Center for PersonKommunikation
Chomsky: Grammar Theory 2
English Native Speaker
According to my intuition this sentence is
OK!OK!
...
English Grammar/Language Theory
1) Colorles green ideas sleep furiously.2) Have you a book on modern music?…
… the grammar must generate NOTHING BUT sentences acceptable to the native speaker and ...
Random sentence generation:
!
P.13
Center for PersonKommunikation
Chomsky: Grammar Theory 3
Grammar A
Grammar B
Set of Sentences
generated byA and B
Preferable grammar
(equivalent grammars)
… the grammar must be as SIMPLE (e.g. “small”) as possible
Language l
!
P.14
Center for PersonKommunikation
Chomsky: Grammar Theory 4
What’s in the “Black Box”? What type of grammar can “generate” a natural language like English?
– A Finite State Grammar without/with loops? • (No! “Syntactic Structures” pp. 18 ff.)
– A Phrase Structure Grammar?• (No! “Syntactic Structures” pp. 26 ff.)
– A Transformational Grammar?• (Yes/Maybe! According to “Syntactic Structures” pp 34 ff. BUT
“Generative Transformational Grammar” has turned out to be a “blind alley” in computational linguistics)
?
!
!
!
P.15
Center for PersonKommunikation
Chomsky: Hierarchy of Grammars
• Type 3: Regular Grammars
– Equivalent to finite state automata, finite state transition networks, Markov models (probabilistic type).
• Type 2: Context free Grammars
– E.g. recursive transition networks (RTNs), phrase structure grammars (PSGs). Unification grammars where attributes take values drawn from a finite table.
• Type 1: Context sensitive Grammars
– Augmented transition networks (ATNs), transformational grammars, some unification grammars
• Type 0: Unrestricted Grammars
!
!
P.16
Center for PersonKommunikation
Finite State Grammar
• Structure:– Directed Graph/Transition Network structure
– All transitions are terminals
– The terminal symbols are either words or POS (word class) names like Noun, Verb, Pronoun.
– The network structure may involve loops (iterations) and “empty” transitions (jumps, skips)
4
nodes
transition
loopj
jump
!
P.17
Center for PersonKommunikation
Recursive Transition Network Grammar
• Structure:– A SET of named Directed Graph/Transition Network structureS
– Transitions are terminals or NON-TERMINALS
– Terminal symbols/loops/jumps -> see FSN -slide
– A non-terminal symbol is the name of a network in the set included in the RTN
X
Jump
Xa bEquivalent BNF/PSGX -> a bX -> a X b
AnBn-problem, “Syntactic Structures”, p. 30
!
P.18
Center for PersonKommunikation
What’s wrong with FSNs & RTNs according to Chomsky?
• FSNs without loops can only generate a finite set of sentences. English is an infinite set
• FSNs with loops generate infinite sets of sentences but cannot describe AnBn sequences found in constructions with “respectively”.
• RTNs (PSGs/BNFs) generate infinite sets of sentences, can describe AnBn sequences, but applied to English a huge number of symbols is required (conflict with simplicity)
!
P.20
Center for PersonKommunikation
Young et al.: Grammar types in speech reocgnition
• Level Building [obsolete] (Young p. 8 f.):– finite state grammar without loops
• Viterbi/Token passing (Young p. 8 f, p. 11 ff.)– finite state grammar with loops
– context-free grammar provided that every non-terminal refers to a unique instance of a sub-network (No recursions!)
Conclusion: Decoding algorithms used in modern speech recognition technology can only be applied on the weakest grammar type within the Chomsky Hierarchy
P.21
Center for PersonKommunikation
Exercise 1
• The following extremely simple grammar generates all (typed) English sentences. What is wrong with it according to the Chomsky theory?
• Describe the concepts “native speaker, “intuition”, “generate” as used in the Chomsky theory.
char
Lexicon: char=a,b,c,…z,’;’,’.’,’?’, …etc
P.22
Center for PersonKommunikation
Exercise 2
Consider the "correct" (or "grammatical") ansi-C printf sequences in I and compare them with the "false" (or "ungrammatical") ones in II :
I printf("%d %s",integer,string)
printf("%s %d %d",string,integer,integer)
printf("%d",integer)
etc.
II printf("%d %s",integer,string,string)
printf("%d %s",integer)
printf("%d %s",string,integer)
etc.
Is it possible to design a regular (finite state) grammar that generates the correct sequences in I without generating II? If not, can a context-free grammar be designed that meet these conditions?