formal language

48
7/16/2019 Formal Language http://slidepdf.com/reader/full/formal-language 1/48 CogSci 131 Language as a formal system Tom Griffiths

Upload: brian

Post on 05-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Formal Language

TRANSCRIPT

Page 1: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 1/48

CogSci 131

Language as a formal system

Tom Griffiths

Page 2: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 2/48

 Admin

•  Problem Set 0 is due tomorrow at 5pm

•  Problem Set 1 will be out tomorrow – alittle harder than Problem Set 0 and morerepresentative of what to expect

 – 

the Turing machine problem is tricky!

Page 3: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 3/48

Token manipulation systems

•  System is defined fully by

 – 

a set of tokens

 – 

starting positions for those tokens

 –  formal rules stating how token positions canbe changed into other token positions

•  Rules depend only on current positions,and define only the next positions

Page 4: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 4/48

Language as a formal system

Noam Chomsky

Page 5: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 5/48

Studying the mind in 1950

•  Behaviorism

 –  explaining complex behaviors through simpleassociative learning mechanisms

 –  constructing theories of behavior withoutinternal mental states or representations

•  e.g. language (= “verbal behavior ”)

 – 

speech acts are a response to environmentalstimuli, with learned sequential structure

Page 6: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 6/48

The cognitive revolution

•  Chomsky provided evidence for the idea thatwe can model the mind as a formal system

 – 

rigorous treatment of mental representations –  using human data to evaluate formal proposals 

•  This was part of a more general revolution in

the way we approach behavior –  making the study of cognition respectable

Page 7: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 7/48

Symposium on Information Theory

•  Often considered the birth of cognitive science(on 9/11/56, at MIT)

•  Three famous papers presented:

 –  Allen Newell & Herbert Simon, “The Logic TheoryMachine: A complex information processing system” 

 – 

Noam Chomsky,“Three models of language

” 

 –  George Miller, “The magical number seven” 

Page 8: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 8/48

Behaviorist view of language

•  People form associations betweenwords and things (semantics)

•  People form associations between

words and other words (syntax) –

 

“the” followed by “word” makes it morelikely that “the” will be followed by “word”

“sheep”

Page 9: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 9/48

What was Chomsky attacking?

•  Simplistic behaviorist notions of syntax

• 

Models of language as sequential  

 –  e.g., n-th order Markov chains:

P(wi+n

  |wi,...,w

i+n"1)

Page 10: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 10/48

wi+1 is independent of its history given wi 

w w w w w w w w

Transition matrix

 P (wi+1|wi

)

Markov chains

Page 11: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 11/48

Markov chains 

 A. A. Markov

Chomsky's work in linguistics imply

concomitant understandings of aspects of

 mental processing and human nature. His

theory of a universal grammar was seen by

 many as a direct challenge to the

established behaviorist theories of the

external environment. The link between

human innate aptitude to language and mind

are innate. The acquisition and

development of innate propensities

triggered by the experiential input of the

time and in later discussions, we are

still far from understanding the genetic

setup of humans and aptitude to language

have been suggested at that time and had

 major consequences for understanding how

language is learned by children.

Page 12: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 12/48

What was Chomsky attacking?

•  Simplistic behaviorist notions of syntax

• 

Models of language as sequential  

 –  e.g., n-th order Markov chains:

 – 

or, n-grams:P(w

i,...,w

i+n )

 

P(wi+n

  |wi,...,w

i+n"1)

Page 13: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 13/48

P (model, of, language)

Page 14: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 14/48

P (model, of, quickly)

Page 15: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 15/48

Language

“a set (finite or infinite) of sentences, each finite inlength and constructed out of a finite set of elements” 

all sequences

L This is a good sentence 1Sentence bad this is 0

linguistic analysis aims to separate the grammatical  sequences which are sentences of L from the

ungrammatical  sequences which are not

Page 16: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 16/48

Grammatical

! meaningful

(1) “Colorless green ideas sleep furiously.” 

(2) “Furiously sleep ideas green colorless.” 

“It is fair to assume that neither sentence (1) nor (2)(nor indeed any part of these sentences) has everoccurred in an English discourse. Hence, in any

statistical model for grammaticalness, these sentenceswill be ruled out on identical grounds as equally‘remote’ from English.” 

! probable*

Page 17: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 17/48

Page 18: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 18/48

Grammar

• 

 A formal system!

  –  tokens

 –  initial positions

 – 

rules for moving between positions

•  !

such that final positions are sentences

“a device that generates all of the grammaticalsequences of L and none of the ungrammatical ones” 

Page 19: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 19/48

Syntax

• 

 Atomic formulas: proposition symbols(e.g. P, Q), True and False

•  Complex formulas built out of simple

formulas via rules –  if " and # are okay, ("$#) is okay

 –  if " and # are okay, ("%#) is okay

 –  if " and # are okay, ("&#) is okay

 – 

if " and # are okay, ("'#) is okay –  if " is okay, ¬" is okay

Page 20: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 20/48

Finite state grammar

START FINISHTHE

DOG

DOGS

RUNS

RUN

Page 21: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 21/48

Finite state grammar

START FINISHTHE 

DOG

DOGS

RUNS

RUN

THE

Page 22: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 22/48

Finite state grammar

START FINISHTHE

DOG 

DOGS

RUNS

RUN

THE DOG

Page 23: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 23/48

HAIRY

Finite state grammar

START FINISHTHE

DOG

DOGS

RUNS 

RUN

THE DOG RUNSTHE DOGS RUN

THE HAIRY HAIRY HAIRY HAIRY HAIRY H

Page 24: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 24/48

HAIRY

Finite state grammar

START FINISHTHE

DOG

DOGS

RUNS

RUN

The set of languages generated by finite stategrammars are called “regular ” languages 

Page 25: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 25/48

English is not a regular language

• 

Many simple languages are not regular

 –  e.g.anbn = { ab, aabb, aaabbb, aaaabbbb,!} 

• 

English exhibits similar dependencies –  e.g. the dog the cat chased runs 

•  This “center embedding” indicates that

English is not a regular language –  (provided we include infinitely long sentences)

Page 26: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 26/48

Phrase structure grammar

Tokens

Starting positions

Formal rules

S, NP, VP, T, N, V

the, man, ball, hit, took

S

  S ( NP VPNP ( T NVP ( V NP

T ( theN ( man, ball,! V ( hit, took,! 

Page 27: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 27/48

Phrase structure grammar

S ( NP VPNP ( T NVP ( V NPT ( theN ( man, ball,! V ( hit, took,

!

 

S

NP VP

T N V NP

T Nthe man hit

the ball

Page 28: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 28/48

Phrase structure grammar

•  Context-free languages

 – 

rules of the form X ( Y

 –  e.g. anbn   S ( aSb, S ( ab 

•  Context-sensitive languages 

 –  rules of the form Z X W ( Z Y W

 – 

e.g. an

bn

c n

 

Page 29: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 29/48

Transformational grammar

• 

Phrase structure grammars miss somestructural connections between sentences

 –  e.g. active and passive forms

• 

Hence transformations –  e.g. if active form is grammatical, so is passive

•  Transformational grammar is complicated! 

 –  complexity of identifying grammatical sentences:

Regular O(n) Context-sensitive worse

Context-free O(n3) Transformational undecidable

Page 30: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 30/48

Chomsky’s project

• 

Identifying a formal system that captures thestructure of human language (and thought)

•  Ignoring the limitations imposed by finite

human memory resources•  This project was distinctive in

 – 

postulating rich structures involved in cognition

 –  using human data (linguistic intuitions) to rule out

certain kinds of formal systems

 –  also having implications for computer science

Page 31: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 31/48

Marr ’s three levels

Computation “What is the goal of the computation, why is itappropriate, and what is the logic of the strategyby which it can be carried out?” 

Representation and algorithm

“What is the representation for the input andoutput, and the algorithm for the transformation?” 

Implementation “How can the representation and algorithm berealized physically?” 

  c  o  n  s   t  r  a   i  n  s

  c  o  n  s   t  r  a   i  n  s

Page 32: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 32/48

Break  

Up next:

The Chomsky hierarchy

Page 33: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 33/48

 

If language were finite 

(Sung to the tune of “If I were a richman”, from Fiddler on the Roof. With

apologies to Noam Chomsky.) 

If language were finite,One could memorize All sentences as if they were just lists.But it’s not. True novelty exists.Language is no finite sys-tem. 

 A finite state grammarIs a tempting second thoughtBut clearly isn’t what we’ve gotThe first words in almost any phraseCan constrain the end in many ways 

So how about a push-down automaton

Might that be the very thing?

Just like a finite-state with a proper stack.There would be just one symbol poppedfrom the top

Plus one as input from the string And others there just waiting to pop back.

But hu-man language cannot be context-freeSwiss-German shows why this is trueThanks to the cross seri-al de-pen-den-cySo much for the very thought that language

could be finite

I’ve shown why the notion just won’t doBut now onto what language has to be! Oy! 

Language isn’t finiteNor is it finite stateOr even possibly push-downNor just strings composed of verb and noun

Language is a complex sys-tem.There are transformationsOver parse-trees that yousimply cannot code as linear stringsEvery sentence is hierarchicalWith deep structure that can be revealed 

I see language as governed by universal grammarWith a hefty set of rules And acquisition guided by a deviceI see grammar as context sensitive and complexNot like the grammar taught in schoolsWouldn’t such a system be quite nice?

Page 34: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 34/48

The Chomsky hierarchy

Languages

Computable

Context sensitive

Context free

Regular

Machines

Turing machine

Bounded TM

Push-down automaton

Finite state automaton

Page 35: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 35/48

state

read/write head

rules

tape

( state,read,move,state,write)

Turing machine

Page 36: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 36/48

state

read/write head

rules

tape

( state,read,move,state,write)

Finite state automaton

Page 37: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 37/48

Finite state

an

bm

 s 

Pushdown

anbm

anbn

 s b

b

a

 s 

anbm

anbn

an

bn

cn

Readable Stack/ Bounded TM

b

b

a

Page 38: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 38/48

Non-context-free constructions

• 

“Cross-serial dependencies” 

 –  occur in Swiss German and Dutch

 – 

in English: “respectively” 

“Bob, Jim, and Ted earned $3, $4, and $5 respectively” 

• 

Cannot be produced by context-free grammar

Page 39: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 39/48

The Chomsky hierarchy

Languages

Computable

Context sensitive

Context free

Regular

Machines

Turing machine

Bounded TM

Push-down automaton

Finite state automaton   H  u  m  a  n  s

Page 40: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 40/48

The power of rules and symbols

• 

Generativity

 – “infinite use of finite means” 

 – 

from tokens, initial positions, and rules, infinitely

many outcomes result –  captures (constrained) novelty of language

•  Structured representations

 –  e.g. hierarchical representations, expressing

relationships at multiple levels of abstraction

Page 41: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 41/48

Structured representations

•  Driving

Page 42: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 42/48

Start car

Step on gas Turn on ignition

Take out keyMove leg Push gas Insert key

in ignition

Turn key

Page 43: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 43/48

Structured representations

•  Driving

•  Cooking

Page 44: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 44/48

(Humphreys & Forde, 1999)

Page 45: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 45/48

(Cooper & Shallice, 2000)

Page 46: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 46/48

Structured representations

•  Driving

•  Cooking

• 

Music and dance

•  Is any behavior not  hierarchically organized?

Page 47: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 47/48

The power of rules and symbols

• 

Generativity

 – “infinite use of finite means” 

 – 

from tokens, initial positions, and rules, infinitelymany outcomes result

 –  captures (constrained) novelty of language

•  Structured representations

 –  e.g. hierarchical representations, expressing

relationships at multiple levels of abstraction

Page 48: Formal Language

7/16/2019 Formal Language

http://slidepdf.com/reader/full/formal-language 48/48

Next week

•  Take a look at Problem Set 1!

 –  it’s harder, may take longer, plan accordingly

•  Tuesday: Learning structured representations

 – 

(or: Not learning structured representations)

 –  The Poverty of the Stimulus argument