phrase structure helen louise gaylard

PHRASE STRUCTURE

IN A

COMPUTATIONAL MODEL

OF

CHILD LANGUAGE ACQUISITION

Helen Louise Gaylard

A thesis submitted to the

School of Computer Science,

Faculty of Science,

University of Birmingham

for the degree of

Doctor of Philosophy

(Cognitive Science)

March 1995

Abstract

This thesis describes a computational model of child language acquisition

which acquires a recursive phrase-structure grammar in the absence of X-Bar

Theory. The model assumes no grammar, lexicon, or segmentation. Input

utterances include phrases as well as sentences, of no more than two levels of

embedding, paired with their semantic representations. The initial products of

acquisition are a lexicon of unanalysed utterances and a finite-state grammar.

The lexical items acquired guide further lexical acquisition, which results in

their segmentation, and thus triggers the acquisition of a phrase-structure

grammar. The Lexical-Functional Grammar formalism is used, so that

acquiring C-Structure, or phrase structure, can be viewed as mapping the

ordered utterance onto the unordered F-Structure, a shallow semantic

representation. Generalization over the phrase-structure rules acquired

results in the induction of syntactic categories, and it is this which gives rise to

recursion in the grammar. The model demonstrates both Degree-2 learnability

and incremental learning in accordance with the gradual nature of child

language development.

Acknowledgements

To Peter, for your enormous generosity.

For invaluable support, Muni, Keith, Jon and many others.

To my Thesis Group, Ros Bradbury and Aaron Sloman, and everyone else who

has provided useful advice and encouragement.

To Iain, for being unbelievably patient and understanding.

A special thanks to Kathryn; hugs and kisses to Emma; to Chloe, love and no

kisses at all!

He who has suffer’d you to impose on him knows you.

William Blake, Proverbs of Hell

Contents Page

1. Introduction 1

1. 1. Overview 1

1. 2. Grammar Acquisition and Induction 1

1. 3. The Computational Paradigm 2

1. 4. Evaluation of Existing Models 4

1. 5. The Approach to Phrase Structure Acquisition 4

1. 6. The Integrated Model of Acquisition Processes 5

1. 7. The Approach to Evaluation 6

1. 8. Secondary Issues Addressed 7

1. 8. 1. Top-Down Lexical Acquisition 8

1. 8. 2. Issues in Functional Morpheme Acquisition 8

1. 8. 3. The Bases of Syntactic Structure 8

1. 8. 3. Knowledge and Acquisition 9

1. 8. 4. Representation and Acquisition 9

1. 8. 5. Implications for Innateness and Autonomy 9

1. 9. Summary 10

2. Approaches to Child Language 11

2. 1. Aims 11

2. 2. Learnability Theory 11

2. 3. Psycholinguistics 14

2. 4. Computational Models 15

2. 4. 1. The Objective of Learnability 16

2. 4. 2. The Objective of Child Language Development 17

2. 4. 2. 1. A Stage Classification 17

2. 4. 2. 2. The Role of the Empirical Data 18

2. 5. Summary 19

3. Computational Models of Child Language Acquisition 20

3. 1. Introduction 20

3. 2. The Objectives for Evaluation 21

3. 2. 1. The Categories in the Child and Adult Grammars 22

3. 2. 1. 1. The Semantic Approach 23

3. 2. 1. 2. The Innatist Approach 24

3. 2. 1. 3. The Lexical Approach 24

3. 2. 1. 4. The Hybrid Innatist/Lexical Approach 25

3. 2. 2. The Gradual Acquisition of Phrase Structure 26

3. 2. 2. 1. The Empiricist Approach 27

3. 2. 2. 2. The Innatist Approach 29

3. 2. 3. An Overall Evaluation 30

3. 3. Conclusion 32

4. Development of the Computational Model 33


4. 2. Comprehension and Production 33

4. 3. Comprehension and Acquisition 34

4. 4. Architecture of the Model 34

4. 5. Parsing 35

4. 5. 1. Requirements of Acquisition 35

4. 5. 2. Theories of Syntactic Processing 37

4. 5. 2. 1. Ambiguity Resolution 41

4. 5. 2. 1. 1. Parsing Preferences 41

4. 5. 2. 1. 2. Lookahead 43

4. 5. 2. 1. 3. Predicate-Argument Frames and Thematic Roles 44

4. 5. 2. 1. 4. Context 45

4. 5. 2. 1. 5. Prosody 45

4. 5. 2. 2. Conclusion 46

4. 5. 3. Parsing and Representation in the Model 46

4. 5. 3. 1. Lexical-Functional Grammar 46

4. 5. 3. 1. 1. Levels of Representation 47

4. 5. 3. 1. 1. 1. C-Structure and F-Structure 47

4. 5. 3. 1. 2. Unification 49

4. 5. 3. 1. 3. Well-Formedness Conditions 50

4. 5. 3. 1. 4. LFG and Acquisition 50

4. 5. 3. 2. Development of the Parser 50

4. 5. 3. 2. 1. Implementation of Determinism 52

4. 5. 3. 2. 2. Summary of Parsing in the Model 57

4. 6. Acquisition in the Model 57

4. 6. 1. Representation in Acquisition 57

4. 6. 2. Acquisition as Parsing 58

4. 6. 3. Induction of Syntactic Categories 63

4. 6. 4. Summary of Grammar Rule Acquisition 64

4. 7. The Integration of Parsing and Acquisition 65

4. 7. 1. Meta-Knowledge and Acquisition 65

4. 7. 2. A Basic Assumption Which is Too Simple 66

4. 7. 3. A Problem in Acquisition 66

4. 7. 4. The Solution Implemented 68

4. 7. 5. Implications 70

4. 8. Comparison with L.PARSIFAL 70

4. 9. Summary and Conclusions 71

5. Phrase Structure Acquisition in the Computational Model 73


5. 2. The Problem of Phrase Structure Acquisition 73

5. 3. The Problem of Functional Morpheme Acquisition 75

5. 4. The Approach to the Problem of Phrase Structure Acquisition 76

5. 4. 1. The Innatist Approach 76

5. 4. 2. An Empiricist Solution 78

5. 4. 2. 1. The Role of F-Structure 78

5. 4. 2. 2. Revising the Model to Exploit F-Structure 81

5. 4. 2. 2. 1. The Role of Different Types of Input Utterance 81

5. 4. 2. 2. 2. The Role of Lexical Acquisition and Segmentation 82

5. 4. 2. 3. Development of the Phrase Structure Grammar 82

5. 4. 2. 4. Summary of the Empiricist Solution 83

5. 5. Lexical Recognition, Lexical Acquisition, and Segmentation 83

5. 5. 1. Bottom-Up Approaches to Lexical Recognition 84

5. 5. 2. Top-Down Approaches to Lexical Recognition 86

5. 5. 3. Bottom-Up Approaches to Lexical Acquisition 87

5. 5. 4. The Novel Top-Down Approach to Lexical Acquisition 90

5. 5. Summary of Approaches to Lexical Processing 91

5. 6. Implementing Phrase Structure Acquisition 91

5. 6. 1. Assumptions 92

5. 6. 2. Lexical Recognition 93

5. 6. 3. Lexical Acquisition 94

5. 6. 4. Lexical Uncertainty and Lexical Lookup 98

5. 6. 5. Lexical Recognition During Acquisition 99

5. 6. 6. Modelling Phrase Structure Acquisition 100

5. 6. 7. Generalization and Syntactic Categorization 103

5. 6. 7. 1. Syntactic Category Splitting 104

5. 6. 8. The Acquisition of Recursive Phrase-Structure Rules 105

5. 6. 8. 1. Modelling the Acquisition of Recursive Constructions 105

5. 6. 8. 2. The Role of Generalization and Recursive Structures in Acquisition112

5. 6. 9. Summary of Acquisition in the Revised Model 113

5. 7. Further Issues in Acquisition 113

5. 7. 1. Closure in Acquisition 113

5. 7. 2. Representation in the Revised Model 116

5. 7. 2. 1. Prepositional Phrases and Grammatical Functions 116

5. 7. 2. 2. Representing Well-formedness Conditions 117

5. 7. 3. The Bases for the Acquisition of Syntactic Structure 120

5. 7. 3. 1. F-Structure May Underdetermine C-Structure 120

5. 7. 3. 1. 1. The Role of Utterance Type and Input Ordering in Acquisition 121

5. 7. 3. 1. 2. The Case of the Verb Phrase 122

5. 7. 3. 2. Summary 124

5. 8. Comparison with Wolff’s SNPR 124

5. 9. Summary and Conclusions 125

6. Acquisition in the Model and in Children 127


6. 2. Some Issues in Functional Morpheme Acquisition 127

6. 2. 1. Order of Acquisition 127

6. 2. 2. Functional Morpheme Omission 128

6. 2. 3. Overregularization 128

6. 2. 4. Comprehension and Production 129

6. 2. 5. Issues in Learnability 129

6. 3. Existing Models 130

6. 3. 1. The Telegraphic Perception Hypothesis 130

6. 3. 2. Discrimination-Based Learning 131

6. 3. 3. Paradigm-Based Learning 132

6. 3. 4. Competition-Based Learning 133

6. 3. 5. Summary of Functional Morpheme Acquisition in Existing Models133

6. 4. Functional Morpheme Acquisition in the Model Developed 133

6. 4. 1. Order of Acquisition in the Model 134

6. 4. 2. Functional Morpheme Omission in the Model 139

6. 4. 3. Comprehension and Production in the Model 141

6. 4. 4. Overregularization in the Model 142

6. 4. 5. Representational Adequacy in the Model 143

6. 4. 5. 1. Implicit Features in Lexical Entries 143

6. 4. 5. 2. Generalization and Syncretism 145

6. 5. Conclusion 149

7. Summary and Conclusions 151

7. 1. Overview 151

7. 2. Summary 151

7. 3. What is Innate and What is Acquired 153

7. 4. Implications and Future Directions 154

7. 4. 1. Extending the Model 155

7. 4. 1. 1. The Role of Semantics in Acquisition 155

7. 4. 1. 2. Later Stages of Acquisition 156

7. 4. 2. Knowledge and Acquisition 156

7. 5. Conclusion 157

Appendices 158

Appendix A An Example of Grammar Acquisition in the Prototype Model 158

Appendix B An Example of Grammar Acquisition in the Prototype Model,

Illustrating Data Structures Used in Parsing 160

Appendix C An Example of Acquisition in the Integrated Model of

Acquisition Processes 167

Appendix D An Example of Acquisition in the Integrated Model of

Acquisition Processes, Illustrating Data Structures Used in

Parsing 170

Appendix E An Example of the Acquisition of a Grammar and Lexicon

Enhanced with Semantic Information (to Replace LFG's

Well-Formedness Conditions) 177

Appendix F An Example of the Acquisition of a Grammar and Lexicon

Enhanced with Semantic Information, Illustrating Data

Structures Used in Parsing 180

Appendix G An Example of the Acquisition of Recursive Constructions 188

Appendix H An Example of the Acquisition of Recursive Constructions,

Illustrating Data Structures Used in Parsing 191

Appendix I An Example Illustrating That the Acquisition of a Recursive

Np Rule is Dependent upon the Ordering of Input Utterances:

A Sequence of Inputs Which Supports the Acquisition of a

Recursive Rule 202

Appendix J An Example Illustrating That the Acquisition of a Recursive

Np Rule is Dependent upon the Ordering of Input Utterances:

A Sequence of Inputs Which Supports the Acquisition of a

Non-Recursive Rule 207

References 212

List of Illustrations Page

Figure 4.1 Architecture of the Model 35

Figure 4.2a Preferred Analysis 38

Figure 4.2b Non-Preferred Analysis 38

Figure 4.3a Preferred Analysis 39

Figure 4.3b Non-Preferred Analysis 40

Figure 4.4 C-Structure 48

Figure 4.5 F-Structure 48

Figure 4.6 The Chart Data Structure 51

Figure 4.7 Attachment versus Invocation 53

Figure 4.8 Argument versus Adjunct 54

Figure 4.9a Attachment versus Attachment 55

Figure 4.9b Attachment versus Attachment 56

Figure 4.10 Representation in Acquisition 58

Figure 4.11 Partial Parse Tree 61

Figure 4.12 Completed Parse Tree 62

Figure 4.13 Failure to Trigger Acquisition Mode 68

Figure 4. 14 Empiricism and Robustness 68

Figure 4. 15 Acquisition in the Absence of Top-Down Constraints 69

Figure 4. 16 Top-Down Constraints Ensure Early Recognition of Parsing Failure69

Figure 5.1 The Problem of Phrase Structure Acquisition 74

Figure 5.2 Matching Lexical Items to the Semantic Input 74

Figure 5.3 Phrase Structure in Functional Morpheme Acquisition 75

Figure 5.4 X-Bar Theory Guides Rule Invocation 77

Figure 5.5 Directed Acyclic Graphs Represent F-Structure 79

Figure 5.6 Invalid Representation of F-Structure 80

Figure 5.7 The F-Structure as a Phrase-Structure Tree Template 80

Figure 5.8 The Integrated Model of Acquisition Processes 92

Figure 5.9 “Reversing” Unification to Extract Common Features 96

Figure 5.10 Extracting Common Features from Nested F-Structures 97

Figure 5.11 Failure to Extract a Unique Set of Common Features 98

Figure 5.12 Flexible Lexical Lookup 99

Figure 5.13 Input F-Structure 101

Figure 5.14 Functional Morpheme Acquisition 102

Figure 5.15 Phrase Structure Acquisition 103

Figure 5.16 Lexical Entry Revision Through Flexible Lexical Lookup 104

Figure 5.17 Analysis of Utterance Triggering Acquisition of a Recursive Rule 109

Figure 5.18 Recursive Rules Enable Parsing of Novel Complex Utterances 112

Figure 5.19 Closure and Completeness 114

Figure 5.20 Closure and Non-Standard Segmentation 115

Figure 5.21 Closure is Required for Isomorphism 116

Figure 5.22 Prepositional Phrases in LFG 116

Figure 5.23 Alternative Representation of Prepositional Phrases 117

Figure 5.24 Analysis of Prepositional Phrase Follows Analysis of Noun Phrase120

Figure 5.25 Analysis of Prepositional Phrase Precedes Analysis of Noun Phrase121

Figure 5.26 Analysis of Sentence Precedes Analysis of Verb Phrase 123

Figure 5.27 Competing Analyses of the Verb Phrase 123

Figure 6.1 Paradigm Representation of Inflected Forms 132

Figure 6.2 English Verb Inflections 135

Figure 6.3 Implicit Features in F-Structures of Uninflected Forms 143

Figure 6.4 Case-Marking Noun Inflections in Serbo-Croatian 148

1

1. Introduction

1. 1. Overview

The work described here can be viewed as addressing an important issue in child language

acquisition by means of the development of a computational model. The issue addressed is

that of how a phrase-structure grammar, the target grammar of natural language

acquisition, may be acquired. A distinction is made between acquiring phrase structure

and assuming it in the form of innate linguistic knowledge, such as the X-Bar Theory of

Phrase Structure. The model described here differs from other models in that it acquires a

phrase-structure grammar without assuming such innate linguistic knowledge. It is argued

that the empiricist account of phrase structure acquisition offered here provides a better

account of the observed features of child language development than do the innatist

alternatives.

1.2. Grammar Acquisition and Induction

One way of characterising a natural language is in terms of its syntactic grammar. Given

this grammar, it is possible to deduce, for any string, whether or not it is a well-formed

string in the language. The problem facing the language learner has been viewed as the

opposite of this, induction as opposed to deduction. Given examples of legal strings in the

language, the learner is required to arrive at the underlying grammar. This leads to the

logical problem of language acquisition.

The logical problem of language acquisition arises from the fact that any sample of the

(infinite) set of well-formed strings in a natural language underdetermines the grammar,

being compatible with a number of alternative grammars (Gold 1967). The source of this

problem is induction which, unlike deduction, is logically invalid. In fact, the logical problem

of language acquisition is an example of the more general problem of reasoning by

induction.

There have been a number of responses to the logical problem of language acquisition.

One response involves the proposal of innate linguistic knowledge which constrains

acquisition such that only the appropriate grammar may be hypothesised. An alternative

response involves the rejection of the assumption that children receive as input to learning

only positive evidence, consisting of examples of well-formed strings in the language.

Gold (1967) showed that natural languages could be acquired given, additionally, negative

evidence, that is, examples of illegal strings in the language. Negative evidence could take

a number of forms, so that the question often framed of whether it plays a role in

acquisition is inherently ambiguous. Recent debates on the role of negative evidence have

2

centred on parents’ differential responses to children’s grammatical and ungrammatical

utterances (Bohannon et al 1990). The final response to the logical problem of language

acquisition mentioned here involves a rejection of the logical, or learnability-theoretic,

approach, with the goal of acquisition being viewed as the ability to use language rather

than the possession of a particular grammar.

The response taken here to the logical problem of language acquisition differs from all of

those outlined above. While accepting the logical problem of language acquisition, we

reject the assumption that acquisition-enabling constraints need be in the form of innate

linguistic knowledge. We propose that, by revising the current acquisition paradigm so as

to meet the requirements of learnability, factors internal to the processes of acquisition

may be identified which act as acquisition-enabling constraints.

The reasons for rejecting the standard responses outlined above are mentioned briefly

here and expanded upon in later sections. We argue against the assumption of innate

linguistic knowledge on the grounds that it results in models of acquisition which are too

powerful to account for the observed features of child language development.

Considerations of learnability cannot be ignored, since it is possible to characterise some

existing models as clearly deficient with respect to the grammar acquired. In developing a

computational model we are, at this stage, concerned only with utterances directed at the

child and not with feedback to the child’s own utterances. In this respect it is relatively

uncontroversial to assume a lack of negative evidence, since the overwhelming majority of

utterances directed towards children are grammatical (Newport 1977).

1. 3. The Computational Paradigm

The starting point for development of the model was an evaluation of existing models

which fall within a certain broadly-defined paradigm. These are concerned primarily with

the acquisition of syntax, in the form of a set of grammar rules for the language. The issue

of syntax acquisition is essentially abstracted away from wider issues in acquisition, on

the assumption that specific issues can be usefully explored within a very simple and

restricted model. An advantage of the computational approach is that any learning

processes proposed by a theory being modelled must be specified in detail. Generality is

to a certain extent sacrificed to this aim.

Other features shared by the models discussed are the assumption that semantic

information plays an important role in the acquisition of syntax. It can be viewed as

facilitating acquisition by constraining the possible grammars proposed. Typical inputs to

3

acquisition include an input utterance paired with its semantic representation. It is

assumed that the latter can be inferred from context in the case of simple, child-directed

utterances. This is one example of a gross, but intentional, over-simplification. Learning

the grammar involves mapping the utterance onto its semantic representation. Another

assumption made for the purposes of enabling this mapping is that a lexicon of content

words has been acquired prior to syntax acquisition.

Models also differ in various dimensions. There is the distinction, already mentioned,

between models which do and do not assume various kinds of innate linguistic knowledge.

Here, the terms innatist and empiricist are used to distinguish models which do and do not

assume the innateness of the X-Bar Theory of Phrase Structure and the syntactic

categories in terms of which it is formulated. A related distinction is that innatist models

tend to be concerned with formal questions of learnability, while empiricist models tend to

be more concerned with capturing the observed features of child language development.

Models of acquisition may be either comprehension- or production-based. In the case of

the former, inputs consist of the utterance and its semantic representation, the task facing

the learner being to match the two to derive the syntactic structure. In the case of the

latter, the initial input is the semantic representation, from which, along with any existing

grammar rules, the model attempts to produce an utterance. Learning takes place when

the model’s production is compared with a further input, the target utterance. It is argued

here that, on the whole, existing models have failed to give due consideration to the

question of the different roles of comprehension- and production-related processes in

acquisition.

Comprehension-based models may be further divided into those which assume parsing

using existing rules and grammar rule acquisition are separate processes, and those which

view the two as essentially the same process. The integration of the processes would

seem to be necessary if the intention is to model children’s experience of language. We

argue that children’s lack of meta-knowledge as to when to use existing rules and when

to attempt acquisition is an important constraint on acquisition.

One of the issues addressed here is the inadequacy of the existing paradigm. This is

elucidated below.

4

1. 4. Evaluation of Existing Models

The models examined are a diverse enough group to make evaluation difficult. To a certain

extent, each has to be examined independently with respect to its particular set of

objectives. The approach taken here to a more general evaluation of the models as a group

specified two kinds of objectives in relation to which an assessment could be made. The

first of these was the learnability-theoretic objective of accounting for the endstate of

acquisition. For instance, if the categories in the target grammar require a syntactic

description, then a model which is only able to account for the acquisition of a semantic

grammar will be regarded as, in this respect, deficient. The second objective was

accounting for the observed features of child language development. This objective is

obviously vague as stated, but it does allow us to recognise those cases where some

criterion is clearly violated. For instance, if the model has adult syntactic categories from

the earliest stages of development, and children appear not to, then this is considered an

inadequacy.

The evaluation of existing models based upon the principles outlined above revealed the

lack of a satisfactory account of the acquisition of a phrase-structure grammar. Models

fell, broadly speaking, into three classes. One class of empiricist models could account

only for the acquisition of finite-state grammars. In order to acquire a phrase-structure

grammar, another class of essentially empiricist models relied on ad hoc, in effect

language-specific, assumptions, while recognising the inadequacy of this approach to the

problem. The final class of models incorporated the proposed innatist solution of assuming

the X-Bar Theory of Phrase Structure to be innate. However, while these models met the

endstate objective of acquiring a phrase-structure grammar, the learning mechanisms

were too powerful to account for the gradual nature of children’s language development.

The need for an empiricist model of phrase structure acquisition was thus suggested.

1. 5. The Approach to Phrase Structure Acquisition

The approach taken to the problem of phrase structure acquisition involved examining the

simplified paradigm of acquisition shared by existing models in order to identify

acquisition-enabling constraints. One aspect which was examined and retained was the

assumption of a correspondence between semantic and syntactic constituency. This

assumption is crucial since it provides those basic constraints upon which the possibility

of the acquisition of phrase structure in an empiricist model would seem to depend.

There is an apparent paradox in the innatist claim that X-Bar Theory is required to

account for the existence of phrase structure. The claim implies that the semantic inputs to

5

acquisition do not provide the basis for phrase structure, and yet the basis of X-Bar

Theory is semantic. The role of X-Bar Theory in innatist models was examined, revealing

that, while X-Bar Theory contributed information with a semantic basis, this was

expressed in syntactic terms. The same information was present in the semantic input to

acquisition, but here it wasn’t translated into syntactic terms. The question to be

addressed was, thus, whether the semantic relations underlying phrase-structure

relations could be acquired directly from the inputs to acquisition, given a revised

acquisition paradigm.

Revising the acquisition paradigm involved examining and rejecting some of the ostensibly

simplifying assumptions of existing models. One of these was the assumption that all

input to acquisition consists of grammatical sentences. This was replaced with the more

realistic assumption that input includes, for instance, noun phrases and isolated nouns, as

well as sentences (Newport 1977). The other assumption rejected was essentially that

syntax acquisition could be considered independent of related processes in acquisition.

This could be broken down into the assumptions that a lexicon and segmentation abilities

were acquired prior to syntax acquisition. The latter assumption was reflected in existing

models both in the lexicon and in the segmentation of utterances input to learning into

their constituent words or morphemes. The independent model of syntax acquisition was

replaced with an integrated model of acquisition processes, including grammar acquisition,

lexical acquisition, and segmentation.

1. 6. The Integrated Model of Acquisition Processes

The integrated model of acquisition processes was implemented in Prolog. It consists

essentially of a deterministic left-corner parser in which the switch to acquisition mode is

triggered when parsing using existing rules fails. No initial lexicon is assumed and input

utterances are represented as phonological strings. Lexical acquisition in the model is

triggered by failure of lexical lookup. Segmentation takes place as a side-effect of lexical

acquisition.

The model initially acquires a finite-state grammar and a lexicon of large, unanalysed

chunks of language, such as phrases and sentences. Lexical acquisition uses existing

lexical knowledge. If a lexical entry for “cat” is required to parse an input utterance,

lexical acquisition might use an existing lexical entry for “the cat”. When “the” has also

been acquired, the lexical item “the cat” becomes redundant. This means that a grammar

rule is required to construct the noun phrase from its lexical constituents. As the units in

the lexicon are segmented into their constituent units, a phrase-structure grammar comes

6

to be acquired, replacing the earlier finite-state grammar. Syntactic categories are induced

in the model through generalization over the rules acquired, and through this process the

grammar becomes a recursive phrase-structure grammar.

Acquisition of the lexicon, segmentation, and acquisition of the grammar are intimately

related in this account, such that acquisition of the lexicon cannot be assumed without

affecting the course of syntax acquisition. Were lexical items like “cat” assumed, lexical

items like “the cat”, implicated in the acquisition of lexical entries for functional

morphemes like “the”, would never be proposed. This is one example of what is meant by

the claim that acquisition-enabling constraints are internal to acquisition, evolving

dynamically from it.

Learning in the model offers a better account of child language development than do the

innatist alternatives. Development, driven by the requirements of comprehension, is

smooth and continuous. It is gradual since innate linguistic knowledge and other

simplifying assumptions are removed. Acquisition of the target lexicon and of phrase

structure necessarily builds upon initially simple representations of knowledge. The units

in the initial finite-state grammar are not traditional linguistic units; however, they seem

consistent with the finding of extensive under-segmentation in children’s language

(Peters 1983, 1985). An interesting feature of acquisition is that it is non-deterministic

insofar as different inputs (including the same inputs in different orders) will result in

different units in the developing grammar and lexicon, with gradual convergence towards a

more uniform system of representation.

1. 7. The Approach to Evaluation

The intended status of the computational model is as a crude implementation of the

empiricist approach to phrase structure acquisition outlined, illustrating, most importantly,

that the integrated approach does make acquisition of a phrase-structure grammar

possible. There are certain aspects of the model which are clearly in need of refinement.

The model demonstrates that a phrase-structure grammar may be acquired given an ideal

semantic representation of utterances input. Consideration needs to be given to the

question of how acquisition processes may exploit imperfect semantic representations of

utterances. This is an issue which has to a certain extent been addressed by the

development of computational models focussing on children’s acquisition of lexical

semantics (Siskind 1992). The current implementation of lexical acquisition and

segmentation in the model is, again, primarily intended to demonstrate that integration of

these processes with syntax acquisition helps to resolve the main issue addressed. These

7

processes currently rely heavily on semantic information. A more detailed representation

of input utterances is required, so that learning processes can exploit multiple sources of

information. Nevertheless, the general approach to the lexical acquisition and

segmentation problem embodied in the model is novel and interesting in its own right. This

issue is returned to below.

The aim in developing the model was to work towards the formal objective of acquiring a

phrase-structure grammar, and then to evaluate the resulting model as an account of child

language development. This is the way in which the twin objectives of accounting for

learnability and the empirical data on child language development are combined. The high-

level, qualitative description of the model given above suggests that learning is more

child-like than in the innatist alternatives. The further step taken was to examine in detail

a specific aspect of learning in the model, that of functional morpheme acquisition.

The evaluation of functional morpheme acquisition in the model was a non-trivial task.

While the model is comprehension-based, most available child language data relates to

production. In the case of functional morpheme acquisition, the predictions of the model

with respect to order of acquisition of functional morphemes were examined. When

previous metrics for evaluation were considered, the predictions of the model appeared

incorrect. The model, being generalization-based, seemed to predict that lexical entries for

functional morphemes defined in terms of a complex of features would be acquired before

those for functional morphemes defined in terms of a single feature. However, this

analysis was concerned with the acquisition of competence, or, rather, a set of target

lexical entries. An attempt was made to predict performance in language production during

functional morpheme acquisition. The result of taking language production into account

was that the model made correct predictions with respect to functional morpheme

acquisition. In the case of a feature like the progressive which maps onto and defines a

single morpheme, “ing”, the model predicted correct usage from the start. Where a

combination of three features was required to determine the appropriate morpheme to be

used, as in the case of the third person singular “s”, there was a period of uncertainty in

which errors or omissions were expected. Further issues related to functional morpheme

acquisition examined included functional morpheme omission, overgeneralization, and the

development of comprehension of functional morphemes ahead of their production.

1. 8. Secondary Issues Addressed

There are a number of issues addressed in addition to the main question of phrase

structure acquisition, some of which have already been touched upon above.

8

1. 8. 1. Top-Down Lexical Acquisition

The implementation of lexical acquisition and segmentation in the model proposes a

solution to a problem as important as that of phrase structure acquisition. The issue

addressed is that of how top-down accounts of lexical recognition, which are heavily

dependent on information stored in the lexicon, can be reconciled with an account of lexical

acquisition, which, it has been argued, presupposes bottom-up segmentation. The model

developed here implements a top-down approach to lexical acquisition which does not

presuppose segmentation. This is made possible simply by the adoption of a flexible

criterion as to what may constitute a lexical item during acquisition, a criterion which

appears to be supported by the observed features of early child language. The possibility

of reconciling the top-down approach to lexical recognition and segmentation with an

account of acquisition removes what has perhaps been the major objection to it.

1. 8. 2. Issues in Functional Morpheme Acquisition

The issue of functional morpheme acquisition has already been mentioned. We offer a new

analysis of the predictions of generalization-based accounts of functional morpheme

acquisition, as outlined above. The model also offers a novel but very simple account of the

onset of both functional morpheme omission and overgeneralization following periods of

correct use. This is made possible by the model’s assumption that segmentation abilities

are acquired alongside, rather than prior to, the grammar.

1. 8. 3. The Bases of Syntactic Structure

The model developed, as an empiricist model, relies heavily upon semantic inputs to

acquisition. However, we also suggest other sources of syntactic structure. Recursive

constructions, not explicitly represented in the semantic input to acquisition, are acquired

through generalization. As recursive rules may be acquired on the basis of utterances with

no more than two levels of embedding, it is only for these utterances that semantic inputs

need be assumed. We further suggest that constituents like the verb phrase, which do not

correspond to semantic constituents, may be proposed corresponding to non-sentential

input utterances. If greater use were made by the model of the evidence available as to

what constitutes a grammatical utterance, it is possible that the role of semantics in

bootstrapping could be further reduced and that the syntax acquired could even play a role

in the development of semantic representation.

9

1. 8. 4. Knowledge and Acquisition

The learner’s lack of meta-knowledge as to when to use existing knowledge and when to

attempt acquisition is identified as one important constraint on acquisition in a model

which aims at psychological plausibility. It means that constraints as to what can further

be acquired evolve dynamically during the course of acquisition as a direct result of the

knowledge acquired. This is an idea which may have implications beyond the model

developed. The constraints we identify in the model affect the learning of a single

language, but similar constraints may also affect multilingual acquisition, explaining how

the possession of one language inhibits the acquisition of another language.

1. 8. 5. Representation and Acquisition

The Lexical-Functional Grammar (LFG) formalism was chosen to represent the

knowledge acquired in the model. Its suitability is an issue necessarily addressed during

the course of developing the model. An interesting issue concerns LFG’s well-

formedness conditions of completeness and coherence. The model either had to assume

these as innate or demonstrate their acquisition. For convenience these were initially

assumed to be innate, but various considerations necessitated their removal from the

model of phrase structure acquisition developed. An enhanced grammar rule notation

which matches lexical items on the basis of subcategorization as well as syntactic

category information was proposed. Like LFG’s well-formedness conditions, this rules

out ungrammatical sentences like “the girl handed”. The advantage of the notation is that

the acquisition of subject optionality is facilitated, since constraints on grammaticality are

viewed as language-specific, and thus a property of the grammar acquired. Furthermore,

non-sentential constituents such as verb phrases may be characterized as well-formed or

malformed, a distinction not provided by LFG’s well-formedness conditions.

1. 8. 6. Implications for Innateness and Autonomy

The question of what is innate and what is acquired in language acquisition is an important

issue addressed by the development of the model. It is argued that it is not necessary to

assume X-Bar Theory and syntactic categories to be innate, and, similarly, that

grammaticality conditions are acquired rather than innate. The empiricist solution to the

problem of phrase structure acquisition involves considering the relationship of syntax

acquisition to other issues in acquisition. The implication of this is that syntax is not to be

viewed as autonomous. With respect to learnability, the model builds upon recent work

(e.g., Morgan 1986) which has been concerned with how acquisition may be constrained

and learnability achieved, other than by the assumption of innate linguistic knowledge.

10

1.9. Summary

We accept the logical problem of language acquisition, while rejecting the assumption that

acquisition-enabling constraints need be innate. Phrase structure acquisition is a

particular issue in learnability which existing computational models have failed to address

satisfactorily. Our approach to the problem involves replacing the autonomous paradigm of

syntax acquisition with an integrated model of acquisition processes. The relationship of

syntax acquisition to other issues in language acquisition is important because, as any

kind of knowledge is acquired, it affects future acquisition.

11

2. Approaches to Child Language

2. 1. Aims

A statement of the aims of child language research which is broad enough to encompass a

variety of theoretical and methodological approaches is given by Levelt:

“The aim is to account for the acquisition of natural language skills on the basis

of the learner’s initial state of knowledge, the nature of the input, and the

learner’s inductive abilities …”

(Levelt 1992, p.291)

Here, we consider how computational models may contribute to this process. We examine

two major methodologies in child language research, observational/experimental

psycholinguistics, and the formal approach of Learnability Theory. There is surprisingly

little common ground between the two, each having a rather narrow set of objectives. The

result of this, we argue, is that each is in some respects insufficiently constrained. We

view computational models as ideally suited to address some of the same concerns as

both psycholinguistics and Learnability Theory. Their utility can thus be seen, at least

partly, in helping to bridge the gap between the other approaches discussed. In analysing

the role of computational models, we at the same time identify a set of objectives for their

evaluation.

2. 2. Learnability Theory

Learnability Theory is a formal approach to the problem of language acquisition. Its

objectives relate to setting out the conditions necessary for the acquisition of grammatical

competence. Those conditions focussed upon are the input to learning and the learner’s

pre-existing knowledge. The proposal of learning mechanisms and modelling the time

course of acquisition are viewed as objectives beyond the scope of Learnability Theory,

which takes an idealised, instantaneous view of acquisition. The approach also tends to be

linked with the assumption of the autonomy of syntax which licenses the view that the

acquisition of grammar rules can be logically separated from other issues in acquisition.

Here, we attempt to show how the limitations of Learnability Theory go beyond the

acknowledged limitations outlined.

Gold presented a formal proof that a natural language cannot be acquired in a finite amount

of time from positive evidence alone (Gold 1967). Given that the language input

underdetermines the grammar, the major recourse allowed by Learnability Theory is to the

proposal of innate linguistic principles. These are viewed as enabling acquisition by

constraining the possible grammars hypothesised. Wexler and Culicover (1980)

established a number of such principles, by means of formal proofs. Theirs was a proof of

12

Degree-2 learnability; that is, the principles proposed enabled acquisition given input with

two levels of embedding. We aim to show that there is an alternative response possible to

Gold’s proof.

Gold’s proof rests on the assumption that what is crucial is what is learned, not how it is

learned, and so the endproduct of language learning is abstracted away from the issues of

learning mechanisms and stages in acquisition. This methodological assumption appears

convenient, but it is not clear that it is valid. Learnability for natural languages is

evaluated by Gold in a situation where learning is viewed as a formal procedure whereby

possible grammars for a natural language are enumerated one by one and rejected as they

fail to account for the input. There are a number of grammars which accord with any finite

sample of language and selection of the appropriate grammar is made impossible in the

absence of negative evidence. We argue that this formal approach is not a neutral

description of learning which ignores questions about how learning proceeds. It is simply a

false description of how human language learning proceeds. Given that this description

renders language learning impossible, it seems that we do need to consider how learning

in the child differs from the process described. An alternative approach to the proposal of

innate linguistic knowledge is, thus, the consideration of alternative language learning

mechanisms. Outside of the framework of Learnability Theory, the predictions of

alternative learning mechanisms may be evaluated with respect to the empirical data on

child language development.

There is a general consensus that innate abilities play a role in language acquisition. What

is controversial is the proposal of abstract principles of a specifically linguistic nature.

Schlesinger (1982) argues that what is required is an explanation of how such principles

derive from humans’ information-processing capacities and limitations. Returning to the

point made above, a general, cognitive explanation of innate linguistic principles may be

equivalent to the proposal of alternative learning mechanisms. Braine argues against the

need to account for learnability in terms of abstract principles, on the grounds that the

function of language ensures that the appropriate conditions will be met:

“suppose we assume, plausibly, that only sentences that are processed

contribute to learning, and that the learner’s working memory is limited so that

the average complexity of the sentences that can be processed is less than two

degrees of embedding. Then the rules whose learnability depends on processing

of sentences with more than this complexity will not be learned, and languages

that contain such rules will be filtered out from the set of human languages.

13

Thus, no innate principles are needed just to guarantee learnability from simple

input.”

(Braine 1988, p.220)

The criticisms made above have been partly overcome by later work in Learnability

Theory. Morgan (1986) argues that the conditions required for Degree-2 learnability don’t

actually obtain, based on an analysis of speech directed at children. An alternative

Degree-1 proof is offered, which relies on the assumption that information about the

bracketing of phrasal constituents in utterances is available to children. Morgan argues

that prosody and the different types of utterance directed at children might be sources of

such bracketing information. Here, the available empirical evidence is thus used to

constrain the account given. However, insofar as the focus remains on the input to

acquisition, the limitations of Learnability Theory remain apparent.

Above, we have attempted to illustrate the limitations of a strictly learnability-theoretic

approach to language acquisition. The source of these can be seen as the methodology’s

neglect of questions concerning learning mechanisms and the non-instantaneous nature of

language acquisition. Computational models are clearly suitable for the implementation of

learning mechanisms, and these in turn generate empirically testable predictions

concerning the course of acquisition. This is illustrated by the L.PARSIFAL model

(Berwick & Weinberg 1984, Berwick 1985) in which Wexler and Culicover’s learnability

principles are claimed to arise naturally from the attempt to implement a psychologically

plausible model of parsing in acquisition. The model also generates predictions concerning

the relative order of acquisition of certain grammatical constructions.

An advantage of computational implementation is that it requires that the underlying

assumptions of theories be spelled out in detail. This has led to the view, taken by Pinker

(1979) and Wilkins (1989), of computational models as tools for the rigorous specification

of theories of the acquisition of grammatical competence, and thus as (merely)

alternatives to proofs in Learnability Theory:

“a theory that is powerful enough to account for the fact of language acquisition

may be a more promising first approximation of an ultimately viable theory than

one which is able to describe the course of language acquisition, which has

been the traditional focus of developmental psycholinguistics.”

(Pinker 1979, p.220)

The discussion above aims to illustrate that, rather, computational models have a role to

play in overcoming the deficiencies of a narrow, formal approach.

14

2. 3. Psycholinguistics

Psycholinguistics is concerned with the collection and characterization of empirical data on

child language development, and with the proposal of empirically testable theories of

acquisition. Our concern here will not be with the methods used to collect data; rather, we

focus upon the objectives of characterizing the data collected and using it in the testing of

hypotheses. In the same way that work in Learnability Theory has been insufficiently

constrained by empirical considerations, we argue that the inferences drawn from child

language data may be underconstrained due to a lack of attention to the formal objective of

learnability. We illustrate this point by examining grammars which have been proposed for

early child language.

Semantic grammars have been put forward in an attempt to characterize the earliest

utterances produced by children (e.g., Schlesinger 1982, 1988). It is argued here that the

problem with these is that, while they appear suitable with respect to the sample of

language in question, they are problematic for a theory of acquisition. The case against

semantic grammars has been made by Pinker (1984). He argues that they are not suitable

for inclusion in a continuous theory of acquisition, since the adult, syntactic grammar

cannot be acquired by building upon child grammars characterized in terms of semantic

categories. Semantic grammars of early child language are, thus, ruled out by the

constraint of learnability.

The telegraphic stage is so termed due to the fact that certain constituents, like functional

morphemes, are omitted from the utterances children produce. We use the term

“telegraphic grammars” to refer to those grammars which evince the assumption that any

functional morphemes omitted from the utterances children produce at this stage are also

not present as constituents in the grammar rules (e.g., Brown 1973; Hill 1983). There are

a number of problems with assuming such a grammar, some of which are put forward here.

Assuming that the grammar rules are acquired through comprehension, then some

explanation is required of how constituents present in input utterances come to be omitted

from the grammar acquired. It is thus commonly inferred that children fail to perceive or

notice grammatical morphemes, either because they tend to be unstressed, or because

they are semantically unimportant. The overall view of acquisition arrived at is problematic

from the viewpoint of learnability. It is not clear what role the telegraphic grammar plays in

the development of the adult grammar. Furthermore, a theory is required of how the child’s

perceptual abilities develop over time to enable acquisition of the adult grammar.

15

We have tried to illustrate above how inadequacy in accounts of child language may result

from insufficient attention to the formal requirement of learnability. However, there is no

principled reason why this consideration cannot be taken into account in order to constrain

the representations of knowledge proposed. Like Learnability Theory, psycholinguistics

can be seen as often failing to give due consideration to the learning mechanisms which

account for development from one stage to the next. Computational models, being required

to specify learning mechanisms as well as grammars, tend to be inherently more

constrained by the empirical data than are characterizations of individual stages in

development. Modelling the time course of acquisition makes it clear that any grammars

proposed must be suitable for modelling development and not just for describing the

language produced at a particular stage. Computational models can thus be viewed as a

useful addition to the methodology of psycholinguistics.

2. 4. Computational Models

We view Learnability Theory and psycholinguistics as each insufficiently constrained in

certain respects, such that a fusion of the objectives of each may enable us to maximize

the advantages and minimize the disadvantages of each. Our methodological framework

thus takes as its starting point the borrowing of objectives from both approaches. This

gives us the twin objectives, broadly stated, of accounting for learnability and for the

observed features of child language development. These are used to evaluate existing

models, resulting in the identification of the main problem addressed by the model

developed here, that of phrase structure acquisition. Since this is a formal, learnability-

type problem, the solution implemented must be constrained by, and evaluated with

respect to, the child language acquisition data.

Combining the two kinds of objective in relation to the model developed is non-trivial. We

assume the need for a continuous model of acquisition; that is, one in which the same

basic mechanisms and representations are involved at earlier and later stages in the

acquisition process. This means that evaluation of the earlier stages of acquisition, with

respect to the endstate objective of learnability, is facilitated. A constraint which is not

necessarily available to evaluation is information concerning the relationship which exists

between the observed stages of child language development and the attainment of

learnability objectives. In fact, discovering at what observed stages certain developments

in the grammar take place is one of the tasks faced by a theory of language acquisition.

Our approach to making an evaluation with respect to the child language data involves a

high-level, qualitative assessment of learning in the model. Where specific phenomena,

such as functional morpheme acquisition, are modelled, it is possible to give a more

16

detailed evaluation of such aspects of the model and to predict other phenomena which the

model should account for at the same (preceding/following) stages.

So far, our objectives have only been stated in the broadest terms. Below, we outline how

computational models may be evaluated with respect to the objectives identified of

learnability and of accounting for the observed features of child language development.

2. 4. 1. The Objective of Learnability

One way in which models of acquisition may be characterised is in terms of the type of

grammar acquired. A finite-state grammar is one in which all the constituents in rules are

lexical categories. A sentence, such as “the cat sat on the mat”, is described by a single

rule in a finite-state grammar:

S → Det Noun Verb Prep Det Noun

In the corresponding phrase-structure grammar, constituents in rules include non-terminal

categories which have further rules associated with them:

S → Np Vp

Vp → Verb Pp

Pp → Prep Np

Np → Det Noun

Recursion is a property characteristic of phrase-structure grammars typically used to

describe natural language. A recursive grammar is one in which a constituent of a certain

category may dominate (either directly or indirectly) another constituent of the same

category. For example, “The boy fell down” and “The girl said that the boy fell down” can

both be considered instances of the sentence-level category S.

The endstate of natural language acquisition cannot be characterized in terms of a finite-

state grammar (Chomsky 1957, pp.21-33). Models, therefore, need to account for the

acquisition of a recursive phrase-structure grammar, in which the constituents in rules are

syntactic (as opposed to, e.g., semantic) categories1. A question which needs to be

addressed is that of how this endstate objective is to constrain models of early child

language. It can be interpreted as the requirement that the knowledge acquired should be

consistent with a smooth, continuous development towards the target grammar.

Alternatively, some fairly detailed explanation is required, in terms of learning or

maturational processes, of how the transition to the endstate is to be made.

1Chomsky goes on to argue that a phrase-structure grammar is also inadequate, in the weak sense that it isinelegant relative to alternative forms of linguistic description. We go on to use the Lexical-FunctionalGrammar formalism, rather than the kind of naïve context-free phrase-structure grammar to which thiscriticism applies.

17

2. 4. 2. The Objective of Child Language Development

The stages and aspects of acquisition of interest will vary with the model being examined.

Here, we focus upon the earliest stages, corresponding to Brown’s Stages I & II, as

outlined below. These are the stages most often modelled, and the language produced at

these stages appears to provide various interesting contrasts with the language input to

acquisition and, similarly, with the endstate of acquisition.

2. 4. 2. 1. A Stage Classification

Brown (1973) has given a very general account of child language development, describing

it in terms of a number of recognisable stages. His analysis is based on longitudinal

studies of three children, Eve, Adam, and Sarah, along with suitable data generated by a

number of other researchers. Brown’s stages form one suitable basis for a comparison of

the aspects of language acquisition addressed by the various models, as they are primarily

descriptive or theory-neutral, being based on actual samples of child language. Brown’s

stages, I to V, are associated with the “five major processes of sentence construction”

Brown identifies as constitutive of language acquisition (Brown 1973, p.32). While the

development of each of these actually spans a number of the five stages, their

development is orderly, and, thus, each process is identified as the major development of

one of the five stages. The characteristic processes of the stages are outlined below.

Stage I

Brown’s stages cover language development from the child’s first productive word

combinations. At Stage I the child is viewed as learning how basic semantic relations are

expressed in the language. The child learning English acquires this knowledge in the form

of the language’s canonical word order. While most of the utterances produced by the child

at this stage will express a single relation obtaining between two constituents, the child is

viewed as aware that the English main verb paradigm is ordered thus:

Agent - Action - Dative - Object - Location

Stage I speech is termed telegraphic as grammatical morphemes are generally absent and

obligatory major constituents are also omitted from individual utterances.

Stage II

This is identified as the stage at which the child begins to acquire grammatical morphemes

such as (in English) noun and verb inflections, prepositions, articles, and the copular verb

“be”. The order of acquisition of the different grammatical morphemes is remarkably

consistent across children. A prominent feature of children’s early use of the functional

18

morphemes is a stage of overgeneralization, of overregularization, of the rules governing

the application of these. This phenomenon is most apparent in the production of the past

tenses of verbs where forms like “goed” may be produced for irregular verbs, and in the

formation of plurals where “mans” and/or “mens” may be used to indicate the plural of

“man” (Brown 1973, p.325). Children apparently have the ability to recover from such

errors without explicit correction. The present progressive “ing” is the first of the

grammatical morphemes to be mastered by the children studied by Brown, and there is no

evidence of its being overgeneralized. Brown argues that this is because there is a

semantic, as opposed to an apparently arbitrary, distinction between those verbs which do

(e.g., “to walk”) and do not (e.g., “to like”) take this form. It is also the case that the

progressive is completely regular in that there is no irregular alternative form to “ing”.

Stage III

Stage III is identified with the acquisition of grammatical means for expressing non-

declarative sentence modalities, such as interrogatives, imperatives, and negatives. The

interrogative form is expressed in the earlier stages by means of rising intonation.

Stage IV

At this stage the child acquires complementation and control, the knowledge required for

the production of complex, embedded grammatical constructions.

Stage V

The acquisition of sentence coordination is the focus of this stage. Acquisition of

coordinated forms is ordered. For instance, the English-speaking child produces “and”

constructions before those containing “but”, “because”, or “so”.

What has been given above is a brief overview of the distinguishing features of Brown’s

developmental stages which may serve to constrain possible accounts of acquisition. The

question of how it may perform this function is considered below.

2. 4. 2. 2. The Role of the Empirical Data

At the stages of interest identified there are features which children’s language shares

with the adult language as well as obvious contrasts. With respect to word order, the

earliest utterances appear adult-like. An obvious discrepancy to be explained between the

language input and the utterances children produce concerns children’s omission of certain

constituents. Another difference is the lexical appearance of child language (Ninio 1988),

whereby the use of lexical items appears to be restricted to certain contexts, as

19

contrasted with the syntactic categorization appropriate to characterizing adults’

grammars. The challenge for computational models is to account both for the phenomena

characterizing early stages and the transition to later stages.

In evaluating computational models of child language acquisition, most of the empirical

data available to us relates to children’s performance in language production. This is a

potential source of difficulty, as models tend to focus upon comprehension and the

acquisition of grammatical competence. In relating learning in models to the empirical data,

the “Competence Assumption” is often implicitly made:

“Assume that the child’s linguistic competence is relatively close to the child’s

linguistic performance. That is, do not propose a linguistic construct until there

is evidence for it in the child’s performance.”

(Ingram 1989, p.76)

However, while this assumption may sometimes hold, it cannot be assumed to do so. The

proposal of telegraphic grammars, discussed above, provides one example of the uncritical

use of the Competence Assumption. Gerken (1987) provides experimental evidence to

suggest that the inference from the omission of functional morphemes in language

production to the failure to perceive them in comprehension is an invalid one. In evaluating

models against the empirical data, it is thus important to maintain a clear distinction

between comprehension and production.

2. 5. Summary

We have attempted to examine the role of computational models within the wider context

of research in child language development and to identify a set of objectives in relation to

which models may be evaluated. Computational models are able to address and be

constrained by both the formal requirement of learnability and the empirical data on child

language development. They are able to combine the objectives of the very different

approaches of psycholinguistics and Learnability Theory, since, unlike both of these, they

place emphasis upon the learning mechanisms whereby development is driven.

20

3. Computational Models of Child Language Acquisition

3. 1. Introduction

In this section we present an evaluative review of recent computational models of child

language acquisition. Our discussion takes as its starting point the objectives for

evaluation set out in the previous chapter, rather than a detailed description of the models

in question. This is because the primary objective of the review is to identify important

issues which remain to be addressed, rather than to provide a catalogue of existing

models. A detailed description of each of the individual models examined would not

provide an ideal basis for attempting an overall evaluation. This is because, while the

models discussed share a number of aims and assumptions, they nevertheless form a

diverse group.

The introductory chapter mentioned the common aims and assumptions of the models

discussed. The major shared aim is that of syntax acquisition. A major shared assumption

is that semantic information plays an important role in the acquisition of syntax. Such

information may be given in the form of a semantic representation of the input utterance,

assumed to be inferred from context, and also in a lexicon of content words, assumed to be

previously acquired. In assuming a lexicon, models also presuppose the ability to segment

the input utterance. A further assumption implicitly shared by the models is that syntax

acquisition may be usefully explored in isolation from other issues in child language

acquisition.

Models diverge in relation to a number of important issues. Various different assumptions

are made about processing, which may be either comprehension- or production-based.

Certain models assume, in addition to previously acquired knowledge like the lexicon, the

existence of innate linguistic knowledge. Models further differ from each other in their

specific objectives and in the assumptions made concerning evaluation.

Below we attempt an overall evaluation of the models in relation to the twin objectives,

identified in the previous chapter, of accounting for learnability and the observed features

of children’s language development. Features of the individual models are expanded upon

as is appropriate to the purpose of evaluation. An essential aspect of evaluation is an

examination of the validity of models’ underlying assumptions. We conclude by

highlighting an important issue which, we argue, the models discussed fail to address

satisfactorily. This is the problem of phrase structure acquisition, the main issue to be

addressed by the model developed here.

21

3. 2. The Objectives for Evaluation

In the previous chapter we presented two general objectives against which the different

computational models of child language acquisition may be evaluated. The objective of

learnability requires that an account be given of how the endstate of acquisition may be

achieved. The objective of child language development requires that models be consistent

with, and thus suggest accounts of, the observed phenomena of first language acquisition.

Although our discussion of objectives for evaluation precedes the review of the actual

models, it is to a certain extent dependent upon an examination of the models. It is

because the models, considered as a group, evince such obviously contrasting objectives,

that we take into account both empirical and learnability-theoretic objectives in attempting

an overall evaluation. Similarly, we focus upon the type of grammar acquired as a

convenient metric for evaluation of the constraint of learnability, since this is one

dimension in which the models examined clearly differ.

In presenting our evaluation, we could consider each of the general objectives in turn.

However, the result of this would be, predictably, that one group of models would fare well

in relation to the objective of learnability, while a different group would appear to account

for certain observed features of early child language. Instead, we attempt a unified

evaluation, by taking pairs of apparently opposing and thus mutually constraining

objectives (one in each pair empirically-based, the other learnability-theoretic), and

comparing models’ alternative approaches to the unified objectives resulting from this

interaction. We focus upon two such pairs of objectives, as outlined below. These

objectives are chosen since they are sufficiently general to apply to all of the models

examined.

One objective we propose is that models should account for both the categories

appropriate to a description of children’s early grammars and the syntactic categories

characterising the target grammar. The first part of this objective is stated here in general

terms since the issue of what constitutes an appropriate description of children’s

grammars is considered below. A further objective for models is that they should account

for the initially simple appearance of children’s grammatical constructions as well as the

gradual acquisition of the target phrase-structure grammar, possibly from an initially less

powerful form of representation, such as a finite-state grammar. As well as being of wide

applicability, these objectives have the property of combining into a single objective, that

of gradually acquiring a syntactic phrase-structure grammar from an initially lexical

grammar.

22

3. 2. 1. The Categories in the Child and Adult Grammars

It is relatively uncontroversial to propose that models should eventually acquire a

grammar framed in terms of syntactic categories. As will become apparent below, this is

an objective which tends to be recognised even in those cases where it is not met. This is

because, although correspondences exist between semantic and syntactic levels of

representation, there is nevertheless a clear distinction to be made between the two. This

is apparent in the difference between violations of form, or well-formedness (e.g., “Bit

into the apple the boy”), and violations of content, or meaningfulness (e.g., “The apple bit

into the boy”). We thus assume the learnability-theoretic objective of acquiring a

syntactic grammar.

With respect to child language, it has long been argued that traditional syntactic

categorizations are inappropriate. Were children’s grammars framed in terms of the same

categories as adult grammars, we would expect them to employ words in a greater range

of grammatical constructions than they are observed to do:

“it can be safely generalized that children do not apply every combinatorial rule

to every term in their vocabulary to which such a rule could apply.”

(Ninio 1988, p.103)

This suggests that children do not initially acquire a syntactic grammar, rather than

providing conclusive evidence, since the grammar is clearly not the only determinant of the

utterances children produce. Below, we address the question of whether a more

appropriate characterization is possible.

Semantic grammars are the most popular alternative to syntactic grammars to have been

suggested for early child language. However, the problem with these, as outlined in the

previous chapter, is that they fail to meet the constraint of learnability, since they do not

provide a suitable basis for the acquisition of the target grammar (Pinker 1984). Ninio

(1988) presents an analysis of child language, on the basis of which he argues that the

most appropriate characterization of child grammars is lexical. Lexical categories are

similar to syntactic categories, except that a number of lexical categories are subsumed by

a single syntactic category. This means that a lexical grammar is, as required, less

powerful than the corresponding syntactic grammar, generating a smaller number of

sentences. However, it also appears to provide a suitable basis for the induction of a

syntactic grammar, syntactic categories being viewed as simply generalizations over

lexical categories:

23

“what is shared by the class of all adjectives, or of all verbs, etc., is precisely

their logical-semantic-syntactic common behaviour in higher-order word-

combinations.”

(Ninio 1988, p.116)

We thus assume the further objective of initially acquiring a lexical grammar.

Taking our learnability-theoretic and empirical objectives together results in the single,

highly constrained objective of acquiring a syntactic grammar on the basis of an initially

lexical grammar. Below, we outline the three major approaches suggested by the

computational models examined to the problem of categorization in the developing

grammar. We argue that only one of these approaches meets the requirements outlined

above.

3. 2. 1. 1. The Semantic Approach

Examples of the semantic approach to categorization in the grammar acquired are provided

by CHILD (Selfridge 1982, 1986) and AMBER (Langley 1982). In CHILD, words are

labelled with categories like actor, and lexical entries have a slot-filler notation to record

how the word order of the language acquired corresponds to language-neutral Conceptual

Dependency relations. In Langley’s AMBER, the production rules acquired are framed in

terms of semantic categories,

e.g.,

“If you want to describe node1, and node2 is the agent of node1, and you have

described node2, and node3 is the object of node1, then describe node3.”

(Langley 1982, p.233)

The motivation for using semantic grammars seems to be the recognition of the

inappropriateness of syntactic categorizations of child language, since Langley recognises

the inadequacy of AMBER with respect to the constraint of learnability:

“Another difficulty relates to the semantic basis of AMBER’s grammatical

rules. Although young children seem to employ semantically-based classes,

such as agent and action, they eventually replace these categories with more

syntactic ones …”

(Langley 1982, p.249)

However, the analysis given above suggests that semantic grammars are also not the

most appropriate for characterising early child language. The semantic approach to the

problem of categorization is thus ruled out.

24

3. 2. 1. 2. The Innatist Approach

In the work of Pinker (1982, 1984), the acquisition of the target syntactic grammar relies

on the assumption that syntactic categories are innate. Tied to this assumption is the

further assumption of innate knowledge of semantic-syntactic category correspondences,

these being viewed as necessary to enable semantic bootstrapping of the initial syntactic

grammar. L.PARSIFAL (Berwick & Weinberg 1984, Berwick 1985), which shares some

of Pinker’s assumptions but in a weaker form, is discussed in a separate section below.

A number of criticisms can be made of the innatist approach. The obvious problem is that it

leads to the prediction, not borne out by the empirical data, that, as children acquire lexical

items, they will use them in the whole range of possible contexts licensed by their

syntactic category labels. The approach can also be criticized as relatively unparsimonious

in its assumptions, compared with the alternative lexical approach outlined below. Given

these difficulties, it is useful to examine the rationale underlying the approach. One

motivation for assuming syntactic categories is that this is required for the assumption of

related linguistic knowledge, such as the X-Bar Theory of Phrase Structure. We argue

below that this is a further assumption that needs to be rejected.

3. 2. 1. 3. The Lexical Approach

The lexical approach is exemplified by, for example, LAS (Anderson 1977) and Hill’s

model of acquisition in a two-year-old child (Hill 1983). LAS acquires an Augmented

Transition Network (ATN) grammar. The “principle of minimal contrast” leads to

generalization over words which can occupy the same position in a sentence, with these

being given a shared category label. Where, for example, subject and object noun phrases

are recognized to be interchangeable, categorization is implemented by collapsing the

networks representing each into a single network. This process of generalization in LAS

is to a certain extent constrained by semantic considerations. Hill’s model acquires a slot-

filler grammar, in which categories are induced for both slots and fillers. Development from

the initially word-specific grammar is by the “Classification through Word Use

Hypothesis”:

“My hypothesis is the relatively radical one that the child may acquire a

multitude of small word classes by projecting word classes based on word

use.”

(Hill 1983, p.301)

Words are classified on the basis of the slots they may fill and, similarly, slots are

classified on the basis of words that may fill them. The lexical classes that emerge may

25

merge over time, with the induction of syntactic categories being the result of this process

of generalization.

While the models described differ in implementational features, the general approach is

the same. Each lexical item starts out belonging to a unique category, with equivalent

categories being merged on the basis of the distributional information acquired. This

information is, in fact, not only the basis for categorization, but the driving force behind it.

Grammars are initially lexically based and thus appropriate for characterizing early child

language. Generalization over word classes gradually merges relatively specific

categories, resulting in the development of a grammar with fewer and more general

classes. This induction of syntactic categories means that the requirement of learnability

is also satisfied.

Models incorporating the lexical approach appear to best meet the twin objectives of

accounting for the categories in both the initial child grammar and the target adult

grammar. Models incorporating this approach, in addition to those discussed above,

include Anderson’s later model, ACT* (Anderson 1983), and the Competition Model

(MacWhinney & Anderson 1986, MacWhinney 1987) which combines ACT* with

MacWhinney’s earlier model of the acquisition of morphophonology (MacWhinney 1978).

3. 2. 1. 4. The Hybrid Innatist/Lexical Approach

L.PARSIFAL (Berwick & Weinberg 1984, Berwick 1985) can be seen to incorporate

elements of both the innatist and lexical approaches outlined above. Syntactic category

information is assumed insofar as words in input sentences are labelled with features such

as [+Noun -Verb +Head] (in the case of object-naming words, or nouns). However,

syntactic category labels are not assumed, so that L.PARSIFAL can be viewed as

acquiring a syntactic grammar on the basis of an earlier lexical grammar.

In assessing L.PARSIFAL, it is necessary to examine the underlying assumptions as

well as its superficially meeting the objectives specified. The power of the initial grammar

acquired by L.PARSIFAL can be seen to be artificially restricted by the use of syntactic

features which bear no relation to category labels. The induction of a syntactic grammar is

redundant where the target syntactic grammar rules are already implicit in the earliest

rules acquired, given the syntactic features shared by lexical items. As in the innatist

approach, the justification underlying the assumption of syntactic features seems to be

their role in X-Bar Theory, the assumption of which is itself, we argue below, unjustified.

26

3. 2. 2. The Gradual Acquisition of Phrase Structure

The second objective in relation to which we evaluate models is the requirement that they

should account for the gradual acquisition of a phrase-structure grammar. This unified

objective is constituted by the learnability-theoretic objective that a phrase-structure

grammar be acquired, taken together with the empirical objective that complex

grammatical constructions be acquired by building upon initially simple constructions. We

briefly re-capitulate upon these objectives before considering to what extent they are met

by the models examined.

The reason for requiring the acquisition of a phrase-structure grammar is that a less

powerful finite-state grammar is inadequate for capturing certain natural language

constructions (Chomsky 1957). This objective is relatively uncontroversial, as is made

apparent by the inadequacies of certain models recognized by those responsible for their

development. An examination of the stages in child language development also makes

clear the need for a phrase-structure grammar, although without necessarily indicating the

very earliest stage at which such a description becomes appropriate. Examining Brown’s

five-stage classification of child language development, as outlined in the previous

chapter, enables us to suggest the very latest stage at which a phrase-structure

description of children’s language may become necessary. By Stage IV, the child is

generating utterances of the form “I said that S”, which can be assumed to imply that

there is a non-terminal symbol S in the grammar. Phrase structure needs to be accounted

for by this stage, and may be emerging during the earlier stages at which it is not evident.

The argument in favour of the need to account for complex grammatical constructions on

the basis of simpler constructions acquired earlier in acquisition is based upon observation

of the regularities in child language development. Some caution needs to be exercised,

however, with respect to this argument since, while it is based upon the observed stages

in children’s language production, models are often concerned, rather, with the acquisition

of grammatical competence. However, it does suggest the usefulness of exploring

alternative representations to the phrase-structure grammar for characterizing early child

language. These are, of course, to be constrained by the need to account for the eventual

acquisition of a phrase-structure grammar.

We can summarise the objective examined here as the need to acquire a phrase-structure

grammar on the basis of a simpler, but as yet unspecified, form of grammatical

representation. We outline below the two approaches offered by existing models, and

argue that neither of these offers a satisfactory account. While certain of the models

27

discussed acquire a phrase-structure grammar, these fail to account simultaneously for

the observed characteristics of early child language.

3. 2. 2. 1. The Empiricist Approach

The empiricist approach is that taken by the majority of models examined: LAS (Anderson

1977), CHILD (Selfridge 1982, 1986), Hill’s model (Hill 1983), AMBER (Langley 1982),

ACT* (Anderson 1983), the Competition Model (MacWhinney & Anderson 1986,

MacWhinney 1987), and BUD (Satake 1990). These models share the view that no

assumptions should be made about the child’s knowledge which are not justified by the

empirical evidence. One consequence of this is that the assumption of innate linguistic

knowledge, motivated by learnability-theoretic considerations, is disallowed. This is,

however, a principle which may be more or less strictly adhered to; for example, Satake’s

claim to such a viewpoint seems to depend, to an extent, upon a confused identification of

innate linguistic knowledge with linguistic theory:

“BUD has its own definitions for a main word of a sentence, subcategorization

information, a content word, a function word, and a phrase…”

(Satake 1990, p.46)

Below, we discuss two models which exemplify the main alternatives within this

approach.

Hill’s model acquires a simple finite-state grammar, apparently adequate for representing

the earliest utterances produced by children. The grammar acquired is telegraphic, it

having been assumed, on the basis of the language they produce, that children fail to

perceive grammatical morphemes in the earliest stages of acquisition. This approach has

already been mentioned as problematic but, for the purpose of discussing the issue at

hand, we assume that the type of grammar acquired, finite-state or phrase-structure, is

the relevant issue. The relevant criticism, in this respect, is that the model fails to account

for the eventual need to acquire a phrase-structure grammar. Hill, while arguing that the

finite-state grammar is adequate and appropriate in relation to the earliest stages of

acquisition, at the same time recognises:

“There is no doubt that the simple mechanisms proposed here as adequate for

the language of the 2-year-old will prove to be inadequate for language of any

complexity.”

(Hill 1983, p.315)

Hill suggests that progression beyond the two-word stage to hierarchical sentence

structure may be envisaged in terms of the embedding of templates in the slots provided

by other templates and in the concatenating of templates containing a common element,

28

e.g., Actor-Action + Action-Object → Actor-Action-Object. The problem with this

suggestion is that it omits reference to the crucial issue of what triggers, or drives, this

development.

The alternative empiricist approach we examine is that of LAS (Anderson 1977).

Although LAS was not specifically intended as a model of child language acquisition, it can

be seen to have strongly influenced the development of AMBER (Langley 1982) and

ACT* (Anderson 1983), the latter in turn being incorporated into the Competition Model

(MacWhinney & Anderson 1986, MacWhinney 1987). An essential contribution to

learning in LAS is made by the “Graph Deformation Condition”:

“The claim is that the surface structure interconnecting the content words of the

sentence can always be represented as a graph deformation of the underlying

semantic structure. This implies that certain word orders will be unacceptable

ways to express certain semantic intentions.”

(Anderson 1977,p.333)

The constraints of the Graph Deformation Condition are built into a program, BRACKET,

which takes the input utterance, along with its meaning representation and topic, and

outputs a bracketing of the surface structure of the sentence, corresponding to its phrase

structure. Unlike Hill’s model, LAS and its successors acquire phrase-structure

grammars. Crucially, however, BRACKET is only able to arrive at the phrase structure for

a sentence by relying on ad hoc assumptions:

“Currently, all the function words to the left of a content word are placed at the

same level as the content word.”

(Anderson 1977, p.337)

This heuristic, in effect, constitutes language-specific knowledge built into the model, and

thus does not offer a solution to the general problem of how a phrase-structure grammar

may be acquired.

The empiricist models examined do not provide a satisfactory solution to the problem of

acquiring a phrase-structure grammar. Certain models acquire a finite-state grammar,

adequate only for early child language, while others acquire the target grammar, but only

by relying on ad hoc assumptions. There is no attempt to model the transition from the

former type of grammar to the latter. This is important since a phrase-structure grammar

may be inappropriate in relation to the earliest stages of child language. The need to

consider less powerful representations is suggested by, for example, Anderson’s

arbitrarily restricting learning in ACT* to one rule per utterance, in order to arrive at a

more reasonable model of child language.

29

3. 2. 2. 2. The Innatist Approach

This approach is found in the work of Pinker (1982, 1984) and in the development of

L.PARSIFAL (Berwick & Weinberg 1984, Berwick 1985). It is distinguished by the

assumption that innate linguistic knowledge, in the form of the X-Bar Theory of Phrase

Structure, provides the solution to the problem of acquiring a phrase-structure grammar.

We discuss L.PARSIFAL and the work of Pinker separately, since they provide radically

different examples of the innatist approach.

Evaluation of Pinker’s work is made difficult by the fact that he proposes various

computational principles implicated in acquisition, rather than describing an implemented

computational model. However, a clear difficulty is that the assumption of X-Bar Theory

means that relatively complex grammatical constructions can be represented from the

start of acquisition, whereas children appear to acquire these only by building upon

knowledge of simple sentences. That is, Pinker fails to demonstrate the gradual

acquisition of a phrase-structure grammar. Since X-Bar Theory is what enables

instantaneous acquisition, it seems that this constitutes too strong an assumption.

However, it is also worthwhile examining L.PARSIFAL, since this incorporates X-Bar

Theory in a much more restricted model of acquisition.

L.PARSIFAL attempts to incorporate humans’ working memory limitations. These are

reflected in deterministic parsing, the failure of which results in a switch to acquisition

mode. The parser is a Marcus parser (Marcus 1980), with an Active Node Stack and

three-cell Input Buffer which provide the local context within which parsing decisions must

be made. Acquisition is itself tightly constrained, since it may not be attempted

recursively. The leftmost constituent in the Input Buffer is attached to the currently active

node at the top of the Stack, where this is permitted by X-Bar Theory, resulting in the

acquisition of a base phrase-structure rule. Since the grammar acquired is a

transformational-type grammar, three alternative actions are also permitted in acquisition:

switching the contents of the leftmost two Buffer cells, inserting a lexical item, or inserting

a trace. Where one of these actions is successful, it results in the acquisition of a

movement-type rule.

L.PARSIFAL does appear to model the gradual acquisition of a phrase-structure

grammar. Specific predictions are made about the order in which various grammatical

constructions will be acquired; for instance (from earliest to latest acquired), simple

declarative, passive, yes-no question, embedded sentence, wh question, and embedded

30

missing subject sentence (Berwick 1985, p.12). While this appears to suggest that the

order of acquisition is from simple to complex, even the earliest rules acquired constitute a

phrase-structure grammar which may be more powerful than children’s earliest grammars.

To the extent that L.PARSIFAL does appear to meet the objective of gradual phrase

structure acquisition, it is necessary to examine the validity of the assumptions which give

rise to this result.

The problem with L.PARSIFAL, we argue, is that it consists of a powerful model in which

acquisition is restricted by the imposition of arbitrary constraints. Acquisition is

potentially powerful due to X-Bar Theory which, in effect, constitutes a superset of the

rules to be acquired, and due also to the four actions allowed in acquisition. The constraint

of deterministic parsing has a relatively sound basis; however, the same cannot be said of

the constraints on acquisition itself. The appropriate order of acquisition is predicted

because the four actions are attempted in a fixed order and because recursive acquisition

is disallowed. There is no independent justification for the order in which actions are

attempted. The constraint of non-recursive acquisition is viewed as simply a parallel

constraint to that of deterministic parsing. However, the analogy is clearly problematic:

were recursive acquisition disallowed by working memory limitations, there would be no

need for an additional constraint since that of determinism would suffice. The constraint of

non-recursive acquisition is, rather, necessitated by the existence of actions in acquisition

which mean that the underlying model is wildly unconstrained.

The innatist approach outlined above does not offer a satisfactory solution to the problem

of gradually acquiring a phrase-structure grammar. The assumption of X-Bar Theory tends

to result in models too powerful to account for early child language. To the extent that

acquisition is gradual, and the order of acquisition is from simple to complex, this is due to

the imposition of arbitrary constraints. There is no attempt to model the transition to a

phrase-structure grammar from a less powerful type of grammar, since the latter is ruled

out by the assumption of X-Bar Theory.

3. 2. 3. An Overall Evaluation

The above discussion suggests that phrase structure acquisition is an important issue

which existing models have failed to address satisfactorily. While certain of the models

acquire a phrase-structure grammar, these fail to simultaneously meet the requirement of

accounting for the observed features of child language development. A solution to the

problem of gradually acquiring a phrase structure grammar should take into account the

broader objective of acquiring a syntactic phrase-structure grammar on the basis of an

31

earlier lexical grammar. Those models which induce syntactic categories on the basis of

lexical categories suggest a partial solution to this overall problem. Below, we extend our

analysis to consider the issue of what kind of approach is required to the problem of

phrase structure acquisition. For this purpose it is convenient to retain the distinction,

made above in discussing approaches to phrase structure acquisition, between innatist

and empiricist models.

While all models assume certain innate abilities, we use the term innatist here specifically

to refer to those models which assume two kinds of innate linguistic knowledge, the X-

Bar Theory of Phrase Structure (of the innatist approach to phrase structure acquisition)

and syntactic categories (of the innatist approach to categorization). These assumptions

tend to be found together, since it is impossible to assume X-Bar Theory without also

assuming syntactic categories, as opposed to modelling their induction. We use the term

empiricist to denote any approach which is not innatist. The empiricist approach thus

subsumes both the semantic and lexical approaches to the problem of categorization in

acquisition, as well as the empiricist approach to phrase structure acquisition.

We consider two approaches to the inadequacies identified in existing models. An innatist

approach would involve proposing yet further innate constraints on models which are too

powerful as models of child language development. An empiricist approach would involve

finding a principled solution to the problem of gradually acquiring a phrase-structure

grammar on the basis of initially simple representations. We suggest that a number of

considerations favour the empiricist approach.

The innatist approach of proposing further constraints on powerful mechanisms has

already been implemented in L.PARSIFAL where, we argue, these further constraints are

arbitrary and unjustifiable. Even were alternative constraints to be proposed, there are

good reasons to suppose that this would not result in a satisfactory solution to the

problems identified. Modelling the induction of syntactic categories is incompatible with

the innatist approach, as is assuming an intrinsically less powerful form of representation

than the phrase-structure grammar in the early stages of acquisition. A further problem is

that any proposed innate constraint must be truly universal, or models risk sacrificing the

aim of language neutrality. For example, Braine (1992, p.88) has criticized Pinker’s model

as reliant on the assumption that in the language being acquired, as in English, the

canonical order of constituents is Subject-Verb-Object.

32

Certain empiricist models already offer a satisfactory solution to the problem of

categorization in acquisition, by modelling the acquisition of a syntactic grammar on the

basis of an initially lexical grammar. Furthermore, unlike the innatist approach, the

empiricist approach is compatible with the initial acquisition of a less powerful form of

representation than a phrase-structure grammar. The question which remains to be

addressed is whether such a representation may form the basis for acquiring the target

phrase-structure grammar, in the same way that lexical categories form the basis for the

induction of syntactic categories.

3. 3. Conclusion

Given the twin objectives of accounting for the endstate of acquisition and the observed

stages in children’s linguistic development, phrase structure acquisition emerges as an

important issue which existing models have failed to address satisfactorily. While gradual

acquisition in certain empiricist models appears reasonable in relation to early child

language, the transition to a phrase-structure grammar is missing from such accounts. The

instantaneous acquisition of phrase-structure grammars by innatist models, on the other

hand, means that these are too powerful as models of child language development. We

argue in favour of an empiricist approach to the inadequacies identified. Further reasons

underlying the approach chosen are outlined in the following chapter.

33

4. Development of the Computational Model

4. 1. Introduction

This chapter is concerned with the development of the parser which forms the basis of the

model used to explore the issue of phrase structure acquisition in the following chapter.

The claim is not that all acquisition of grammar rules takes place during parsing; rather,

the current model is restricted to those aspects of acquisition which it is appropriate to

view as taking place during comprehension. This idea is expanded upon below. It having

been determined that a parser will form the basis of the model, issues underlying the

choice of parser are discussed. While parsing is not central to the issues in acquisition

addressed, a reasonable parser is required if the relationship between parsing and

acquisition is to be explored. Parsing and acquisition in the model are then described,

acquisition being viewed as parsing in the absence of the appropriate grammar rules. For

the purposes of developing the model, a number of standard, simplifying assumptions are

made; for example, segmentation of the utterances input to acquisition, a lexicon of content

words, and the pairing of utterances with fully specified semantic representations. Some of

these oversimplifications are removed in the following chapter, while the basic architecture

of the model described here remains unchanged.

4. 2. Comprehension and Production

The majority of models described in the review of existing models are comprehension-

based. That is, grammar rules are acquired through the analysis of utterances which form

the input to the model. A question which is not often explicitly addressed in the

development of computational models concerns the nature of the relationship assumed to

exist between the grammar acquired through comprehension and the utterances output in

language production. For example, Hill (1983) assumes that the utterances produced must

be transparently reflected in the grammar acquired. Ingram (1989) has termed this view,

that competence can be inferred from performance, the “Competence Assumption”. Pinker

(1984) does explicitly refer to the issue of the relationship between competence and

performance. He suggests that a production bottleneck may account for the apparent

discrepancies between the grammar acquired in his model and the utterances children are

observed to produce.

Production-based models include Langley’s AMBER (Langley 1982) and Anderson’s

ACT* (Anderson 1983). These have been offered as alternatives to comprehension-based

models. As such, they are clearly problematic. Learning involves utterances being

produced by the model and then compared with target utterances. The earliest utterances

produced may bear no obvious resemblance to grammatical sentences in the language.

34

That is, unlike children, these models appear to lack an underlying competence as evinced

by, for example, appropriate word order. The source of the problem would seem to be the

fact that no consideration has been given to the role of comprehension in acquisition. A

further apparent problem is the reliance on negative evidence. However, here again it

seems that a clear distinction between comprehension and production needs to be made.

The suggestion that input to comprehension includes negative evidence (e.g., examples of

ungrammatical utterances) is clearly controversial. However, re-castings of children’s

utterances (Bohannon et al 1990) may constitute negative evidence relevant to acquisition

in production.

The model developed here is concerned with the comprehension-based aspects of

acquisition. As such, it should be viewed as a component in a larger model of acquisition in

which production-related processes also have a role. There is no commitment to the

Competence Assumption. However, this does not mean that any kind of output from

comprehension may be licensed and, thus, evaluation made impossible. The grammar

acquired will constrain the possible accounts of child language phenomena which are

compatible with the model. An evaluation of how learning in the model compares with

certain observed features of child language is given in a later chapter.

4. 3. Comprehension and Acquisition

The majority of models discussed acquire a grammar through the analysis of input

utterances. However, only L.PARSIFAL (Berwick & Weinberg 1984, Berwick 1985) is

strictly comprehension-based, in that acquisition is triggered during the course of normal,

rule-based parsing if this cannot continue using existing rules. In other models,

comprehension and acquisition are implemented as separate processes.

Intuitively, comprehension using existing rules and comprehension utilising semantic

information during acquisition are one and the same process. It is the nature of the input

which determines whether or not any learning takes place. If the utterance can be parsed

using existing rules, no acquisition will take place. Where parsing on the basis of existing

rules is not successful, successful acquisition will depend upon the appropriate

combination of existing knowledge and contextual information being available. Thus, in the

model developed, comprehension and acquisition are fully integrated.

4. 4. Architecture of the Model

The top-level architecture of the model is given below (Figure 4.1). At this stage what is

indicated is the central role played by the parser in the model of acquisition.

35

Parsing uses the grammar and lexicon to construct a parse tree and semantic

representation for the utterance. The switch to acquisition mode can take place, in the

event of parsing failure, if a semantic representation of the utterance is available. Parsing

in acquisition involves inferring the syntactic relations which hold amongst constituents on

the basis of semantic information. This semantic information has two sources: the lexicon,

and the context in which the utterance is encountered. Successful acquisition involves

performing mappings between the two kinds of information. Where a parse tree for the

utterance can be constructed in this way, new grammar rules are acquired. At this stage

the question of lexical acquisition is not addressed.

PARSING

semanticstructure

PARSING

utterance

semanticstructure

syntacticstructure

GRAMMAR

LEXICONINPUTS OUTPUT

Figure 4.1 Architecture of the Model

The remainder of this chapter is concerned with the issues of parsing, representation, and

acquisition in the model.

4. 5. Parsing

The development of the parser is determined jointly by consideration of psycholinguistic

theories of syntactic processing in humans and the specific requirements imposed by the

model of acquisition.

4. 5. 1. Requirements of Acquisition

The integration of comprehension and acquisition raises the question of how the switch to

acquisition mode is to be made when parsing using existing rules fails. It is necessary to

36

detect the point of failure, recover the parse tree built at the point of failure, and ascertain

what input remains to be consumed. Meeting these requirements is facilitated if the

model, like L.PARSIFAL, is based upon a deterministic parser (although not necessarily a

Marcus parser). That is, if a single parse tree is constructed, and the syntactic structure,

once built, cannot be undone through backtracking. As Berwick and Weinberg (1984,

p.232) point out, failure could result in either backtracking or the switch to acquisition

mode in a non-deterministic parser. Their claim assumes a depth-first strategy. A

breadth-first non-deterministic parser could be used to construct all possible partial

parses. The difficulty would then, however, be selecting the appropriate partial analysis for

acquisition to build upon in the case of parsing failure. Deterministic parsing is convenient

since it eliminates the possibility of backtracking and the need to select from amongst

alternative analyses when parsing failure arises. However, the unique point of parsing

failure is only as reliable as the principles underlying ambiguity resolution in the

deterministic parser, an issue which we discuss in the following section.

Determinism in turn suggests that a largely bottom-up approach to parsing is appropriate.

Pereira (1985, p.318) notes that the class of (non-natural) languages which can be parsed

deterministically bottom-up is larger than the corresponding top-down class. Natural

languages are, of course, not in a class of languages which can strictly be parsed

deterministically using either top-down or bottom-up approaches:

“There is a fair-sized subset of the context-free grammars for which

deterministic parsing algorithms can be built. While natural language is

certainly not one of these because of its ambiguous nature, these techniques

may be able to be extended in various ways to make them applicable to natural

language parsing.”

(Allen 1987, p.166)

However, the advantage for bottom-up over top-down approaches extends to natural

languages, as can be illustrated below by a simple example.

The verb phrase can be characterised by a number of rules, for example:

Vp → Verb Np

Vp → Verb Comp

Vp → Verb Np Pp

Vp → Verb Pp

Vp → Verb

In the top-down parsing of a sentence, e.g., “Mary sang to John”, the choice of Vp rule is

made before any of its constituents are encountered. If any rule other than

37

Vp → Verb Pp

is selected, then failure will result when the preposition “to” is encountered. In contrast,

in bottom-up parsing, rules are only selected so long as they match the constituents

found. When the verb is found, no commitment need be made to a particular Vp rule since

control of the parsing process is independent of such rules. The dotted rule convention may

be used:

Vp → Verb. Np

Vp → Verb. Comp

...

The prepositional phrase is built bottom-up, and the appropriate rule matched, so that

parsing is successfully completed:

Vp → Verb Pp.

Control in existing deterministic parsers is similar to the example of bottom-up parsing

above. Pereira’s shift-reduce parser (Pereira 1985) and the family of Marcus-type parsers

(Marcus 1980; Berwick & Weinberg 1984, Berwick 1985; Milne 1982, 1986) can be

characterised as left-corner parsers. That is, rules are invoked when their leftmost

constituents are found. Control of parsing proceeds in a bottom-up fashion, but may be

constrained by top-down prediction.

A further consideration favours a bottom-up approach to parsing. In acquisition mode, the

rules which would normally guide top-down parsing are not available. Parsing in

acquisition is thus most readily envisaged as data-driven, albeit constrained top-down by

the restrictions on tree-building provided by the semantic input to acquisition. This is

because, while the semantic input does provide information of a top-down nature, it is

language-neutral and thus can only constrain rather than predict the shape of the parse

tree. The switch to acquisition mode will be facilitated if parsing using existing rules is

similarly bottom-up.

4. 5. 2. Theories of Syntactic Processing

Theories of syntactic processing are concerned with characterising human linguistic

performance, as contrasted with grammatical competence. In comprehension, one aim is to

explain how it is that, given an input sentence which is structurally ambiguous according

to the grammar, the parser arrives at one particular interpretation rather than another.

Ambiguity resolution is focussed upon, since it is in relation to ambiguity that competing

theories make different, testable predictions.

38

S

Np Vp

VerbNp Pp

Det Noun

Det Noun PrepNp

Det Noun

the boy left the dog in the park

Figure 4.2a Preferred Analysis

S

Np Vp

VerbNp Pp

Det Noun

Det Noun PrepNp

Det Noun


Np

Figure 4.2b Non-Preferred Analysis

It is difficult to separate the data on preferred analyses from the different theories it is

used in support of, but the following examples are illustrative and relatively

39

uncontroversial1. Where a prepositional phrase can be attached to the verb phrase (Figure

4.2a) or to the preceding noun phrase (Figure 4.2b), the former analysis tends to be

preferred. Where the correct and preferred analyses differ, recovery may be difficult, as

illustrated by the (somewhat artificial) phenomenon of garden path sentences. An

example is the sentence “We painted the walls with cracks”, where the only semantically

acceptable interpretation requires a non-preferred syntactic analysis (Figure 4.2b).

A further example of preferred and non-preferred analyses is provided by the attachment

of prepositional phrases to verb phrases. Where a prepositional phrase can be attached to

alternative verb phrases, attachment to the lower, rightmost Vp node (Figure 4.3a), rather

than that above it and to its left (Figure 4.3b), is predicted.

S

Np Vp

Verb Comp

Rel S

Np Vp

VgPp

Sarah noted that Emily

Pn

Pn

Aux Verb

had written

Prep Np

Det Noun

in her diary

Figure 4.3a Preferred Analysis

1 The validity of using isolated sentences to illustrate preferences is itself controversial, as will be madeclear below.

40

A shared assumption of theories of parsing is that ambiguity resolution takes place within

a local, as opposed to sentence-global, context. The locality restriction may be attributed

both to the limited capacity of working memory and to the on-line requirements of the task

of comprehension. The on-line nature of comprehension means that the outputs from

syntactic analysis must eventually be passed onto the semantic processor. It seems likely

that recovery from misanalysis is made more difficult once processing has passed beyond

the syntactic stage. However, most current theories focus on accounting for the first

analysis arrived at.

S

Np Vp

Verb Comp

Rel S

Np Vp

Vg

Pp

Sarah noted that Emily

Pn

Pn

Aux Verb

had written

Prep Np

Det Noun

in her diary

Figure 4.3b Non-Preferred Analysis

Theories of parsing differ in a number of major respects. Gorrell (1989) distinguishes

serial, parallel, and delay (i.e., not strictly left-to-right) models. Differences in structure-

41

building will not be the primary concern here. It is important to note the closeness

between serialism and fine-grained parallelism (Altmann & Steedman 1988) which

differences in terminology may obscure. For example, Frazier (1989), while arguing for a

strictly serial approach to constituent-building, concedes that multiple lexical items may

be activated in parallel during the course of parsing.

In the following discussion what is of interest is the nature of the information implicated in

ambiguity resolution. A dichotomy exists between theories which view syntactic

processing as strictly autonomous and those which consider the roles played by various

kinds of extra-syntactic information. A related issue is the size of the context, or search

space, within which parsing decisions are made. The more kinds of information which are

implicated in ambiguity resolution, the smaller may be the context required to explain how

(usually appropriate) choices are made.

It is necessary to consider to what extent the constraints on processing in children are the

same as those in adults, since discussions of determinism tend to focus upon the latter. If

deterministic constraints arise from a limited working memory capacity, the question to be

examined is that of how much more limited, if at all, is the child’s working memory relative

to the adult’s. Carey (1990) argues that, whereas 4-year-olds appear on the basis of,

e.g., digit memory span, to have only half the working memory capacity of adults,

development involves increases in knowledge rather than in memory span per se. Thus, if

a plausible model of adult language processing is adopted, it can be expected to serve as a

reasonable model of children’s processing, with certain qualifications. For instance, a

strategy whereby the input buffer is manipulated in order to relax the restriction of left-to-

right parsing (e.g., Marcus 1980) may be classified as a mnemonic strategy, in which case

the child would not be expected to acquire this ability until above the age of five (Carey

1990, p.152).

4. 5. 2. 1. Ambiguity Resolution

4. 5. 2. 1. 1. Parsing Preferences

According to this approach, parsing decisions are guided by grammar-independent

structural principles. A specific preference for one structural analysis over another may be

rooted in a general preference for ease, or speed, of computation:

“It appears that most known structural parsing preferences can be explained as

a consequence of the extremely general preference for the syntactic processor

to take the first analysis available of each word in the input string.”

(Frazier 1985, p.135)

42

Specific structural principles which are proposed include “Minimal Attachment” and

“Right Association” (or “Late Closure”). Minimal Attachment is characterised as a

preference for the analysis which results in the parse tree with fewer nodes; for example,

(Figure 4.2a) above, as contrasted with (Figure 4.2b). Right Association involves

attaching a new constituent to the constituent immediately under construction (Figure

4.3a) rather than to a node higher up in the parse tree (Figure 4.3b).

Pereira (1985) describes an elegant implementation of the principles of Minimal

Attachment and Right Association in a deterministic shift-reduce parser. “Shift” and

“Reduce” are the two possible parsing actions. The first of these involves consuming the

next terminal symbol by shifting it onto the Stack, which is the main data structure.

Reduction involves replacing the string of symbols on top of the Stack, which correspond

to the right-hand side of a phrase-structure rule, with a single symbol. Two kinds of

conflicts may arise from syntactic ambiguities: Shift-Reduce and Reduce-Reduce.

Favouring Shift over Reduce is equivalent to the principle of Right Association, whereas

favouring a longer over a shorter reduction results in a strategy of Minimal Attachment.

There are a number of problems with the parsing preferences account of ambiguity

resolution. It appears tautological if the preferred and most efficient analyses are, by

definition, one and the same thing. Some independent characterization of efficiency is,

therefore, required. There is an a priori plausibility in the suggestion that constructing a

parse tree with fewer nodes (i.e., Minimal Attachment) involves fewer actions. The

analysis predicted by the principle of Right Association may be more efficient than

alternative analyses since it doesn’t involve an attention shift. However, a more detailed

characterization of the parsing process appears to be called for, in terms of which notions

like efficiency and attention can be grounded.

A further major problem is the characterization of structural preferences as absolute. As

Ford et al (1982, p.730) argue, a theory of linguistic performance must incorporate an

explanation of how the analysis of a sentence will vary with the listener and context.

Furthermore, Crain and Steedman (1985) have demonstrated that any structural

preferences which exist are not absolute; rather, they can be overridden through the

manipulation of contextual factors.

43

4. 5. 2. 1. 2. Lookahead

According to Marcus (1980), parsing is guided by pattern-action lookahead rules. The

pattern part of these rules represents the local syntactic context in which the action is

appropriate. It is this local syntactic context which delineates those structures which can

and cannot be disambiguated. Where more than one rule is matched, different priorities

attached to competing lookahead rules form the basis of the parsing decisions made.

Garden paths are predicted to occur where the correct analysis requires selection of a low

priority rule.

In Marcus’ deterministic parser, PARSIFAL, local syntactic context is defined in terms of

the parser’s data structures. The Active Node Stack of constituents being built gives the

left context, while the right context is provided by the three-cell Input Buffer. Of the Stack,

the current node and its dominating S or NP node may be examined. Non-terminal as well

as terminal constituents may, in certain circumstances, be held in the Input Buffer and

examined in ambiguity resolution.

As in the parsing preferences account, only syntactic information is used in ambiguity

resolution, and thus the criticism again applies that due consideration has not been given

to the range of factors which may affect human performance. The difference between this

account of ambiguity resolution and that outlined above is that, here, preferences are

grammar-dependent as opposed to grammar-independent. The implication of grammar-

dependence for an account of acquisition is that preferences cannot be assumed to be a

property of the parser; rather, their acquisition needs to be accounted for. This means

acquiring the priorities attached to rules, as well as the rules themselves. The acquisition

of such priorities has not been modelled and, thus, may present difficulties for an account

of acquisition. For instance, in Berwick’s model of acquisition, L.PARSIFAL, which is

based upon a modified Marcus parser, the priorities attached to the different possible

actions are fixed.

Milne’s ROBIE (Milne 1982, 1986) is a variant of the Marcus parser in which local

context is reduced and a further kind of information is incorporated into ambiguity

resolution. Lexical preferences are said to account, for example, for the interpretation of

“screws” as a noun in “the aluminium screws …” but as a verb in “the boy screws …”

(Milne 1982, p.366). These are supported by experiments reported by Milne (1982) which

suggest that semantics better predict ambiguity than does lookahead. Local context is

reduced in Milne’s parser, relative to PARSIFAL, in that left context consists of only the

current node under construction on the Active Node Stack, and the Input Buffer has one

44

fewer cell. The effect of incorporating lexical preferences and reducing lookahead is that

ROBIE, unlike PARSIFAL, correctly predicts a garden path effect for a sentence like “The

prime number few”.

Lexical semantics are, thus, a further source of information which can be exploited in

ambiguity resolution, although it is important to keep in mind the distinction between

lexical ambiguity and the kind of structural ambiguity with which Marcus was primarily

concerned. Milne (1986) does also consider structural ambiguities, showing how these

can be reduced by using agreement to constrain possible interpretations.

4. 5. 2. 1. 3. Predicate-Argument Frames and Thematic Roles

A kind of lexical information which has been used in accounts of structural ambiguity

resolution consists of the predicate-argument frames of verbs and other argument-taking

words. The kind of information predicate-argument frames provide, being semantically

based, can be represented in the semantic input to acquisition, and thus incorporated into

an account of parsing in acquisition.

Predicate-argument information has been implicated in both parallel (e.g., Tanenhaus &

Carslon 1989) and serial (e.g., Ford et al 1982) accounts of the parsing process. The latter

account is outlined here. Ford et al describe ambiguity resolution in Kaplan’s General

Syntactic Processor. The data structures are the Chart of parsed constituents and the

Agenda which contains rules to be applied. Search is depth-first, the preferred analysis

thus being determined by the order in which rules are placed on the Agenda. The lexical

forms of verbs are used to determine what this order should be. Generally, there is a

preference for that lexical form which interprets a constituent as an argument of the verb

rather than as an adjunct. The preference for arguments over adjuncts offers an alternative

explanation to the Principle of Minimal Attachment of the preferred interpretation of a

sentence like “Joe bought the book for Susan” (Ford et al 1982, p.728). This kind of

explanation of preferences is supported by the experimental work of Britt et al (1992).

They report that the Principle of Minimal Attachment can be overridden by contextual

information but that this is unlikely where a third argument is strongly predicted by the

verb (Britt et al 1992, p.304).

Predicate-argument frames appear to have an important role to play in ambiguity

resolution. Like grammar-independent parsing preferences, however, they may constitute

only part of a complete account. Clifton et al (1991) find, in addition to an advantage for

45

arguments over adjuncts, a preference for verb attachments over noun attachments; they

conclude:

“a mixed phrase-structure and frame-driven model is necessary.”

(Clifton et al 1991, p.265)

4. 5. 2. 1. 4. Context

Crain and Steedman (1985) have highlighted the role played by contextual information in

parsing decisions. They found that, presented with structurally identical sentences,

subjects were more likely to garden path on those that were semantically implausible,

e.g., The teachers taught by the Berlitz method passed the test.

(c.f. The children taught by the Berlitz method passed the test.)

However, the garden path effect could be overcome by manipulating the prior context in

which the sentence was presented. This result suggests that experiments which focus

upon the processing of isolated sentences give a distorted view of human parsing by

assuming the irrelevance of contextual factors. Crain and Steedman’s more reasonable

assumption that parsing normally takes place within a discourse context leads to the view

that:

“referential context is made use of during parsing and is the principal

determinant of garden path effects.”

( Crain & Steedman 1985, p.343)

Pragmatic knowledge inferred from the discourse context may, thus, be considered a

further source of information relevant to ambiguity resolution.

4. 5. 2. 1. 5. Prosody

Prosody is a further kind of information which may be taken account of in the processing of

spoken sentences (and possibly written sentences, in context). Nespor and Vogel’s

argument, which they back up with experimental evidence, is that:

“the possibility of disambiguating ambiguous sentences depends on the

prosodic structures rather than on the syntactic structures corresponding to the

different interpretations of a given sentence.”

(Nespor & Vogel 1986, p.268)

Prosody is especially of interest as it appears to make disambiguating information

available at a much earlier stage than is, for example, assumed by Marcus (1980) for,

e.g., Have the students who missed the exam take it today.

Have the students who missed the exam taken it today?

46

Given the availability of prosodic information, the size of context required for successful

disambiguation will be reduced.

4. 5. 2. 2. Conclusion

Implicit in the discussion of theories of parsing has been the assumption that ambiguity is

an important feature of natural language. However, when consideration is given to the

various kinds of information available in the normal circumstances in which parsing takes

place, it appears that redundancy, rather than ambiguity, is the norm. Disambiguation is

still an important notion, but it needs to be clarified. In normal processing, disambiguation

involves using the various kinds of information available to discern the correct, intended

interpretation. Where the relevant information is not available, such as when an

experimental subject is presented with isolated, written sentences, a preferred

interpretation may be returned but there is a real sense in which disambiguation cannot

take place. An adequate account of human syntactic processing needs to account for how

the function of disambiguation is achieved, rather than for some afunctional notion of

preferred interpretation.

4. 5. 3. Parsing and Representation in the Model

The design of the model at this stage precludes the use of certain kinds of information in

parsing, namely, referential context and prosody. That is, the assumption is made that

input is in the form of the isolated utterance. No suprasegmental information is

represented for the input utterance, which is initially assumed to be segmented into its

constituent morphemes. The kinds of information it is possible to incorporate into

ambiguity resolution include the predicate-argument frames of words, given the

appropriate representation (see below), along with grammar-independent parsing

preferences, assuming some quasi-objective measure of the relative efforts or costs of

different parsing actions.

4. 5. 3. 1. Lexical-Functional Grammar

The Lexical-Functional Grammar Formalism (Kaplan & Bresnan 1982) is used for the

representation of grammatical knowledge in the model. The reasons underlying this choice

are outlined below.

The LFG formalism has a basis in linguistic theory and has been used to account for a

variety of linguistic phenomena in a range of languages (Bresnan 1982b). It should,

therefore, meet the requirement of being adequate to represent the target grammar of

acquisition. At the same time, the formalism appears relatively theory-neutral as relates

47

to issues in acquisition. For instance, it has been used by Pinker (1982, 1984) in an

innatist account of acquisition; here, it is employed in the development of an empiricist

model.

LFG has a place in psycholinguistic theories as well as in abstract accounts of linguistic

phenomena. LFG’s role in an account of human parsing (Ford et al 1982) has been

outlined above. Furthermore, LFG (Ford et al 1982) and other lexically-based formalisms

(Bock et al 1992) are argued to provide a better basis for accounts of language production

than grammars in which constructions are generated by movement-type rules. Steedman

(1985) points out that movement rules are used to describe competence, not performance,

and could be compiled to produce the kind of performance associated with lexically-based

grammars. Nevertheless, LFG is associated with a more parsimonious account of

processing. It is suitable for incorporation into a parsing-based model of acquisition since

predicate-argument information, of the kind implicated in ambiguity resolution, is

contained in its lexical entries, some examples of which are given below.

4. 5. 3. 1. 1. Levels of Representation

There are two levels of representation in LFG, Constituent (“C-”) Structure and

Functional (“F-”) Structure. While the C-Structure represents relations amongst

syntactic constituents, the F-Structure is a shallow semantic representation. These are

both described in more detail below. While the C-Structure contains language-specific

information concerning, e.g., the ordering of constituents, the F-Structure is viewed as a

universal level of representation (Bresnan 1982a). It seems that F-Structure cannot be

regarded as universal in terms of the information represented, since different languages

may encode different semantic features; rather, F-Structure must be regarded as

language-neutral in the narrower sense of containing no language-specific syntactic

information.

F-Structure is used to represent the semantic input to acquisition in the model. This is

considered appropriate since the primary concern in developing the model is to illustrate

that the kind of information represented in C-Structure (i.e., phrase structure) may be

acquired from an ideal semantic input. Having demonstrated this, it will be appropriate to

investigate acquisition from more realistic semantic inputs. Representation in acquisition

is described in more detail below when the processes involved in learning in the model are

described.

4. 5. 3. 1. 1. 1. C-Structure and F-Structure

48

The C-Structure is a phrase-structure tree, in which each node has an F-Structure

associated with it, and is annotated with an equation describing how its F-Structure is to

be unified with the F-Structure of the node above. F-Structures represent information in

terms of feature:value pairs. Where the F-Structure’s predicate governs grammatical

functions, such as the subject, these are also represented as features, paired with their

F-Structures. These can be termed feature:function pairs, to distinguish them from

features which are paired with simple values. An example C-Structure is illustrated here

(Figure 4.4), along with its top-level F-Structure (Figure 4.5), i.e., that associated with

the S node.

S

Np(↑subj) = ↓

Vp↑ = ↓

Det↑ = ↓

Noun↑ = ↓

Verb↑ = ↓

Np(↑obj) = ↓

Det↑ = ↓

Noun↑ = ↓

the mouse ate the cheese

Figure 4.4 C-Structure

subj:

pred: eat(subj, obj)

tense: past

obj:

pred: mousenum: sgdef: pos

pred: cheesedef: pos

Figure 4.5 F-Structure

49

The F-Structure’s feature:value pairs derive from lexical entries. The pred feature encodes

semantic information. In the case of verbs, information about predicate-argument structure

is encoded in the lexical form, which forms the value of the pred feature. The arguments,

such as subj and obj, are termed grammatical functions. F-Structures of non-lexical

constituents are built through unification in accordance with the equations, derived from

grammatical rules, which annotate nodes in the C-Structure. For example, the equation

“( ↑ subj) = ↓” indicates that the F-Structure contributes the subject function to that

above. The equation “↑ = ↓” is used for straightforward unification, as outlined below.

The small lexicon and grammar required to parse the example sentence (for which the C-

and F- Structures are given above) are included here for illustrative purposes.

Det → the [def:pos]

Noun → mouse [pred:mouse, num:sg]

Noun → cheese [pred:cheese]

Verb → ate [pred:eat(subj,obj), tense:past]

S → Np Vp

(↑ subj) = ↓ ↑ = ↓

Vp → Verb Np

↑ = ↓ ↑ = ↓

Np → Det Noun

↑ = ↓ (↑obj) = ↓

4. 5. 3. 1. 2. Unification

Unification embodies the constraint of uniqueness, i.e., a feature can only take a single

value. The F-Structure which results from the unification of two F-Structures inherits all

the feature:value pairs of both F-Structures, with the restriction that each feature may

only be represented once,

e.g.,

[num:sg] U [num:sg] = [num:sg]

[num:sg] U [ ] = [num:sg]

[num:sg] U [case:nom] = [num:sg, case:nom]

50

[num:sg] U [num:pl] = FAILS

4. 5. 3. 1. 3. Well-Formedness Conditions

LFG has further well-formedness conditions in addition to the constraint of Uniqueness

implied by unification. Completeness is the requirement that an F-Structure contain all

those grammatical functions governed by its pred feature, the F-Structures of which must

in turn be complete. Coherence requires that no ungoverned grammatical functions be

present in the F-Structure, and an F-Structure is only coherent if the F-Structures nested

within it are also coherent.

4. 5. 3. 1. 4. LFG and Acquisition

Developing a model of acquisition requires that we assume some form of representation.

We mention above some reasons for regarding LFG as relatively suitable. It should be

adequate for representing the target grammar of acquisition. At the same time, it does not

seem to require that we assume either that syntactic categories are innate or that the

earliest rules acquired constitute a phrase-structure, as opposed to finite-state, grammar.

LFG’s compatibility with a theory of acquisition is one determinant of its adequacy as a

linguistic theory or formalism, and is an issue which we necessarily further address in

using it in a model of acquisition. For example, the question of whether the constraints of

completeness and coherence should be considered innate or acquired is one which is

addressed in the following chapter.

4. 5. 3. 2. Development of the Parser

The parser is implemented as a version of the Active Chart Parser. Deterministic parsers

often have a Stack as the main data structure. The Chart data structure was chosen for its

flexibility: since it is neutral with respect to parsing strategy, it was possible to develop

other versions of the parser than the one settled upon here as the basis for the model of

acquisition. There is, however, a trade-off between such flexibility and the possibility of

representing, e.g., memory limitations, in a more rigid data structure tailored to a particular

parsing strategy. The parser’s other data structures are the Input List of words to be

consumed and the Agenda, a list of constituents to be processed.

Parsing can be characterised as follows. An item of input is consumed whenever the

Agenda is empty, parsing being strictly left-to-right. Lexical lookup results in the

generation of an inactive edge (or edges) which is placed on the Agenda for processing.

Parsing progresses through rule invocation and application of the fundamental rule. Where

the constituent on the Agenda matches the leftmost right-hand constituent of a grammar

51

rule, an edge categorized by the left-hand side of the rule is proposed. The parsing

strategy can thus be characterized as left-corner. The fundamental rule creates a new

edge by matching an inactive edge on the Agenda with the next constituent required by an

active edge in the Chart. Since parsing is bottom-up, only inactive edges generated are

placed on the Agenda for processing, after which they are placed in the Chart. Active

edges are placed directly into the Chart. Parsing halts when the Agenda is empty and all

input has been consumed. It is successful if an inactive edge or edges spanning the input

consumed can be retrieved from the Chart.

0, 5 S → Np Vp.

0, 2 S → Np. Vp

0, 2 Np → Det N. 3, 5 Np → Det N.

0,1 Np → Det. N 3, 4 Np → Det. N2, 3 Vp → V. Np

0,1 Det → the. 1, 2 N → mouse. 2, 3 V → ate. 4, 5 N → cheese.3, 4 Det → the.

the mouse ate the cheese0 1 2 3 4 5

2, 5 Vp → V Np.

Figure 4.6 The Chart Data Structure

The parsing operations having been outlined, a more detailed example of representation is

now given. Edges are indexed by their initial and final vertices and labelled with

information about their syntactic category, constituents expected, constituents found, and

F-Structure built1. In the Chart illustrated, the dotted rule notation is used to indicate

those constituents of edges found and expected. This information may also be represented

in the arguments of a Prolog predicate,

e.g.,

Active Edge

2, 3 Vp → Verb(ate). Np

↑ = ↓ (↑obj) = ↓

1The C-Structure of an edge can be inferred from its syntactic category and constituents found.

52

edge(2, 3, vp, [(np, (↑obj) = ↓))], [(verb(ate), ↑ = ↓)], [pred: …], …)

Inactive Edge

2, 5 Vp → Verb(ate) Np(Det(the),Noun(cheese)).

↑ = ↓ (↑obj) = ↓

edge(2, 5, vp, [], [(np((det(the),↑ = ↓), (noun(cheese), ↑ = ↓)), (↑obj) = ↓), (verb(ate), ↑ = ↓)], …)

Other information represented in additional arguments on edges is referred to below, as is

relevant.

4. 5. 3. 2. 1. Implementation of Determinism

In non-deterministic parsing, both rule invocation and the fundamental rule apply to the

inactive edge at the front of the Agenda, and all resulting edges are placed into the Chart

or onto the Agenda, as appropriate. If the input is globally ambiguous, then multiple

interpretations are recovered. In deterministic parsing, either rule invocation or the

fundamental rule may apply. Multiple edges may be generated so long as they agree on

the constituents found; that is, edges must agree on initial and final vertices, syntactic

category, and constituents found, but need not agree on constituents expected since

control of parsing is bottom-up. The implementation of ambiguity resolution is outlined

below. The parser is itself neutral with respect to the kinds of information implicated in

ambiguity resolution. The left-corner parsing strategy does, however, affect the point in

parsing at which ambiguity resolution takes place.

Determinism is implemented by successively applying different kinds of information to

prune all possible edges generated until the criterion of deterministic structure-building is

met. The first kind of information used is a top-down check combined with lookahead to

the next terminal constituent. Lookahead is relatively straightforward in the case of active

edges: the syntactic category of the next expected constituent must match or link with that

of the next word to be consumed. In the case of an inactive edge, lookahead is

straightforward where no input remains to be consumed; otherwise, we hypothesise an

active edge resulting from the application of either rule invocation or the fundamental rule

to the inactive edge, and assume this as the basis for lookahead.

53

The top-down constraints and lookahead outlined above apply to all proposed edges and

serve to remove implausible analyses prior to the application of ambiguity resolution.

Where ambiguity remains following this pruning of the edges proposed, if this consists of a

choice between attaching the current constituent to that above as either an argument or an

adjunct, the former interpretation is preferred. Thus, predicate-argument information is one

kind of knowledge employed in ambiguity resolution. The source of this information is the

lexicon, which determines in what circumstances (i.e., for which predicates) the argument

interpretation is licensed. The remaining preferences which can be applied are similar to

the structural principles of Minimal Attachment and Right Association. It is important to

note, however, that here these preferences do not preclude consideration of other kinds of

information. Minimal Attachment is implemented as the preference for extending an

existing edge, through application of the fundamental rule, over the alternative of

proposing a new edge, through rule invocation1. Right Association means always choosing

the rightmost possible attachment when applying the fundamental rule. The rationale

behind each of these preferences is the same. It is assumed that attention is normally

focussed upon extending the latest active edge and that returning to an earlier active

(violating Right Association) or invoking a new constituent (violating Minimal

Attachment) involves a shift in attention.

S

the boy left the dog the parkin

Np(↑subj) = ↓

Vp↑ = ↓

Det↑ = ↓

Noun↑ = ↓ Verb

↑ = ↓ Np(↑obj) = ↓

Np(↑obj) = ↓

Det↑ = ↓

Noun↑ = ↓

the man put the dog in the house

Figure 4.7 Attachment versus Invocation

1Edges are labelled with a further argument indicating whether they are generated by application of thefundamental rule or through rule invocation.

54

S


Np(↑subj) = ↓

Det↑ = ↓

Det↑ = ↓

Det↑ = ↓

Noun↑ = ↓

Noun↑ = ↓

Noun↑ = ↓

Verb↑ = ↓

Vp↑ = ↓

Np(↑obj) = ↓

Prep↑ = ↓

Np↑ = ↓

Pp(↑↑↑↑adjunct) = ↓↓↓↓

Pp(↑↑↑↑loc) = ↓↓↓↓

S

the man put the dog in the house

Np(↑subj) = ↓

Det↑ = ↓

Det↑ = ↓

Det↑ = ↓

Noun↑ = ↓

Noun↑ = ↓

Noun↑ = ↓

Verb↑ = ↓

Vp↑ = ↓

Np(↑obj) = ↓

Prep↑ = ↓

Np↑ = ↓

Figure 4.8 Argument versus Adjunct

Ambiguity resolution in the parser is illustrated with a number of examples. In parsing

“the boy left the dog in the park”, “the man put the dog in the house”, or “we painted the

walls with cracks”, ambiguity resolution takes place as illustrated (Figure 4.7). In all

cases, the Np on the Agenda functions as an expected argument of the verb. Ambiguity

55

resolution is thus determined by the preference for attachment over rule invocation, and

the Np is attached to the Vp. When the Pp has been built, there remains the choice of

whether to attach it as an argument or adjunct (Figure 4.8). Attachment is as an argument

where the verb, like “put”, has an obligatory third argument, but as an adjunct in the case

of a verb like “left” or “painted”. “We painted the walls with cracks” is a garden-path

sentence. The semantically anomolous interpretation results, not from attachment of the

Pp as an adjunct, but from the earlier decision to attach Np to Vp.

S

Sarah noted that Emily had written in her diary

Np(↑subj) = ↓

Pn↑ = ↓

Vp↑ = ↓

Verb↑ = ↓

Rel↑ = ↓

S↑ = ↓

Vg↑ = ↓

Aux↑ = ↓ Prep

↑ = ↓Det

↑ = ↓Noun↑ = ↓

Verb↑ = ↓

Pn↑ = ↓

Np(↑subj) = ↓

Np↑ = ↓

Vp↑ = ↓

Pp(↑↑↑↑adjunct) = ↓↓↓↓

Comp(↑comp) = ↓

Figure 4.9a Attachment versus Attachment

This hybrid account of ambiguity resolution seems to meet the requirement of

incorporating both the preference for arguments over adjuncts and that for attachment to

verbs (verb phrases) over attachment to nouns (noun phrases) (Clifton et al 1991).

However, it is the latter preference which is crucial in the above examples, due to the

point at which ambiguity resolution takes place, which is dependent on the parsing

56

strategy. The argument/adjunct distinction is not relevant to the first choice of whether to

attach the Np or use it as the basis for rule invocation. The distinction becomes relevant in

relation to attachment of the Pp. Here, the attachment as an argument is preferred where

the argument is predicted by the predicate-argument structure of the verb. (Prepositional

phrases which appear before the verb, in English, tend to be adjuncts; thus, “In the garden

the children play” (adjunct) is acceptable, but “In the garden put the milk bottle”

(argument) is not.) The account of ambiguity resolution predicts that a garden path arises

in the case of “we painted the walls with cracks” since there is no information to prevent

attachment of the “the walls” to the Vp at the point at which this decision is made.

In parsing a sentence such as “Sarah noted that Emily had written in her diary” (Figure

4.9a) or “George said that Henry put the apples in his car” (Figure 4.9b) ambiguity

resolution takes place in attaching the Pp, as illustrated.

S

George said that Henry in his car

Np(↑subj) = ↓

Pn↑ = ↓

Vp↑ = ↓

Verb↑ = ↓

Rel↑ = ↓

S↑ = ↓

Prep↑ = ↓

Det↑ = ↓

Noun↑ = ↓

Pn↑ = ↓

Np(↑subj) = ↓

Np↑ = ↓

Vp↑ = ↓

Pp(↑↑↑↑lo c) = ↓↓↓↓

Comp(↑comp) = ↓

put the apples

Verb↑ = ↓

Np(↑obj) = ↓

Det↑ = ↓

Noun↑ = ↓

Figure 4.9b Attachment versus Attachment

57

Attachment of the Pp can be to either the lower or upper Vp and as an argument or

adjunct. The first example sentence is globally ambiguous which implies that neither of the

verbs takes an obligatory argument. Ambiguity resolution is thus determined by the

preference to continue building the lower, rightmost Vp, to which the Pp is attached as an

adjunct. In the second example sentence “put” obligatorily takes a further argument and

attachment to the rightmost node is jointly determined by this consideration and the

preference to attach to the lower Vp. Thus, in this case, the preference for the argument

interpretation and the structural preference coincide. There is a theoretically possible

conflict between attachment to the higher Vp as an argument or to the lower Vp as an

adjunct, but verbs which take complements do not tend to take further arguments1. Since

the two kinds of preferences applied in ambiguity resolution do not conflict, it appears to

be immaterial in which order the different preferences are applied. In the implementation of

the parser, the somewhat arbitrary decision was made to make the need to fill arguments

overriding.

The further examples reinforce the point made above. This is that, while information

concerning the predicate-argument structures of verbs is incorporated into ambiguity

resolution, it is not possible to isolate the effects of this from those of the structural

preferences which underlie the choices made. The parser thus predicts that the first

interpretation of a sentence will be that dictated by structural preferences.

4. 5. 3. 2. 2. Summary of Parsing in the Model

The basis of the model is formed by a Chart Parser which, using a left-corner strategy,

may compute all possible parses in parallel. We implement determinism by pruning the

edges generated by the current edge on the Agenda in order to ensure that a single

syntactic analysis is selected. Top-down checking and lookahead are used to eliminate

any edges which are licensed by only the narrowest context. The preferences then applied

in ambiguity resolution utilize some of the kinds of information implicated in the discussion

of human syntactic processing, the predicate-argument frames of verbs and grammar-

independent structural preferences.

4. 6. Acquisition in the Model

4. 6. 1. Representation in Acquisition

1An argument versus argument conflict is ruled out: assuming a grammatical sentence, it cannot be the casethat both verbs obligatorily take a further argument.

58

Inputs include the utterance, and a semantic representation in the form of the underlying

F-Structure, the rationale for the latter choice having been outlined in the section on LFG.

For the purposes of developing the model, standard simplifying assumptions were made

concerning representation in acquisition. One of these was that input utterances were

segmented into their constituent morphemes, i.e., the ability to segment was

presupposed.

No grammar or innate linguistic knowledge (syntactic categories, X-Bar Theory) were

presupposed. The simplifying assumption was, however, made that the meanings of

certain content words (only) had been acquired prior to grammar acquisition. Acquisition

involves performing mappings between such semantic knowledge and the semantic input

to acquisition: detailed examples are given below.

PARSING

semanticstructure

PARSING

utterance

GRAMMAR


F-Structure

C-Structure

Figure 4.10 Representation in Acquisition

Where acquisition succeeds, the C-Structure for the utterance is output, and the grammar

rules corresponding to this are retained for future use.

4. 6. 2. Acquisition as Parsing

Acquisition is characterized as parsing in the absence of the appropriate grammar rules.

Left-corner parsing involves either attaching the inactive on the Agenda to an active to its

left, according to the fundamental rule, or using it as the basis for proposing a new edge

above. These processes are normally guided by grammar rules. Where existing such rules

fail, appropriate rules for constructing the C-Structure must be inferred. This involves

59

matching the F-Structure of the current inactive, which consists of features derived from

lexical entries, to that part of the input F-Structure for the utterance to which it

corresponds. Where matching is successful, it determines the appropriate action. Possible

actions in acquisition correspond to those in rule-guided parsing. The options are thus

attaching the inactive on the Agenda to an active in the Chart or proposing a new edge

above it, of which the inactive is the first subconstituent. Where acquisition is successful,

the information embodied in the C-Structure output can be seen to have two sources.

Information about the semantic relations which exist amongst constituents, which underlie

certain syntactic relations, derive from the semantic input to acquisition. Essentially

arbitrary language-specific information concerning the orderings of constituents is given by

the input utterance. The information contained in the grammar rules acquired has an

additional source, syntactic categorization, outlined below. Applied to the appropriate

phrase-structure rules, this process gives rise to the property of recursion in the grammar,

as we illustrate in the following chapter.

The switch to acquisition mode is triggered by parsing failure, which may be recognised as

follows. If the inactive on the Agenda fails to generate any edges through either rule

invocation or application of the fundamental rule (given the application of top-down

constraints and lookahead), then parsing failure is normally inferred. There are, however,

possible exceptions to be considered. For example, where alternative inactive edges exist

which have the same syntactic category but different F-Structures, deriving from different

lexical entries, only one of these alternatives is required to generate new edges if parsing

is to continue with the possibility of success. The section below on the integration of

parsing and acquisition serves to illustrate that the early recognition of parsing failure,

which results from the pruning of the edges proposed, is a desirable feature in a model of

acquisition.

Acquisition in the model is illustrated below by means of a few trivial examples. These

same examples are also reproduced in Appendices A & B which provide transcripts of

acquisition in the basic model. As mentioned previously, the only existing knowledge

assumed is a lexicon of content words. All grammar rules are acquired on the basis of this

information together with the input utterance/F-Structure pairs. Grammar acquisition as

outlined at this stage is necessarily uninteresting due to the limitations of the model as it

stands. These limitations are outlined below, and are addressed in the next chapter.

Lexicon

Lex1 → fido [pred:fido]

60

Lex2 → tigger [pred:tigger]

Lex3 → tabby [pred:tabby]

Lex4 → chases [pred:chase(subj,obj), tense:present]

Lex5 → chased [pred:chase(subj,obj), tense:past]

Lexical items are initially assigned unique syntactic labels, since no syntactic

categorization is assumed.

Input 1

Utterance: [fido, chases, tigger]

F-Structure: [pred:chase(subj,obj),

tense:present

subj:[pred:fido],

obj:[pred:tigger]]

The first case we consider involves straightforward acquisition in the absence of a

grammar. The first edge to enter onto the Agenda results from lexical lookup of “fido”:

edge(0, 1, lex1, [], [fido], [pred:fido], …)

The F-Structure of this edge is matched against the input F-Structure. It can be inferred

from the nature of unification that the constituent on the Agenda contributes the subject

function to the F-Structure above, which is the top-level F-Structure for the utterance.

This is because the feature:value pair “pred:fido” is uniquely shared by the edge’s F-

Structure and the subject function within the input F-Structure. A constraint built into

grammar rule acquisition is that lexical nodes cannot be annotated with grammatical

function assigning equations such as “(↑subj) = ↓”. The rationale behind this constraint

is the fact that more than one lexical entry may contribute to a single grammatical function.

Therefore, an edge is proposed above the lexical edge which has an identical F-Structure

and the lexical item as its first subconstituent:

edge(0, 1, node1, [], [(lex1(fido), ↑ = ↓)], [pred:fido], …)

This inactive is placed onto the Agenda and the equivalent “active” is placed into the

Chart1:

edge(0, 1, node1, acq, [(lex1(fido), ↑ = ↓)], [pred:fido], …)

1When the model was developed, the “active” was only placed into the Chart if the inactive failed togenerate any edges, since determinism implies that a choice need be made between an inactive edge and anyalternative actives. We are, in fact, here assuming a later version of the model, for consistency with laterexamples. A full discussion of the issue of closure in acquisition is delayed until the following chapter.

61

Node2

Node1(↑subj) = ↓

Lex1↑ = ↓

fido chases tigger

Figure 4.11 Partial Parse Tree

In this case, the F-Structure of the inactive matches that of the subject function in the

input F-Structure, and a new edge corresponding to the top-level F-Structure is

proposed:

edge(0, 1, node2, acq, [(node1((lex1(fido), ↑ = ↓)), (↑subj) = ↓)], [subj:[pred:fido]],

… )

The resulting edge may be placed directly into the Chart, since it can be inferred from its

F-Structure that the edge is an active rather than an inactive. After consuming “fido” and

processing all the resulting edges, the parse tree is as illustrated (Figure 4.11).

The next edge placed on the Agenda results from lexical lookup of “chases”:

edge(1, 2, lex4, [], [chases], [pred:chase(subj,obj), tense:present], …)

The F-Structure of this edge matches that of the top-level input F-Structure, as does that

of the existing active 0,1 Node2, so the F-Structures of the inactive and active edges are

directly unified. The action is equivalent to applying the fundamental rule, and results in a

new active in the Chart:

edge(0, 2, node2, acq, [(lex4(chases), ↑ = ↓),(node1((lex1(fido), ↑ = ↓)),(↑subj) = ↓)], [subj:[pred: fido], pred:chase(subj,obj), tense:present], …)

The edge generated by lexical lookup of “tigger” invokes a non-lexical edge

corresponding to the object function, in a manner analogous to the creation of the edge

corresponding to the subject function:

edge(2, 3, node3, [], [(lex2(tigger), ↑ = ↓)], [pred:tigger], …)

The F-Structure of this edge matches the object function of the input F-Structure, and so

contributes the object function to the active edge 0,2 Node2 to which it is attached.

62

Node2


Node3(↑obj) = ↓

Lex1↑ = ↓

Lex4↑ = ↓

Lex2↑ = ↓

fido chases tigger

Figure 4.12 Completed Parse Tree

The resulting edge, being an inactive which spans the input consumed, is placed into the

Chart:

edge(0, 3, node2, [], [(node3((lex2(tigger),↑ = ↓)), (↑obj) = ↓)),

(lex4(chases), ↑ = ↓),(node1((lex1(fido), ↑ = ↓)), (↑subj) = ↓)],

[subj:[pred:fido], pred:chase(subj,obj), tense:present,

obj: [pred:tigger]], …)

The phrase-structure tree for the utterance can be recovered from this edge (Figure 4.12).

Additionally, the following grammar rules can be recovered from this edge together with

the inactive edges in the Chart which correspond to its constituents:

Node2 → Node1 Lex4 Node3

(↑subj) = ↓ ↑ = ↓ (↑obj) = ↓Node1 → Lex1

↑ = ↓Node3 → Lex2

↑ = ↓

Input2

Utterance: [Fido, chases, Tabby]

F-Structure: [pred:chase(subj,obj),

tense:present,

subj:[pred:fido],

obj:[pred:tabby]]

63

This case differs from that above due to the rules which have been acquired. The left

corner of the tree is constructed using the newly acquired grammar rules. Parsing failure is

recognised as soon as Lex4 is attached to Node2, since the resulting edge predicts the

next lexical item to be of category Lex2 rather than Lex3. Acquisition mode being

triggered, completion of the parse tree is identical to the previous example, except that

different labels are assigned to the nodes in the C-Structure. The further rules acquired

are:


(↑subj) = ↓ ↑ = ↓ (↑obj) = ↓Node4 → Lex3

↑ = ↓

4. 6. 3. Induction of Syntactic Categories

Syntactic categorization in the model can now be illustrated. Node3 and Node4 exist in

identical immediate left and right contexts in the two rules labelled Node2, and so these

rules may be conflated and all instances of Node3 and Node4 in lexicon and grammar

replaced by a single category, Node5:


(↑subj) = ↓ ↑ = ↓ (↑obj) = ↓Node5 → Lex2


↑ = ↓Lex2 and Lex3 now also occur in identical contexts and can be similarly generalized over1:

Node5 → Node6

↑ = ↓Node6 → tigger [pred:tigger]

Node6 → tabby [pred:tabby]

Assuming having parsed a further input utterance, “Fido chased tabby”, further

generalization, resulting in a single top-level rule for the three inputs parsed, may be

illustrated. The further rules acquired, prior to generalization, are as follows:


(↑subj) = ↓ ↑ = ↓ (↑obj) = ↓

1In the case of single-constituent rules, the left-hand side of the rule is an additional constraining factor,since all constituents in single-constituent rules exist in identical immediate left and right contexts.

64

Node9 → Node6


↑ = ↓The labels initially attached to constituents are arbitrary and thus we merge any

categories which label rules with identical right-hand-side constituents. Thus Node1 is

merged with (or replaced by) Node7, and Node5 is merged with Node9. The rules in the

grammar having been re-labelled accordingly, an available generalization is apparent:


(↑subj) = ↓ ↑ = ↓ (↑obj) = ↓Node8 → Node7 Lex5 Node9

(↑subj) = ↓ ↑ = ↓ (↑obj) = ↓Lex4 and Lex5 are generalized over, giving the following entries in the final lexicon and

grammar:

Node2 → Node7 Node10 Node9

(↑subj) = ↓ ↑ = ↓ (↑obj) = ↓Node7 → Lex1

↑ = ↓Node9 → Node6

↑ = ↓Lex1 → fido [pred:fido]

Node6 → tigger [pred:tigger]

Node6 → tabby [pred:tabby]

Node10 → chases [pred:chase(subj,obj), tense:present]

Node10 → chased [pred:chase(subj,obj), tense:past]

The above examples illustrate how a lexicon and grammar may be acquired which cover a

greater number of utterances than those actually input to the model. Having demonstrated

the acquisition of a phrase-structure grammar in the following chapter, we show how the

same simple mechanisms of syntactic categorization may give rise to a recursive phrase-

structure grammar.

4. 6. 4. Summary of Grammar Rule Acquisition

Grammar rule acquisition in the model is fully integrated with rule-based parsing, and is

viewed as parsing which utilises semantic information in the absence of the appropriate

rules. The examples used to illustrate grammar rule acquisition above appear trivial and

uninteresting for two reasons. The first reason is the simplifying assumption of a complete

65

semantic representation of utterances input to acquisition. The second reason is that the

examples given avoid the difficult issues in phrase structure acquisition which are

addressed in the following chapter. Even the very simple model of acquisition outlined

above, however, raises some important issues which we discuss in detail in the following

section.

4. 7. The Integration of Parsing and Acquisition

A difficult issue addressed in developing the model concerns the nature of the relationship

between parsing and acquisition and how these processes are to be integrated. As we

illustrate below, if it is assumed that parsing using existing rules always takes priority

over acquisition, then acquisition may not always be triggered when required. Our solution

to this problem involves early recognition of parsing failure through the enforcement of

top-down constraints and lookahead in parsing. We further use the problem of integrating

parsing and acquisition to demonstrate that, not only is the innatist assumption of

syntactic categories not required for acquisition, it may even inhibit acquisition.

4. 7. 1. Meta-Knowledge and Acquisition

In general, existing models of acquisition have assumed a separate language acquisition

mode, with its own inputs and outputs, distinct from the normal processes of

comprehension and production. The exceptions are L.PARSIFAL (Berwick & Weinberg

1984, Berwick 1985) and the model developed here, in both of which failure of

deterministic rule-guided parsing results in a switch to acquisition mode. The separate

acquisition mode in other cases is intended only as a simplifying assumption but, it is

argued here, models which make this assumption necessarily avoid an important issue.

This is the question of role of meta-knowledge in language acquisition.

One kind of meta-knowledge required by the language learner is knowledge of when it is

appropriate to use existing knowledge and when acquisition of new knowledge is required.

Models which assume a separate acquisition mode side-step this issue, since inputs to

acquisition trigger acquisition, while inputs to other processes use existing knowledge.

The meta-knowledge to which we refer is a kind of meta-knowledge which, in fact, the

language learner cannot be assumed to possess, since it is parasitic upon a knowledge of

the language being acquired. Only having acquired the language ourselves, are we in a

position to make the distinction, for a particular learner, between what is known and what

has yet to be acquired. This distinction could be a trivial one, were it the case that

whenever existing knowledge appeared, from the viewpoint of the learner, to apply, this

66

was indeed the case. We seek to illustrate below that this is not the case, and that the

learner’s lack of meta-knowledge is thus an issue which needs to be addressed.

4. 7. 2. A Basic Assumption Which is Too Simple

In the absence of the kind of meta-knowledge referred to above, it is necessary for models

to make some assumption concerning when acquisition is triggered. The initial

development of the model reflected the basic assumption that acquisition is only

attempted after rule-based parsing has been attempted and has failed. That is, given that

the current edge on the agenda had to be processed without any possibility of

backtracking, acquisition could only apply given that both rule invocation and the

fundamental rule had failed to generate any new edges. Top-down constraints and

lookahead were not used to prune the edges generated in this early version of the model.

We made the assumption that parsing takes priority over acquisition because its seemed

more reasonable than the converse assumption that acquisition takes priority over parsing

and it was simpler than any other alternative.

The problem outlined below shows the assumption initially made to have been too simple

and demonstrates one rationale underlying the application of top-down constraints and

lookahead to edges generated during rule-based parsing. A further rationale for these

constraints is that they facilitate ambiguity resolution. This means that, while the

relationship between parsing and acquisition is not as simple as originally presupposed,

the relationship between parsing in the infant and adult does appear relatively simple.

That is, while the adult requires highly constrained parsing for the purposes of ambiguity

resolution, the infant also requires highly constrained parsing if acquisition is to be

triggered as required. Thus we do not propose the development of tightly constrained

syntactic processing in the adult from an initially unconstrained system, but, rather,

continuity in processing.

4. 7. 3. A Problem in Acquisition

We assume, as in the model described here, two basic parsing operations, concerned with

extending existing constituents and proposing new constituents. However, for the

purposes of illustrating the problem of integration, we assume no top-down constraints or

lookahead, so that parsing is given a clear priority over acquisition. Taking into account

acquisition, we can thus distinguish four operations:

(i) rule-guided attachment;

(ii) rule-guided proposal;

(iii) acquisition-guided attachment; and,

67

(iv) acquisition-guided proposal.

The operations are presented here in the order in which we would normally expect them to

apply: that is, failure of attachment triggers rule invocation, the failure of which, in turn,

triggers acquisition. Were the learner’s lack of meta-knowledge a trivial problem, then

(iii) and (iv) would be triggered whenever acquisition was required. Below we show how

acquisition mode may fail to be triggered when required.

We assume the following lexicon and grammar:

Np → Det Noun

↑ = ↓ ↑ = ↓

Det → the [def:pos]

Det → an [def:neg]

Noun → boy [pred:boy]

Noun → apple [pred:apple]

Verb → threw [pred:throw(subj,obj), tense:past, prog:neg]

Verb → fell [pred:fall(subj), tense:past, prog:neg]

If the utterance “the boy fell” is input, along with its semantic representation, then the

following rule may be acquired:

Node1 → Np Verb

(↑subj) = ↓ ↑ = ↓

Failure of acquisition, however, occurs if the utterance “the boy threw an apple”, and the

appropriate semantic representation, are now input. There is failure to parse the utterance

and acquire the sentence-level rule since action (ii) above, rule-guided proposal, is

assumed always to take precedence over the required action (iii), acquisition-guided

attachment, as illustrated below (Figure 4.13).

68

Node1

Np(↑subj) = ↓

Verb↑ = ↓

Det↑ = ↓

Noun↑ = ↓

Node1

Np(↑subj) = ↓

Det↑ = ↓

Noun↑ = ↓

the boy an applethrew

Figure 4.13 Failure to Trigger Acquisition Mode

4. 7. 4. The Solution Implemented

To a certain extent the model developed, as an empiricist model, is resistant to the kind of

problem outlined. Acquisition need not always be confounded in the model developed,

since syntactic categorization is one kind of knowledge which is not assumed. That is,

given that “an” and “apple” are Det1 and Noun1 respectively, a different rule will apply

(we assume the further rule “Np1 → Det1 Np1”) and thus rule invocation will not be

erroneously preferred to acquisition (Figure 4.14). What is required for successful

acquisition of the C-Structure is that, in at least some such cases, there is the opportunity

for attachment in acquisition before the competing rule is acquired either directly or

through syntactic categorization.

Node1

Np(↑subj) = ↓

Verb↑ = ↓

Det↑ = ↓

Noun↑ = ↓

Np1(↑obj) = ↓

Det1↑ = ↓

the boy an applethrew

Noun1↑ = ↓

Figure 4.14 Empiricism and Robustness

For the example problem presented, the empiricist approach of inducing rather than

assuming syntactic categories appears an adequate solution. The model, however,

incorporates an additional solution. The top-down constraints and lookahead added to

rule-based parsing ensure that failure of the existing rules is recognised, even given the

69

assumption of syntactic categorization, as soon as the Np is proposed. The rationale

underlying the imposition of these additional constraints is, however, not the prevention of

failure to trigger acquisition in those cases where categorization has taken place, since a

failure to process some input utterances is an acceptable feature of the model. The early

recognition of parsing failure addresses a further problem, the possibility of which only

becomes apparent once the issue of phrase structure acquisition has been addressed, but

which we outline briefly below.

the dog

Node3↑ = ↓ Node1

↑ = ↓

Node2↑ = ↓

Node4

Figure 4. 15 Acquisition in the Absence of Top-Down Constraints

In the following chapters, we assume a variety of input utterances. Thus the word “dog”

may be analysed as an input utterance, giving rise to a rule of the form “Node2 → Node1”

where Node2 is the utterance-level node and Node1 is the lexical category of the word

“dog”1. In the later parsing of an input such as “the dog”, the early recognition of failure

of the existing rule is required if the C-Structure built is to be isomorphic to that which

would be proposed in the absence of the rule2 (Figures 4.15, 4. 16). This is ensured by the

imposition of top-down constraints and lookahead.

the dog

Node3↑ = ↓

Node4

Node1↑ = ↓

Figure 4. 16 Top-Down Constraints Ensure Early Recognition of Parsing Failure

1The model incorporates a distinction between lexical and non-lexical nodes, which is discussed in thefollowing chapter.2Isomorphism is required for syntactic categorization, as discussed in more detail in Chapter 5.

70

4. 7. 5. Implications

The above discussion is intended to demonstrate that the integration of parsing and

acquisition is not a trivial issue. In the model developed, the imposition of top-down

constraints and lookahead ensures early recognition of parsing failure and this facilitates

the integration of rule-based parsing and acquisition. The solution to the problem of the

learner’s lack of meta-knowledge also partly depends, in the model developed, upon the

assumptions made about the learner’s domain-specific linguistic knowledge, in this case

knowledge of syntactic categories, at the start of acquisition. The idea that existing

knowledge may constrain acquisition, and thus facilitate it, is familiar. Here we suggest

how the assumption of syntactic categories may, rather, inhibit acquisition. This is an

issue to which we return in later chapters.

4. 8. Comparison with L.PARSIFAL

It will be useful to outline parsing in L.PARSIFAL, since this model comes closest to

addressing the issues examined here. In the previous chapter we critically examined

L.PARSIFAL’s assumptions relating to acquisition. Here, we focus upon the parser which

forms the basis of the model.

L.PARSIFAL is a variant of the Marcus parser (Marcus 1980). Data structures are an

Active Node Stack of constituents being built, and a three-cell Buffer containing completed

constituents and input waiting to be consumed. Control of parsing is basically bottom-up,

with top-down prediction, an active node being proposed by its leftmost constituent (i.e.,

that in the first cell of the Buffer). Other basic parsing actions include attaching the item in

the first Buffer cell to the current active node, and popping the (completed) current node

from the Stack into the first cell of the Buffer. Parsing is deterministic in that all structures

built contribute to the final parse. Decisions are made on the basis of local information.

This consists of the left context, which is constituted by the current active node and its

dominating S or NP node, and the right context, supplied by the contents of the three-cell

Buffer. Together these form the pattern in pattern-action type grammar rules.

Determinism means that parsing failure does not allow the option of backtracking, and

thus always leads to acquisition mode being triggered. In this case, (up to) four possible

actions in acquisition are examined, in order of priority, an action being applied if it results

in a situation whereby parsing can continue using existing rules (i.e., acquisition is non-

recursive). These actions involve attaching an item from the Buffer to the current active

node, switching items in the leftmost Buffer cells, inserting a lexical item into the Buffer, or

dropping trace into the Buffer. Only the first of these actions is basic, the others leading to

71

the acquisition of transformation-type rules where attachment fails. The “Attach” action

attaches the leftmost constituent from the Input Buffer to the current active node, the

constraints of X-Bar Theory permitting. X-Bar Theory is viewed as having the essential

role of constraining possible attachments. It can be applied by virtue of the fact that the

syntactic features of lexical items are assumed known; for instance, a noun has the

features [+Noun, -Verb, +Head]. X-Bar Theory’s constraints in effect constitute phrase-

structure rule templates.

The model developed here is, like L.PARSIFAL, based upon a left-corner deterministic

parser, although each uses different data structures. Some important differences between

the models are highlighted below. In the empiricist model developed here, X-Bar Theory is

not assumed and thus acquisition is not constrained by phrase-structure rule templates1.

The model developed also differs from L.PARSIFAL in that the there are only two

possible actions, attachment and invocation. Since these are considered simultaneously,

and semantic input determines which is appropriate, the need for a constraint against

recursive acquisition is eliminated. In L.PARSIFAL such a constraint is viewed as a

necessary addition to the constraint of no backtracking if the behaviour of the model is to

be reasonably constrained.

A similarity between the two models of acquisition is that both assume grammar-

independent parsing preferences2. The problem in the case of L.PARSIFAL is that, as a

Marcus-type parser, it ought to acquire grammar-dependent preferences; that is, priorities

attached to rules are required in order to disambiguate between those which share the

same pattern but propose different actions. In the case of the model developed here, the

assumption of grammar-independent preferences does not result in any such

inconsistencies.

4. 9. Summary and Conclusions

This chapter has been concerned with the development of a model of parsing in acquisition.

The rationale behind basing the model upon a parser having been given, consideration was

given as to how this parser was to meet the joint requirements of acquisition and

psycholinguistic theories of human syntactic processing. On this basis, it was determined

that the parser should be a deterministic left-corner parser. Existing deterministic parsers

have also tended to use the left-corner strategy, including L.PARSIFAL (Berwick &

1The problem addressed in the following chapter is that of acquiring a phrase-structure, as opposed to afinite-state, grammar in such a model. It is hypothesised that, given the correct assumptions, the semanticinput to acquisition will provide sufficient information from which to infer phrase structure.2In the case of L.PARSIFAL, the four possible actions in acquisition are attempted in a fixed sequence.

72

Weinberg 1984, Berwick 1985) which formed the basis of a model of acquisition. Studies of

human syntactic processing suggest that various kinds of information may be implicated in

syntactic ambiguity resolution. The parser incorporated two such kinds of information,

predicate-argument frames and grammar-independent structural preferences. The

limitations of the model precluded the use of other kinds of information, such as referential

context or prosody.

Ambiguity resolution in the model is summarised as follows. A top-down check and

lookahead over a single terminal constituent were used to prune the proposed edges.

Other kinds of information used were the preference for continuing to construct the current

active, and the preference for filling all open arguments of verbs. Interestingly, due to the

points at which decisions were made given a left-corner parsing strategy, structural

preferences and the preference for arguments over adjuncts did not conflict. This suggests

that theories of parsing need to consider explicitly the issue of control.

The most important issue addressed by the development of the basic model concerned the

integration of parsing and acquisition. We identified a lack of meta-knowledge as the

source of the difficulty in integrating these processes. Constraints on rule-based parsing

were essential to ensuring that acquisition was triggered when required. Our solution to

the problem of integration also suggested that assuming innate knowledge, such as that of

syntactic categories, may actually inhibit acquisition. The role of existing knowledge in

constraining learning is an issue of general importance, to which we later return.

There are deficiencies with the model, apart from those which are addressed in the

following chapters. Unreasonably strong assumptions have been made about the semantic

input available in acquisition. These make sense, however, in relation to the issues

addressed in the following chapter: if a solution cannot be demonstrated given an ideal

semantic input, it makes little sense to attempt a more realistic simulation of the

acquisition problem. Given that the parser is to be used as the basis for the model, in

retrospect the Chart may not be the optimal choice of data structure; for instance, a more

constrained form of storage for parsed constituents, such as the Stack, would facilitate the

imposing of memory limitations by means of placing restrictions on the depth of nodes

which may be examined.

73

5. Phrase Structure Acquisition in the Computational Model

5. 1. Introduction

The subject of this chapter is how the problem of phrase structure acquisition is addressed

in the model. Examining objectives for computational models of child language acquisition,

and evaluating existing models against these, we identified phrase structure acquisition

as an issue which needed to be addressed. Acquisition in the model described in the

previous chapter is used to illustrate how the problem arises. The existing innatist

solution is examined and rejected. An alternative empiricist solution is proposed and

implemented in the model, the basic architecture of which remains unchanged. It is argued

that this better accounts for the observed features of child language development than

does the innatist solution. The model developed here thus makes an important advance

upon existing models by demonstrating that the solution of the problem of phrase structure

acquisition does not require the assumption of innate linguistic knowledge. We suggest

that the solution of the problem does, on the other hand, require a change in the

assumptions that have normally been made concerning the relationship of syntax

acquisition to other aspects of language acquisition, such as segmentation and lexical

acquisition.

5. 2. The Problem of Phrase Structure Acquisition

Here, we illustrate how the problem of phrase structure acquisition arises in existing

empiricist models by using an example of acquisition in the model described in the

previous chapter.

Lexicon

Lex1 → Daddy [pred:daddy]

Lex2 → threw [pred:throw(subj,obj), tense:past]

Lex3 → ball [pred:ball]

Input

Utterance: [Daddy, threw, the, ball]

F-Structure: [pred:throw(subj,obj),

tense:past,

subj:[pred:daddy],

obj:[pred:ball, def:pos]]

Successful acquisition requires that the C-Structure for the utterance be output.

74

Node2


Lex1↑ = ↓

daddy

Lex2↑ = ↓

thethrew ball

F1↑ = ↓

Node3(↑obj) = ↓

Figure 5.1 The Problem of Phrase Structure Acquisition

In this case, acquisition mechanisms allow the C-Structure to be only partially constructed

(Figure 5.1). The dashed line illustrates the next step required, which is proposing a node,

Node3, corresponding to the object function.

We focus here upon the problem of acquiring the following rule for the noun phrase:

Node3 → F1 Lex3 (e.g., “the ball”)

Embedded within the F-Structure for the utterance is that for the object, which

corresponds to the phrasal constituent to be acquired, and the meaning of “ball” can be

recognised as in turn embedded within this:

pred: throw(subj, obj)

tense: past

subj:

obj:

pred: daddy

pred: balldef: pos

Figure 5.2 Matching Lexical Items to the Semantic Input

Given the assumption that semantic constituency underlies syntactic constituency, it can

be inferred that “ball” is a subconstituent of some constituent of the sentence

corresponding to the object function, Node3. What we cannot infer is that “the” is also a

subconstituent of the same constituent, Node3. In order to recognise this, we must be able

to map the meaning of “the” onto the object function of the input F-Structure. This is not

possible, however, since it is not considered reasonable to assume that the meanings of

75

function words have been acquired prior to syntax acquisition. The argument in favour of

the innateness of X-Bar Theory has thus been based upon the apparent impossibility of

accounting for the integration of function words into phrases in the absence of such

knowledge.

It can be seen that phrase structure acquisition would be possible were we equipped with

the meanings of functional morphemes. It has generally been assumed that the meanings

of certain content words, like “ball”, but not function words, like “the”, may be acquired

prior to syntax acquisition, since meaning is closely associated with use in the latter case.

Functional morpheme acquisition is itself problematic since, in a model constrained by

psychologically plausible processing limitations, phrase structure appears to be exactly

the kind of local information needed to guide this process, as we illustrate below.

5. 3. The Problem of Functional Morpheme Acquisition

If we knew that “the” was a subconstituent of Node3, as could be inferred given the

phrase-structure tree, we could narrow down its possible defining features to those in the

F-Structure corresponding to Node3, that of the object function (Figure 5.3).

Node2


Lex1↑ = ↓

daddy

Lex2↑ = ↓

thethrew ball

F1↑ = ↓

Node3(↑obj) = ↓

Lex3↑ = ↓

pred: balldef: pos

Figure 5.3 Phrase Structure in Functional Morpheme Acquisition

In the absence of this kind of information, however, we know only that the feature:value

pair or pairs defining “the” are to be located somewhere within the input F-Structure for

the utterance as a whole. We are presented with too large a search space for this

information alone to be really useful. What phrase structure acquisition requires in this

case is precisely that we be able to narrow down the defining feature:value pairs to those

in the object F-Structure.

76

The problem of functional morpheme acquisition can be seen to present empiricist models

with a problem analogous to that of phrase structure acquisition. That is, we have two

problems that appear to be linked in such a way that the solution to each presupposes the

solution of the other.

5. 4. The Approach to the Problem of Phrase Structure Acquisition

In the previous chapter we mentioned two possible ways of addressing the inadequacies

of existing models. The innatist approach rejected involves the proposal of further innate

constraints designed to make innatist models of phrase structure acquisition less powerful

and thus reasonable models of child language development. The empiricist approach taken

here involves addressing the problem of phrase structure acquisition in a model which

does not assume innate linguistic knowledge. Some reasons underlying the choice of

approach have already been given. We argued that the empiricist approach was more

likely to provide a solution to the broader problem of acquiring the target syntactic phrase-

structure grammar from an earlier, less powerful lexical grammar. There are further

problems with the existing innatist approach, and these we outline below. We suggest

that the assumption of X-Bar Theory is unreasonable and that it also ought to be

unnecessary given the semantic input to acquisition. Thus an empiricist solution is

favoured.

5. 4. 1. The Innatist Approach

The innatist view is that phrase structure cannot be acquired and thus various kinds of

innate linguistic knowledge are proposed as accounting for phrase structure. This view has

already been rejected on the grounds that it results in poor models of child language

development. However, here we examine the role of X-Bar Theory in innatist models in

order to determine whether, and if so how, it may be replaced in an empiricist model.

The essentials of X-Bar Theory are briefly summarized here. Phrasal constituents of the

form X'' (or XP) are viewed as maximal projections of the head of the phrase, X. X may be

constituted by any of the major lexical categories:

“The main point of this theory … is that all major lexical categories (noun, verb,

adjective, adverb, and preposition) admit of essentially the same range of

types of modification.”

(Jackendoff 1983, p.57)

X-Bar Theory further specifies at what levels of projection various specifiers and modifiers

are to be attached to the phrase-structure tree.

77

X-Bar Theory constrains acquisition in innatist models by providing a small set of

possible rule templates:

“the problem for the learner is reduced essentially to figuring out what items

may go in the slots on either side of the X.”

(Berwick & Weinberg 1984, p.209)

Since X-Bar Theory is defined in terms of syntactic categories, the syntactic categories

(Pinker 1982, 1984) or features (Berwick & Weinberg 1984, Berwick 1985) of lexical

items are also assumed available to the learner; for example, in L.PARSIFAL, verbs are

marked by the features [-Noun, +Verb, +Head]. The role of X-Bar Theory is essentially

to provide information about the relations amongst constituents, the basis of which is

semantic. What is language-specific, and thus acquired from the utterance, is the ordering

of constituents within the rule templates provided by X-Bar Theory.

Returning to the example problem outlined above, X-Bar Theory effectively gives us a set

of all possible rules in the grammar. In this case, we are concerned with those for the noun

phrase:

NP → Noun Det

NP → Det Noun

…

In parsing our input utterance, the empiricist model fails to either attach “the” to the

parse tree or to propose a new node above it. Acquisition is straightforward, however,

given the set of possible rule templates and the knowledge that “the” is a determiner.

The appropriate rule is simply matched and selected from the set of possible rules

provided by X-Bar Theory, enabling proposal of the noun phrase node (Figure 5.4).

daddy thethrew ball

S

NpVp

Pn Verb Det

Np

Figure 5.4 X-Bar Theory Guides Rule Invocation

78

X-Bar Theory does offer a solution to the problem of how a phrase-structure grammar is

to be acquired; however, its role in innatist models appears somewhat paradoxical. If it

provides information of a semantic nature about the relations amongst constituents, then

this role ought to be fulfilled by a semantic representation of the utterance, information of a

kind which is available to both empiricist and innatist models. That is, X-Bar Theory ought

to be redundant. The success of innatist, but not empiricist, models in acquiring phrase

structure, however, demonstrates that X-Bar Theory isn’t redundant. We suggest that,

while the basis of X-Bar Theory is indeed semantic, it enables acquisition by expressing

this information in syntactic terms. It provides information of the kind that is missing in

empiricist models; for example, explicitly encoding the fact the determiner is a

subconstituent of the noun phrase. It has generally been considered unreasonable to

assume the meanings of functional morphemes, like determiners, in acquisition, since their

meanings are closely linked with their syntactic functions. Given the close link between

the meaning of a function word and its use, the information encoded in X-Bar Theory must

similarly be viewed as unreasonable.

5. 4. 2. An Empiricist Solution

X-Bar Theory enables phrase structure acquisition by expressing information of a

semantic nature in syntactic terms. The solution to the problem of phrase structure

acquisition proposed here takes as its starting point the hypothesis that, given an

appropriate empiricist model, we ought to be able to directly derive the kind of information

supplied by X-Bar Theory from the semantic input to acquisition. We illustrate that the

kind of information required for phrase structure acquisition is to be found within an ideal

semantic input to acquisition such as the F-Structure for the utterance. Since existing

models do not make effective use of such information, we suggest that an alternative

model is required which is able to exploit the structural information contained in the input

to acquisition. To this end, we re-examine the assumptions underlying existing models

and propose an alternative model with a revised set of assumptions.

5. 4. 2. 1. The Role of F-Structure

The two levels of representation in LFG are C-Structure and F-Structure. In normal

parsing, grammar rules are used in constructing the C-Structure for the utterance, which in

turn determines the F-Structure. C-Structure contains all the information needed to

construct F-Structure which essentially consists of unordered feature:value pairs. The

converse is not, however, true, since F-Structure does not contain the ordering

information relevant at the level of C-Structure. In acquisition, the task is to make this

mapping from F-Structure to C-Structure. We suggest that the input utterance itself

79

provides the necessary ordering information which is absent from the F-Structure.

Therefore, while it is not possible to make the mapping from F-Structure to C-Structure or

from the utterance to C-Structure, it should be possible to make both mappings

simultaneously. The F-Structure is viewed here as, like X-Bar Theory, giving rise to all

possible phrase-structure tree templates. The utterance matches and selects one of

these, and the C-Structure is thereby derived. The difference between the F-Structure and

X-Bar Theory is that the latter, but not the former, supplies syntactic category

information.

obj

predapple

def

pos

subj

pred

anna

pred

eat(subj, obj)

subj

pred

anna

pred

eat(subj, obj)

obj

pred

apple

def

pos

Figure 5.5 Directed Acyclic Graphs Represent F-Structure

We illustrate that phrase-structure relations are implicit in F-Structures, by representing

the latter as Directed Acyclic Graphs (DAGs). Each unordered F-Structure gives rise to

a number of ordered DAGs which function as potential phrase-structure tree templates.

For example, the above two DAGs are both valid re-representations of the F-Structure

for the sentence “Anna eats the apple” (Figure 5.5). The DAG which splits up the

feature:value pairs within the object function is, however, considered irregular (Figure

80

5.6). The utterance orders the F-Structure, by selecting that DAG which corresponds to

its phrase-structure tree (Figure 5.7).

obj

pred

apple

pred

eat(subj, obj)

obj

def

pos

subj predanna

5.6 Invalid Representation of F-Structure

subj

pred

anna

pred

eat(subj, obj)

obj

pred

apple

def

posNode2


theeats apple

Node3(↑obj) = ↓

anna

Lex1↑ = ↓

F1↑ = ↓

Lex2↑ = ↓

Lex3↑ = ↓

Figure 5.7 The F-Structure as a Phrase-Structure Tree Template

81

The phrase-structure tree above is not one built by the model. The diagrams at this point

merely serve to illustrate the argument put forward that the semantic relations in the F-

Structure input to acquisition underlie the phrase-structure relations required in the output

from acquisition. It remains to demonstrate how the selection of the appropriate phrase-

structure tree template by the utterance may be implemented in a computational model.

5. 4. 2. 2. Revising the Model to Exploit F-Structure

We approach the problem of phrase structure acquisition by examining the assumptions of

existing models which give rise to it and replacing these with a set of assumptions which

allows us to exploit the structural information implicit in the semantic representations of

utterances. In very abstract terms the solution arrived at can be characterized as follows.

While existing models of syntax acquisition have viewed it as independent of various

other acquisition processes in order to simplify the problems addressed, it is argued here

that an integrated model of acquisition processes is required in which syntax acquisition is

viewed as taking place alongside, for instance, lexical acquisition and segmentation. This

idea is expanded upon below.

Existing empiricist models suggest that we cannot acquire a rule for a noun phrase under

the following conditions:

(i) input includes a grammatical sentence, e.g., “Fido chased the cat”,

segmented into its constituent morphemes, and paired with a semantic

representation;

(ii) the meaning of the noun, i.e., “cat”, is known; and,

(iii) the meaning of the determiner, i.e., “the”, is unknown.

These assumptions are reviewed in the following sections.

5. 4. 2. 2. 1. The Role of Different Types of Input Utterance

Existing models have invariably assumed that the acquisition problem is simplified if it is

assumed that all input utterances are grammatical sentences. However, if we relax this

assumption, allowing noun phrases like “the cat” as input utterances, then acquisition of

the noun phrase is facilitated. This is because, as the semantic structure for the utterance

is simple (i.e., with no nesting of arguments), we can infer that the determiner is

immediately dominated by the utterance-level node. We can also narrow down the

meaning of “the” to some subset of those features in the semantic structure for the

utterance.

82

The assumption that well-formed phrases may constitute input utterances turns out to be

an appropriate one, since phrasal utterances, especially noun phrases (and also isolated

nouns), are found in the speech directed at children (Newport 1977). The first alternative

assumption we suggest is, then, that input utterances include phrases as well as

sentences.

5. 4. 2. 2. 2. The Role of Lexical Acquisition and Segmentation

Assuming different kinds of input utterances transforms the related problems of phrase

structure and functional morpheme acquisition into a single, tractable problem. Since

functional morpheme acquisition is a case of lexical acquisition, the meaning of content

words like “cat” could also be acquired in this way. This assumes a view of lexical

acquisition in which lexical entries may initially contain redundant features which are ruled

out by later utterances in which the words are encountered. What is required is that the

original lexical entry be sufficiently specific to (sometimes) enable matching of the word to

the appropriate part of the semantic structure of an utterance containing the word, since

this will enable acquisition of the phrase-structure grammar. This function would be

further facilitated if, instead of acquiring lexical entries for “the” and “cat”, the model

initially acquired a single lexical entry for “the cat”. This lexical entry, having no

redundant features, would always be successful, although in relation to a smaller

proportion of input utterances. The assumption that a lexical entry in acquisition may

subsume a number of words is consistent with one of the observed features of child

language development, which is the extensive evidence of under-segmentation:

“The first units of language acquired by children do not necessarily correspond to

the minimal units (morphemes) of language described by conventional

linguistics. They frequently consist of more than one (adult) word or

morpheme.”

(Peters 1983, p.89)

The further alternative assumption we make, then, is that syntax acquisition may actually

be facilitated if we initially assume no lexical acquisition or segmentation, instead

modelling these alongside acquisition of the grammar.

5. 4. 2. 3. Development of the Phrase-Structure Grammar

Assuming units like “the cat” during acquisition is only appropriate if these may form the

basis for development of the adult lexicon and grammar. We suggest that development is

possible given different kinds of input utterance. Segmentation of units like “the cat” will

be induced by the presence in the input of isolated nouns like “cat”. Similarly, “the cat sat

83

on the mat” may be entertained as a lexical item which may be gradually broken down into

its constituents due to the presence in the input of units like “the cat”.

The emerging account of acquisition suggests a novel view of the distribution of

knowledge between lexicon and grammar1. The adult segmentation does not define what

constitutes a lexical item but, rather, minimises the number of units stored in the lexicon

and results in a maximally general grammar. The acquisition of the grammar is closely

bound up with the acquisition of the lexicon. Segmentation takes place as a kind of side-

effect of lexical acquisition; for instance, having acquired “the cat” and “cat”, the learner

infers that “the” is a lexical item, and “the cat” is segmented into its constituents.

Acquisition of the grammar is in turn linked to segmentation, in that the acquisition of

rules linking constituents serves to replace combinatory information lost through

segmentation.

5. 4. 2. 4. Summary of the Empiricist Solution

The empiricist solution offered here to the problem of phrase structure acquisition can be

summarized as that of replacing the independent model of syntax acquisition with an

integrated model of acquisition processes, and sentential input with input utterances of

various grammatical types.

5. 5. Lexical Recognition, Lexical Acquisition and Segmentation

Given the view of acquisition outlined above, there are a number of processes to be added

to the basic model described in the previous chapter. Since input utterances will not be

segmented into their lexical constituents, processes responsible for lexical recognition and

segmentation are required. In acquisition, processes are required to segment the input and

acquire new lexical items when processing using existing knowledge fails. Before

describing the implementation of these processes in the model, we discuss the underlying

approach taken to issues in lexical processing.

We assume a top-down approach to lexical recognition and segmentation. In top-down

accounts (e.g., McClelland & Elman 1986), segmentation is predicted from knowledge of

possible lexical items. In contrast, in bottom-up accounts (e.g., Grosjean & Gee 1987;

Cutler & Mehler 1993), segmentation is guided by prosodic cues and precedes lexical

lookup. The approach taken here to lexical recognition and segmentation was suggested

1The work of Wolff (1988), which we discuss below, similarly takes an unconventional view on thisissue, but it differs in important respects from that put forward here. The units in Wolff’s initial lexiconare smaller, rather than larger, than adult morphemes. Thus, unlike lexical items in the model developedhere, they are not, and could not be, meaningful units.

84

by the account of lexical acquisition outlined above as part of the solution to the problem of

phrase structure acquisition. This account is top-down, with lexical acquisition guided by

existing lexical items (e.g., “the cat”) and segmentation the result of lexical acquisition

(of smaller units, like “cat”). This is a novel approach licensed by the assumption that

units like “the cat” may constitute lexical items during acquisition. It contrasts with

existing bottom-up approaches (e.g., Cairns et al 1994a, 1994b; Cartwright & Brent 1994,

Brent et al 1994) in which the acquisition of segmentation is assumed necessarily to

precede lexical acquisition.

Given the top-down approach, lexical processing in the model can be summarized as

follows. Lexical acquisition replaces lexical recognition where the latter fails, and

segmentation of the input results from both of these processes. Below, we consider the

issue of top-down versus bottom-up approaches, including the validity of the novel top-

down approach to acquisition introduced here.

5. 5. 1. Bottom-Up Approaches to Lexical Recognition

These have in common the feature that segmentation, or “juncture detection” (Norris &

Cutler 1985), precedes (and thus tightly constrains) lexical recognition. One approach to

the segmentation of English proposes that lexical lookup is initiated on the basis of

stressed syllables, which provide a fairly reliable cue to the start of lexical (as opposed to

grammatical) words:

“strong syllables trigger segmentation of continuous speech signals.”

(Cutler & Norris 1988, p.120)

Other bottom-up approaches include that of Grosjean and Gee (1987) who propose that

words are accessed by stressed syllables regardless of their position in the word, and

Church (1987) who emphasises the importance of phonotactic constraints.

The role of stressed syllables in triggering segmentation has been tested experimentally.

Cutler and Norris (1988) presented subjects with real words presented in strings like

“mintesh” and “mintayve”. Recognition of the word was faster in those cases, like

“mintesh”, where the second syllable was weak, than for cases like “mintayve” which

had a strong second syllable. It was argued that the predicted lexical access initiated for

“tayve” accounted for the longer reaction time in this case. Further evidence in favour of a

processing advantage for stressed syllables comes in the form of an analysis of “slips of

the ear”. The predictions of the metrical segmentation strategy confirmed were that:

“errors involving insertion of a word boundary before a strong syllable or

deletion of a boundary before a weak syllable should prove to be relatively

85

common, whereas errors involving insertion of a boundary before a weak

syllable or deletion of a boundary before a strong syllable should be relatively

rare.”

(Cutler & Butterfield 1992, p.221)

Other evidence cited in favour of this approach is a comparative study of lexical access

strategies by Briscoe (1989). This favours the metrical segmentation strategy on the

assumption that it is the only strategy considered which keeps to a reasonable number the

lexical candidates considered. The difficulty here is, of course, knowing what constitutes a

reasonable search space. There is also the issue of the trade-off between constraining

lexical access and actually achieving accurate lexical recognition.

A number of criticisms have been made of bottom-up approaches. It is argued that the

information required for segmentation simply isn’t available in the input so that, while the

metrical segmentation strategy may provide a very good heuristic, it is nonetheless

unacceptable in terms of the number of false junctures proposed and actual junctures

neglected. Bard views the inability to exploit top-down information as an inherent

weakness:

“While function and content words have metrical characteristics, the distribution

of such words is controlled by syntax. Any prelexical strategy for characterizing

words which has as its strength the fact that it is autonomous will have as its

weakness the fact that it fails to use the appropriate higher-level information.”

(Bard 1990, p.204)

In contrast, the alternative approach outlined below, being interactive rather than

autonomous, is able to make use of both higher- and lower-level information, and to

account for the appearance of a metrical segmentation strategy.

A major weakness of the bottom-up approach is that the strategies proposed are

language-specific. While the stressed syllable has an important role in segmentation for

English-speakers, the syllable is the corresponding unit in French1, and the mora in

Japanese (Cutler & Mehler 1993). One implication of language-specificity is that an

account is required of the acquisition of the segmentation strategy itself. A language-

specific strategy, such as the metrical segmentation strategy for English, appears to

derive its efficacy from the fact that it incorporates useful distributional information

concerning the language. What is unclear is how the stressed syllable cue, reliant on the

1If Briscoe’s arguments against the viability of lexical access triggered by syllables hold, does it followthat there can be no system of lexical access for French?

86

fact that stressed syllables tend to indicate the start of a lexical word in English, could be

acquired prior to the acquisition of the lexicon. There is the additional problem in

acquisition of explaining the motivation to acquire units which, from the viewpoint of the

segmentation strategy, cannot be regarded as meaningful. Nevertheless, existing

approaches to acquisition have tended to be in the bottom-up paradigm. These are

outlined below.

5. 5. 2. Top-Down Approaches to Lexical Recognition

In early versions of the top-down model (e.g., Cole & Jakimik 1980; Tyler & Marslen-

Wilson 1982), a phoneme consumed from the input stream activates a cohort of matching

lexical items. As further input is consumed, the cohort is reduced so that it remains

consistent with the input so far consumed, until there is a unique match with a particular

lexical item. At the end of one lexical item, recognition of the next is initiated so that

segmentation emerges from the process of lexical recognition.

One criticism of early top-down models is that they generate an implausible number of

lexical candidates. Another is that they naïvely assume a left-to-right strategy with

postlexical segmentation. Frauenfelder (1985) points out that statistical analyses of

lexicons demonstrate that it is often not possible to recognise a lexical item before its

acoustic offset. A further problem with such models is that they are not sufficiently robust

to deal with realistically noisy input.

The interactive-activation architecture of the TRACE model of word recognition

(McClelland & Elman 1986) provides a more sophisticated example of the top-down

approach. In this PDP model, activation of units at the phonemic level leads to activation

of connected units at the lexical level. Within levels there is competition amongst units as

the links are inhibitory. Activation is interactive since, as a lexical item increases in

activation, activation spreads back from it to those units at the phonemic level supporting

the same lexical hypothesis, and so on. Eventually, a pattern of activation is arrived at

which is interpretable as recognition of a particular lexical item. The advantages of this

model over its predecessors include robustness and a view of recognition which does not

entail a strictly postlexical approach to segmentation. TRACE does entertain a large

number of incorrect hypotheses, but experiments on human speech processing suggest

that in this respect it is a good model (Shillcock 1990).

One strength of the interactive-activation model is its use of a single mechanism to model

the two processes of lexical recognition and segmentation (Bard 1990). It also appears to

87

be able to take advantage of the kinds of information exploited by bottom-up approaches

to segmentation; for example:

“Like human subjects, the model exhibits apparent phonotactic rule effects on

phoneme identification, though it has no explicit representation of the

phonotactic rules.”

(McClelland & Elman 1986, p.71)

Persuasive arguments have also been put forward that a segmentation strategy which

actively seeks out, for example, stressed syllables, is not required: a model like TRACE

will naturally exploit the relative intelligibility (Bard 1990) and informativeness (Altmann

1990) of stressed syllables.

Top-down models of lexical recognition and segmentation have no associated account of

the development of these processes in children. Those who argue in favour of the

alternative bottom-up approach have thus focused upon this deficiency. Their argument is

that, while top-down approaches to recognition rely on the lexicon, the acquisition of the

lexicon itself presupposes segmentation (Mehler et al 1990; Cairns et al 1994b).

Acquisition of the lexicon from isolated words in the input is not regarded as plausible,

since function words, for instance, are not used in this way (Jusczyk 1993b). Below we

discuss how the top-down approach may be extended to incorporate an account of

acquisition.

5. 5. 3. Bottom-Up Approaches to Lexical Acquisition

The bottom-up approach to acquisition can be summarized as the proposal that

segmentation abilities, which precede lexical acquisition, are bootstrapped on the basis of

prosodic/suprasegmental and phonotactic/segmental information in the language:

“We have suggested that it may be the case that the characteristic pattern of a

language is sufficiently salient to assist the newborn child in segmenting the

continuous speech stream into discrete units.”

(Cutler & Mehler 1993, p.105)

Prosodic information is viewed as useful in acquisition due to the correlations which exist

between prosodic units and syntactic or lexical units. Phonotactic information is viewed as

useful at a lower level where knowledge of legal and illegal phoneme clusters can be used

to distinguish phonemes within the same unit (syllable or word) from those which belong

to different units. Below we evaluate accounts of the roles of each of these kinds of

information in acquisition.

88

There is empirical evidence to support the claim that infants are sensitive to prosody and

to correlations between prosody and syntax. Hirsh-Pasek et al (1987) found that infants

(7-10 months) oriented longer to speech segmented at clause boundaries than to that

segmented within clauses. Nelson et al (1989) compared the sensitivity of infants of the

same age range to prosodic-marking of clausal units in motherese and in adult-directed

speech, and found that the preference for segmentation at clause boundaries held only for

motherese. Jusczyk et al (1992) found that, while sensitivity to clausal units was apparent

at 6 months, sensitivity to phrasal units developed later, at about 9 months. The

experimental evidence has been used to infer that infants may be able to use prosodic

information:

“to do at least a rough parsing of the ongoing speech stream into clauses.”

(Hirsh-Pasek et al 1987, p.281)

However, evidence that children are aware of prosodic-syntactic correlations is not

evidence that awareness of prosodic information develops first and is used to segment the

input into syntactic units. The differences between child-directed and adult-directed

speech and infants’ perception of these is, however, persuasive in favour of the argument

that motherese may aid acquisition by highlighting (prosodically, or otherwise)

grammatical units within a continuous stream of speech. Important issues which accounts

of the role of prosody in acquisition still need to address include the question of how

correlations which are both language-specific and imperfect may form the basis for the

bootstrapping of segmentation abilities.

Further problems exist for accounts which emphasise the role of prosody in the acquisition

of those segmentation abilities underlying acquisition. Lexical acquisition is

underspecified, with important questions remaining unanswered. There is a need to

demonstrate infants’ sensitivity to prosodic markers of units like syllables and words, as

well as phrases and clauses. Mehler et al (1990) propose that the syllable (or rather a

universal, syllable-like unit) is a “prelexical unit” available to the child prior to, and for

use in, lexical acquisition. It is assumed that sensitivity to word and syllable boundaries

follows sensitivity to phrasal boundaries, in the same way that the latter follows

sensitivity to clausal boundaries. However, it is not clear that the prosodic basis of

syllabic and lexical recognition may be inferred from the role of prosody in the recognition

of larger syntactic units: different kinds of information may be implicated in each case.

Furthermore, nothing is said about how the different kinds of grammatical unit may be

distinguished, something which seems to be required by an account in which the syllable

is viewed as a unit with special status. Mehler et al do suggest that knowledge of the

distributional properties of languages at the phonemic level may also be required for

89

lexical recognition. However, it is recognised that these are language-specific and that an

account of their acquisition is required which suggests a paradox for the bottom-up

approach:

“some cases even require the acquisition of the lexicon to become functional.”

(Mehler et al 1990, p.249)

Having examined the role of prosody in accounts of lexical acquisition, we go on to

examine accounts of the roles of lower-level kinds of information.

A number of computational models have been developed which simultaneously attempt to

acquire phonotactic knowledge about a language and use this in segmenting the input into

those syllabic and lexical units viewed as providing the basis for lexical acquisition. In the

absence of a lexicon, phonotactic information is derived from a statistical analysis of the

input. Segmentation works on the assumption that frequent sequences of phonemes are

likely to be word-internal, whereas infrequent phoneme sequences are likely to indicate

word or syllable boundaries. Examples of this approach are provided by SNPR (Wolff

1988) which chunks together frequently adjacent elements to form words, and the work of

Cairns et al (1994b) which, by contrast, analyses input for infrequent sequences

corresponding to lexical or syllabic boundaries. Cartwright and Brent (1994) model

acquisition using both these kinds of information, singly and in combination, with both

child- and adult-directed speech. They find that performance in segmentation is optimised

when both kinds of information are used in the analysis of child-directed speech. Under

these conditions, 72.3% of segmentation points are located (recall) and 88.3% of the

segmentation points proposed are correct (accuracy). The advantage for child-directed

speech is attributed to the large number of repetitions it contains, e.g.,

“Do you see the kitty? See the kitty? Do you like the kitty?”

(Cartwright & Brent 1994, p.2)

The idea that infants (6-9 months) are sensitive to phonotactic as well as prosodic

information is supported by experiments carried out by Friederici and Wessels (1993).

They found a preference existed for legal over illegal phoneme clusters, even when

materials were filtered to eliminate prosodic cues, suggesting sensitivity to segmental as

well as suprasegmental features of language. Cartwright and Brent’s approach of

analysing input for both frequent and infrequent sequences is supported by the finding that

children appear to use both bracketing and clustering strategies in segmentation and

lexical acquisition (Goodsitt et al 1993). A further issue to be addressed in evaluating

phonotactic approaches to acquisition concerns the validity of their underlying

assumptions. Wolff’s SNPR and the approach of Cartwright and Brent appear too powerful

90

for psychological plausibility, since analyses operate upon input samples several

utterances long. By contrast, the model described by Cairns et al, based on a recurrent

neural network architecture, analyses input as it is consumed, comparing the sub-

phonemic features of a unit consumed with those predicted by the model. The problem with

this model, as it stands, is that it only finds 20% of actual word boundaries, although

syllable boundaries are detected in addition to these.

One interesting result of the work of Cairns et al stems from its use of a complex matrix of

sub-phonemic features to represent input. Given this accurate representation of

segmental information, but no explicit encoding of prosodic information, the model tends to

place boundaries before strong rather than weak syllables, one result predicted by

accounts which focus upon the role of prosody in acquisition. Cairns et al thus suggest that

the segmental/suprasegmental distinction may be an artificial one.

Phonotactic approaches appear, on the whole, to provide a better basis for a bottom-up

account of lexical acquisition than do prosodic approaches, since the former suggest how

language-specific segmentation strategies may be bootstrapped on the basis of the input

alone. A general weakness shared by all of the bottom-up accounts discussed, however,

is that they focus on the acquisition of segmentation while paying insufficient attention to

issues in lexical acquisition. That is, given an account of the extraction of segments from

the input, even assuming that these segments correspond to meaningful units in the

language, it remains to be explained how meanings are to be attached to these segments.

5. 5. 4. The Novel Top-Down Approach to Lexical Acquisition

Lexical recognition and segmentation in the model developed here represent a version of

the standard top-down approach discussed above. It seems that the top-down approach

is a viable alternative to the bottom-up approach, providing that the issue of acquisition

can be resolved. The approach to lexical acquisition in the model, where the acquisition of

novel lexical items is guided top-down by knowledge of existing lexical items, offers a

solution. Segmentation takes place in the model whenever a word is recognised or

acquired.

The novel and distinctive feature of the acquisition of segmentation as described here is

that it is top-down, guided by lexical items defined as functional units of meaning. The

function of units is something overlooked by bottom-up approaches to the acquisition of

segmentation, which assume that segmentation (into adult lexical items) must precede

lexical acquisition:

91

“it is difficult to reconcile the interactionist approach with the development of

segmentation since a lexicon is presupposed.”

(Cairns et al 1994b, p.4)

The model developed here avoids this difficulty by assuming a more flexible concept of

lexical item during acquisition. Its suggestion of how top-down information may be

incorporated into an account of the acquisition of segmentation is important for top-down

approaches to lexical recognition. It may also point the way towards hybrid accounts of

lexical acquisition and segmentation in which the acquisition of bottom-up cues is

facilitated where the learner has already acquired some lexical items top-down.

We briefly examine the assumptions underlying the proposed top-down model of lexical

acquisition. Processing in the model is not overly powerful, with input being analysed

incrementally as it is consumed. With respect to the empirical data, the account of

acquisition offered appears to be consistent both with the extensive evidence of children’s

under-segmentation and with the finding that sensitivity to clauses precedes sensitivity

to phrases. Both of these observations suggest, more generally, that a wholly bottom-up

phonotactic approach may be inappropriate.

An advantage of the account outlined over bottom-up approaches is that it incorporates

both lexical acquisition and segmentation. Furthermore, while segmentation in the top-

down account is driven by the requirements of comprehension, bottom-up approaches fail

to incorporate any explanation of what motivates the child to segment the input into as-

yet-meaningless units.

5. 5. Summary of Approaches to Lexical Processing

Above, we discuss top-down and bottom-up approaches to lexical recognition,

segmentation, and lexical acquisition, including the novel top-down approach to lexical

acquisition introduced here. We argue that top-down approaches have certain advantages;

for example, combining lexical recognition and segmentation, and, similarly, lexical

acquisition and segmentation, into a single process. However, top-down and bottom-up

approaches need not be considered mutually exclusive, and some suggestion has been

given of how they may be combined in acquisition.

5. 6. Implementing Phrase Structure Acquisition

In implementational terms, the solution to the problem of phrase structure acquisition

involves rejecting a number of standard, “simplifying” assumptions of existing models.

This means that the model developed differs from existing models primarily in its input

92

assumptions rather than its architecture. It is assumed that neither lexical acquisition nor

segmentation has taken place prior to syntax acquisition, so mechanisms for lexical

acquisition are added to the model. In order that lexical acquisition, and thus

segmentation, may be triggered, it is necessary to assume that input to the model includes

phrases as well as sentences.

PARSING

semanticstructure

PARSING

utterance

GRAMMAR


F-Structure

C-Structure

Figure 5.8 The Integrated Model of Acquisition Processes

Lexical acquisition is, like grammar acquisition, need-driven, being triggered when lexical

lookup fails. The semantic input to acquisition becomes the F-Structure of the lexical entry

acquired where this lexical entry corresponds to the whole of the input utterance. Where it

constitutes only part of the input utterance, lexical acquisition involves determining the

corresponding part of the input F-Structure by matching the input with existing lexical

entries. The examples of learning given below illustrate lexical acquisition and show how

segmentation arises in the model.

5. 6. 1. Assumptions

In order to implement the model of acquisition outlined, it is necessary to change the

assumptions made about the representation of utterances input to the model and the

knowledge assumed acquired at the start of syntax acquisition. Changes are also required

to the model so that lexical acquisition and segmentation may take place.

One standard, simplifying assumption of the original model was a complete lexicon of

content words. This assumption, if retained, would confound acquisition in the integrated

93

model of acquisition processes, preventing the acquisition of units like “the cat” which are

precursors to the acquisition of function words.1. No lexicon is assumed by the revised

model, and this allows us to explore the issue of what kinds of input utterance are required

for acquisition in the model. Segmentation in the model requires that isolated nouns may

constitute input utterances: given nouns and noun phrases as input utterances, the latter

can be segmented into nouns and determiners. The existence of noun phrases in the input

may trigger the segmentation of sentences into verbs and noun phrases. In the case of

English verbs, the existence of uninflected forms enables us to segment inflected forms

into verb stem and inflection. The model thus appears to require input utterances of the

kind that are actually available to children. Newport (1977) found that utterances directed

at children consisted of phrases, especially noun phrases, and also isolated nouns, in

addition to sentences.

Rejecting the assumption of segmentation further requires, in addition to the removal of

the lexicon, that utterances input to the model be represented as unsegmented

phonological strings. These are represented as lists in Prolog, in which a unique atom

corresponds to each phone,

e.g., the cat sat on the mat

[dd, sw, k, a, t, s, a, t, o, n, dd, sw, m, a, t]

The requirement for lexical acquisition and segmentation is met by adding to the model

processes responsible for lexical acquisition. Segmentation is not implemented as a

separate process; rather, it takes place as a side-effect of lexical acquisition, as will

become apparent below.

5. 6. 2. Lexical Recognition

In order to demonstrate how lexical entries are acquired, it is first necessary to outline

how lexical recognition takes place in the revised model, given unsegmented phonological

strings as input. In consuming the input string, all possible candidates for the next lexical

item are calculated. For example, the following input string (“the cats chase the mouse”)

may match two lexical entries:

[dd, sw, k, a, t, z, ch, aa, s, dd, sw, m, ow, s]

LexA → [dd, sw, k, a, t] [pred:cat, …]

1It would be possible to assume that some, but not all, content words had been acquired prior to syntaxacquisition, without confounding functional morpheme and phrase structure acquisition.

94

LexB → [dd, sw, k, a, t, z] [pred:cat, num:pl, …]

Thus, representation in the revised model immediately makes apparent ambiguities that

were not evident given the assumption that input was pre-segmented1. Where, as in this

case, more than one lexical entry is matched, lookahead to up to two following lexical

items is used in ambiguity resolution. This works under the assumption that the

appropriate lexical item will be followed in the utterance by a valid lexical item, whereas

the choice of an inappropriate lexical item may prevent recognition of the remainder of the

input utterance. There is no recourse to, for example, prosodic, syntactic, or semantic

information. An example is given here. The following further lexical entries are assumed:

LexC → [ch, aa, s] [pred:chase(subj,obj), …]

LexD → [z, ch, aa, s, i, ng] [pred:chase(subj,obj), prog:pos, …]

LexE → [dd, sw, m, ow, s] [pred:mouse, …]

Lookahead will be successful for “the cats” but not “the cat”; hence, the former lexical

item is selected.

5. 6. 3. Lexical Acquisition

Lexical acquisition in the model involves matching a phonological string with the

corresponding F-Structure and storing the pair as a lexical entry. There are no constraints

built into the model as to what phonological string may constitute a lexical entry. Thus, for

example, an unanalysed sentence may be stored as a lexical entry.

In the simplest case, lexical acquisition is triggered when no part of the input utterance

can be recognised on the basis of existing lexical entries. A lexical entry is acquired

consisting of the input utterance paired with its F-Structure (and given a unique category

label):

Input 1

Utterance: [dd, sw, k, a, t] “the cat”

F-Structure: [pred:cat, def:pos]

Acquired

Node1 → [dd, sw, k, a, t] [pred:cat, def:pos]

1The simplified representation of utterances and lexical items as strings of phones, however, means thatthere is still no variability between different instances of the use of a word.

95

Input 2

Utterance: [d, a, d, ee, th, r, oo, z, dd, sw, b, o, l] “daddy throws the ball”

F-Structure: [subj:[pred:daddy],

pred:throw(subj,obj),

tense:present,


Acquired

Node2 → [d, a, d, ee, th, r, oo, z, dd, sw, b, o, l] [subj:[pred:daddy],

pred:throw(subj,obj),

tense:present,


In other cases, part of the utterance may be recognisable on the basis of existing lexical

entries. Lexical acquisition then involves deducing which features in the input F-Structure

are to be paired with that remaining part of the input utterance for which a lexical entry is

required. Lexical acquisition in cases of partial recognition is outlined below in two stages:

discerning the phonological string for which a lexical entry is required, and acquiring its

defining feature:value pairs.

Normal lexical recognition uses knowledge of possible lexical items. It is not strictly left-

to-right, but is constrained by the need to recognise items to both the left and the right of

any particular lexical item. Where recognition using existing lexical entries fails, lexical

acquisition is triggered. In this case, the phonological string of the lexical entry to be

acquired is inferred on the basis of those lexical items in the utterance which can be

recognised. The lexical entry to be acquired is taken to start either at the end of a lexical

item deterministically recognised or at the beginning of the utterance, and to end at the

start of a deterministically recognised lexical item or where the utterance ends.

Having determined the phonological form of the lexical entry to be acquired, acquisition of

the corresponding F-Structure uses the input F-Structure and existing lexical entries. The

lexical entry or entries used are those which subsume the lexical item to be acquired such

that it forms their head or tail. For example, in the acquisition of “chased fido” we may

use existing lexical entries for “the cat chased fido” and “a dog chased fido”. Lexical

acquisition utilises the fact that both the input F-Structure and the F-Structures of these

96

lexical entries represent the unification of the F-Structures of their subconstituents. In the

example used here, “chased fido” can be considered a common subconstituent of the input

utterance and the relevant lexical entries. It can be inferred from the nature of unification

that only feature:value pairs shared by the F-Structures examined are possible defining

features of the common subconstituent “chased fido”. Different feature:value pairs must

derive from subconstituents which are not shared; however, shared feature:value pairs

may also derive from different subconstituents where these share certain feature:value

pairs (e.g., “the dog”, “the cat”). Meaning acquisition involves comparing the input F-

Structure and the F-Structures of the appropriate lexical entries, and “reversing”

unification by extracting only common features, as illustrated below:

subj:

pred: chase(subj, obj)

tense: past

obj: pred: fido

pred: catdef: pos

subj:


tense: past

obj: pred: fido

pred: dogdef: neg

Figure 5.9 “Reversing” Unification to Extract Common Features

The result of meaning acquisition is a tentative lexical entry:

Node3 → [ch, aa, s, t, f, ii, d, oo] [pred:chase(subj,obj),

tense:past,

obj:[pred:fido]]

It is tentative since it may contain some non-defining feature:value pairs. Unification

allows us to infer that unshared features derive from different subconstituents; however,

shared features may derive from (i) a shared subconstituent, (ii) different subconstituents,

or (iii) both:

(i) [f1:v1] U [f0:v0] = [f1:v1, f0:v0]

[f2:v2] U [f0:v0] = [f2:v2, f0:v0]

(ii) [f1:v1, f0:v0] U [ ] = [f1:v1, f0:v0]

[f2:v2, f0:v0] U [ ] = [f2:v2, f0:v0]

(iii) [f1:v1,f0:v0] U [f0:v0] = [f1:v1, f0:v0]

[f2:v2,f0:v0] U [f0:v0] = [f2:v2, f0:v0]

97

In case (ii) above, our initial assumption that the shared feature:value pair “f0:v0” derived

from the common subconstituent would be incorrect and the proposed lexical entry would

contain non-defining features. The issue of how such examples are dealt with in the model

is discussed below.

Meaning acquisition is more complicated than in the above example where the shared

feature:value pairs which we are interested in recovering are to be found in an F-Structure

nested somewhere within the top-level F-Structure:

subj:


tense: present

obj: pred: tabby

pred: fido subj:

pred: sit(subj, loc)

tense: past

loc: pcase: onpred: mat

pred: tabby

Figure 5.10 Extracting Common Features from Nested F-Structures

Were the top-level F-Structures compared, no feature:value pairs in common would be

discerned. It is thus necessary to find some way of ensuring that the appropriate F-

Structures are compared with each other, where these can be recognised. This is possible

in the case of the present example, since the feature:value pair “pred:tabby” is uniquely

shared by the object function of the first F-Structure and the subject function of the second

F-Structure. “Reverse unification” is applied to the appropriate nested F-Structures,

enabling the acquisition of a lexical entry for the common string:

Node4 → [t, a, b, ee] [pred:tabby]

Were there ambiguity as to which were the appropriate F-Structures for comparison, then

lexical acquisition would fail, as illustrated (Figure 5.11).

98

subj:


tense: past

prog: neg

obj:

pred: catdef: pos

subj:

pred: eat(subj, obj)

tense: present

prog: pos

obj: pred: bonedef: neg

pred: dogdef: neg

pred: dogdef: pos

Figure 5.11 Failure to Extract a Unique Set of Common Features

5. 6. 4. Lexical Uncertainty and Lexical Lookup

As mentioned above, meaning acquisition may result in lexical entries which contain non-

defining feature:value pairs. While these were shared by all exemplars of use of the lexical

item at the point of acquisition of its lexical entry, they may cause failure of unification in

the processing of later inputs. For example, if meaning of the lexical entry for “chased

fido” were inferred on the basis of the F-Structures for “the cat chased fido” and “the dog

chased fido”, then it would contain the non-defining feature:value pair subj-def:pos. If this

lexical entry were then used in parsing “a dog chased fido”, unification would fail:

[subj:[def:pos]] U [subj:[def:neg]] = FAILS

This problem is dealt with in the model by making lexical lookup flexible enough to cope

with the tentative lexical entries acquired. An alternative approach might involve setting

an arbitrary threshold on the number of exemplars of use required as the basis for

proposing a lexical entry. The advantage of the approach chosen is that it should result in

a more robust model.

Making lexical lookup flexible requires viewing lexical lookup and acquisition, not as

separate processes, but as variations of the same process, differing along the dimension of

knowledge acquired/required. At the initial stage of lexical acquisition, both the

phonological string and the corresponding F-Structure have to be acquired. At the

intermediate stage, the phonological string consumed matches that of an existing lexical

entry but the latter’s feature:value pairs may require pruning. Finally, no further changes

are required to the lexical entry so that the process of flexible lexical lookup appears

identical with ordinary, uncritical lexical lookup.

Flexible lexical lookup is implemented as follows. When a lexical entry is accessed during

the parsing of an input utterance, its F-Structure is compared with the input F-Structure

99

(in those cases where an input F-Structure exists). If the lexical F-Structure unifies with

the input F-Structure, or with an F-Structure nested within it, then the lexical entry

remains unchanged. However, if the F-Structures fail to unify, then a process aimed at

refining the lexical entry is triggered. The lexical entry is, where possible, matched to the

appropriate part of the input F-Structure, as in lexical acquisition. The new lexical entry

created contains just those feature:value pairs common to the two F-Structures. In this

way, all feature:value pairs which would lead to failure of unification in the parsing of the

current input utterance are removed. An example is outlined below.

The existing lexical entry for “chased fido” is assumed to be as follows:


tense:past,

subj:[def:pos],

obj:[pred:fido]]

Assuming “a dog chased fido” as the input utterance, the above lexical entry fails to unify

with the input F-Structure. Features are, however, shared by both F-Structures:

subj:


tense: past

obj:

def: pos

pred: fido

subj:


tense: past

obj: pred: fido

def: negpred: dog

Figure 5.12 Flexible Lexical Lookup

Unshared feature:value pairs are removed, resulting in a new lexical entry on the basis of

which parsing may continue:

Node5 → [ch, aa, s, t, f, ii, d, oo] [pred:chase(subj, obj),

tense:past,

obj:[pred:fido]]

5. 6. 5. Lexical Recognition During Acquisition

Recognition of known lexical items may be impaired in the context of acquisition. This is

because the unknown lexical item cannot be used in constraining the interpretation of

adjacent lexical items. Furthermore, since known lexical items may be only a small subset

of actual lexical items, there may be a failure to recognise ambiguity (i.e., between a

100

known lexical item and unknown alternatives). An example of a possible mis-

segmentation resulting from the context of acquisition would be the segmentation of

“chased the dog”, given knowledge of the lexical item “chase” but not “chased”:

[ch, aa, s | t, dd, sw, d, o, g]

[ch, aa, s]

One way in which it would be possible to address such problems in the model would be by

selecting lexical items, in the acquisition context, on the basis of F-Structure as well as

phonological string. In the example above, the F-Structure for “chase” would fail to unify

with the F-Structure of the input utterance, lexical recognition would fail, and lexical

acquisition of “chased the dog” would be triggered. This idea has, however, not been

implemented since it is incompatible with the model’s flexible lexical lookup and

incremental lexical acquisition. It is also heavily reliant upon the overly strong assumption

of the availability of a full semantic representation for the input utterance. Weaker

information such as subject-verb agreement is, similarly, not easily used in the acquisition

context: were subject and verb to fail to unify, it could be the case either that any one (or

more) of the existing lexical entries used is in need of refinement or that a new lexical

entry is required. The problem is thus a further example of the more general problem of a

lack of meta-knowledge in acquisition.

To a certain extent, mis-segmentation in the model need not be regarded as problematic.

The analysis of “chased the dog” as containing the lexical item “chase” is appropriate.

While the target adult segmentation suggests that “ed the dog” is not an appropriate

unit, this may eventually be further segmented and rules linking the resulting constituents

acquired. That is, because there are multiple possible routes to achieving the appropriate

segmentation in the model, separate mechanisms for recovery from mis-segmentation

may not be required. However, given a more detailed representation of the utterance it

should also be possible, in future versions of the model, to use bottom-up information to

further constrain what is currently a purely top-down approach to lexical acquisition and

segmentation.

5. 6. 6. Modelling Phrase Structure Acquisition

Here, we illustrate the acquisition of phrase structure under the alternative input

assumptions outlined above and given the revised model incorporating lexical acquisition.

No initial lexicon is assumed. We represent utterances input to the model as

unsegmented phonological strings. Appendices C & D provide actual transcripts of

acquisition in the revised model, as outlined below.

101

The first utterances input to the model are parsed as if they consist of a single constituent.

The first lexical entries acquired consist of an input utterance paired with its F-Structure.

Assuming having parsed “dog”, “a mouse” and “the cat chased fido”, the grammar and

lexicon are as follows:

Node2 → Node1↑ = ↓



Node1 → [d, o, g] [ pred:dog ]

Node3 → [sw, m, ow, s] [ pred:mouse, def:neg ]

Node5 → [dd, sw, k, a, t, ch, aa, s, t, f, ii, d, oo] [pred:chase(subj,obj),

tense:past,

subj:[pred:cat, def:pos],

obj:[pred:fido]]

If the utterance “a mouse chased fido” is now input, “a mouse” is recognised and an

attempt is made to acquire a lexical entry for “chased fido”. The information used is the

input F-Structure for the utterance, along with existing lexical entries, in this case the

lexical entry for “the cat chased fido”. The input F-Structure is as illustrated (Figure

5.13).

subj:


tense: past

obj:

pred: mousedef: neg

pred: fido

Figure 5.13 Input F-Structure

It can be inferred from the nature of unification that only features common to the two

F-Structures may derive from the common string “chased fido”. A lexical entry is created

102

containing just these features, enabling the C-Structure to be built, and thus new rules to

be created1:


tense:past,

obj:[pred:fido]]

Node8 → Node7 Node9

(↑ subj) = ↓ ↑ = ↓Node7 → Node3

↑ = ↓

In this case the lexical entry acquired contains no redundant features: if it did, these could

later be removed when found to lead to failure of unification, as outlined above.

Functional morpheme acquisition involves identical processes to other cases of lexical

acquisition, such as that just outlined. The input utterance “a dog chased fido” triggers

the acquisition of a lexical entry for “a”, which uses the input F-Structure and the lexical

item “a mouse”. In order for acquisition to take place successfully, it must be possible to

recognise that the possible defining features are located within the subject function of the

input F-Structure (Figure 5.14).

subj:


tense: past

obj:

pred: dogdef: neg

pred: fido

Figure 5.14 Functional Morpheme Acquisition

This is possible, since this part of the input F-Structure uniquely shares a feature with the

lexical entry “a mouse”, and this goes to make up the F-Structure of the lexical entry

acquired:

Node10 → [sw] [ def: neg ]

1As described in Chapter 4, the rules for Nodes 4 and 7 are recognised to be equivalent and are merged.

103

The acquisition of the functional morpheme in turn makes possible the completion of the

phrase-structure tree (Figure 5.15). The integrated model of acquisition processes thus

suggests a solution to the difficulty which functional morphemes have presented for

accounts of phrase structure acquisition.

Node12


a dog chased fido

Node1↑ = ↓

Node9↑ = ↓

Node10↑ = ↓

Figure 5.15 Phrase Structure Acquisition

The further rules acquired (prior to generalization, outlined below) are:

Node12 → Node11 Node9(↑subj) = ↓ ↑ = ↓

Node11 → Node10 Node1 ↑ = ↓ ↑ = ↓

5. 6. 7. Generalization and Syntactic Categorization

Those mechanisms described in the previous chapter are used in the revised model in the

induction of both lexical and phrasal categories. For example, the phrasal constituents

Node7 and Node11 occur in identical contexts in the rules labelled Node8 and Node12.

These rules are generalized over, with the merging of these categories and the rules in

which they occur, i.e., the categories Node8 and Node12 also come to be considered

identical:

Node12 → Node13 Node9(↑subj) = ↓ ↑ = ↓

Node13 → Node3 ↑ = ↓

104

Node13 → Node10 Node1 ↑ = ↓ ↑ = ↓

5. 6. 7. 1. Syntactic Category Splitting

It is possible for acquisition to result in a lexical entry such as the following example:

Node14 → [f, ii, d, oo] [tense:past, obj:[pred:fido]]

That is, in the absence of in-built constraints as to possible lexical entries, it is possible

to infer that “fido” is marked for tense if it has only been encountered in past tense

contexts. Eventually, later input utterances will provide the evidence required to deduce

that it is only features nested within the object F-Structure of the tentative lexical entry

which are required to define “fido” (Figure 5.16).

subj:

tense: present

pred: bite(subj, obj)

obj:

pred: tabby

pred: fido

tense: past

obj: pred: fido

Figure 5.16 Lexical Entry Revision Through Flexible Lexical Lookup

The appropriate lexical entry can then be discerned by the process of flexible lexical

lookup:

Node14 → [f, ii, d, oo] [pred:fido]

However, the object function having been removed from the lexical entry, it is now

required in the equations annotating the constituent in grammar rules. The task is thus to

recognise those cases where such a requirement applies and to amend the grammar

accordingly. The changes required to the grammar are not straightforward since there may

be other lexical items of category Node13 for which existing grammar rules are

appropriate. In the latter case, category-splitting must apply. This involves the following.

The revised lexical entry is given a unique syntactic category:

Node15 → [f, ii, d, oo] [pred:fido]

The grammar is amended accordingly: wherever a rule contains the constituent “Node13

↑ = ↓”, a duplicate rule is created containing the constituent “Node14 (↑obj) = ↓”. Where

a category equivalent to the new category is already present in the grammar, the new

category will be merged with this existing category through the process of generalization.

105

5. 6. 8. The Acquisition of Recursive Phrase-Structure Rules

The primary aim in developing the model has been to demonstrate that the semantic

inputs to acquisition constitute a sufficient basis for the acquisition of phrase structure,

given an appropriate model of acquisition, that is, an integrated model of acquisition

processes, as contrasted with an independent model of syntax acquisition which

presupposes prior lexical acquisition and segmentation. Our stated aim has been to focus

upon the very earliest stages of child language development, corresponding to Brown’s

Stages I & II, but also to take into account the need for learning processes which are

adequate for taking acquisition beyond these stages. The examples of learning in the

model outlined above result in the gradual acquisition of a phrase-structure grammar

which appears adequate for characterising children’s utterances at Stages I & II. By Stage

IV, however, which is concerned with the acquisition of complementation and control, it

appears that the simple phrase-structure grammar is inadequate and that a recursive

phrase-structure grammar is required.

Having demonstrated the acquisition of phrase structure, we are in a position to

demonstrate the acquisition of recursive constructions, since the model’s existing

mechanisms of generalization give rise to these in the presence of the appropriate inputs

to acquisition. Below we present an example of the acquisition of a recursive phrase-

structure grammar in the model. This does not constitute an extension of the model’s

account of child language development, which would be incomplete without reference to

such issues as the acquisition of those non-declarative sentence modalities characterising

Stage III. Rather, the account of the acquisition of recursive structures serves to illustrate

one way in which the model is equipped to meet the objective of learnability.

5. 6. 8. 1. Modelling the Acquisition of Recursive Constructions

We criticised innatist models, which presuppose X-Bar Theory and the syntactic

categories of the target grammar, as able to instantaneously acquire complex sentence

structures. The empiricist model developed contrasts with these in that a very large

number of inputs have to be processed before the kind of recursive rule in the adult

grammar may be acquired. It is for this reason that we illustrate the acquisition of a

grammar which, while it embodies the property of recursion, differs from the target

grammar in terms of its segmentation. Even given that lexical acquisition and

segmentation are kept to a minimum1, it can be seen that the model’s acquisition of a

recursive rule is non-instantaneous.

1Since we are not primarily concerned with lexical acquisition in these examples, the number of features inthe F-Structure are kept to a minimum.

106

The example presented here is concerned with the acquisition of a sentence-level rule

with a sentential complement. Since we presuppose no lexical acquisition or

segmentation, the first utterances input, Inputs 1-4 below, are acquired as lexical items

accompanied by a trivial finite-state grammar.

Input 1

Utterance: [k, l, oo, ee, th, i, n, k, s, dd, a, t, m, u, m, ee, z, i, n, dd, sw, g, ah, d, n]

“Chloe thinks that Mummy’s in the garden”

F-Structure: [subj:[pred:chloe],

pred:think(subj,comp),

comp:[subj:[pred:mummy],pred:be(subj,loc),loc:[pred:garden],pcase:in]]

Acquired

Node1 → [k, l, oo, ee, th, i, n, k, s, dd, a, t, m, u, m, ee, z, i, n, dd, sw, g, ah, d, n]

[subj:[pred:chloe],

pred:think(subj,comp),


Node2 → Node1

↑ = ↓

Input 2

Utterance: [m, u, m, ee, z, i, n, dd, sw, g, ah, d, n]

“Mummy’s in the garden”

F-Structure: [subj:[pred:mummy],pred:be(subj,loc),loc:[pred:garden],pcase:in]

Acquired

Node3 → [m, u, m, ee, z, i, n, dd, sw, g, ah, d, n]

[subj:[pred:mummy],pred:be(subj,loc),loc:[pred:garden],pcase:in]

Node4 → Node3

↑ = ↓

Input 3

Utterance: [e, m, sw, s, e, z, dd, a, t, d, a, d, ee, z, p, aa, n, t, i, ng,

dd, sw, k, i, ch, i, n]

“Emma says that Daddy’s painting the kitchen”

107

F-Structure: [subj:[pred:emma],

pred:say(subj,comp),

comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]]

Acquired

Node5 → [e, m, sw, s, e, z, dd, a, t, d, a, d, ee, z, p, aa, n, t, i, ng,


[subj:[pred:emma],

pred:say(subj,comp),


Node6 → Node5

↑ = ↓

Input 4

Utterance: [d, a, d, ee, z, p, aa, n, t, i, ng, dd, sw, k, i, ch, i, n]

“Daddy’s painting the kitchen”

F-Structure: [subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]

Acquired

Node7 → [d, a, d, ee, z, p, aa, n, t, i, ng, dd, sw, k, i, ch, i, n]

[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]

Node8 → Node7

↑ = ↓

The next two input utterances differ from those above in that they can be partially

analysed using the newly acquired grammar and lexicon. The rules acquired have

sentential complements, but are non-recursive. No generalizations over the rules acquired

are as yet available; however, as rules with identical right-hand-side constituents are

acquired, these are merged1.

Input 5

Utterance: [k, l, oo, ee, th, i, n, k, s, dd a, t, d, a, d, ee, z, p, aa, n, t, i, ng,


“Chloe thinks that Daddy’s painting the kitchen”

1Node8 is replaced with the equivalent category Node11, and Node4 by the equivalent Node14.

108

F-Structure: [pred:think(subj,comp),

subj:[pred:chloe],


Acquired

Node9 → [k, l, oo, ee, th, i, n, k, s, dd a, t]

[pred:think(subj,comp),subj:[pred:chloe]]


↑ = ↓ (↑comp) = ↓Node11 → Node7

↑ = ↓

Input 6

Utterance: [e, m, sw, s, e, z, dd, a, t, m, u, m, ee, z, i, n, dd, sw, g, ah, d, n]

“Emma says that Mummy’s in the garden”

F-Structure: [pred:say(subj,comp),

subj:[pred:emma],


Acquired

Node12 → [e, m, sw, s, e, z, dd, a, t]

[pred:say(subj,comp),subj:[pred:emma]]


↑ = ↓ (↑comp) = ↓Node14 → Node3

↑ = ↓

Input 7 (Figure 5.17) provides the kind of input which is necessary for the acquisition of a

recursive rule.

Input 7

Utterance: [e, m, sw, s, e, z, dd, a, t, k, l, oo, ee, th, i, n, k, s, dd, a, t, m, u, m, ee, z,

i, n, dd, sw, g, ah, d, n]

“Emma says that Chloe thinks that Mummy’s in the garden”

F-Structure: [subj:[pred:emma],pred:say(subj,comp),comp:

[subj:[pred:chloe],pred:think(subj,comp),comp:

[subj:[pred:mummy],pred:be(subj,loc),loc:[pred:garden],pcase:in]]]

109

Rules Acquired (prior to generalization)


↑ = ↓ (↑comp) = ↓Node16 → Node9 Node17

↑ = ↓ (↑comp) = ↓Node17 → Node3

↑ = ↓

Emma says that Chloe thinks that Mummy’s in the garden

Node3↑ = ↓

Node17(↑comp) = ↓

Node9↑ = ↓


Node12↑ = ↓

Node15

Figure 5.17 Analysis of Utterance Triggering Acquisition of a Recursive Rule

Having built the C-Structure for Input 7, the model acquires a recursive sentence-level

rule through a series of generalizations applied to the rules acquired. The top-level rules

acquired through the analyses of Inputs 6 & 7 can be generalized over, as Node14 and

Node16 appear in these in identical immediate left- and right-hand contexts in the rules

labelled Node13 and Node15. All instances of Nodes 14 & 16 in the lexicon and grammar

are replaced with a new label, Node181:

Rules Prior to Generalization



↑ = ↓ (↑comp) = ↓

1As a result of the merging of the two rules generalized over, the category Node13 is also merged withNode15.

110

Generalized Rule


↑ = ↓ (↑comp) = ↓Related Rules1

Node18 → Node9 Node18 (Node16 → Node9 Node17)

↑ = ↓ (↑comp) = ↓ ↑ = ↓ (↑comp) = ↓

Node18 → Node3 (Nodes14/17 → Node3)

↑ = ↓ ↑ = ↓

Here, a recursive rule is acquired simply through the filtering of the syntactic

categorization made through the lexicon and grammar. All instances of the categories

Node14 and Node16 are replaced with the generalized category Node18. Furthermore,

Node17 is also recognised to be an instance of the new category, through the merging of

rules with identical right-hand-side constituents. While a recursive rule has now been

acquired, it is not the most general one available. A further generalization is available over

one of the newly revised rules and the top-level rule resulting from the analysis of Input 5:




↑ = ↓ (↑comp) = ↓

Generalized Rule


↑ = ↓ (↑comp) = ↓Related Rules

Node15 → Node12 Node19 (Node15 → Node12 Node18)

↑ = ↓ (↑comp) = ↓ ↑ = ↓ (↑comp) = ↓

Node19 → Node7 (Node11 → Node7)

↑ = ↓ ↑ = ↓


↑ = ↓ ↑ = ↓

1Here the rule on the left represents the revised version of the rule on the right following thegeneralization.

111

The above generalization in turn licenses a generalization over Nodes 3 & 7, since the

additional constraint required for single-constituent rules is satisfied by the appearance of

these constituents in rules with identical labels:


Node19 → Node7

↑ = ↓Node19 → Node3

↑ = ↓

Generalized Rule

Node19 → Node20

↑ = ↓Related Rules

Node20 → [m, u, m, ee, z, i, n, dd, sw, g, ah, d, n]

(Node3) [subj:[pred:mummy],pred:be(subj,loc),loc:[pred:garden],pcase:in]

Node20 → [d, a, d, ee, z, p, aa, n, t, i, ng, dd, sw, k, i, ch, i, n]

(Node7) [subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]

The final generalization available is over the sentence-level rules which can now be

generalized over to give a single, maximally general, recursive rule for sentences with

sentential complements1:




↑ = ↓ (↑comp) = ↓Generalized Rule


↑ = ↓ (↑comp) = ↓Related Rules


↑ = ↓ ↑ = ↓

Node21 → [k, l, oo, ee, th, i, n, k, s, dd a, t]

(Node9) [pred:think(subj,comp),subj:[pred:chloe]]

1Nodes 15 & 19 are recognised as equivalent. The choice of Node15 as the label to be retained is arbitrary.

112

Node21 → [e, m, sw, s, e, z, dd, a, t]

(Node12) [pred:say(subj,comp),subj:[pred:emma]]

Transcripts showing the model’s acquisition of the above recursive grammar are given in

Appendices G & H.

5. 6. 8. 2. The Role of Generalization and Recursive Structures in Acquisition

Emma says that Chloe thinks that Mummy’s in the garden



Node21↑ = ↓

Node15

Peter suspects that

Node20↑ = ↓

Node21↑ = ↓

Node15(↑comp) = ↓Node21

↑ = ↓

Figure 5.18 Recursive Rules Enable Parsing of Novel Complex Utterances

The example given above of the acquisition of a recursive grammar serves to illustrate a

point made earlier in the chapter concerning the roles played in acquisition by the

F-Structure input and generalization processes. The C-Structure corresponding to a

particular utterance is recovered in acquisition by mapping the ordering information

contained in the utterance onto the F-Structure which, being unordered, underdetermines

C-Structure. Performing this mapping appears to present an intractable problem given a

standard, independent model of syntax acquisition which presupposes segmentation and a

lexicon of content words. This is thus the primary problem which the development of the

integrated model of acquisition processes addresses. A solution to this problem having

been implemented, existing mechanisms for generalization can be applied to the

C-Structures constructed from particular utterance/F-Structure pairs, resulting in the

113

acquisition of recursive rules. The importance of these recursive rules is that they provide

a means for analysing input utterances with further levels of embedding without the need

to assume that the F-Structures for these can be recovered from context (Figure 5.18)1.

The model thus avoids the criticism that2:

“too much thematic information ‘gives away’ the problem by indirectly encoding

the syntax of sentences.”

(Berwick 1985, p.24)

The issue of the role of semantics and other kinds of information in the acquisition of

syntax is one to which we return below.

5. 6. 9. Summary of Acquisition in the Revised Model

The model described initially acquires a finite-state grammar along with a lexicon

containing large, unproductive “chunks” of language. There is a gradual transition to a

recursive phrase-structure grammar and a lexicon of adult words. Development in the

model appears to be reasonably consistent with the observed features of child language

development, predicting a progression from the rote reproduction of utterances to the

acquisition of increasingly complex grammatical constructions. Syntactic categories are

gradually acquired in the model so that the lexical nature of early child language is also

captured.

5. 7. Further Issues in Acquisition

5. 7. 1. Closure in Acquisition

The issue of syntactic closure is one which was made more difficult due to the acquisition

of non-standard lexical items in the revised model. As mentioned in the previous chapter,

we allow competing edges to co-exist in parsing so long as they agree on the constituents

found. An exception involves competing active and inactive edges. Were these allowed to

co-exist, the inactive might form the basis for rule invocation, while the next constituent

processed might extend one of the competing actives, such that there was no longer a

single analysis. Thus, closure is one of the issues addressed in ambiguity resolution. In

rule-based parsing, the choice between an inactive and the competing actives uses

information such as lookahead which is not available in acquisition. Below we outline how

1This example presupposes that further lexical acquisition has taken place, but not that a construction ofthis complexity has previously been encountered.2The model seems to require, as in the example presented, inputs with two levels of embedding as thebasis for the acquisition of recursive structure. Thus it agrees with Wexler & Culicover’s proposal ofDegree-2 learnability (Wexler & Culicover 1980). Morgan (1986) has suggested that Degree-1 learnabilityis possible given the appropriate assumptions concerning the input, such as that “bracketing” informationis given by different kinds of input utterance. While the model developed takes account of the latter kindof information, it does not assume syntactic categorization, and it is thus for the acquisition of recursivecategories that two levels of embedding are required.

114

the issue of closure in acquisition was addressed in the initial and revised models of

acquisition.


Lex1↑ = ↓

Lex2↑ = ↓

the cat s chased the rat

def: pospred: cat

Figure 5.19 Closure and Completeness

A completeness check (i.e., a check for the pred feature and its grammatical functions) is

one kind of information which could be applied to the F-Structure of an edge in order to

determine the earliest point at which closure might apply. This has the disadvantage that

it requires the assumption that LFG’s well-formedness condition of completeness is

innate, an assumption which we rule out below. Furthermore, it is not always appropriate

to close a phrase as soon as its pred has been found (Figure 5. 19).

In initially developing the model, having rejected the assumption of completeness, we

relied heavily upon the assumption of syntactic-semantic constituency. This meant that a

phrase was not assumed complete unless it contained in its F-Structure all of the features

present in the corresponding input F-Structure. This removed the possibility of premature

closure that was associated with completeness. However, the solution settled upon was

unsatisfactory insofar as it relied upon the assumption of a complete semantic input to

acquisition.

The reliance on syntactic-semantic constituency for determining closure could not be

retained in the revised model, since it assumes a standard segmentation and lexicon.

Given a non-standard segmentation, the need for closure might not always be recognised

(Figure 5.20); alternatively, closure might take place prematurely. Thus an alternative

approach to closure had to be considered. This is outlined below.

115


Node3

the cats chased the rat

Node1↑ = ↓

Node4↑ = ↓

Figure 5.20 Closure and Non-Standard Segmentation

In the revised model we, in effect, relax the constraints of determinism (in acquisition

only), rather than specify some principled basis for deciding when to apply closure. While

in rule-based parsing the action of closure is deterministic, in acquisition it need not be.

This means that, having placed an inactive on the Agenda and applied an action (e.g.,

attachment) to it, we may later extend the same inactive, now treating it as an active, and

then re-apply the action to the resulting edge. Since relaxing determinism means that

recovery from premature closure is possible, inactives are preferred to actives in

acquisition. We also introduce a third action in acquisition: where both attachment and

proposal fail, an “active” equivalent to the inactive on the Agenda is placed in the Chart

to be extended. Assuming a preference for inactive edges does not present difficulties in

acquisition, as it would in rule-based parsing where it would mean neglecting to consider

certain analyses. However, this is only possible given that the assumption of a complete

semantic input to acquisition means that there are no ambiguities to be resolved.

We have not offered a satisfactory solution to the issue of closure in acquisition. One

question to be considered is whether the importance of closure is exaggerated due to the

notation used for representing the knowledge acquired. If this were the case, then the C-

Structure illustrated below (Figure 5.21) might be considered an appropriate output from

acquisition. There is a problem, here, however, which is that the portion of the tree

corresponding to “the cat” is not isomorphic with that which would be generated if only

the noun phrase was input. This cannot be considered merely a cosmetic issue, since it

would affect the ability of the model to make generalizations.

116


the cat s chased the rat

Node3



Node1↑ = ↓

Node4↑ = ↓

Node6↑ = ↓

Node8↑ = ↓ Node2

the cat s

Node1↑ = ↓

Node4↑ = ↓

Node6↑ = ↓

Figure 5.21 Closure is Required for Isomorphism

5. 7. 2. Representation in the Revised Model

5. 7. 2. 1. Prepositional Phrases and Grammatical Functions

One aspect of Lexical-Functional Grammar which it was necessary to modify in using it in

a model of acquisition was its treatment of prepositions. In LFG, verbs which take

prepositional phrases as arguments have the appropriate preposition associated with an

argument represented in their lexical forms, e.g., put(subj,obj,in-obj). Prepositional

phrases and their corresponding grammatical functions are represented in C- and F-

Structure as illustrated (Figure 5. 22). The cyclic defining equation “(↑(↓ pcase)) = ↓”

means that the feature in in the F-Structure corresponding to the prepositional phrase

derives from the value of the feature pcase in the F-Structure below (i.e., that deriving

from the lexical entry for the preposition). That is, the value in of the feature pcase,

deriving from the lexical entry for “in”, instantiates the variable “↓” in the equation

“(↑(↓ pcase)) = ↓”, giving the more specific equation “(↑in) = ↓”.

Pp(↑(↓pcase) = ↓

Prep↑ = ↓

Np(↑obj) = ↓

Det↑ = ↓

Noun↑ = ↓

in the car

pred: put(subj, obj, in-obj)

in: obj:

pcase: in

def: pospred: carnum: sg

…

Figure 5.22 Prepositional Phrases in LFG

117

The main reason for rejecting the above form of representation is that it would result in

language-specific syntactic information in the semantic input to acquisition, in the form of

the identities of prepositions used with certain verbs1. There is a further consideration,

which is that the rules acquired might contain equations such as “(↑ in) = ↓ ” and

“( ↑ to) = ↓ ” , but not the more general “(↑ (↓ pcase)) = ↓ ” (in the absence of any

explanation as to what should motivate this generalization).

Pp(↑loc) = ↓

Prep↑ = ↓

Np↑ = ↓

Det↑ = ↓

Noun↑ = ↓

in the car

pred: put(subj, obj, loc)

loc: def: pospred: carnum: sgpcase: in

…

Figure 5.23 Alternative Representation of Prepositional Phrases

In order to maintain both language-neutrality and generality in acquisition, we replace

features such as in in the F-Structures representing the semantic input to acquisition with

the more general case-like feature loc, which is treated analogously to grammatical

functions like subj and obj, as illustrated (Figure 5.23).

5. 7. 2. 2. Representing Well-formedness Conditions

During the development of the model, one aspect of the Lexical-Functional Grammar

Formalism, namely its well-formedness conditions of completeness and coherence,

emerged as particularly problematic. Completeness is the requirement that all grammatical

functions governed by pred be present in the F-Structure, while coherence requires that no

governable functions ungoverned by the main predicate be present. The purpose of these

constraints is the elimination of ungrammatical constructions such as “the girl fell the

boy” (coherence violated) and “the monkey put the banana” (completeness violated).

Completeness and coherence are normally thought of as post-parsing checks on

grammaticality.

1There is also the more general implication that certain arguments are syntactically realised asprepositional phrases at the level of C-Structure.

118

One problem with LFG’s grammaticality conditions is that they suggest a bias towards

the task of comprehension, since it is not clear how they are to prevent the production of

ungrammatical utterances. In acquisition, a further question which arises is that of how the

constraints are to be acquired. The obvious recourse at this point is to the proposal that

they are innate. The assumption of innateness, however, creates another problem.

Assuming completeness doesn’t facilitate the acquisition of subject-optionality, a feature

of some natural languages which suggests that completeness is relative to the language

being acquired and not an absolute requirement as suggested by LFG.

In developing the model, the simplifying assumption initially made was that completeness

and coherence checking were built into the model. However, it was necessary to remove

the in-built constraints when it was proposed that input utterances included phrases as

well as sentences. Consideration of non-sentential constituents highlighted what seems

to be a serious deficiency of LFG, which is that the notions of completeness and coherence

only apply to the F-Structures of sentences. It is impossible, in LFG, to distinguish

between a grammatical and ungrammatical verb phrase: “fell” and “fell the boy” are

equally ungrammatical, since the F-Structure of any verb phrase is necessarily

incomplete.

While LFG’s well-formedness conditions clearly cannot be retained in the model

developed, there is a need to distinguish between grammatical and ungrammatical

constituents. We propose that constraints on grammaticality should be language-specific,

a property of the grammar acquired. To this end we propose an enhanced notation for

grammar rules which ensures that they may only apply in conjunction with the appropriate

lexical items. Well-formedness conditions are thus viewed as a constraint on the

relationship between lexicon and grammar. This means that grammaticality can be

maintained in the model without either assuming or acquiring the notions of completeness

and coherence.

LFG Notation

Vp → Verb

↑ = ↓

Vp → Verb Np

↑ = ↓ (↑obj) = ↓

Verb → fell [pred:fall(subj), …]

119

Verb → throws [pred:throw(subj,obj), …]

Enhanced Notation

Vp(subj:Subj^pred:Pred) → Verb(subj:Subj^pred:Pred)

↑ = ↓

Vp(subj:Subj^pred:Pred) → Verb(obj:Obj^subj:Subj^pred:Pred) Np(Obj)

↑ = ↓ (↑obj) = ↓

Verb(subj:Subj^pred:fall(Subj)) → fell [pred:fall(subj), …]

Verb(obj:Obj^subj:Subj^pred:throw(Subj,Obj)) →throws [pred:throw(subj,obj), …]

Under the LFG notation, either verb matches either rule, but, under the enhanced notation,

each verb matches only the appropriate rule. The notation is language-specific in that a

language with the causal concept “to fall” could be envisaged, for which “fell the boy”

was a grammatical verb phrase. The latest version of the model uses this notation and

eliminates separate checks of completeness and coherence. We outline the minor changes

to the model this entails below. Transcripts of acquisition in the model using the revised

notation are given in Appendices E & F.

In rule-based parsing, the constraints of the enhanced notation are implemented simply,

using Prolog’s matching. Minor changes are required in acquisition to model the

acquisition of the enhanced grammar rules. Building the “semantic” left-hand side of the

rule involves basically the same kinds of information already used in acquisition. However,

the rules acquired initially have variables like Subj instantiated to a particular value, and

these have to be replaced with variables to enable generalization over rules to take place.

In appropriate cases, however, a lexical entry or grammatical rule may incorporate

particular values:

Verb(subj:Subj^pred:throw(Subj,ball)) →throws the ball [pred:throw(subj,obj), obj:[pred:ball, …], …]

The induction of syntactic categories is complicated, but not inhibited, by the revised

notation.

120

5. 7. 3. The Bases for the Acquisition of Syntactic Structure

We have demonstrated that the model is capable of acquiring recursive phrase-structure

rules and thus of representing information at the level of C-Structure which is not

explicitly encoded in the semantic input to acquisition. The question we consider here is

that of the extent to which semantic structure underlies syntactic structure. We propose

that constituents may appear in C-Structure which do not correspond to entities at the

level of F-Structure and that these have their basis in the kinds of non-sentential

utterances which may constitute inputs to acquisition. This idea is expanded upon below.

5. 7. 3. 1. F-Structure May Underdetermine C-Structure

In the previous section we explicitly rejected certain aspects of the LFG notation as

unsuitable for inclusion in an empiricist model of acquisition. One of these was the idea

that an argument of a predicate could be characterised, in F-Structure, as the argument of

a particular preposition in the language, e.g., in-obj. Such a representation of arguments

was rejected as language-specific. However, in rejecting this aspect of LFG in the

representation of semantic inputs to acquisition, we lose constituency information relevant

at the level of C-Structure. The implications of this for acquisition are illustrated below by

means of an example.

to the girl with a teddy bear

Node16↑ = ↓

Node16↑ = ↓

Node11↑ = ↓

Node11↑ = ↓

Node15(↑adjunct) = ↓

Node9↑ = ↓

Node15

Figure 5.24 Analysis of Prepositional Phrase Follows Analysis of Noun Phrase

Were we to assume LFG’s representation in the F-Structures input to acquisition, then

the appropriate inputs would result in the acquisition of a recursive rule for the noun

phrase, corresponding to:

121

Np → Np Pp

↑ = ↓ (↑adjunct) = ↓However, given our alternative representation of F-Structure, which does not nest an

F-Structure corresponding to the Np within one corresponding to the Pp, the acquisition of

a recursive rule depends upon the ordering of input utterances. If a noun phrase, such as

“the girl with the teddy bear”, is input and analysed, then the rule thus acquired may be

used in the analysis of a later, more complex input (Figure 5.24)1, with the output of this

analysis forming the basis for the acquisition of the recursive rule. Here, the constituent

Node9 does not correspond to a distinct F-Structure nested within the F-Structure for the

utterance, but, rather, has as its source the fact that the noun phrase has previously been

encountered as a grammatical utterance. If the more complex utterance is encountered

before a rule for analysing the noun phrase within it has been acquired, then the analysis

does not facilitate the acquisition of a recursive rule (Figure 5.25).

to the girl with a teddy bear

Node15↑ = ↓

Node15↑ = ↓

Node10↑ = ↓

Node10↑ = ↓

Node14(↑adjunct) = ↓

Node14

Figure 5.25 Analysis of Prepositional Phrase Precedes Analysis of Noun Phrase

Appendix I illustrates the acquisition of the recursive Np rule in the model. Appendix J

illustrates failure to acquire the rule when the identical inputs are presented but in a

different sequence.

5. 7. 3. 1. 1. The Role of Utterance Type and Input Ordering in Acquisition

Above we illustrate how the semantic inputs to acquisition may underdetermine

C-Structure. This raises the possibility that the representation of the semantic input to

the model is inadequate and in need of revision. There is, however, a more interesting

possibility which is that the acquisition of C-Structure may itself contribute to the

1Here the labels of constituents are chosen to match those in the examples of acquisition in the appendices.

122

development of representation at the level of F-Structure, thus partially satisfying the

need for less reliance on overly strong assumptions concerning the semantic input to

acquisition. It is proposed that what enables C-Structure to contribute to the development

of F-Structure is that it is determined partly by semantic information and partly by the

empirical evidence available to the learner as to what may constitute a grammatical

utterance. The latter includes, in addition to sentences, e.g., anything which may provide

the answer to a question:

Who hit the boy? The girl with the teddy bear. (Np)

Where shall I put it? In the box. (Pp)

What’s Daddy doing? Cooking dinner. (Vp)1

Currently, the model is capable of acquiring competing structures (Figures 5.24, 5.25),

where these are built from different constituents which later come to be recognised as

equivalent through syntactic categorization. The issue of how one analysis should be

selected over another in the model has not yet been determined. There are a number of

possibilities to be considered. A mechanism similar to generalization could be proposed to

merge relatively specific rules with more general rules recognized to subsume them.

Alternatively, a task might be encountered, for instance in language production, for which

only a general, recursive rule sufficed, and this might thereby come to be preferred over the

alternatives. A more interesting issue than that of selecting from amongst competing

analyses may be the question of whether acquisition may be revised so that only the

appropriate structures are ever proposed. If the semantic input to acquisition were less

complete, acquisition might only be able to succeed in an incremental manner, thus

mimicking the effects of an increasingly complex sequence of inputs.

5. 7. 3. 1. 2. The Case of the Verb Phrase

The verb phrase provides the obvious case in which F-Structure appears to

underdetermine C-Structure. This means that, if a sentence is input to the model before a

rule for the verb phrase has been acquired, the analysis will not incorporate a constituent

corresponding to the verb phrase (Figure 5.26). If this constitutes an inadequate analysis,

then it can be argued that the model is currently underconstrained. However, it may

nevertheless have a role to play in the development of a theory of acquisition.

1Verb phrases are discussed in the following section.

123

Henry is baking a cake

Aux↑ = ↓

Verb↑ = ↓

Np(↑subj) = ↓

Np(↑obj) = ↓

S

Figure 5.26 Analysis of Sentence Precedes Analysis of Verb Phrase

Like the model, linguistic theories may propose competing representations (Figure 5.27).

A theory of acquisition, such as that embodied in the model, may be used to select

amongst these, since it constrains that which can be represented to that which can be

acquired.


Aux↑ = ↓

Verb↑ = ↓

Np(↑subj) = ↓

Np(↑obj) = ↓

S


Aux↑ = ↓

Verb↑ = ↓

Np(↑subj) = ↓

Np(↑obj) = ↓

Vp↑ = ↓ Vp

↑ = ↓

S

Figure 5.27 Competing Analyses of the Verb Phrase

Either of the structures in Figure 5.27 could be acquired by the model, given the

appropriate inputs. However, while one of the analyses presupposes an input of the type

licensed by the model, the other can be rejected as requiring an ungrammatical input:

What is Henry doing?

Baking a cake. S → Np Aux Vp

(↑subj) = ↓ ↑ = ↓ ↑ = ↓Vp → Verb Np

↑ = ↓ (↑obj) = ↓

* Is baking a cake. S → Np Vp

(↑subj) = ↓ ↑ = ↓

124

Vp → Aux Verb Np

↑ = ↓ ↑ = ↓ (↑obj) = ↓

In each case, an input utterance constitutes the source of the verb phrase constituent.

While the first of the above analyses requires a valid input, however, the second requires

an invalid input and is thus ruled out by the model.

5. 7. 3. 2. Summary

We suggest that syntactic structure is jointly determined by semantic structure and by the

nature of those utterances input to acquisition. The latter may give rise to the proposal of

syntactic constituents other than those based upon semantic structure. Possible

constituents are, however, tightly restricted by constraints on what may constitute a valid

input utterance. In developing the model so that it is less reliant upon semantic

information, it is proposed that that information which can be acquired concerning what is a

grammatical utterance in the language may play an important role in the development of

phrase structure.

5. 8. Comparison with Wolff’s SNPR

Wolff’s SNPR (Wolff 1988) embodies general learning principles of optimization by means

of redundancy abstraction. It is radically different from the other models of language

acquisition discussed in that learning relies, not on semantic information, but wholly on:

“distributional or cluster analysis designed to reveal statistical discontinuities

at the boundaries of words and other segments.”

(Wolff 1988, p.192)

Its interest lies in the fact that input consists of an unsegmented string of perceptual

primitives (either letters or phonemes, in the case of language acquisition) and output is in

the form of a context-free phrase-structure grammar. The issues of segmentation and

phrase structure acquisition are thus both addressed, as in the model developed here.

Learning in the model has already been described when discussing issues in segmentation

and is outlined briefly here. During the parsing of a string, the frequencies of all pairs of

adjacent elements are recorded, and the most frequent pair chunked. The model is quite

successful at recognising word boundaries, but performance declines markedly for phrases.

The successful recognition of phrase boundaries requires that input strings consist of

words labelled with their syntactic categories.

125

It is argued that SNPR, unlike the model developed here, does not provide a reasonable

model of child language development. Learning relies on exhaustive searches and

analyses, so that the processing demands of the learning mechanisms suggest that they

may be incompatible with the assumption of limited working memory. Another problem

relates to the exclusively bottom-up approach of successively clustering elements with

adjacent elements in order to discover the appropriate segmentation and (eventually) the

grammar. This seems to predict the prevalence of over-segmentation in child language, a

phenomenon much rarer than that of under-segmentation predicted by the model

developed here (Peters 1983).

5. 9. Summary and Conclusions

The revised model described above attempts to demonstrate that it is not necessary for

theories of language acquisition to assume the innateness of syntactic categories or the

X-Bar Theory of Phrase Structure. While previous models which rejected these

assumptions (as well as ad hoc language-specific assumptions) acquired only finite-state

grammars, the computational model described here acquires a phrase-structure grammar.

Development is gradual and continuous, as in children. The model differs from previous

models primarily in that no lexical acquisition or segmentation are assumed at the start of

acquisition. This suggests that syntax acquisition cannot be considered apart from those

processes which determine the nature of the units upon which it operates. Once phrase

structure has been acquired by the model, recursive phrase structure can also be acquired

through generalization processes used in the induction of syntactic categories. The

importance of recursive rules is that they can be acquired on the basis of relatively simple

sentences, with no more than two levels of embedding, and then used in the analysis of

more complex constructions, thus eliminating the need to assume that the semantics of

the latter can be recovered from context.

In addressing the problem of phrase structure acquisition by developing an integrated

model of acquisition processes, we necessarily consider issues in lexical acquisition. The

approach offered to lexical acquisition in the model is interesting in its own right, since,

uniquely, it is top-down and thus does not presuppose segmentation. The value of the

top-down approach to lexical acquisition is in addressing what has been a major criticism

of top-down approaches to lexical recognition, that they are not compatible with an

account of acquisition, and in overcoming some of the inadequacies of existing bottom-up

approaches to lexical acquisition. While bottom-up approaches have focussed upon the

acquisition of segmentation, neglecting issues in lexical acquisition, the top-down

approach to lexical acquisition eliminates the need for a separate account of the acquisition

126

of segmentation abilities. A more accurate representation of utterances input to the model,

in terms of features rather than discrete phones, is eventually required so that it may also

incorporate the kinds of information implicated by bottom-up accounts of segmentation.

In developing an empiricist model of acquisition, we have relied heavily on semantic inputs

to acquisition. However, we have also considered other bases of syntactic structure. One

of these is the acquisition of recursive rules through generalization. We also propose that

syntactic constituents, such as verb phrases, which are not derived from semantic

constituents, may be acquired on the basis of non-sentential inputs to acquisition. Since

the latter are restricted to grammatical utterances, the model only predicts a small number

of such constituents. The semantic input to learning remains an important aspect of the

model developed which is in need of refinement. The F-Structure semantic input provides

an idealised learning situation. In a more realistic simulation, the semantic features

encoded in the F-Structure will themselves have to be acquired, and learning will rely

upon partial and imperfect semantic representations of utterances. By proposing a role for

different kinds of input utterance in the acquisition of syntactic constituents which do not

correspond to semantic structures, we have suggested how the model’s dependence on

semantic input may be reduced.

The solution proposed to the problem of phrase structure acquisition re-introduces an

issue discussed in the previous chapter. This is the idea that existing knowledge may

constrain, rather than facilitate, learning. We showed in the previous chapter how the

assumption of syntactic categorization inhibited grammar rule acquisition. Similarly, the

assumption of a lexicon of content words and the ability to segment can be viewed as

inhibiting both phrase structure and functional morpheme acquisition. The development of

the model thus serves to illustrate how learning within a language is constrained by

existing knowledge of that language. Where the acquisition of one kind of knowledge

inhibits the acquisition of a different kind of knowledge, an integrated model of their

acquisition is required. Grammar acquisition needs to take place alongside the induction of

syntactic categories, and, similarly, alongside the acquisition of the lexicon and

segmentation abilities. We consider, in the concluding chapter, the implications for

learning across languages.

127

6. Acquisition in the Model and in Children

6. 1. Introduction

In the previous chapter we described an empiricist model of phrase structure acquisition.

This appeared to provide a reasonable model of child language development insofar as

acquisition was gradual, with the syntactic phrase-structure grammar being acquired on

the basis of an earlier lexical finite-state grammar. In this chapter we attempt to evaluate

in more detail a particular aspect of child language acquisition in the model, that of

functional morpheme acquisition. The model incorporates an account of functional

morpheme acquisition, since this is simply a case of lexical acquisition in the model

developed.

In evaluating functional morpheme acquisition in the model, we aim to combine empirical

and learnability-theoretic objectives. The model may be evaluated against the empirical

data on the order in which children tend to acquire different functional morphemes. Related

issues in functional morpheme acquisition include accounting for such phenomena as

functional morpheme omission, the overgeneralization of certain regular forms such as the

English past tense “ed”, and the development of comprehension ahead of production. It is

important to note that, while the model is concerned with the acquisition of grammatical

competence, the empirical data relates to language production. In evaluating the model it

is necessary, therefore, to extend it to at least consider issues in language production. An

important question in learnability is whether the target lexical entries for the functional

morphemes may be acquired. To this end we examine whether certain features which are

only implicit in lexical entries acquired by the model, such as those of uninflected verb

stems, need be regarded as problematic. The model also needs to be able to account for

phenomena such as syncretism:

“the use of one affix to encode different sets of features.”

(Pinker 1984, p.173)

Before evaluating the various issues in functional morpheme acquisition in the model, we

expand upon these in a little more detail and consider to what extent they are addressed

by the other models examined.

6. 2. Some Issues in Functional Morpheme Acquisition

6. 2. 1. Order of Acquisition

Functional morpheme acquisition has been viewed as providing relatively objective data

against which computational models can be evaluated. Despite the variations in language

learning across individuals, Brown (1973) found that order of acquisition of fourteen

128

English functional morphemes was remarkably consistent for three children studied. It is a

result which has since been replicated on the basis of more extensive data. The relative

ease of acquisition of a functional morpheme seems to be determined by the complexity of

the features it encodes. Cross-linguistic studies of functional morpheme acquisition also

support this idea. Slobin (1982) reports that the agglutinating Turkish system of

inflections, in which each functional morpheme encodes a single feature, is acquired earlier

than the fusional Serbo-Croatian system, in which each inflection encodes a complex of

features. Turkish also appears easier to acquire in that the kinds of errors and

overgeneralizations found in the acquisition of Serbo-Croatian are not observed.

6. 2. 2. Functional Morpheme Omission

Infants have been observed to pass through a stage which has been termed telegraphic

due to the omission of functional morphemes and major constituents from the utterances

produced. Where telegraphic speech is modelled, models need to incorporate an account of

how the child eventually comes to acquire an adult-like grammar. The data on child

language development also provides further constraints on accounts of functional

morpheme omission. There is the need to account for the observation that functional

morphemes are produced prior to the stage at which they are omitted. Experiments by

Gerken (1987) suggest that children at the telegraphic stage perceive functional

morphemes: in imitation, functional morphemes were systematically omitted, whereas

unstressed and phonologically similar nonsense functors were normally reproduced. For

example, the English functors were omitted 50% of the time in imitation of sentences like

“Pete pushes the dog”, while nonsense functors were omitted only 29% of the time from

sentences like “Pete pusheg le dog” (Gerken 1987, p.54). Children at this stage have

also been found to respond better to speech containing functional morphemes than to

telegraphic speech (Ingram 1989).

6. 2. 3. Overregularization

Children’s overextensions of rules for the application of functional morphemes have been

termed overregularizations, or overgeneralizations, since they are systematic rather than

random and unconstrained. In the acquisition of English, the regular past tense “ed” may

be applied to irregular verbs to produce, e.g., “goed”, and, similarly, the regular plural may

be applied to irregular nouns to produce, e.g., “sheeps”. However, overextension of the

progressive “ing” is not observed, since the progressive is completely regular (Brown

1973). A remarkable feature of overgeneralization is that it follows a period of correct use

of regular and irregular forms. Thus, models need to account for the onset of

overregularization. This has been linked with the acquisition of a generalized (as

129

contrasted with word-specific) rule for the formation of the regular past tense, in the case

of overregularization of the past tense “ed” (Marcus 1992). However, accounts which

seek to explain overgeneralization in terms of the acquisition of this rule fail, crucially, to

explain why the latter is acquired when it is (Plunkett 1994).

It has already been suggested that models need to account for overregularization following

a period of correct use. Overregularization has previously been characterized in terms of a

U-shaped curve, with correct use preceding and recovery following the stage of

overgeneralization (e.g., Rumelhart & McClelland 1986). More recently, in-depth

empirical studies have suggested that there is no stage where overregularization is

prevalent (Marcus 1992); rather, it exists, if at all, alongside largely correct use. This is,

then, the phenomenon for which we assume models need to account. A further

consideration to be used in constraining accounts of overregularization is its relation to

other phenomena in functional morpheme acquisition; for instance, it is a phenomenon

observed at a later stage than functional morpheme omission.

6. 2. 4. Comprehension and Production

The development of comprehension ahead of production is considered by Clark & Clark to

be one of:

“three...issues...critical for a general theory of the acquisition of language.”

Clark & Clark (1977, p.297)

There is some evidence to suggest that, even after a word is recognized in comprehension,

it is some time before it becomes available for use in production. Ingram (1989) reports

that children are estimated to comprehend on average 100 words at the time they produce

their first words. At the holophrastic (one-word) stage of production, there is evidence of

the ability to comprehend multiple word utterances. Later, at the telegraphic stage,

children respond better to speech containing grammatical morphemes than to utterances

similar to those they are themselves producing. The latter is a surprising and apparently

paradoxical finding which presents a challenge for accounts of functional morpheme

acquisition.

6. 2. 5. Issues in Learnability

A feature of some natural languages which accounts of functional morpheme acquisition

need to be able to incorporate is syncretism. Pinker (1984) argues that generalization-

based accounts of functional morpheme acquisition are inadequate since they assume a

single lexical entry for a morpheme in which any disjunctive sets of defining features will

be generalized over and lost. This is an issue which we need to consider since lexical

130

acquisition in the model developed can be characterized as generalization-based; i.e.,

features are removed from lexical entries in the model as they are found to conflict with the

input. We also consider a related issue in representational adequacy which is relevant to

the acquisition of English. This concerns whether there is the need to model the

acquisition of lexical entries for zero morphemes, as has been argued by Pinker (1984).

6. 3. Existing Models

Functional morpheme acquisition is an issue which has been addressed by a minority of

the models examined, since these have tended to focus upon the acquisition of syntax

rather than morphology. Issues in morphology are, nevertheless, crucial if models are to be

sufficiently general to account for the acquisition of inflectional as well as word order

languages. We thus examine to what extent the issues have been addressed. We

evaluate AMBER (Langley 1982) and the work of Pinker (1984) with respect to the

objectives of accounting for order of acquisition, the phenomena of functional morpheme

omission and overgeneralization, and issues in learnability. We also briefly examine the

accounts of functional morpheme omission offered by Hill (1983) and ACT* (Anderson

1983), and an implementation of functional morpheme acquisition in the Competition

Model (MacWhinney & Leinbach 1989).

6. 3. 1. The Telegraphic Perception Hypothesis

Hill (1983) and ACT* (Anderson 1983) account for functional morpheme omission in

terms of the telegraphic perception hypothesis. This can be summarized as the proposal

that functional morphemes are not produced since they are not perceived and thus not

present in the grammar acquired. A problem with this approach, as mentioned in Chapter

2, is that it fails to explain how functional morphemes eventually come to be perceived and

incorporated into the grammar. Thus, these models fail to offer any account of functional

morpheme acquisition. The criticism of these models is not that they are merely

underspecified in this respect, since the empirical evidence also conflicts with the

telegraphic perception hypothesis. As Gerken (1987) has argued, children’s systematic

omission of functional morphemes (as contrasted with unstressed non-functors) in

imitation experiments suggests that they are both perceived and recognised. The

observation that functional morphemes are produced before the stage at which they are

omitted is further evidence that they are perceived, as is the preference, in comprehension,

for non-telegraphic speech.

131

6. 3. 2. Discrimination-Based Learning

Langley’s production-based model of acquisition, AMBER (Langley 1982), incorporates a

discrimination-based account of functional morpheme acquisition. While generalization-

based learning uses the input evidence to remove features from overly specific lexical

entries, discrimination-based learning adds features to overly-general lexical entries. The

main motivation behind the discrimination-based account is the need to account for order

of acquisition. Given that only one feature is ever added to a lexical entry at a time, it is

predicted that those functional morphemes defined in terms of fewer features will be

acquired first. We outline discrimination-based learning in AMBER below.

The initial rules acquired by AMBER represent the model’s attempt to produce an

utterance on the basis of its semantic representation and a lexicon in the absence of any

grammar rules. The rules acquired are modified as the utterances produced by AMBER

are compared with target utterances. AMBER incorporates functional morphemes into

grammar rules when their presence is noted in the target utterance. Since AMBER

operates with the broad semantic categories action and object, the initial rules acquired

are overly general. AMBER is able to recognise this when there is a functional morpheme

in an utterance produced but no corresponding one in the target utterance. A search is

conducted for a feature which discriminates valid previous uses of the rule from the current

instantiation, and a new version of the rule is generated containing the discriminating

condition. Since only a single discriminating feature is acquired on any particular occasion,

morphemes with simple conditions of use are acquired before those with complex

conditions.

Functional morpheme omission in AMBER results from a production bottleneck. The task

in production is to traverse a semantic tree deterministically and output the word or

phrase corresponding to each node. The appropriate order in which to traverse the tree to

produce an utterance is encoded in the production rules acquired. Functional morphemes

and other constituents may be omitted in the early stages of acquisition due to

deterministic traversal in the absence of the full set of rules required. Recovery is gradual

and continuous since the production rules acquired by AMBER serve to enable

deterministic production without omission.

The main criticisms of AMBER’s account of functional morpheme acquisition relate to the

issues of overgeneralization and functional morpheme omission. Functional morphemes

are over-extended by AMBER, and there is recovery from this overextension through

discrimination, as outlined above. However, while children’s overgeneralizations are

132

systematic, AMBER’s overextensions of functional morphemes appear completely

random, so that the model cannot be viewed as offering a viable account of

overgeneralization. The reason underlying this would seem to be AMBER’s conflation of

comprehension (which is not explicitly modelled) with production. A potential problem

with discrimination-based learning, in general, is its heavy reliance on negative evidence;

however, the assumption of some feedback may be reasonable insofar as there appear to

be differential responses to children’s grammatical and ungrammatical utterances

(Bohannon et al 1990). The possibility is thus not ruled out that discrimination may have a

role to play in functional morpheme acquisition. With respect to the issue of functional

morpheme omission, the model is unable to account for the observation that functional

morphemes are produced before the stage at which they are omitted.

6. 3. 3. Paradigm-Based Learning

The account of functional morpheme acquisition presented by Pinker (1984) uses the

paradigm representation for the inflectional knowledge acquired. The paradigm consists of

a matrix in which rows and columns are labelled with features (Figure 6.1). The task in

acquisition is to fill each cell in a word-specific paradigm with a uniquely inflected form of

the stem. Further features are added to the paradigm when more than one form appears in

a cell, violating the uniqueness constraint. The uniqueness constraint also accounts for

how recovery from overregularization is envisaged: where there are no features

distinguishing two competing forms, that licensed by the input (i.e., the irregular) is

preferred. The claimed advantages of the paradigm representation are that it predicts the

correct order of acquisition and that syncretism is easily accommodated, since the same

form may appear in more than one cell. It is proposed that a production bottleneck

accounts for functional morpheme omission; however, no details of this are given since

Pinker is concerned primarily with accounting for the acquisition of grammatical

competence.

stema

stemb

stemc

stemd

Feature 1

Value 1

Value 2

Value 1 Value 2

Feature 2

Figure 6.1 Paradigm Representation of Inflected Forms

Pinker’s account of functional morpheme omission is open to the same criticisms as

AMBER; that is, a production bottleneck does not explain why functional morphemes

133

should be produced before they are omitted. Further difficulties arise in relation to the

insufficiently precise account of functional morpheme acquisition. For instance,

segmentation and the acquisition of general paradigms are assumed to follow the

acquisition of word-specific paradigms; however, it is clear that segmentation is

presupposed by the acquisition of word-specific paradigms, since each is set up for a

specific word stem.

6. 3. 4. Competition-Based Learning

The time course of children’s acquisition of German articles is demonstrated in a PDP

implementation of this aspect of the Competition Model (MacWhinney & Leinbach 1989).

This attempts to overcome some of the criticisms made of the earlier connectionist model

of past tense acquisition and overregularization (Rumelhart & McClelland 1986), such as

that stages in acquisition were simply induced by changes in the input data (Pinker &

Prince 1988). The model is interesting in that nouns input are represented in terms of

various phonological, morphological, and semantic cues, and the mapping from nouns to

articles is achieved via a hidden layer of units viewed as acquiring and encoding traditional

features like case, person, number, and gender. However, since only cues relevant to the

acquisition of each article are included in the input, the problem of acquiring the meanings

of functional morphemes in the context of a more general model of acquisition is not

addressed.

6. 3. 5. Summary of Functional Morpheme Acquisition in Existing Models

The discussion of functional morpheme acquisition reveals that this is an issue which has

received insufficient attention in the context of developing computational models of syntax

acquisition. Various proposals have been made as to learning mechanisms underlying the

order of acquisition of different functional morphemes. Some interesting phenomena which

remain to be explained include the onset of functional morpheme omission and

overregularization following periods of correct use.

6. 4. Functional Morpheme Acquisition in the Model Developed

Functional morpheme acquisition in the model is identical to any other case of lexical

acquisition. The meaning of a morpheme is acquired through the process of “reversed

unification”, which is applied to F-Structures corresponding to different instances of its

use. Feature:value pairs which are not shared by different exemplars of the morpheme’s

use are inferred to be non-defining features and are removed from the F-Structure of its

lexical entry, with only common feature:value pairs being retained. The model is thus an

example of the generalization-based approach to functional morpheme acquisition. In

134

evaluating the model we pay particular attention to criticisms which have been made of

previous generalization-based approaches (e.g., Pinker 1982), and thus we outline some

of these below.

One criticism which has been made of the generalization-based approach is that it

predicts the incorrect order of acquisition (Langley 1982; Pinker 1984). That is, if learning

involves features being removed from lexical entries as they are found to conflict with the

input evidence, then it is (incorrectly) predicted that the lexical entries for those functional

morphemes defined in terms of a complex of features will be arrived at before those for

morphemes which encode a simple feature. Another difficulty suggested by Pinker (1984)

is that the overly-specific lexical entries of the generalization-based account predict

undergeneralization of functional morphemes, whereas child language development

suggests that it is a stage of overgeneralization for which we need to account. A further

criticism which needs to be addressed is the claim that generalization-based models are

inadequate to represent phenomena like syncretism.

Our response to the above criticisms involves, in part, a re-analysis of the predictions of

the generalization-based account. The basis for this is a close examination of the issue of

the relationship between the grammatical competence acquired and predictions concerning

language production. We argue that, if performance in language production during

acquisition is taken into account, then the correct order of acquisition is predicted by the

generalization-based account. The same analysis is extended to incorporate a novel

account of functional morpheme omission (and recovery from this) and the development of

the comprehension of functional morphemes ahead of their production. We show further

that, as well as being consistent with the prediction of overgeneralization, a

generalization-based model which presupposes no segmentation can also account for its

onset. We propose that the generalization-based account is representationally adequate

for English, and that syncretism can also be accommodated so long as syntactic

categorization isn’t presupposed.

6. 4. 1. Order of Acquisition in the Model

In this section, we seek to illustrate that acquisition in the model developed is consistent

with the observation that functional morphemes which encode a simple feature are

acquired before those which encode a complex of features. For example, the model

predicts that the present progressive is used correctly from the start, whereas the third

person singular can only be used correctly once the target lexical entry is arrived at. The

source of these predictions is an examination of the problem of how functional morphemes

135

are produced during acquisition, before the acquisition of their target lexical entries is

complete. It will be recalled that lexical acquisition in the model is gradual, such that prior

to the acquisition of the target lexical entry for a functional morpheme there may be earlier

lexical entries which contain non-defining features. We specify in detail below the point in

acquisition at which the predictions of the model are examined.

We focus upon a number of functional morphemes to be acquired which are related to each

other as English verb inflections. The target lexical entries for these are as follows1:

Infl → ing [prog:pos]

Infl → ed [prog:neg, tense:past]

Infl → s [subj:[num:sg, person:3]]

We can represent the relations amongst the different inflections to be acquired in a tree of

feature:value pairs (Figure 6.2). If we traverse the tree from the root downwards, we can

determine the inflection required to express any set of features. Conversely, if we start

from one of the leaves of the tree, we can traverse the tree upwards to the root in order to

discover the features encoded by a particular functional morpheme.

prog: posprog: neg

tense: pasttense: present

subj: num: sgperson: 3

“ -ing”

“ -ed”

“ -s”Ø

Figure 6.2 English Verb Inflections

1We make the simplifying assumption that functional morpheme acquisition is independent fromsegmentation, referring to the interaction between these processes only where this is an importantconsideration. Lexical entries for zero morphemes are not acquired by the model: discussion of the issue ofrepresentational adequacy is deferred to a later section.

136

The language production task we investigate may be characterized as follows. The task is

to determine the appropriate verb inflection for expressing a complex of three feature:value

pairs which has not yet been encountered in the input to acquisition. The stage of

acquisition is such that examples of the appropriate inflections used to express any two of

the three features have been given in the input to acquisition, and this fact is reflected in

the current lexical entries for functional morphemes. The three cases we examine are

where the target inflection is defined in terms of one, two, and three features.

Case 1

Progressive: Single Feature

Features to be Expressed

Here, the task is to determine the correct inflection to be used to express the following

features:

subj:

prog: pos

tense: past

num: sgperson: 3

Exemplars

We assume that the subject and tense feature:value pairs have been encountered together

in an input sentence, e.g., “Dad walked to work”, the subject and progressive features in,

e.g., “Dad is walking home”, and the tense and progressive features in “We were baking

a cake”. The crucial kind of input which is not assumed to have been encountered is a

sentence like “Dad was baking a cake” which contains all three of the features to be

expressed. Given these inputs, the state of the model’s lexicon (taking into account only

the three above features) is as follows:

Infl → ing [prog:pos]

Infl → ed [prog:neg, tense:past, subj:[num:sg, person:3]]

Choice of Inflection

In this case, while the tense and subject features select (i.e., will unify with) either

morpheme, the progressive feature (i.e., prog:pos) selects only the appropriate morpheme:

137

subj:

prog: pos

tense: past

num: sgperson: 3

→ “ -ed” / “ -ing”

→ “ -ed” / “ -ing”

→ “ -ing”

In fact, in this case, the target lexical entry for the inflection has been acquired on the

basis of the (relatively impoverished) input, since both the tense and subject features

have been generalized over.

Case2

Past Tense: Pair of Features


The F-Structure below represents those features to be expressed by the appropriate

choice of inflection:

subj:

prog: neg

tense: past

num: sgperson: 3

Exemplars

Example input sentences assumed in this case are “Dad was baking a cake” (subject and

tense as above), “She walks to work” (subject and progressive), and “We baked a

cake” (tense and progressive). The kind of input which has not been encountered is a

sentence like “Dad baked a cake”. Given these inputs, the lexicon is as follows:

Infl → ing [prog:pos, tense:past, subj:[num:sg, person:3]]

Infl → ed [prog:neg, tense:past, subj:[num:pl, person:1]]

Infl → s [prog:neg, tense:present, subj:[num:sg, person:3]]


The triple of features to be expressed will unify with none of the existing lexical entries.

Taking each feature to be expressed singly, we find that each selects two inflections,

resulting in a combined total of three candidate morphemes, none of which is preferred:

138

subj:

prog: neg

tense: past

num: sgperson: 3

→ “ -s” / “ -ing”

→ “ -ed” / “ -ing”

→ “ -s” / “ -ed”

Thus in this case, unlike that above, further input to acquisition is required before the

target lexical entry for “ed” is arrived at.

Case3

Third Person Singular: Triple of Features


Here, the task is to select the inflection corresponding to the features given below:

subj:

prog: neg

tense: present

num: sgperson: 3

Exemplars

The example input sentences we assume are “Dad is baking a cake” (subject and tense

as above), “She walked to work” (subject and progressive) and “We walk to school”

(tense and progressive). The crucial kind of exemplar which is missing in this case is

“She walks to work”. Our lexical entries are as follows1:

Infl → ing [prog:pos, tense:present, subj:[num:sg, person:3]]

Infl → ed [prog:neg, tense:past, subj:[num:sg, person:3]]

Infl → Ø [prog:neg, tense:present, subj:[num:pl, person:1]]


subj:

prog: neg

tense: present

num: sgperson: 3

→ “ -ed” / “ -ing”

→ Ø / “ -ing”

→ Ø / “ -ed”

1We assume the zero morpheme notation as a temporary representational convenience.

139

Here, again, the triple of features fails to select any of the lexical entries. There are three

candidate morphemes suggested by the features considered singly; however, the correct

inflection “s” isn’t included amongst them. This isn’t surprising since, in this case, saying

that the appropriate complex of features hasn’t been encountered is equivalent to saying

that the morpheme “s” itself has not been encountered. Dropping the assumption of zero

morphemes, which are not acquired by the model, does not alter the fact that the correct

inflection cannot be determined. However, given that the uninflected form replaces the

zero morpheme option, it is likely to be selected, since it can be assumed to have none of

the features in the lexical entry for the zero morpheme1.

The above examples demonstrate that, contrary to what has been suggested by previous

analyses, a generalization-based account does make the correct predictions regarding

order of acquisition of functional morphemes. Morphemes which encode a single feature

are acquired before those which encode a complex of features. Another way of stating the

model’s prediction, which makes explicit the role of production, is that it is based upon the

number of morphemes selected by a feature. Where a feature, like “prog:pos”, selects a

single morpheme, correct usage of the morpheme selected is predicted from the start. The

third person singular takes much longer to acquire since none of the defining feature:value

pairs are highly selective and they only select “s” in combination. Order of acquisition is

correlated with the number of features defining a morpheme since only a highly selective

feature can single-handedly perform this function, unselective feature:value pairs only

being able to define in combination.

6. 4. 2. Functional Morpheme Omission in the Model

Above we discussed the two kinds of accounts of functional morpheme omission offered by

existing models. We rejected the telegraphic perception hypothesis as both inconsistent

with the child language acquisition data and inadequate with respect to the objective of

learnability. The alternative explanation of functional morpheme omission was a

performance-based account which proposed the existence of a production bottleneck. This

was also problematic as it could not account for why functional morphemes should be

produced before the stage at which they were omitted. We thus suggest that an account of

omission is required which links it to the development of the child’s competence, perhaps

to the onset of functional morpheme recognition2. Here, we use acquisition in the model

developed to offer such a competence-based account.

1A verb stem which had any of the features in the lexical entry for the zero morpheme would fail to unifywith inflections like “ing” and “ed”, and thus such features quickly come to be generalized over.2We assume that the recognition of functional morphemes involves more than their perception.

140

Our analysis takes as its starting point the grammar rules and lexical entries acquired in

comprehension and uses these as the basis for making predictions concerning functional

morpheme omission in language production. The account links omission to that stage in

functional morpheme acquisition at which there is indeterminacy concerning the

appropriate morpheme to be used for expressing certain sets of features, as illustrated in

the previous section. Recovery from omission is envisaged by means of further functional

morpheme meaning acquisition, so that there is continuous development towards the

endstate of acquisition.

In the model described, those units which can be either produced or omitted correspond to

lexical entries in the model. The nature of these units changes during the course of

acquisition. The first prediction to arise from the model is that, so long as “Daddy goes”

exists as an unanalysed lexical unit, the verb inflection will not be omitted from it. That is,

the model is consistent with the observation that functional morphemes are produced

during the early stages of acquisition. Functional morpheme omission only becomes a

possibility once “goes” has been segmented into the constituent verb stem and inflection.

When the state of acquisition in the model is such that there are “adult” content words

and functional morphemes in the lexicon, it is predicted that certain of these will

immediately be available for use in production whereas others will require further

examples of use. Ease of acquisition of content words is predicted. This is because they

are defined in terms of the pred feature, each value of which selects a single content word.

As outlined in the section on the order of functional morpheme acquisition, where a

feature:value pair selects a single lexical item, error-free acquisition of the lexical item is

predicted. The same argument applies to those functional morphemes defined in terms of a

single feature:value pair, since for a feature:value pair to be defining implies that it is

highly selective. Where a functional morpheme is defined in terms of a complex of

feature:value pairs, acquisition is predicted to be more gradual. The defining feature:value

pairs being less selective, there may be indeterminacy as to the appropriate choice of

morpheme for expressing a given combination of feature:value pairs. This indeterminacy is

the explanation offered of functional morpheme omission. As further input utterances are

processed, the knowledge is acquired as to the appropriate morphemes for expressing

different combinations of feature:value pairs, so that there is eventually recovery from such

omissions.

141

There may be another factor which operates in conjunction with incomplete inflectional

acquisition to produce the phenomenon of functional morpheme omission. Below, we

characterize uninflected forms of verbs, for which certain feature:value pairs are implicit

rather than explicit, as default forms which apply when the alternative inflected forms fail

to apply. Thus, for example, the lexical entry for “walk” contains only the pred

feature:value pair. If it contained values for tense, for example, it would fail to unify with

inflections. We do not assume the convention of zero inflections, for reasons outlined

below. Given this framework, it can be assumed, from the start of acquisition, that there is

a unique inflection that is appropriate for expressing any set of features. What this means

is that, during acquisition, the uninflected form of a verb (where its acquisition is

complete) may apply, by default, whenever the appropriate inflected form cannot be

determined. The use of the uninflected form is equivalent to, or indistinguishable from,

inflectional omission. As further inflections are acquired, the scope for the use of the

default uninflected form, or omission, is gradually narrowed. Thus the account of functional

morpheme omission offered by the model predicts gradual recovery from the telegraphic

stage.

A further aspect of the account of functional morpheme omission is briefly mentioned here.

The account of the order of acquisition of functional morphemes can also be applied to the

earlier stages of acquisition to predict the order of acquisition of different inflected forms.

An inflected verb form, such as “goes”, defined in terms of a number of feature:value

pairs, will be relatively difficult to acquire, and thus may sometimes be omitted. This

prediction may be more difficult to evaluate against the child language data than is

functional morpheme omission. However, children’s Agent-Object constructions (Brown

1973) suggest that this prediction of the model cannot be ruled out. Alternatively, where

there is failure to determine the appropriate inflected form, the default uninflected form

could be produced, resulting in the appearance of functional morpheme omission at a stage

when functional morphemes do not yet exist as units in the model.

The competence-based account of functional morpheme omission in the model as outlined

above presents a viable alternative to the accounts previously offered. Its obvious

strength is that both the onset of functional morpheme omission and recovery from it are

accounted for solely by the characterization of the child’s developing knowledge.

6. 4. 3. Comprehension and Production in the Model

An account of functional morpheme omission should be consistent with the finding that

children at the telegraphic stage in language production respond better, in comprehension,

142

to adult-like than to telegraphic utterances. The account offered here is consistent with

this observation since it is assumed that functional morphemes are perceived from the

start of acquisition. A delay in producing certain functional morphemes after they have

been recognized is also predicted. During acquisition, an inflection may be used

appropriately some of the time, as well as omitted some of the time, as certain of the

conditions licensing its use are acquired before others. The model thus accounts for the

observation that children’s comprehension develops ahead of their production.

6. 4. 4. Overregularization in the Model

The features we require in an account of overgeneralization have been outlined above. The

onset of overgeneralization, following a period of correct use, needs to be explained. If the

explanation is in terms of the acquisition of the past tense rule, then an account of this is

itself required. After the onset of overgeneralization, we need to account for the nature of

the overregularizations predicted, for the observation that regular and irregular forms may

co-exist, and for the gradual recovery from overgeneralization. Existing models have failed

to offer a satisfactory account of overregularization. The emphasis in analysing the model

developed here is on the overgeneralizations predicted and on the onset of

overgeneralization. We are able to offer a simple account of the latter since, due to the fact

that segmentation is not presupposed by the model, stages appear to emerge from the

continuous model of syntax acquisition.

Here we offer a simple account of how the acquisition of the past tense rule and the

possibility of overregularization arise following a period of correct use of regular and

irregular past tenses. In the early stages of acquisition in the model developed, inflected

verbs like “walked” and “went” exist as unanalysed units. There is no rule for the

formation of the past tense and thus no possibility of overregularization. Lexical

acquisition and segmentation in the model eventually result in the acquisition of lexical

items corresponding to verb stems such as “walk” and “go” and inflections like “ing”

and “ed”. Syntax acquisition in turn results in rules for the formation of the progressive

and the past tense. Since the rule for the formation of the progressive applies to both

“walk” and “go”, these come to be considered instances of the same syntactic category.

The consequence is that the rule for the acquisition of the past tense, acquired in relation

to verb stems like “walk”, applies equally to a verb like “go” for which an irregular past

tense form exists, and thus the possibility of overregularization arises. An explanation of

the onset of overgeneralization thus arises naturally from an integrated model of

acquisition processes in which progression is predicted from a relatively unproductive

143

repertoire of utterances, which does not allow for the possibility of errors, to a more

productive system in which overregularization becomes a possibility.

We briefly consider the kind of account of recovery from overgeneralization consistent with

the model. If we assume that a unique form is appropriate for expressing, for example, the

past tense of a particular verb, then recovery should be possible. What recovery requires

is that an irregular alternative to the regular past tense be experienced and recognised as

such. A prediction which correctly follows is that adults may overregularize in the case of

an infrequent irregular verb (Marcus 1992).

6. 4. 5. Representational Adequacy in the Model

We noted above that generalization-based accounts of functional morpheme acquisition,

such as that offered by the model developed here, have been criticised as inadequate to

represent the kind of knowledge which has to be acquired. Here, we attempt to show that

such criticisms are unjustified.

6. 4. 5. 1. Implicit Features in Lexical Entries

Certain defining features are not present in the lexical entries acquired. Lexical acquisition

and segmentation in the model result in the acquisition of the following kinds of lexical

entry:

Verb → walk [pred: walk(subj, obj)]

Noun → cat [pred: cat]

Certain implied features of the uninflected form are not explicitly represented in the lexical

entry acquired; if they were, the F-Structure of the stem would fail to unify with the

appropriate inflections or functional morphemes. The existence of such forms also allows

for the possibility of alternative routes to segmentation.

pred: walk(subj, obj)

tense: present

prog: neg

~subj:

“walk”

pred: cat

num: sg“cat”

person: 3num: sg

Figure 6.3 Implicit Features in F-Structures of Uninflected Forms

144

The implied F-Structures for the example uninflected forms are given above (Figure 6.3).

Features which are implicit, rather than explicit, have been viewed as problematic since

they mean that, in language production, “walk” can be overgeneralized to incorporate the

third person singular and, similarly, “cat” to incorporate the plural. As mentioned above,

we view this feature of the generalization-based model as one part of the competence-

based account offered of functional morpheme omission. The problem addressed here is

whether a generalization-based account, which results in lexical entries which can be

overgeneralized, is compatible with the eventual attainment of adult performance. While

the exact mechanisms for dealing with uninflected forms are not considered here, it is

argued that these need not be viewed as problematic. They can be viewed as default

cases which apply only when the conditions for use of inflected forms do not hold. Thus

before the acquisition of inflections is complete, they may be overgeneralized, resulting in

functional morpheme omission. However, once functional morpheme acquisition is

complete, it is possible to infer, through a process of elimination, the conditions for use of

the uninflected form, given the assumption that a unique form corresponds to each set of

features to be expressed. If the conditions for use of the uninflected form can be inferred,

then it is possible to make these explicit by creating a lexical entry for a zero morpheme.

However, this further step seems unnecessary and unparsimonious since, in the case of a

zero verb inflection, it would be necessary either to (i) introduce negative features into the

F-Structure of the lexical entry or to (ii) propose multiple lexical entries corresponding to

alternative F-Structures:

(i) Infl → Ø [~subj: [num: sg, person: 3]]

(ii) Infl → Ø [subj: [num: pl, person: 3]]

Infl → Ø [subj: [person: 1]]

Infl → Ø [subj: [person: 2]]

It is argued above that implicit features need not be viewed as problematic in the case of

uninflected forms where these can be regarded as a default form. The same argument

extends to possible default inflected forms, with the constraint that there can only be a

single default form1, whether uninflected or inflected.

1This is an over-simplification, which we qualify below.

145

6. 4. 5. 2. Generalization and Syncretism

If an account of functional morpheme acquisition is to be adequate, it must allow for the

possibility that the same morpheme may encode distinct sets of features. Generalization-

based learning, of which lexical acquisition in the model developed is an example, appears

to rely on the assumption that a word or functional morpheme is only ever associated with

a single set of features. The existence of syncretism shows this assumption to be invalid.

Our response to the problem this presents is to illustrate that the assumptions of the

model differ in an important respect from the invalid assumption above. Lexical entries in

the model consist essentially of the word or morpheme and its defining features together

with a syntactic category label. Crucially, syntactic categories in the model are induced on

the basis of distributional information, rather than assumed. This means that what lexical

acquisition in the model assumes is that a word or morpheme with a particular

distributional pattern or of a particular syntactic category only ever encodes a single set of

features.

In order to illustrate what difference the model’s qualified assumption makes to learning,

we examine case-marking noun inflections in Serbo-Croatian (Slobin 1982). Translating

the features defining inflections into the F-Structure representation used in the model, we

get the following target lexical entries1:

Infl → a [gen:m, ani:pos, case:acc, num:sg]

Infl → a [gen:n, num:pl]

Infl → a [gen:f, class:a, case:nom, num:sg]

Infl → i [gen:m, case:nom, num:pl]

Infl → i [gen:f, class:o, num:pl]

Infl → i [gen:f, class:cons, num:pl]

Infl → u [gen:f, class:a, case:acc, num:sg]

Infl → e [gen:m, case:acc, num:pl]

Infl → e [gen:f, class:a, num:pl]

Infl → e [gen:n, num:sg]

1Again, we use the zero morpheme notation as a convenient form of representation. Below, we discuss theimplications of dropping the assumption of zero morphemes for consistency with the model developed.

146

Infl → o [gen:f, class:o, num:sg]

Infl → o [gen:n, num:sg]

Infl → Ø [gen:m, case:nom, num:sg]

Infl → Ø [gen:m, ani:neg, num:sg]

Infl → Ø [gen:f, class:cons, num:sg]

By performing “reversed unification” over the target lexical entries for each inflection, we

can predict the lexical entries which would be acquired in the generalization-based model

assuming a uniform syntactic categorization:

Infl → a [ ]

All defining features are lost.

Infl → i [num:pl]

Important distinguishing features are lost.


Since there is a single set of features, none are lost.

Infl → e [ ]

All defining features are lost.

Infl → o [num:sg]


Infl → Ø [num:sg]


The possibility of a solution to this problem relies upon the argument that, while all cases

of an inflection, such as “a”, are homonyms, they are not syntactically equivalent, differing

with respect to the gender of the noun inflected. If we apply “reversed unification” to only

those cases of an inflection with the same gender value, the result we get very closely

resembles the target lexical entries. We demonstrate this by presenting the target lexical

entries above together with those predicted by the generalization-based model given

differential syntactic categorization:

147

Infl → a [gen:m, ani:pos, case:acc, num:sg]

MInfl → a [gen:m, ani:pos, case:acc, num:sg]

Infl → a [gen:n, num:pl]

NInfl → a [gen:n, num:pl]

Infl → a [gen:f, class:a, case:nom, num:sg]

FInfl → a [gen:f, class:a, case:nom, num:sg]

Infl → i [gen:m, case:nom, num:pl]

MInfl → i [gen:m, case:nom, num:pl]

Infl → i [gen:f, class:o, num:pl]

Infl → i [gen:f, class:cons, num:pl]

FInfl → i [gen:f, num:pl]

Here, the class feature is lost; however, the idea of a default form may apply (see below).


FInfl → u [gen:f, class:a, case:acc, num:sg]

Infl → e [gen:m, case:acc, num:pl]

MInfl → e [gen:m, case:acc, num:pl]

Infl → e [gen:f, class:a, num:pl]

FInfl → e [gen:f, class:a, num:pl]

Infl → e [gen:n, num:sg]

NInfl → e [gen:n, num:sg]

Infl → o [gen:f, class:o, num:sg]

FInfl → o [gen:f, class:o, num:sg]

Infl → o [gen:n, num:sg]

NInfl → o [gen:n, num:sg]

148

Feminine

num: pl num: sg

class: a

“ -e”“ -i” class: a

class: cons class: o

Ø “ -o”

case: nom case: acc

“ -a” “ -u”

Masculine

num: sgnum: pl

case: nom case: acccase: accani: pos

Ø “ -a” “ -i” “ -e”

Neuter

num: pl num: sg

“ -a” “ -e” / “ -o”

Figure 6.4 Case-Marking Noun Inflections in Serbo-Croatian

149

Infl → Ø [gen:m, case:nom, num:sg]

Infl → Ø [gen:m, ani:neg, num:sg]

MInfl → Ø [gen:m, num:sg]

Here, the case and ani features are lost; however, the idea of a default form may apply.

Infl → Ø [gen:f, class:cons, num:sg]

FInfl → Ø [gen:f, class:cons, num:sg]

For each of the categories masculine and feminine, there is a single form in relation to

which features are lost. Given that there is only a single case, the notion of a default form,

introduced in the section above, may apply. However, we must also consider the

implications of dropping the assumption of zero morphemes which are not explicitly

represented in the model’s lexicon. In the case of the masculine, the default and

uninflected forms are one and the same, so there is no problem. However, in the feminine

case, it appears that there are now two defaults, the inflection “i” and the uninflected

form. Interestingly, however, these do not present a problem, being mutually exclusive, as

illustrated (Figure 6.4): “i” is the default plural form and the uninflected form is the

ultimate default form which is, appropriately, only required to apply in the singular case.

What the above analysis of syncretism is intended to show is that a generalization-based

account of functional morpheme acquisition may be able to account for this feature of

natural language if syntactic categorization is not assumed, as in the model developed.

Future work will need to include actual simulations of learning of natural languages which

exhibit syncretism. This will enable us to determine whether syntactic categorization in

the model does enable us, as suggested above, to distinguish those cases of a functional

morpheme associated with different sets of features.

6. 5. Conclusion

In this chapter we have examined in detail how the model accounts for one aspect of child

language development, functional morpheme acquisition. As well as correctly predicting

order of acquisition, the model offers explanations of a number of phenomena related to

functional morpheme acquisition. The integrated model of acquisition processes explains

why the past tense rule, which has been linked with the onset of overgeneralization, is

acquired when it is. A competence-based account of functional morpheme omission is

offered which, unlike alternative accounts, is consistent with the observation that

functional morphemes are produced before the stage at which they are omitted. The model

also correctly predicts that the stage of omission precedes that of overgeneralization:

150

whereas omission is predicted when acquisition of the functional morpheme is incomplete,

overregularization takes place after the inflection and the associated syntactic rule have

been acquired. With respect to the issue of representational adequacy, we propose that a

generalization-based account of lexical acquisition, like that incorporated in the model,

may be able to deal with syncretism so long as syntactic categorization isn’t assumed.

This reinforces the suggestion made in previous chapters that the assumption of innate

linguistic knowledge may confound acquisition.

151

7. Summary and Conclusions

7. 1. Overview

This chapter takes as its starting point a summary of the major issues addressed by the

development of the model. One of these is the argument, underlying the development of

the empiricist model, that the assumption of too much knowledge may actually inhibit

acquisition. Since this is an issue of general importance, we go on to make explicit in what

respects the model can be considered empiricist and how its assumptions relate to the aim

of providing an account of acquisition. We then consider future directions. These include

the need to model the acquisition of a variety of natural languages. The idea that

knowledge constrains learning may be extendable to learning across languages (e.g., to

issues in second language acquisition). Finally, there is the need to further develop the

model to consider important issues such as later stages in acquisition and the role of

semantics in acquisition.

7. 2. Summary

The model developed acquires a recursive phrase-structure grammar without assuming

innate linguistic knowledge in the form of X-Bar Theory and the syntactic categories in

terms of which it is defined. This result was achieved by removing from the model

developed certain ostensibly simplifying assumptions of existing models. These were the

assumptions that content word acquisition and segmentation take place prior to syntax

acquisition, and that all input utterances consist of grammatical sentences. The standard

independent model of syntax acquisition was replaced with an integrated model of

acquisition processes, input utterances to which were assumed to be of various

grammatical types. No change to the basic architecture of the model was required.

Mechanisms for lexical acquisition were added, with segmentation taking place as a kind

of side-effect of lexical acquisition, given the assumption of different types of input

utterance. No lexicon was assumed prior to acquisition, and inputs consisted simply of the

utterance, represented as an unsegmented phonological string, paired with its semantic

representation. The problem of building phrase structure having been addressed, a

recursive phrase-structure grammar was acquired through the induction of syntactic

categories.

In rejecting innatist assumptions, empiricist models are at risk of replacing these with

over-reliance on semantic inputs. The model developed avoids this criticism insofar as the

acquisition of recursive constructions, on the basis of relatively simple inputs, means that

semantic inputs need not be assumed for utterances with more than two levels of

embedding. It is also proposed that the acquisition of syntactic constituents, like verb

152

phrases, which do not correspond to semantic constituents, may be triggered by the

presence in the input of non-sentential utterances.

It is argued that the empiricist model of phrase structure acquisition developed provides a

better account of child language development than do the innatist alternatives. Whereas

acquisition is powerful in the case of the latter, the model developed here gradually

acquires syntactic categories, and its phrase-structure grammar gradually emerges from

an earlier finite-state grammar. Gradual acquisition is just one of a number of objectives

the development of the model aimed to meet. Acquisition in the model is also continuous,

with all learning driven by the requirements of comprehension, and the appearance of

stages in acquisition an emergent property of the model. Thus, the model does not make

ad hoc assumptions in order to account for certain observed phenomena. In developing the

model, we have focussed, as have other models, upon the acquisition of English. However,

language-neutrality is clearly an important goal. In the previous chapter we examined the

issue of whether the model was representationally adequate with respect to certain

natural language phenomena not found in English. Future work will need to involve actual

simulations with a variety of natural languages.

A number of important issues were addressed by the development of the model in addition

to the main issue of phrase structure acquisition. A top-down model of lexical acquisition

was developed, compatible with a lexically-driven account of lexical recognition and

segmentation. This removes what has been a major objection to such accounts, that they

are incompatible with an account of acquisition, it having been assumed that bottom-up

segmentation must precede lexical acquisition. Lexical acquisition prior to segmentation is

made possible in the model developed by the adoption of a relatively flexible criterion as to

what may constitute a lexical item during the course of acquisition. The nature of the

target lexicon remains unchanged. There is thus an obvious parallel with the development

of the target phrase-structure grammar from the earlier finite-state grammar.

In relation to functional morpheme acquisition, accounts are offered of a number of related

phenomena. These include order of acquisition, functional morpheme omission,

overgeneralization, and the appearance of comprehension developing ahead of production.

The model also correctly predicts that omissions precede overgeneralizations. Unlike

competing accounts of functional morpheme acquisition, that incorporated in the model is

able to offer a simple explanation of the onset of both functional morpheme omission and

overgeneralization following an earlier stage of correct use. This is because segmentation

153

isn’t assumed by the model, and the possibility of these behaviours only arises after

segmentation has taken place.

The model developed provides a number of examples of how existing knowledge plays an

important role in constraining learning. We suggest, in Chapters 4 and 6, reasons why the

assumption of syntactic categorization inhibits grammar rule and functional morpheme

acquisition. Similarly, the assumption of a lexicon of content words and the ability to

segment can be viewed as inhibiting both phrase structure and functional morpheme

acquisition, as argued in Chapter 5. The development of the model thus serves to

illustrate how learning within a language is constrained by existing knowledge of that

language. Where the acquisition of one kind of knowledge inhibits the acquisition of a

different kind of knowledge, an integrated model of their acquisition is required. Grammar

acquisition needs to take place alongside the induction of syntactic categories, and,

similarly, alongside the acquisition of the lexicon and segmentation abilities. The idea that

knowledge constrains acquisition may also apply to learning across languages, an issue to

which we return below.

7. 3. What is Innate and What is Acquired

The model has been characterized as empiricist. This implies a commitment to the view

that syntactic categories and phrase structure are both acquired. It has also been argued

that grammatical well-formedness conditions are language-specific, and thus acquired.

We aim to make explicit here further ways in which the assumptions of the model reflect

the view that presupposing too much knowledge, as either innate or previously acquired,

may inhibit acquisition.

There is a single aim assumed to be innate, that of processing input utterances, and all

learning processes in the model are driven by this one aim. Syntax acquisition and lexical

acquisition are both triggered by failure of normal processing, and segmentation takes

place as a result of lexical acquisition. There is thus no assumption of the existence of

processes which actively seek to acquire a grammar or lexicon, or to segment incoming

utterances.

No principled division is assumed between grammar and lexicon, or between content

words and functional morphemes. Inputs to acquisition determine what is encoded in the

lexicon and grammar at different stages in acquisition. This is an important point since this

assumption is necessary to the accounts offered by the model of both phrase structure and

lexical acquisition. It would also seem to be required for language neutrality.

154

There are a number of features embodied in the model which can be characterized as

innate. The parsing preferences assumed are grammar-independent, and thus an innate

property of the parser1. In addition, it was found necessary to incorporate a distinction into

the model, required in acquisition, between lexical and phrasal nodes. This does not

prevent the acquisition of the initial finite-state grammar in which only lexical nodes exist.

The constraint which is imposed by the distinction is that a grammatical-function-

assigning equation can never be attached to a lexical node, such that the presence of such

an equation induces the bottom-up proposal of a phrasal node corresponding to the

grammatical function. This means that nodes corresponding to grammatical functions are

treated equivalently with those corresponding to predicates.

The LFG notation is assumed by the development of the model, with F-Structures

corresponding to utterances represented in the input to acquisition, so that only the

acquisition of ordering information at the level C-Structure is modelled. The intention is

that semantic knowledge of features encoded by a language, but not syntactic knowledge

of how these features are represented, might be assumed (but as previously acquired

rather than innate). While F-Structure has been claimed to be language-neutral, we find it

necessary to alter the representation of prepositional phrases and their grammatical

functions so as not to encode syntactic knowledge in the semantic input to acquisition. A

further problem presented by LFG is that, while we consider assuming its well-

formedness conditions to be innate unjustifiable, they are not, as abstract principles,

amenable to acquisition. The solution implemented to this involves enhancing the notation

for rules acquired so that separate grammaticality checks are not required. This, in effect,

means that a language-specific version of completeness (i.e., the arguments of a predicate

which the language requires be expressed) is acquired. Coherence is still effectively

assumed in the form of predicate-argument information in semantic inputs, so that its

acquisition is an issue which remains to be addressed by future versions of the model.

7. 4. Implications and Future Directions

We view the work described here as being extendable in a variety of ways. The

discussion of the representation of syncretism in Chapter 6 suggests the need for actual

simulations of acquisition in a variety of natural languages. The further issues we consider

here concern the removal of further simplifications from the model, modelling the later

1They are not merely abstract principles, however, being characterized in terms of a notion of efficiencygrounded in the parsing strategy used.

155

stages of acquisition, and looking beyond the issue of learning within languages to that of

learning between languages.

7. 4. 1. Extending the Model

An issue addressed in developing the model is the appropriateness of the existing

computational model paradigm. The changes required to this in order to implement phrase

structure acquisition can be viewed as an implicit criticism of models which make too many

simplifying assumptions. At the same time, however, it can be seen that the model

developed itself makes oversimplifying assumptions which future work will need to

address. A more sophisticated representation of input utterances is required, as is a

model of lexical acquisition capable of exploiting various sources of information and, by

implication, redundancy. More realistic assumptions about the semantic input to

acquisition are also required, an issue which we expand upon below. Other limitations to

the model suggest that a more holistic model of acquisition will eventually be required. For

instance, declarative input utterances overemphasize the descriptive role of language,

ignoring its communicative function. There is also the need to implement a model of

language production in acquisition.

The issues outlined above are concerned with refining inadequacies identified in the

current model. Another aspect of extending the model involves using it to address further

issues. This involves adding to the model and, at the same time, re-evaluating its

assumptions in relation to new considerations which arise. We consider below the issue

of modelling later stages of acquisition.

7. 4. 1. 1. The Role of Semantics in Acquisition

A major implication of the solution offered to the problem of phrase structure acquisition is

that syntax is not to be viewed as autonomous. To an extent this seems to diminish the

importance of demonstrating the acquisition of a phrase-structure grammar. That is, given

that semantic considerations are viewed as the basis of syntactic constraints, the

acquisition of semantics remains a vital issue which needs to be addressed.

Existing computational models have examined the problem of referential uncertainty

concerning the semantic input to lexical acquisition (Siskind 1992); however, it was

assumed that language-neutral semantic representations are involved. Studies of

children’s lexical acquisition suggest that what is further required is an account of the

acquisition of concepts which are, at least partly, language-specific:

156

“we have shown that the meanings of children’s early spatial words are

language specific. This means that language learners do not map spatial words

onto nonlinguistic spatial concepts, as has often been proposed, but instead are

sensitive to the semantic structure of the language virtually from the

beginning.”

(Choi & Bowerman 1991, pp.117-8)

The model developed already embodies the assumption of language-specificity at the

syntactic level; however, current fully specified semantic inputs have embodied the

simplifying assumption of language-neutrality at the conceptual level. Language-

specificity requires that, in future versions of the model, semantic inputs be underspecified

with respect to, for example, information concerning the arguments taken by predicates.

Input utterances, initially analysed on the basis of such partial semantic information, will

determine the concepts acquired. This should be possible given that structures may be

acquired at the syntactic level that are not present in the semantic input to acquisition. We

have already identified phrasal utterances as providing one non-semantic source of

syntactic structure.

7. 4. 1. 2. Later Stages of Acquisition

While we have demonstrated the acquisition of a recursive phrase-structure grammar, we

have mainly been concerned with simple utterances and the earliest stages of acquisition.

The challenge remains of acquiring means of representing, for example, interrogatives and

co-ordination. We view the task not as simply modelling the acquisition of these

phenomena, but as also determining how they are to be represented. While representation

might seem to be an issue which is well-understood, acquisition is one kind of constraint

on representation which is too often neglected. For example, LFG represents

interrogatives as involving movement of constituents and long-distance dependencies. It

seems unlikely that the acquisition of rules involving movement can be justified in an

empiricist model, so either the commitment to empiricism must be dropped or new ways of

representing the phenomena must be found.

7. 4. 2. Knowledge and Acquisition

The idea that knowledge constrains acquisition is one that may apply to learning across

languages, accounting, for example, for the observation that adults fluent in one language

have difficulty in acquiring the phonemic contrasts of another language (Clifton 1993). An

erroneous match with existing noise-tolerant rules for the native language could inhibit

the acquisition of rules for a different language. It is thus predicted that, if fluency of the

157

different languages is to be achieved, their acquisition has to be simultaneous, in the same

way that different learning processes within each language have to operate concurrently.

The idea that the loss of the ability to distinguish non-native phonetic contrasts is

associated with the acquisition of knowledge appears to be consistent with the following

findings. While the ability is one normally lost at around 10-12 months (Werker 1993), the

ability to recognise a particular contrast may be inhibited at a much earlier age than this:

“when 4-day-olds were exposed to a set of widely separated vowel contrasts

(e.g. [bi], [ba], [bu]), they gave no evidence of detecting the addition of a new

syllable ([b^]) that was very similar to one of the familiar ones (i.e. [ba]).

Most importantly, the failure of the infants to detect the new item is not the

result of their inability to discriminate the contrast, because newborns tested

on the [ba]/[b^] contrast did discriminate these items.”

(Jusczyk 1993a, p.38)

Similarly, some non-native contrasts are not lost by adults because they are so dissimilar

from familiar phonemes (Clifton 1993).

7. 5. Conclusion

Specifically, the work described demonstrates the gradual acquisition of a recursive

phrase-structure grammar in an empiricist computational model. At a more general level,

the development of the model serves to illustrate that computational models have a role to

play alongside other approaches to child language acquisition research. The framework in

which the model has been developed suggests possibilities for future research,

introducing, for example, the idea that complex phenomena in acquisition may be modelled

in terms of the interactions amongst processes which are, in themselves, relatively

simple.

158

Appendices

Appendix A An Example of Grammar Acquisition in the Prototype Model

The following is a script file of grammar acquisition in the prototype model described inChapter 4, showing inputs (utterance, F-Structure) and output (grammar). The purpose ofthis appendix is to support the examples of acquisition outlined in 4. 6. 2.

Script started on Fri May 5 14:40:42 1995To use QUI, setenv DISPLAY to machine:0.0 ~~~~~~~~~~~~~~ ~~~~~~~~~~~fat-controller% quintusQuintus Prolog Release 3.1.3 (Sun-4, SunOS 4.1)Copyright (C) 1993, Quintus Corporation. All rights reserved.2100 Geng Road, Palo Alto, California U.S.A. (415) 813-3800

| ?- [learn_basic_td].

yes| ?- learn([fido,chases,tigger],[subj:[pred:fido],tense:present,pred:chase(subj,obj),obj:[pred:tigger]]).

CURRENT GRAMMAR

rule(node2,[(node1,unifies(up(subj),down)),(lex4,unifies(up,down)),(node3,unifies(up(obj),down))])

rule(node1,[(lex1,unifies(up,down))])


yes| ?- learn([fido,chases,tabby],[pred:chase(subj,obj),subj:[pred:fido],obj:[pred:tabby],tense:present]).

CURRENT GRAMMAR



rule(node5,[(node6,unifies(up,down))])

yes| ?- learn([fido,chased,tabby],[obj:[pred:tabby],tense:past,subj:[pred:fido],pred:chase(subj,obj)]).

CURRENT GRAMMAR


159


rule(node2,[(node7,unifies(up(subj),down)),(node10,unifies(up,down)),(node9,unifies(up(obj),down))])

yes| ?- full_update.

CURRENT GRAMMAR




CURRENT LEXICON

lex(lex1,fido,[pred:fido])

lex(node6,tabby,[pred:tabby])

lex(node6,tigger,[pred:tigger])

lex(node10,chases,[pred:chase(subj,obj),tense:present])

lex(node10,chased,[pred:chase(subj,obj),tense:past])

yes| ?-fat-controller%script done on Fri May 5 14:46:11 1995

160

Appendix B An Example of Grammar Acquisition in the Prototype Model,Illustrating Data Structures Used in Parsing

The following is a script file of grammar acquisition in the prototype model described inChapter 4, showing, in addition to inputs (utterance, F-Structure) and output (grammar)from acquisition, the Agenda during parsing and the Chart at the end of parsing, and theC-Structure and F-Structure outputs from parsing the utterance. The purpose of thisappendix is to support the examples of acquisition outlined in 4. 6. 2.

Script started on Fri May 5 14:40:42 1995To use QUI, setenv DISPLAY to machine:0.0 ~~~~~~~~~~~~~~ ~~~~~~~~~~~fat-controller% quintusQuintus Prolog Release 3.1.3 (Sun-4, SunOS 4.1)Copyright (C) 1993, Quintus Corporation. All rights reserved.2100 Geng Road, Palo Alto, California U.S.A. (415) 813-3800

| ?- [learn_basic_td].

yes| ?- learn([fido,chases,tigger],[subj:[pred:fido],tense:present,pred:chase(subj,obj),obj:[pred:tigger]]).

AGENDA: [edge(0,1,lex1,[],[fido],[pred:fido],_7140,base)]

AGENDA: [edge(0,1,node1,[],[(lex1(fido),unifies(up,down))],[pred:fido],acq,[(lex1,unifies(up,down))])]

AGENDA: []

AGENDA: [edge(1,2,lex4,[],[chases],[pred:chase(subj,obj),tense:present],_7995,base)]

AGENDA: []

AGENDA: [edge(2,3,lex2,[],[tigger],[pred:tigger],_8567,base)]

AGENDA: [edge(2,3,node3,[],[(lex2(tigger),unifies(up,down))],[pred:tigger],acq,[(lex2,unifies(up,down))])]

AGENDA:[edge(0,3,node2,[],[(node3((lex2(tigger),unifies(up,down))),unifies(up(obj),down)),(lex4(chases),unifies(up,down)),(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:tigger],pred:chase(subj,obj),tense:present,subj:[pred:fido]],acq,[(node3,unifies(up(obj),down)),(lex4,unifies(up,down)),(node1,unifies(up(subj),down))])]

AGENDA: []

161

CHART:edge(0,3,node2,[],[(node3((lex2(tigger),unifies(up,down))),unifies(up(obj),down)),(lex4(chases),unifies(up,down)),(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:tigger],pred:chase(subj,obj),tense:present,subj:[pred:fido]],acq,[(node3,unifies(up(obj),down)),(lex4,unifies(up,down)),(node1,unifies(up(subj),down))])

edge(0,3,node2,acq,[(node3((lex2(tigger),unifies(up,down))),unifies(up(obj),down)),(lex4(chases),unifies(up,down)),(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:tigger],pred:chase(subj,obj),tense:present,subj:[pred:fido]],acq,[(node3,unifies(up(obj),down)),(lex4,unifies(up,down)),(node1,unifies(up(subj),down))])

edge(2,3,node3,[],[(lex2(tigger),unifies(up,down))],[pred:tigger],acq,[(lex2,unifies(up,down))])

edge(2,3,node3,acq,[(lex2(tigger),unifies(up,down))],[pred:tigger],acq,[(lex2,unifies(up,down))])

edge(2,3,lex2,[],[tigger],[pred:tigger],_8567,base)

edge(0,2,node2,acq,[(lex4(chases),unifies(up,down)),(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[pred:chase(subj,obj),tense:present,subj:[pred:fido]],acq,[(lex4,unifies(up,down)),(node1,unifies(up(subj),down))])

edge(1,2,lex4,[],[chases],[pred:chase(subj,obj),tense:present],_7995,base)

edge(0,1,node2,acq,[(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[subj:[pred:fido]],acq,[(node1,unifies(up(subj),down))])

edge(0,1,node1,[],[(lex1(fido),unifies(up,down))],[pred:fido],acq,[(lex1,unifies(up,down))])

edge(0,1,node1,acq,[(lex1(fido),unifies(up,down))],[pred:fido],acq,[(lex1,unifies(up,down))])

edge(0,1,lex1,[],[fido],[pred:fido],_7140,base)

C-Structure:

node2((node1((lex1(fido),unifies(up,down))),unifies(up(subj),down)),(lex4(chases),unifies(up,down)),(node3((lex2(tigger),unifies(up,down))),unifies(up(obj),down)))

F-Structure:

[obj:[pred:tigger],pred:chase(subj,obj),tense:present,subj:[pred:fido]]

162

CURRENT GRAMMAR




yes| ?- learn([fido,chases,tabby],[pred:chase(subj,obj),subj:[pred:fido],obj:[pred:tabby],tense:present]).


AGENDA: [edge(0,1,node1,[],[(lex1(fido),unifies(up,down))],[pred:fido],ri,[(lex1,unifies(up,down))])]

AGENDA: [edge(0,1,node2,[(lex4,unifies(up,down)),(node3,unifies(up(obj),down))],[(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[subj:[pred:fido]],ri,[(node1,unifies(up(subj),down))])]

AGENDA: []

AGENDA: [edge(1,2,lex4,[],[chases],[pred:chase(subj,obj),tense:present],_7868,base)]

AGENDA: []

AGENDA: [edge(2,3,lex3,[],[tabby],[pred:tabby],_8440,base)]

AGENDA: [edge(2,3,node4,[],[(lex3(tabby),unifies(up,down))],[pred:tabby],acq,[(lex3,unifies(up,down))])]

AGENDA:[edge(0,3,node2,[],[(node4((lex3(tabby),unifies(up,down))),unifies(up(obj),down)),(lex4(chases),unifies(up,down)),(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:tabby],pred:chase(subj,obj),tense:present,subj:[pred:fido]],acq,[(node4,unifies(up(obj),down)),(lex4,unifies(up,down)),(node1,unifies(up(subj),down))])]

AGENDA: []

CHART:edge(0,3,node2,[],[(node4((lex3(tabby),unifies(up,down))),unifies(up(obj),down)),(lex4(chases),unifies(up,down)),(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:tabby],pred:chase(subj,obj),tense:present,subj:[pred:fido]],acq,[(node4,unifies(up(obj),down)),(lex4,unifies(up,down)),(node1,unifies(up(subj),down))])

163

edge(0,3,node2,acq,[(node4((lex3(tabby),unifies(up,down))),unifies(up(obj),down)),(lex4(chases),unifies(up,down)),(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:tabby],pred:chase(subj,obj),tense:present,subj:[pred:fido]],acq,[(node4,unifies(up(obj),down)),(lex4,unifies(up,down)),(node1,unifies(up(subj),down))])

edge(2,3,node4,[],[(lex3(tabby),unifies(up,down))],[pred:tabby],acq,[(lex3,unifies(up,down))])

edge(2,3,node4,acq,[(lex3(tabby),unifies(up,down))],[pred:tabby],acq,[(lex3,unifies(up,down))])

edge(2,3,lex3,[],[tabby],[pred:tabby],_8440,base)

edge(0,2,node2,acq,[(lex4(chases),unifies(up,down)),(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[pred:chase(subj,obj),tense:present,subj:[pred:fido]],acq,[(lex4,unifies(up,down)),(node1,unifies(up(subj),down))])

edge(1,2,lex4,[],[chases],[pred:chase(subj,obj),tense:present],_7868,base)

edge(0,1,node2,[(lex4,unifies(up,down)),(node3,unifies(up(obj),down))],[(node1((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[subj:[pred:fido]],ri,[(node1,unifies(up(subj),down))])

edge(0,1,node1,[],[(lex1(fido),unifies(up,down))],[pred:fido],ri,[(lex1,unifies(up,down))])


C-Structure:

node2((node1((lex1(fido),unifies(up,down))),unifies(up(subj),down)),(lex4(chases),unifies(up,down)),(node4((lex3(tabby),unifies(up,down))),unifies(up(obj),down)))

F-Structure:

[obj:[pred:tabby],pred:chase(subj,obj),tense:present,subj:[pred:fido]]

CURRENT GRAMMAR




yes

164

| ?- learn([fido,chased,tabby],[obj:[pred:tabby],tense:past,subj:[pred:fido],pred:chase(subj,obj)]).


AGENDA: [edge(0,1,node7,[],[(lex1(fido),unifies(up,down))],[pred:fido],acq,[(lex1,unifies(up,down))])]

AGENDA: []

AGENDA: [edge(1,2,lex5,[],[chased],[pred:chase(subj,obj),tense:past],_8032,base)]

AGENDA: []

AGENDA: [edge(2,3,node6,[],[tabby],[pred:tabby],_8604,base)]

AGENDA: [edge(2,3,node9,[],[(node6(tabby),unifies(up,down))],[pred:tabby],acq,[(node6,unifies(up,down))])]

AGENDA:[edge(0,3,node8,[],[(node9((node6(tabby),unifies(up,down))),unifies(up(obj),down)),(lex5(chased),unifies(up,down)),(node7((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:tabby],pred:chase(subj,obj),tense:past,subj:[pred:fido]],acq,[(node9,unifies(up(obj),down)),(lex5,unifies(up,down)),(node7,unifies(up(subj),down))])]

AGENDA: []

CHART:edge(0,3,node8,[],[(node9((node6(tabby),unifies(up,down))),unifies(up(obj),down)),(lex5(chased),unifies(up,down)),(node7((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:tabby],pred:chase(subj,obj),tense:past,subj:[pred:fido]],acq,[(node9,unifies(up(obj),down)),(lex5,unifies(up,down)),(node7,unifies(up(subj),down))])

edge(0,3,node8,acq,[(node9((node6(tabby),unifies(up,down))),unifies(up(obj),down)),(lex5(chased),unifies(up,down)),(node7((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:tabby],pred:chase(subj,obj),tense:past,subj:[pred:fido]],acq,[(node9,unifies(up(obj),down)),(lex5,unifies(up,down)),(node7,unifies(up(subj),down))])

edge(2,3,node9,[],[(node6(tabby),unifies(up,down))],[pred:tabby],acq,[(node6,unifies(up,down))])

edge(2,3,node9,acq,[(node6(tabby),unifies(up,down))],[pred:tabby],acq,[(node6,unifies(up,down))])

edge(2,3,node6,[],[tabby],[pred:tabby],_8604,base)

165

edge(0,2,node8,acq,[(lex5(chased),unifies(up,down)),(node7((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[pred:chase(subj,obj),tense:past,subj:[pred:fido]],acq,[(lex5,unifies(up,down)),(node7,unifies(up(subj),down))])

edge(1,2,lex5,[],[chased],[pred:chase(subj,obj),tense:past],_8032,base)

edge(0,1,node8,acq,[(node7((lex1(fido),unifies(up,down))),unifies(up(subj),down))],[subj:[pred:fido]],acq,[(node7,unifies(up(subj),down))])

edge(0,1,node7,[],[(lex1(fido),unifies(up,down))],[pred:fido],acq,[(lex1,unifies(up,down))])

edge(0,1,node7,acq,[(lex1(fido),unifies(up,down))],[pred:fido],acq,[(lex1,unifies(up,down))])


C-Structure:

node8((node7((lex1(fido),unifies(up,down))),unifies(up(subj),down)),(lex5(chased),unifies(up,down)),(node9((node6(tabby),unifies(up,down))),unifies(up(obj),down)))

F-Structure:

[obj:[pred:tabby],pred:chase(subj,obj),tense:past,subj:[pred:fido]]

CURRENT GRAMMAR





CURRENT GRAMMAR




166

CURRENT LEXICON

lex(lex1,fido,[pred:fido])

lex(node6,tabby,[pred:tabby])

lex(node6,tigger,[pred:tigger])

lex(node10,chases,[pred:chase(subj,obj),tense:present])

lex(node10,chased,[pred:chase(subj,obj),tense:past])

yes| ?-fat-controller%script done on Fri May 5 14:46:11 1995

167

Appendix C An Example of Acquisition in the Integrated Model ofAcquisition Processes

The following is a script file of acquisition in the model, showing inputs (utterance,F-Structure) and outputs (grammar, lexicon). This version of the model assumes an LFGnotation for the grammar acquired. The purpose of this appendix is to support theexamples of acquisition outlined in 5. 6. 6.

Script started on Fri Apr 7 15:48:24 1995To use QUI, setenv DISPLAY to machine:0.0 ~~~~~~~~~~~~~~ ~~~~~~~~~~~fat-controller% quintusQuintus Prolog Release 3.1.3 (Sun-4, SunOS 4.1)Copyright (C) 1993, Quintus Corporation. All rights reserved.2100 Geng Road, Palo Alto, California U.S.A. (415) 813-3800

| ?- [learn_td].

yes| ?- learn([d,o,g],[pred:dog]).

CURRENT GRAMMAR


yes| ?- learn([sw,m,ow,s],[pred:mouse,def:neg]).

CURRENT GRAMMAR



yes| ?-learn([dd,sw,k,a,t,ch,aa,s,t,f,ii,d,oo],[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)]).

CURRENT GRAMMAR




yes| ?-learn([sw,m,ow,s,ch,aa,s,t,f,ii,d,oo],[obj:[pred:fido],tense:past,subj:[pred:mouse,def:neg],pred:chase(subj,obj)]).

168

CURRENT GRAMMAR



rule(node8,[(node7,unifies(up(subj),down)),(node9,unifies(up,down))])


yes| ?- learn([sw,d,o,g,ch,aa,s,t,f,ii,d,oo],[pred:chase(subj,obj),obj:[pred:fido],tense:past,subj:[def:neg,pred:dog]]).

CURRENT GRAMMAR





rule(node13,[(node10,unifies(up,down)),(node1,unifies(up,down))])


CURRENT GRAMMAR






CURRENT LEXICON

lex(node5,[dd,sw,k,a,t,ch,aa,s,t,f,ii,d,oo],[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)])

lex(node3,[sw,m,ow,s],[pred:mouse,def:neg])

169

lex(node10,[sw],[def:neg])

lex(node1,[d,o,g],[pred:dog])

lex(node9,[ch,aa,s,t,f,ii,d,oo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)])

yes| ?-fat-controller%script done on Fri Apr 7 15:52:04 1995

170

Appendix D An Example of Acquisition in the Integrated Model ofAcquisition Processes, Illustrating Data Structures Used inParsing

The following is a script file of acquisition in the model, showing, in addition to inputs(utterance, F-Structure) and outputs (grammar, lexicon) from acquisition, the Agendaduring parsing and the Chart at the end of parsing, and the C-Structure and F-Structureoutputs from parsing the utterance. This version of the model assumes an LFG notationfor the grammar acquired. The purpose of this appendix is to support the examples ofacquisition outlined in 5. 6. 6.


| ?- [learn_td].


AGENDA: [edge(0,1,node1,[],[dog],[pred:dog],_6957,base)]

AGENDA: [edge(0,1,node2,[],[(node1(dog),unifies(up,down))],[pred:dog],acq,[(node1,unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node2,[],[(node1(dog),unifies(up,down))],[pred:dog],acq,[(node1,unifies(up,down))])

edge(0,1,node2,acq,[(node1(dog),unifies(up,down))],[pred:dog],acq,[(node1,unifies(up,down))])

edge(0,1,node1,[],[dog],[pred:dog],_6986,base)

C-Structure:

node2((node1(dog),unifies(up,down)))

F-Structure:

[pred:dog]

171

CURRENT GRAMMAR


yes| ?- learn([sw,m,ow,s],[pred:mouse,def:neg]).

AGENDA: [edge(0,1,node3,[],[swmows],[pred:mouse,def:neg],_7047,base)]

AGENDA:[edge(0,1,node4,[],[(node3(swmows),unifies(up,down))],[pred:mouse,def:neg],acq,[(node3,unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node4,[],[(node3(swmows),unifies(up,down))],[pred:mouse,def:neg],acq,[(node3,unifies(up,down))])

edge(0,1,node4,acq,[(node3(swmows),unifies(up,down))],[pred:mouse,def:neg],acq,[(node3,unifies(up,down))])

edge(0,1,node3,[],[swmows],[pred:mouse,def:neg],_7092,base)

C-Structure:

node4((node3(swmows),unifies(up,down)))

F-Structure:

[pred:mouse,def:neg]

CURRENT GRAMMAR



yes| ?-learn([dd,sw,k,a,t,ch,aa,s,t,f,ii,d,oo],[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)]).

AGENDA:[edge(0,1,node5,[],[ddswkatchaastfiidoo],[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)],_7492,base)]

172

AGENDA:[edge(0,1,node6,[],[(node5(ddswkatchaastfiidoo),unifies(up,down))],[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)],acq,[(node5,unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node6,[],[(node5(ddswkatchaastfiidoo),unifies(up,down))],[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)],acq,[(node5,unifies(up,down))])

edge(0,1,node6,acq,[(node5(ddswkatchaastfiidoo),unifies(up,down))],[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)],acq,[(node5,unifies(up,down))])

edge(0,1,node5,[],[ddswkatchaastfiidoo],[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)],_7617,base)

C-Structure:

node6((node5(ddswkatchaastfiidoo),unifies(up,down)))

F-Structure:

[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)]

CURRENT GRAMMAR




yes| ?-learn([sw,m,ow,s,ch,aa,s,t,f,ii,d,oo],[obj:[pred:fido],tense:past,subj:[pred:mouse,def:neg],pred:chase(subj,obj)]).AGENDA: [edge(0,1,node3,[],[swmows],[pred:mouse,def:neg],_7495,base)]

AGENDA:[edge(0,1,node7,[],[(node3(swmows),unifies(up,down))],[pred:mouse,def:neg],acq,[(node3,unifies(up,down))])]

AGENDA: []

AGENDA:[edge(1,2,node9,[],[chaastfiidoo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)],_8752,base)]

173

AGENDA:[edge(0,2,node8,[],[(node9(chaastfiidoo),unifies(up,down)),(node7((node3(swmows),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:mouse,def:neg]],acq,[(node9,unifies(up,down)),(node7,unifies(up(subj),down))])]

AGENDA: []

CHART:edge(0,2,node8,[],[(node9(chaastfiidoo),unifies(up,down)),(node7((node3(swmows),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:mouse,def:neg]],acq,[(node9,unifies(up,down)),(node7,unifies(up(subj),down))])

edge(0,2,node8,acq,[(node9(chaastfiidoo),unifies(up,down)),(node7((node3(swmows),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:mouse,def:neg]],acq,[(node9,unifies(up,down)),(node7,unifies(up(subj),down))])

edge(1,2,node9,[],[chaastfiidoo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)],_8829,base)

edge(0,1,node8,acq,[(node7((node3(swmows),unifies(up,down))),unifies(up(subj),down))],[subj:[pred:mouse,def:neg]],acq,[(node7,unifies(up(subj),down))])

edge(0,1,node7,[],[(node3(swmows),unifies(up,down))],[pred:mouse,def:neg],acq,[(node3,unifies(up,down))])

edge(0,1,node7,acq,[(node3(swmows),unifies(up,down))],[pred:mouse,def:neg],acq,[(node3,unifies(up,down))])

edge(0,1,node3,[],[swmows],[pred:mouse,def:neg],_7495,base)

C-Structure:

node8((node7((node3(swmows),unifies(up,down))),unifies(up(subj),down)),(node9(chaastfiidoo),unifies(up,down)))

F-Structure:

[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:mouse,def:neg]]

CURRENT GRAMMAR



174



yes| ?- learn([sw,d,o,g,ch,aa,s,t,f,ii,d,oo],[pred:chase(subj,obj),obj:[pred:fido],tense:past,subj:[def:neg,pred:dog]]).

AGENDA: [edge(0,1,node10,[],[sw],[def:neg],_7615,base)]

AGENDA: []

AGENDA: [edge(1,2,node1,[],[dog],[pred:dog],_8231,base)]

AGENDA:[edge(0,2,node11,[],[(node1(dog),unifies(up,down)),(node10(sw),unifies(up,down))],[pred:dog,def:neg],acq,[(node1,unifies(up,down)),(node10,unifies(up,down))])]

AGENDA: []

AGENDA:[edge(2,3,node9,[],[chaastfiidoo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)],_9320,base)]

AGENDA:[edge(0,3,node12,[],[(node9(chaastfiidoo),unifies(up,down)),(node11((node10(sw),unifies(up,down)),(node1(dog),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:dog,def:neg]],acq,[(node9,unifies(up,down)),(node11,unifies(up(subj),down))])]

AGENDA: []

CHART:edge(0,3,node12,[],[(node9(chaastfiidoo),unifies(up,down)),(node11((node10(sw),unifies(up,down)),(node1(dog),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:dog,def:neg]],acq,[(node9,unifies(up,down)),(node11,unifies(up(subj),down))])

edge(0,3,node12,acq,[(node9(chaastfiidoo),unifies(up,down)),(node11((node10(sw),unifies(up,down)),(node1(dog),unifies(up,down))),unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:dog,def:neg]],acq,[(node9,unifies(up,down)),(node11,unifies(up(subj),down))])

edge(2,3,node9,[],[chaastfiidoo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)],_9320,base)

edge(0,2,node12,acq,[(node11((node10(sw),unifies(up,down)),(node1(dog),unifies(up,down))),unifies(up(subj),down))],[subj:[pred:dog,def:neg]],acq,[(node11,unifies(up(subj),down))])

175

edge(0,2,node11,[],[(node1(dog),unifies(up,down)),(node10(sw),unifies(up,down))],[pred:dog,def:neg],acq,[(node1,unifies(up,down)),(node10,unifies(up,down))])

edge(0,2,node11,acq,[(node1(dog),unifies(up,down)),(node10(sw),unifies(up,down))],[pred:dog,def:neg],acq,[(node1,unifies(up,down)),(node10,unifies(up,down))])

edge(1,2,node1,[],[dog],[pred:dog],_8231,base)

edge(0,1,node11,acq,[(node10(sw),unifies(up,down))],[def:neg],acq,[(node10,unifies(up,down))])

edge(0,1,node10,[],[sw],[def:neg],_7644,base)

C-Structure:

node12((node11((node10(sw),unifies(up,down)),(node1(dog),unifies(up,down))),unifies(up(subj),down)),(node9(chaastfiidoo),unifies(up,down)))

F-Structure:

[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:dog,def:neg]]

CURRENT GRAMMAR







CURRENT GRAMMAR






176

CURRENT LEXICON

lex(node5,[dd,sw,k,a,t,ch,aa,s,t,f,ii,d,oo],[subj:[pred:cat,def:pos],obj:[pred:fido],tense:past,pred:chase(subj,obj)])

lex(node3,[sw,m,ow,s],[pred:mouse,def:neg])

lex(node10,[sw],[def:neg])

lex(node1,[d,o,g],[pred:dog])

lex(node9,[ch,aa,s,t,f,ii,d,oo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)])


177

Appendix E An Example of the Acquisition of a Grammar and LexiconEnhanced with Semantic Information (to Replace LFG'sWell-Formedness Conditions)

The following is a script file of acquisition in the model, showing inputs (utterance,F-Structure) and outputs (grammar, lexicon). This version of the model assumes theenhanced notation described in Chapter 5 for the grammar acquired. The examples ofacquisition outlined in 5. 6. 6 are reproduced using the notation described in 5. 7. 2. 2.


| ?- [wf_int_learn_td].


CURRENT GRAMMAR

rule(node2,pred:_8032,[(node1,pred:_8032,unifies(up,down))])

yes| ?- learn([sw,m,ow,s],[def:neg,pred:mouse]).

CURRENT GRAMMAR



yes| ?- learn([dd,sw,k,a,t,ch,aa,s,t,f,ii,d,oo],[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]]).

CURRENT GRAMMAR




yes| ?- learn([sw,m,ow,s,ch,aa,s,t,f,ii,d,oo],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:mouse,def:neg]]).

178

CURRENT GRAMMAR



rule(node8,pred:_12107,[(node7,pred:_12118,unifies(up(subj),down)),(node9,subj:_12118^pred:_12107,unifies(up,down))])


yes| ?- learn([sw,d,o,g,ch,aa,s,t,f,ii,d,oo],[tense:past,subj:[pred:dog,def:neg],obj:[pred:fido],pred:chase(subj,obj)]).

CURRENT GRAMMAR





rule(node13,pred:_13521,[(node10,[],unifies(up,down)),(node1,pred:_13521,unifies(up,down))])


CURRENT GRAMMAR






179

CURRENT LEXICON

lex(node5,pred:chase(cat,fido),[dd,sw,k,a,t,ch,aa,s,t,f,ii,d,oo],[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]])

lex(node3,pred:mouse,[sw,m,ow,s],[def:neg,pred:mouse])

lex(node10,[],[sw],[def:neg])

lex(node1,pred:dog,[d,o,g],[pred:dog])

lex(node9,subj:_6998^pred:chase(_6998,fido),[ch,aa,s,t,f,ii,d,oo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)])


180

Appendix F An Example of the Acquisition of a Grammar and LexiconEnhanced with Semantic Information, Illustrating DataStructures Used in Parsing

The following is a script file of acquisition in the model, showing, in addition to inputs(utterance, F-Structure) and outputs (grammar, lexicon) from acquisition, the Agendaduring parsing and the Chart at the end of parsing, and the C-Structure and F-Structureoutputs from parsing the utterance. This version of the model assumes the enhancednotation described in Chapter 5 for the grammar acquired. The examples of acquisitionoutlined in 5. 6. 6 are reproduced using the notation described in 5. 7. 2. 2.




AGENDA: [edge(0,1,node1,pred:dog,[],[dog],[pred:dog],_6982,base)]

AGENDA:[edge(0,1,node2,pred:dog,[],[(node1(dog),pred:dog,unifies(up,down))],[pred:dog],acq,[(node1,pred:dog,unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node2,pred:dog,[],[(node1(dog),pred:dog,unifies(up,down))],[pred:dog],acq,[(node1,pred:dog,unifies(up,down))])

edge(0,1,node2,pred:dog,acq,[(node1(dog),pred:dog,unifies(up,down))],[pred:dog],acq,[(node1,pred:dog,unifies(up,down))])

edge(0,1,node1,pred:dog,[],[dog],[pred:dog],_7028,base)

C-Structure

node2((node1(dog),pred:dog,unifies(up,down)))

F-Structure

[pred:dog]

181

CURRENT GRAMMAR


yes| ?- learn([sw,m,ow,s],[def:neg,pred:mouse]).

AGENDA:[edge(0,1,node3,pred:mouse,[],[swmows],[def:neg,pred:mouse],_7075,base)]

AGENDA:[edge(0,1,node4,pred:mouse,[],[(node3(swmows),pred:mouse,unifies(up,down))],[def:neg,pred:mouse],acq,[(node3,pred:mouse,unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node4,pred:mouse,[],[(node3(swmows),pred:mouse,unifies(up,down))],[def:neg,pred:mouse],acq,[(node3,pred:mouse,unifies(up,down))])

edge(0,1,node4,pred:mouse,acq,[(node3(swmows),pred:mouse,unifies(up,down))],[def:neg,pred:mouse],acq,[(node3,pred:mouse,unifies(up,down))])

edge(0,1,node3,pred:mouse,[],[swmows],[def:neg,pred:mouse],_7137,base)

C-Structure

node4((node3(swmows),pred:mouse,unifies(up,down)))

F-Structure

[def:neg,pred:mouse]

CURRENT GRAMMAR



yes| ?- learn([dd,sw,k,a,t,ch,aa,s,t,f,ii,d,oo],[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]]).

AGENDA:[edge(0,1,node5,pred:chase(cat,fido),[],[ddswkatchaastfiidoo],[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]],_7584,base)]

182

AGENDA: [edge(0,1,node6,pred:chase(cat,fido),[],[(node5(ddswkatchaastfiidoo),pred:chase(cat,fido),unifies(up,down))],[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]],acq,[(node5,pred:chase(cat,fido),unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node6,pred:chase(cat,fido),[],[(node5(ddswkatchaastfiidoo),pred:chase(cat,fido),unifies(up,down))],[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]],acq,[(node5,pred:chase(cat,fido),unifies(up,down))])

edge(0,1,node6,pred:chase(cat,fido),acq,[(node5(ddswkatchaastfiidoo),pred:chase(cat,fido),unifies(up,down))],[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]],acq,[(node5,pred:chase(cat,fido),unifies(up,down))])

edge(0,1,node5,pred:chase(cat,fido),[],[ddswkatchaastfiidoo],[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]],_7726,base)

C-Structure

node6((node5(ddswkatchaastfiidoo),pred:chase(cat,fido),unifies(up,down)))

F-Structure

[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]]

CURRENT GRAMMAR




yes| ?- learn([sw,m,ow,s,ch,aa,s,t,f,ii,d,oo],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:mouse,def:neg]]).

AGENDA:[edge(0,1,node3,pred:mouse,[],[swmows],[def:neg,pred:mouse],_7509,base)]

AGENDA:[edge(0,1,node7,pred:mouse,[],[(node3(swmows),pred:mouse,unifies(up,down))],[def:neg,pred:mouse],acq,[(node3,pred:mouse,unifies(up,down))])]

AGENDA: []

183

AGENDA:[edge(1,2,node9,subj:_8817^pred:chase(_8817,fido),[],[chaastfiidoo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)],_9052,base)]

AGENDA: [edge(0,2,node8,pred:chase(mouse,fido),[],[(node9(chaastfiidoo),subj:mouse^pred:chase(mouse,fido),unifies(up,down)),(node7((node3(swmows),pred:mouse,unifies(up,down))),pred:mouse,unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[def:neg,pred:mouse]],acq,[(node9,subj:mouse^pred:chase(mouse,fido),unifies(up,down)),(node7,pred:mouse,unifies(up(subj),down))])]

AGENDA: []

CHART:edge(0,2,node8,pred:chase(mouse,fido),[],[(node9(chaastfiidoo),subj:mouse^pred:chase(mouse,fido),unifies(up,down)),(node7((node3(swmows),pred:mouse,unifies(up,down))),pred:mouse,unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[def:neg,pred:mouse]],acq,[(node9,subj:mouse^pred:chase(mouse,fido),unifies(up,down)),(node7,pred:mouse,unifies(up(subj),down))])

edge(0,2,node8,pred:chase(mouse,fido),acq,[(node9(chaastfiidoo),subj:mouse^pred:chase(mouse,fido),unifies(up,down)),(node7((node3(swmows),pred:mouse,unifies(up,down))),pred:mouse,unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[def:neg,pred:mouse]],acq,[(node9,subj:mouse^pred:chase(mouse,fido),unifies(up,down)),(node7,pred:mouse,unifies(up(subj),down))])

edge(1,2,node9,subj:_8817^pred:chase(_8817,fido),[],[chaastfiidoo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)],_9178,base)

edge(0,1,node8,_8471,acq,[(node7((node3(swmows),pred:mouse,unifies(up,down))),pred:mouse,unifies(up(subj),down))],[subj:[def:neg,pred:mouse]],acq,[(node7,pred:mouse,unifies(up(subj),down))])

edge(0,1,node7,pred:mouse,[],[(node3(swmows),pred:mouse,unifies(up,down))],[def:neg,pred:mouse],acq,[(node3,pred:mouse,unifies(up,down))])

edge(0,1,node7,pred:mouse,acq,[(node3(swmows),pred:mouse,unifies(up,down))],[def:neg,pred:mouse],acq,[(node3,pred:mouse,unifies(up,down))])

edge(0,1,node3,pred:mouse,[],[swmows],[def:neg,pred:mouse],_7509,base)

C-Structure

node8((node7((node3(swmows),pred:mouse,unifies(up,down))),pred:mouse,unifies(up(subj),down)),(node9(chaastfiidoo),subj:mouse^pred:chase(mouse,fido),unifies(up,down)))

184

F-Structure

[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[def:neg,pred:mouse]]

CURRENT GRAMMAR





yes| ?- learn([sw,d,o,g,ch,aa,s,t,f,ii,d,oo],[tense:past,subj:[pred:dog,def:neg],obj:[pred:fido],pred:chase(subj,obj)]).

AGENDA: [edge(0,1,node10,[],[],[sw],[def:neg],_7632,base)]

AGENDA: []

AGENDA: [edge(1,2,node1,pred:dog,[],[dog],[pred:dog],_8318,base)]

AGENDA: [edge(0,2,node11,pred:dog,[],[(node1(dog),pred:dog,unifies(up,down)),(node10(sw),[],unifies(up,down))],[pred:dog,def:neg],acq,[(node1,pred:dog,unifies(up,down)),(node10,[],unifies(up,down))])]

AGENDA: []

AGENDA:[edge(2,3,node9,subj:_9688^pred:chase(_9688,fido),[],[chaastfiidoo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)],_9681,base)]

AGENDA: [edge(0,3,node12,pred:chase(dog,fido),[],[(node9(chaastfiidoo),subj:dog^pred:chase(dog,fido),unifies(up,down)),(node11((node10(sw),[],unifies(up,down)),(node1(dog),pred:dog,unifies(up,down))),pred:dog,unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:dog,def:neg]],acq,[(node9,subj:dog^pred:chase(dog,fido),unifies(up,down)),(node11,pred:dog,unifies(up(subj),down))])]

AGENDA: []

185

CHART:edge(0,3,node12,pred:chase(dog,fido),[],[(node9(chaastfiidoo),subj:dog^pred:chase(dog,fido),unifies(up,down)),(node11((node10(sw),[],unifies(up,down)),(node1(dog),pred:dog,unifies(up,down))),pred:dog,unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:dog,def:neg]],acq,[(node9,subj:dog^pred:chase(dog,fido),unifies(up,down)),(node11,pred:dog,unifies(up(subj),down))])

edge(0,3,node12,pred:chase(dog,fido),acq,[(node9(chaastfiidoo),subj:dog^pred:chase(dog,fido),unifies(up,down)),(node11((node10(sw),[],unifies(up,down)),(node1(dog),pred:dog,unifies(up,down))),pred:dog,unifies(up(subj),down))],[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:dog,def:neg]],acq,[(node9,subj:dog^pred:chase(dog,fido),unifies(up,down)),(node11,pred:dog,unifies(up(subj),down))])

edge(2,3,node9,subj:_9688^pred:chase(_9688,fido),[],[chaastfiidoo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)],_9681,base)

edge(0,2,node12,_9386,acq,[(node11((node10(sw),[],unifies(up,down)),(node1(dog),pred:dog,unifies(up,down))),pred:dog,unifies(up(subj),down))],[subj:[pred:dog,def:neg]],acq,[(node11,pred:dog,unifies(up(subj),down))])

edge(0,2,node11,pred:dog,[],[(node1(dog),pred:dog,unifies(up,down)),(node10(sw),[],unifies(up,down))],[pred:dog,def:neg],acq,[(node1,pred:dog,unifies(up,down)),(node10,[],unifies(up,down))])

edge(0,2,node11,pred:dog,acq,[(node1(dog),pred:dog,unifies(up,down)),(node10(sw),[],unifies(up,down))],[pred:dog,def:neg],acq,[(node1,pred:dog,unifies(up,down)),(node10,[],unifies(up,down))])

edge(1,2,node1,pred:dog,[],[dog],[pred:dog],_8318,base)

edge(0,1,node11,_7896,acq,[(node10(sw),[],unifies(up,down))],[def:neg],acq,[(node10,[],unifies(up,down))])

edge(0,1,node10,[],[],[sw],[def:neg],_7662,base)

C-Structure

node12((node11((node10(sw),[],unifies(up,down)),(node1(dog),pred:dog,unifies(up,down))),pred:dog,unifies(up(subj),down)),(node9(chaastfiidoo),subj:dog^pred:chase(dog,fido),unifies(up,down)))

F-Structure

[obj:[pred:fido],tense:past,pred:chase(subj,obj),subj:[pred:dog,def:neg]]

186

CURRENT GRAMMAR







CURRENT GRAMMAR






CURRENT LEXICON

lex(node5,pred:chase(cat,fido),[dd,sw,k,a,t,ch,aa,s,t,f,ii,d,oo],[pred:chase(subj,obj),tense:past,obj:[pred:fido],subj:[pred:cat,def:pos]])

lex(node3,pred:mouse,[sw,m,ow,s],[def:neg,pred:mouse])

lex(node10,[],[sw],[def:neg])

lex(node1,pred:dog,[d,o,g],[pred:dog])

lex(node9,subj:_6998^pred:chase(_6998,fido),[ch,aa,s,t,f,ii,d,oo],[obj:[pred:fido],tense:past,pred:chase(subj,obj)])

yes

187

| ?-fat-controller%

script done on Fri Apr 7 14:41:38 1995

188

Appendix G An Example of the Acquisition of Recursive Constructions

The following is a script file of the acquisition of a recursive rule in the model, showinginputs (utterance, F-Structure) and outputs (grammar, lexicon). The purpose of thisappendix is to support the examples of acquisition discussed in 5. 6. 8. 1.

Script started on Thu May 11 16:31:27 1995To use QUI, setenv DISPLAY to machine:0.0 ~~~~~~~~~~~~~~ ~~~~~~~~~~~chloe% quintusQuintus Prolog Release 3.1.3 (Sun-4, SunOS 4.1)Copyright (C) 1993, Quintus Corporation. All rights reserved.2100 Geng Road, Palo Alto, California U.S.A. (415) 813-3800

| ?- [learn_td].

yes| ?- learn([k,l,oo,ee,th,i,n,k,s,dd,a,t,m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]]).

CURRENT GRAMMAR


yes| ?- learn([m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]).

CURRENT GRAMMAR



yes| ?- learn([e,m,sw,s,e,z,dd,a,t,d,a,d,ee,z,p,aa,n,t,i,ng,dd,sw,k,i,ch,i,n],[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]]).

CURRENT GRAMMAR




yes

189

| ?- learn([d,a,d,ee,z,p,aa,n,t,i,ng,dd,sw,k,i,ch,i,n],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]).

CURRENT GRAMMAR





yes| ?- learn([k,l,oo,ee,th,i,n,k,s,dd,a,t,d,a,d,ee,z,p,aa,n,t,i,ng,dd,sw,k,i,ch,i,n],[pred:think(subj,comp),subj:[pred:chloe],comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]]).

CURRENT GRAMMAR




rule(node10,[(node9,unifies(up,down)),(node11,unifies(up(comp),down))])


yes| ?- learn([e,m,sw,s,e,z,dd,a,t,m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[pred:say(subj,comp),subj:[pred:emma],comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]]).

CURRENT GRAMMAR







190

yes| ?- learn([e,m,sw,s,e,z,dd,a,t,k,l,oo,ee,th,i,n,k,s,dd,a,t,m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]]]).

CURRENT GRAMMAR






CURRENT GRAMMAR





CURRENT LEXICON

lex(node20,[m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]])

lex(node20,[d,a,d,ee,z,p,aa,n,t,i,ng,dd,sw,k,i,ch,i,n],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]])

lex(node21,[e,m,sw,s,e,z,dd,a,t],[pred:say(subj,comp),subj:[pred:emma]])

lex(node21,[k,l,oo,ee,th,i,n,k,s,dd,a,t],[pred:think(subj,comp),subj:[pred:chloe]])

yes| ?-chloe%

script done on Thu May 11 16:41:17 1995

191

Appendix H An Example of the Acquisition of Recursive Constructions,Illustrating Data Structures Used in Parsing

The following is a script file of the acquisition of a recursive rule in the model, showing, inaddition to inputs (utterance, F-Structure) and outputs (grammar, lexicon) fromacquisition, the Agenda during parsing and the Chart at the end of parsing, and theC-Structure and F-Structure outputs from parsing the utterance. The purpose of thisappendix is to support the examples of acquisition discussed in 5. 6. 8. 1.

Script started on Thu May 11 16:31:27 1995To use QUI, setenv DISPLAY to machine:0.0 ~~~~~~~~~~~~~~ ~~~~~~~~~~~chloe% quintusQuintus Prolog Release 3.1.3 (Sun-4, SunOS 4.1)Copyright (C) 1993, Quintus Corporation. All rights reserved.2100 Geng Road, Palo Alto, California U.S.A. (415) 813-3800

| ?- [learn_td].

yes| ?- learn([k,l,oo,ee,th,i,n,k,s,dd,a,t,m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]]).

AGENDA:[edge(0,1,node1,[],[klooeethinksddatmumeezinddswgahdn],[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]],_7873,base)]

AGENDA:[edge(0,1,node2,[],[(node1(klooeethinksddatmumeezinddswgahdn),unifies(up,down))],[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]],acq,[(node1,unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node2,[],[(node1(klooeethinksddatmumeezinddswgahdn),unifies(up,down))],[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]],acq,[(node1,unifies(up,down))])

edge(0,1,node2,acq,[(node1(klooeethinksddatmumeezinddswgahdn),unifies(up,down))],[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]],acq,[(node1,unifies(up,down))])

edge(0,1,node1,[],[klooeethinksddatmumeezinddswgahdn],[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]],_8046,base)

192

C-Structure:

node2((node1(klooeethinksddatmumeezinddswgahdn),unifies(up,down)))

F-Structure:

[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]]

CURRENT GRAMMAR


yes| ?- learn([m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]).

AGENDA:[edge(0,1,node3,[],[mumeezinddswgahdn],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],_7678,base)]

AGENDA: [edge(0,1,node4,[],[(node3(mumeezinddswgahdn),unifies(up,down))],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],acq,[(node3,unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node4,[],[(node3(mumeezinddswgahdn),unifies(up,down))],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],acq,[(node3,unifies(up,down))])

edge(0,1,node4,acq,[(node3(mumeezinddswgahdn),unifies(up,down))],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],acq,[(node3,unifies(up,down))])

edge(0,1,node3,[],[mumeezinddswgahdn],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],_7787,base)

C-Structure:

node4((node3(mumeezinddswgahdn),unifies(up,down)))

F-Structure:

[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]

193

CURRENT GRAMMAR



yes| ?- learn([e,m,sw,s,e,z,dd,a,t,d,a,d,ee,z,p,aa,n,t,i,ng,dd,sw,k,i,ch,i,n],[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]]).

AGENDA:[edge(0,1,node5,[],[emswsezddatdadeezpaantingddswkichin],[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]],_7878,base)]

AGENDA: [edge(0,1,node6,[],[(node5(emswsezddatdadeezpaantingddswkichin),unifies(up,down))],[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]],acq,[(node5,unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node6,[],[(node5(emswsezddatdadeezpaantingddswkichin),unifies(up,down))],[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]],acq,[(node5,unifies(up,down))])

edge(0,1,node6,acq,[(node5(emswsezddatdadeezpaantingddswkichin),unifies(up,down))],[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]],acq,[(node5,unifies(up,down))])

edge(0,1,node5,[],[emswsezddatdadeezpaantingddswkichin],[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]],_8035,base)

C-Structure:

node6((node5(emswsezddatdadeezpaantingddswkichin),unifies(up,down)))

F-Structure:

[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]]

CURRENT GRAMMAR


194



yes| ?- learn([d,a,d,ee,z,p,aa,n,t,i,ng,dd,sw,k,i,ch,i,n],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]).

AGENDA:[edge(0,1,node7,[],[dadeezpaantingddswkichin],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],_7702,base)]

AGENDA: [edge(0,1,node8,[],[(node7(dadeezpaantingddswkichin),unifies(up,down))],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],acq,[(node7,unifies(up,down))])]

AGENDA: []

CHART:edge(0,1,node8,[],[(node7(dadeezpaantingddswkichin),unifies(up,down))],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],acq,[(node7,unifies(up,down))])

edge(0,1,node8,acq,[(node7(dadeezpaantingddswkichin),unifies(up,down))],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],acq,[(node7,unifies(up,down))])

edge(0,1,node7,[],[dadeezpaantingddswkichin],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],_7795,base)

C-Structure:

node8((node7(dadeezpaantingddswkichin),unifies(up,down)))

F-Structure:

[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]

CURRENT GRAMMAR





195

yes| ?- learn([k,l,oo,ee,th,i,n,k,s,dd,a,t,d,a,d,ee,z,p,aa,n,t,i,ng,dd,sw,k,i,ch,i,n],[pred:think(subj,comp),subj:[pred:chloe],comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]]]).

AGENDA:[edge(0,1,node9,[],[klooeethinksddat],[pred:think(subj,comp),subj:[pred:chloe]],_8435,base)]

AGENDA: []

AGENDA:[edge(1,2,node7,[],[dadeezpaantingddswkichin],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],_9204,base)]

AGENDA: [edge(1,2,node11,[],[(node7(dadeezpaantingddswkichin),unifies(up,down))],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],acq,[(node7,unifies(up,down))])]

AGENDA:[edge(0,2,node10,[],[(node11((node7(dadeezpaantingddswkichin),unifies(up,down))),unifies(up(comp),down)),(node9(klooeethinksddat),unifies(up,down))],[comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],pred:think(subj,comp),subj:[pred:chloe]],acq,[(node11,unifies(up(comp),down)),(node9,unifies(up,down))])]

AGENDA: []

CHART:edge(0,2,node10,[],[(node11((node7(dadeezpaantingddswkichin),unifies(up,down))),unifies(up(comp),down)),(node9(klooeethinksddat),unifies(up,down))],[comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],pred:think(subj,comp),subj:[pred:chloe]],acq,[(node11,unifies(up(comp),down)),(node9,unifies(up,down))])

edge(0,2,node10,acq,[(node11((node7(dadeezpaantingddswkichin),unifies(up,down))),unifies(up(comp),down)),(node9(klooeethinksddat),unifies(up,down))],[comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],pred:think(subj,comp),subj:[pred:chloe]],acq,[(node11,unifies(up(comp),down)),(node9,unifies(up,down))])

edge(1,2,node11,[],[(node7(dadeezpaantingddswkichin),unifies(up,down))],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],acq,[(node7,unifies(up,down))])

edge(1,2,node11,acq,[(node7(dadeezpaantingddswkichin),unifies(up,down))],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],acq,[(node7,unifies(up,down))])

edge(1,2,node7,[],[dadeezpaantingddswkichin],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],_9204,base)

196

edge(0,1,node10,acq,[(node9(klooeethinksddat),unifies(up,down))],[pred:think(subj,comp),subj:[pred:chloe]],acq,[(node9,unifies(up,down))])

edge(0,1,node9,[],[klooeethinksddat],[pred:think(subj,comp),subj:[pred:chloe]],_8496,base)

C-Structure:

node10((node9(klooeethinksddat),unifies(up,down)),(node11((node7(dadeezpaantingddswkichin),unifies(up,down))),unifies(up(comp),down)))

F-Structure:

[comp:[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]],pred:think(subj,comp),subj:[pred:chloe]]

CURRENT GRAMMAR






yes| ?- learn([e,m,sw,s,e,z,dd,a,t,m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[pred:say(subj,comp),subj:[pred:emma],comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]]).

AGENDA:[edge(0,1,node12,[],[emswsezddat],[pred:say(subj,comp),subj:[pred:emma]],_8376,base)]

AGENDA: []



197

AGENDA:[edge(0,2,node13,[],[(node14((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down)),(node12(emswsezddat),unifies(up,down))],[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:say(subj,comp),subj:[pred:emma]],acq,[(node14,unifies(up(comp),down)),(node12,unifies(up,down))])]

AGENDA: []

CHART:edge(0,2,node13,[],[(node14((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down)),(node12(emswsezddat),unifies(up,down))],[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:say(subj,comp),subj:[pred:emma]],acq,[(node14,unifies(up(comp),down)),(node12,unifies(up,down))])

edge(0,2,node13,acq,[(node14((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down)),(node12(emswsezddat),unifies(up,down))],[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:say(subj,comp),subj:[pred:emma]],acq,[(node14,unifies(up(comp),down)),(node12,unifies(up,down))])

edge(1,2,node14,[],[(node3(mumeezinddswgahdn),unifies(up,down))],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],acq,[(node3,unifies(up,down))])



edge(0,1,node13,acq,[(node12(emswsezddat),unifies(up,down))],[pred:say(subj,comp),subj:[pred:emma]],acq,[(node12,unifies(up,down))])

edge(0,1,node12,[],[emswsezddat],[pred:say(subj,comp),subj:[pred:emma]],_8437,base)

C-Structure:

node13((node12(emswsezddat),unifies(up,down)),(node14((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down)))

F-Structure:

[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:say(subj,comp),subj:[pred:emma]]

198

CURRENT GRAMMAR







yes| ?- learn([e,m,sw,s,e,z,dd,a,t,k,l,oo,ee,th,i,n,k,s,dd,a,t,m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[subj:[pred:emma],pred:say(subj,comp),comp:[subj:[pred:chloe],pred:think(subj,comp),comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]]]]).

AGENDA:[edge(0,1,node12,[],[emswsezddat],[pred:say(subj,comp),subj:[pred:emma]],_8207,base)]

AGENDA: []

AGENDA:[edge(1,2,node9,[],[klooeethinksddat],[pred:think(subj,comp),subj:[pred:chloe]],_8965,base)]

AGENDA: []



AGENDA:[edge(1,3,node16,[],[(node17((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down)),(node9(klooeethinksddat),unifies(up,down))],[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:think(subj,comp),subj:[pred:chloe]],acq,[(node17,unifies(up(comp),down)),(node9,unifies(up,down))])]

199

AGENDA: [edge(0,3,node15,[],[(node16((node9(klooeethinksddat),unifies(up,down)),(node17((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down))),unifies(up(comp),down)),(node12(emswsezddat),unifies(up,down))],[comp:[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:think(subj,comp),subj:[pred:chloe]],pred:say(subj,comp),subj:[pred:emma]],acq,[(node16,unifies(up(comp),down)),(node12,unifies(up,down))])]

AGENDA: []

CHART:edge(0,3,node15,[],[(node16((node9(klooeethinksddat),unifies(up,down)),(node17((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down))),unifies(up(comp),down)),(node12(emswsezddat),unifies(up,down))],[comp:[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:think(subj,comp),subj:[pred:chloe]],pred:say(subj,comp),subj:[pred:emma]],acq,[(node16,unifies(up(comp),down)),(node12,unifies(up,down))])

edge(0,3,node15,acq,[(node16((node9(klooeethinksddat),unifies(up,down)),(node17((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down))),unifies(up(comp),down)),(node12(emswsezddat),unifies(up,down))],[comp:[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:think(subj,comp),subj:[pred:chloe]],pred:say(subj,comp),subj:[pred:emma]],acq,[(node16,unifies(up(comp),down)),(node12,unifies(up,down))])

edge(1,3,node16,[],[(node17((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down)),(node9(klooeethinksddat),unifies(up,down))],[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:think(subj,comp),subj:[pred:chloe]],acq,[(node17,unifies(up(comp),down)),(node9,unifies(up,down))])

edge(1,3,node16,acq,[(node17((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down)),(node9(klooeethinksddat),unifies(up,down))],[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:think(subj,comp),subj:[pred:chloe]],acq,[(node17,unifies(up(comp),down)),(node9,unifies(up,down))])

edge(2,3,node17,[],[(node3(mumeezinddswgahdn),unifies(up,down))],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],acq,[(node3,unifies(up,down))])



edge(1,2,node16,acq,[(node9(klooeethinksddat),unifies(up,down))],[pred:think(subj,comp),subj:[pred:chloe]],acq,[(node9,unifies(up,down))])

200

edge(1,2,node9,[],[klooeethinksddat],[pred:think(subj,comp),subj:[pred:chloe]],_8965,base)

edge(0,1,node15,acq,[(node12(emswsezddat),unifies(up,down))],[pred:say(subj,comp),subj:[pred:emma]],acq,[(node12,unifies(up,down))])

edge(0,1,node12,[],[emswsezddat],[pred:say(subj,comp),subj:[pred:emma]],_8207,base)

C-Structure:

node15((node12(emswsezddat),unifies(up,down)),(node16((node9(klooeethinksddat),unifies(up,down)),(node17((node3(mumeezinddswgahdn),unifies(up,down))),unifies(up(comp),down))),unifies(up(comp),down)))

F-Structure:

[comp:[comp:[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]],pred:think(subj,comp),subj:[pred:chloe]],pred:say(subj,comp),subj:[pred:emma]]

CURRENT GRAMMAR






CURRENT GRAMMAR





201

CURRENT LEXICON

lex(node20,[m,u,m,ee,z,i,n,dd,sw,g,ah,d,n],[subj:[pred:mummy],pred:be(subj,loc),pcase:in,loc:[pred:garden]])

lex(node20,[d,a,d,ee,z,p,aa,n,t,i,ng,dd,sw,k,i,ch,i,n],[subj:[pred:daddy],pred:paint(subj,obj),obj:[pred:kitchen]])

lex(node21,[e,m,sw,s,e,z,dd,a,t],[pred:say(subj,comp),subj:[pred:emma]])

lex(node21,[k,l,oo,ee,th,i,n,k,s,dd,a,t],[pred:think(subj,comp),subj:[pred:chloe]])

yes| ?-chloe%

script done on Thu May 11 16:41:17 1995

202

Appendix I An Example Illustrating That the Acquisition of a RecursiveNp Rule is Dependent upon the Ordering of Input Utterances:A Sequence of Inputs Which Supports the Acquisition of aRecursive Rule

The following is a script file of the acquisition of a recursive rule for the noun phrase in themodel, showing inputs (utterance, F-Structure) and outputs (grammar, lexicon). Here, theordering of the input utterances, identical to those in Appendix J, determine whether arecursive or non-recursive rule is acquired. The purpose of this appendix is to support theexamples of acquisition discussed in 5. 7. 3. 1.

Script started on Thu Apr 6 16:22:44 1995To use QUI, setenv DISPLAY to machine:0.0 ~~~~~~~~~~~~~~ ~~~~~~~~~~~fat-controller% quintusQuintus Prolog Release 3.1.3 (Sun-4, SunOS 4.1)Copyright (C) 1993, Quintus Corporation. All rights reserved.2100 Geng Road, Palo Alto, California U.S.A. (415) 813-3800


yes| ?- learn([w,i,dd,dd,sw,ch,ii,l,d],[pcase:with,pred:child]).

CURRENT GRAMMAR


yes| ?- learn([sw,t,e,d,ee,b,ai],[pred:teddy]).

CURRENT GRAMMAR



yes| ?- learn([w,i,dd,sw,t,e,d,ee,b,ai],[pred:teddy,pcase:with]).

CURRENT GRAMMAR



rule(node6,pred:_9361,[(node5,[],unifies(up,down)),(node3,pred:_9361,unifies(up,down))] )

yes| ?- learn([dd,sw,g,er,l],[pred:girl]).

203

CURRENT GRAMMAR



rule(node6,pred:_8134,[(node5,[],unifies(up,down)),(node3,pred:_8134,unifies(up,down))] )


yes| ?-learn([dd,sw,g,er,l,w,i,dd,sw,t,e,d,ee,b,ai],[pred:girl,adjunct:[pcase:with,pred:teddy]]).

CURRENT GRAMMAR





rule(node9,pred:_12133,[(node7,pred:_12133,unifies(up,down)),(node6,pred:_12158,unifies(up(adjunct),down))])

yes| ?- learn([w,i,dd,dd,sw,g,er,l],[pcase:with,pred:girl]).

CURRENT GRAMMAR





yes | ?-learn([sw,t,e,d,ee,b,ai,w,i,dd,dd,sw,g,er,l],[pred:teddy,adjunct:[pcase:with,pred:girl]]).

204

CURRENT GRAMMAR





yes| ?- learn([t,ou,dd,sw,ch,ii,l,d],[pcase:to,pred:child]).

CURRENT GRAMMAR






yes| ?- learn([t,ou,sw,t,e,d,ee,b,ai],[pred:teddy,pcase:to]).

CURRENT GRAMMAR






yes| ?- learn([t,ou,dd,sw,g,er,l,w,i,dd,sw,t,e,d,ee,b,ai],[pcase:to,pred:girl,adjunct:[pcase:with,pred:teddy]]).

205

CURRENT GRAMMAR







CURRENT GRAMMAR






CURRENT LEXICON

lex(node1,pred:child,[w,i,dd,dd,sw,ch,ii,l,d],[pcase:with,pred:child])

lex(node12,pred:child,[t,ou,dd,sw,ch,ii,l,d],[pcase:to,pred:child])

lex(node16,[],[t,ou],[pcase:to])

lex(node16,[],[w,i,dd],[pcase:with])

lex(node17,pred:girl,[dd,sw,g,er,l],[pred:girl])

lex(node17,pred:teddy,[sw,t,e,d,ee,b,ai],[pred:teddy])

yes| ?-

206

fat-controller%

script done on Thu Apr 6 16:58:53 1995

207

Appendix J An Example Illustrating That the Acquisition of a RecursiveNp Rule is Dependent upon the Ordering of Input Utterances:A Sequence of Inputs Which Supports the Acquisition of aNon-Recursive Rule

The following is a script file of the acquisition of a non-recursive rule for the noun phrase inthe model, showing inputs (utterance, F-Structure) and outputs (grammar, lexicon). Here,the ordering of the input utterances, identical to those in Appendix I, determine whether arecursive or non-recursive rule is acquired. The purpose of this appendix is to support theexamples of acquisition discussed in 5. 7. 3. 1.

Script started on Tue May 2 14:21:18 1995To use QUI, setenv DISPLAY to machine:0.0 ~~~~~~~~~~~~~~ ~~~~~~~~~~~fat-controller% quintusQuintus Prolog Release 3.1.3 (Sun-4, SunOS 4.1)Copyright (C) 1993, Quintus Corporation. All rights reserved.2100 Geng Road, Palo Alto, California U.S.A. (415) 813-3800


yes| ?- learn([w,i,dd,dd,sw,ch,ii,l,d],[pcase:with,pred:child]).

CURRENT GRAMMAR


yes| ?- learn([sw,t,e,d,ee,b,ai],[pred:teddy]).

CURRENT GRAMMAR



yes| ?- learn([w,i,dd,sw,t,e,d,ee,b,ai],[pred:teddy,pcase:with]).

CURRENT GRAMMAR




yes| ?- learn([dd,sw,g,er,l],[pred:girl]).

208

CURRENT GRAMMAR





yes| ?- learn([w,i,dd,dd,sw,g,er,l],[pred:girl,pcase:with]).

CURRENT GRAMMAR




yes| ?- learn([t,ou,dd,sw,ch,ii,l,d],[pcase:to,pred:child]).

CURRENT GRAMMAR





yes| ?- learn([t,ou,sw,t,e,d,ee,b,ai],[pcase:to,pred:teddy]).

CURRENT GRAMMAR





209

yes| ?- learn([t,ou,dd,sw,g,er,l,w,i,dd,sw,t,e,d,ee,b,ai],[pred:girl,pcase:to,adjunct:[pred:teddy,pcase:with]]).

CURRENT GRAMMAR





rule(node14,pred:_13198,[(node15,[],unifies(up,down)),(node10,pred:_13198,unifies(up,down)),(node14,pred:_13234,unifies(up(adjunct),down))])

yes| ?-learn([dd,sw,g,er,l,w,i,dd,sw,t,e,d,ee,b,ai],[pred:girl,adjunct:[pcase:with,pred:teddy]]).

CURRENT GRAMMAR







yes| ?-learn([sw,t,e,d,ee,b,ai,w,i,dd,dd,sw,g,er,l],[pred:teddy,adjunct:[pcase:with,pred:girl]]).

CURRENT GRAMMAR



210






CURRENT GRAMMAR







CURRENT LEXICON

lex(node1,pred:child,[w,i,dd,dd,sw,ch,ii,l,d],[pcase:with,pred:child])

lex(node11,pred:child,[t,ou,dd,sw,ch,ii,l,d],[pcase:to,pred:child])

lex(node15,[],[t,ou],[pcase:to])

lex(node10,pred:teddy,[sw,t,e,d,ee,b,ai],[pred:teddy])

lex(node15,[],[w,i,dd],[pcase:with])

lex(node10,pred:girl,[dd,sw,g,er,l],[pred:girl])

211

yes| ?-fat-controller%

script done on Tue May 2 14:30:26 1995

212

References

Allen, J. (1987), Natural Language Understanding, Wokingham: Benjamin/Cummings.

Altmann, G.T.M (1990), Lexical Statistics and Cognitive Models of Speech Processing. In

G.T.M. Altmann (ed.), Cognitive Models of Speech Processing: Psycholinguistic and

Computational Perspectives, London: MIT Press, pp.211-35.

Altmann, G. & M. Steedman (1988), Interaction with context during human sentence

processing, Cognition, 30(3), pp.191-238.

Anderson, J.R. (1977), Computer Simulation of a Language Acquisition System: A Second

Report. In D. Laberge & S.J. Samuels (eds.), Basic Processes in Reading:

Perception and Comprehension, London: Erlbaum, pp.305-60.

Anderson, J.R. (1983), The Architecture of Cognition, London: Harvard University Press.

Bard, E.G. (1990), Competition, Lateral Inhibition, and Frequency: Comments on the

Chapters of Frauenfelder and Peeters, Marslen-Wilson, and Others. In

G.T.M. Altmann (ed.), Cognitive Models of Speech Processing: Psycholinguistic and

Computational Perspectives, London: MIT Press, pp.185-210.

Berwick, R.C. (1985), The Acquisition of Syntactic Knowledge, London: MIT Press.

Berwick, R.C. & A.S. Weinberg (1984), The Grammatical Basis of Linguistic

Performance: Use and Acquisition, London: MIT Press.

Bock, K. et al (1992), From Conceptual Roles to Structural Relations: Bridging the

Syntactic Cleft, Psychological Review, 99(1), pp.150-71.

Bohannon, J. et al (1990), No Negative Evidence Revisited: Beyond Learnability or Who

Has to Prove What to Whom, Developmental Psychology, 26(2), pp.221-6.

Braine, M.D.S. (1988), Modelling the Acquisition of Linguistic Structure. In Y. Levy,

M. Izchak & M.D.S. Braine (eds.), Categories and Processes in Language

Acquisition, London: Erlbaum, pp.217-59.

213

Braine, M.D.S. (1992), What Sort of Innate Structure is Needed to “Bootstrap” Syntax?

Cognition, 45(1), pp.77-100.

Brent, M.R. et al (1994), The Informational Structure of Word Discovery. In P. Broeder &

J. Murre (eds.), Workshop on Cognitive Models of Language Acquisition, Tilburg

University, April 1994, Tilburg: Katholieke Universiteit Brabant, pp.3-4.

Bresnan, J. (1982a), The Passive in Lexical Theory. In J. Bresnan (ed.), The Mental

Representation of Grammatical Relations, London: MIT Press, pp.3-86.

Bresnan, J. (1982b), The Mental Representation of Grammatical Relations, London: MIT

Press.

Briscoe, E.J. (1989), Lexical Access in Connected Speech Recognition. In Proceedings of

the 27th ACL Congress, Vancouver, June 1989, [s.l.]: ACL, pp.84-90.

Britt, M.A. et al (1992), Parsing in Discourse: Context Effects and Their Limits, Journal of

Memory and Language, 31(3), pp.293-314.

Brown, R. (1973), A First Language: The Early Stages, London: Allen & Unwin.

Cairns, P. et al (1994a), Modelling the Acquisition of Lexical Segmentation: a Bottom-up

Corpus Based Approach. In P. Broeder & J. Murre (eds.), Workshop on Cognitive

Models of Language Acquisition, Tilburg University, April 1994, Tilburg: Katholieke

Universiteit Brabant, pp.5-6.

Cairns, P. et al (1994b), Bootstrapping word boundaries: a bottom-up corpus-based

approach to speech segmentation. Submitted to Cognitive Psychology.

Carey, S. (1990), Cognitive Development. In D.N. Osherson & E.E. Smith, An Invitation

to Cognitive Science, Volume3: Thinking, London: MIT Press, pp.147-172.

Cartwright, T. & M.R. Brent (1994), Segmenting Speech Without a Lexicon: The Role of

Phonotactics and Speech Source. In Proceedings of the 1st Meeting of the

Association for Computational Phonology (Post-Conference SIGPHON Workshop

of the 32nd Annual Meeting of the ACL), New Mexico State University, July 1994,

http:/xxx.lanl.gov/abs/cmp-lg/94 12005.

214

Choi, S. & M. Bowerman (1991), Learning to Express Motion Events in English and

Korean: The influence of language-specific lexicalization patterns, Cognition,

41(1/3), pp.83-121.

Chomsky, N. (1957), Syntactic Structures, The Hague: Mouton.

Church, K.W. (1987), Phonological Parsing and Lexical Retrieval, Cognition, 25(1/2),

pp.53-69.

Clark, H.H. & E.V. Clark (1977), Psychology and Language, New York: Harcourt Brace

Jovanovich.

Clifton, C. et al (1991), Parsing Arguments: Phrase Structure and Argument Structure as

Determinants of Initial Parsing Decisions, Journal of Memory and Language, 30(2),

pp.251-71.

Clifton, R.K. (1993), Introduction to the chapters by Werker and Jusczyk. In

G.T.M. Altmann & R. Shillcock (eds.), Cognitive Models of Speech Processing: The

Second Sperlonga Meeting, Hove: Erlbaum, pp.23-6.

Cole, R.A. & J. Jakimik (1980), A Model of Speech Perception. In R.A. Cole (ed.),

Perception and Production of Fluent Speech, Hillsdale, NJ: Erlbaum, pp.133-63.

Crain, S. & M. Steedman (1985), On not being led up the garden path. In D.R. Dowty et al

(eds.), Natural Language Parsing, Cambridge: Cambridge University Press, pp.320-

358.

Cutler, A. & S. Butterfield (1992), Rhythmic Cues to Speech Segmentation, Journal of


Cutler, A. & D. Norris (1988), The Role of Strong Syllables in Segmentation for Lexical

Access, Journal of Experimental Psychology: Human Perception and Performance,

14(1), pp.113-21.

Cutler, A. & J. Mehler (1993), The Periodicity Bias, Journal of Phonetics, 21(1/2),

pp.103-8.

215

Ford, M. et al (1982), A Competence-Based Theory of Syntactic Closure. In J. Bresnan

(ed.), The Mental Representation of Grammatical Relations, London: MIT Press,

pp.727-96.

Frauenfelder, U.H. (1985), Cross-linguistic approaches to lexical segmentation,

Linguistics, 23(5), pp.669-87.

Frazier, L. (1985), Syntactic Complexity. In D.R. Dowty et al (eds.), Natural Language

Parsing, Cambridge: Cambridge University Press, pp.129-189.

Frazier, L. (1989), Against Lexical Generation of Syntax. In W. Marslen-Wilson (ed.),

Lexical Representation and Process, London: MIT Press, pp.505-28.

Friederici, A. D. & J.M.I. Wessels (1993), Phonotactic Knowledge of Word Boundaries

and its Use in Infant Speech Perception, Perception & Psychophysics, 54(3),

pp.287-95.

Gerken, L. (1987), Telegraphic Speaking Does Not Imply Telegraphic Listening, Papers

and Reports on Child Language Development, 26, pp.48-55.

Gold, E.M. (1967), Language Identification in the Limit, Information and Control, 10(5),

pp.447-74.

Goodsitt, J.V. et al (1993), Perceptual Strategies in Prelingual Speech Segmentation,

Journal of Child Language, 20(2), pp.229-52.

Gorrell, P. (1989), Establishing the Loci of Serial and Parallel Effects in Syntactic

Processing, Journal of Psycholinguistic Research, 18(1), pp.61-73.

Grosjean, F. & J.P. Gee (1987), Prosodic Structure and Spoken Word Recognition,

Cognition, 25(1/2), pp.135-55.

Hill, J.C. (1983), A Computational Model of Language Acquisition in the Two-Year Old,

Cognition and Brain Theory, 6(3), pp.287-317.

216

Hirsh-Pasek, K. et al (1987), Clauses are Perceptual Units for Young Infants, Cognition,

26(3), pp.269-86.

Ingram, D. (1989), First Language Acquisition: Method, Description, and Explanation,

Cambridge: Cambridge University Press.

Jackendoff, R. (1983), Semantics and Cognition, London: MIT Press.

Jusczyk, P.W. (1993a), How Word Recognition May Evolve from Infant Speech

Perception Capacities. In G.T.M. Altmann & R. Shillcock (eds.), Cognitive Models

of Speech Processing: The Second Sperlonga Meeting, Hove: Erlbaum, pp.27-55.

Jusczyk, P.W. (1993b), From general to language-specific capacities: the WRAPSA

model of how speech perception develops, Journal of Phonetics, 21(1/2), pp.3-28.

Jusczyk, P.W. et al (1992), Perception of Acoustic Correlates of Major Phrasal Units by

Young Infants, Cognitive Psychology, 24(2), pp.252-93.

Kaplan, R.M. & J. Bresnan (1982), Lexical-Functional Grammar: A Formal System for

Grammatical Representation. In J. Bresnan (ed.), The Mental Representation of

Grammatical Relations, London: MIT Press, pp.173-281.

Langley, P. (1982), Language Acquisition Through Error Recovery, Cognition and Brain

Theory, 5(3), pp.211-55.

Levelt, W.J.M. (1992), Psycholinguistics: An Overview. In International Encyclopedia of

Linguistics, Vol.3, Oxford: Oxford University Press, pp.290-4.

McClelland, J.L. & J.L. Elman (1986), The TRACE Model of Speech Perception, Cognitive

Psychology, 18(1), pp.1-86.

MacWhinney, B. (1978), The acquisition of morphophonology, Monographs of the Society

for Research in Child Development, 43(1/2), pp.1-122.

MacWhinney, B. (1987), The Competition Model. In B. MacWhinney (ed.), Mechanisms

of Language Acquisition, London: Erlbaum, pp.249-308.

217

MacWhinney, B. & J. Anderson (1986), The Acquisition of Grammar. In I. Gopnik &

M. Gopnik (eds.), From Models to Modules: Studies in Cognitive Science from the

McGill Workshops, Norwood, NJ: Ablex, pp.3-25.

MacWhinney, B. & J. Leinbach (1989), Language Learning: Cues or Rules?, Journal of


Marcus, G.F. et al (1992), Overregularization in Language Acquisition, Monographs of

the Society for Research in Child Development, 57(4), pp.1-165.

Marcus, M.P. (1980), A Theory of Syntactic Recognition for Natural Language, London:

MIT Press.

Mehler, J. et al (1990), Constraining Models of Lexical Access: The Onset of Word

Recognition. In G.T.M. Altmann (ed.), Cognitive Models of Speech Processing:

Psycholinguistic and Computational Perspectives, London: MIT Press, pp.236-62.

Milne, R.W. (1982), Predicting Garden Path Sentences, Cognitive Science, 6(4),

pp.349-73.

Milne, R. (1986), Resolving Lexical Ambiguity in a Deterministic Parser, Computational

Linguistics, 12(1), pp.1-12.

Morgan, J.L. (1986), From Simple Input to Complex Grammar, London: MIT Press.

Nelson, D.G.K. et al (1989), How the prosodic cues in motherese might assist language

learning, Journal of Child Language, 16(1), pp.55-68.

Nespor, M. & I. Vogel (1986), Prosodic Phonology, Dordrecht: Foris.

Newport, E.L. (1977), Motherese: The Speech of Mothers to Young Children. In

N.J. Castellan et al (eds.), Cognitive Theory, Vol. 2, Hillsdale, NJ: Erlbaum, pp.177-

217.

Ninio, A. (1988), On Formal Grammatical Categories in Early Child Language. In

Y. Levy, M. Izchak & M.D.S. Braine (eds.), Categories and Processes in Language

Acquisition, London: Erlbaum, pp.99-119.

218

Norris, D. & A. Cutler (1985), Juncture Detection, Linguistics, 23(5), pp.289-705.

Pereira, F.C.N. (1985), A New Characterization of Attachment Preferences. In

D.R. Dowty et al (eds.), Natural Language Parsing, Cambridge: Cambridge

University Press, pp.307-319.

Peters, A.M. (1983), The Units of Language Acquisition, Cambridge: Cambridge

University Press.

Peters, A.M. (1985), Language Segmentation: Operating Principles for the Perception and

Analysis of Language. In D.I. Slobin (ed.), The Crosslinguistic Study of Language

Acquisition, Volume 2: Theoretical Issues, London: Erlbaum, pp.1029-64.

Pinker, S. (1979), Formal Models of Language Learning, Cognition, 7(3), pp.217-83.

Pinker, S. (1982), A Theory of the Acquisition of Lexical Interpretive Grammars. In

J. Bresnan (ed.), The Mental Representation of Grammatical Relations, London:

MIT Press, pp.655-726.

Pinker, S. (1984), Language Learnability and Language Development, London: Harvard

University Press.

Pinker, S. & A. Prince (1988), On language and connectionism: Analysis of a parallel

distributional processing model of language acquisition, Cognition, 28(1/2),

pp.73-193.

Plunkett, K. (1994), Learning the Arabic Broken Plural: The Case for Minority Default

Mappings in Connectionist Nets. In P. Broeder & J. Murre (eds.), Workshop on

Cognitive Models of Language Acquisition, Tilburg University, April 1994, Tilburg:

Katholieke Universiteit Brabant, pp.47-51.

Rumelhart, D.E. & J.L. McClelland (1986), On Learning the Past Tenses of English

Verbs. In J.L. McClelland, D.E. Rumelhart & the PDP Research Group (eds.),

Parallel Distributed Processing: Explorations in the Microstructure of Cognition,

Volume 2: Psychological and Biological Models, London: MIT Press, pp.216-71.

219

Satake, N. (1990), A Computational Model of First Language Acquisition, London: World

Scientific.

Schlesinger, I.M. (1982), Steps to Language, London: Erlbaum.

Schlesinger, I.M. (1988), The Origin of Relational Categories. In Y. Levy, M. Izchak &

M.D.S. Braine (eds.), Categories and Processes in Language Acquisition, London:

Erlbaum, pp.121-78.

Selfridge, M. (1982), Inference and Learning in a Computer Model of the Development of

Language Comprehension in a Young Child. In W. Lehnert & M. Ringle (eds.),

Strategies for Natural Language Processing, London: Erlbaum, pp.299-326.

Selfridge, M. (1986), A Computer Model of Child Language Learning, Artificial

Intelligence, 29(2), pp.171-216.

Shillcock, R. (1990), Lexical Hypotheses in Continuous Speech. In G.T.M. Altmann (ed.),

Cognitive Models of Speech Processing: Psycholinguistic and Computational

Perspectives, London: MIT Press, pp.24-49.

Siskind, J.M. (1992), Naïve Physics, Event Perception, Lexical Semantics, and Language

Acquisition, PhD Thesis, MIT.

Slobin, D.I. (1982), Universal and Particular in the Acquisition of Language. In E. Wanner

& L.R. Gleitman (eds.), Language Acquisition: The State of the Art., Cambridge:

Cambridge University Press, pp.128-70.

Steedman, M. (1985), LFG and Psychological Explanation, Linguistics & Philosophy,

8(3), pp.359-85.

Tanenhaus, M.K. & R.G.N. Carlson (1989), Lexical Structure and Language

Comprehension. In W. Marslen-Wilson (ed.), Lexical Representation and Process,

London: MIT Press, pp.529-61.

Tyler, L.K & W.D. Marslen-Wilson (1982), Speech Comprehension Processes. In

J. Mehler et al (ed.), Perspectives on Mental Representation: Experimental and

220

Theoretical Studies of Cognitive Processes and Capacities, London: Erlbaum,

pp.169-84.

Werker, J.F. (1993), Developmental Change in Cross-language Speech Perception:

Implications for Cognitive Models of Speech Processing. In G.T.M. Altmann &

R. Shillcock (eds.), Cognitive Models of Speech Processing: The Second Sperlonga

Meeting, Hove: Erlbaum, pp.57-78.

Wexler, K. & P.W. Culicover (1980), Formal Principles of Language Acquisition, London:

MIT Press.

Wilkins, W. (1989), Linguistics, Learnability, and Computation. In J.R. Brink &

C.R. Haden (eds.), The Computer and the Brain: Perspectives on Human and

Artificial Intelligence, North-Holland: Elsevier, pp.197-206.

Wolff, J.G. (1988), Learning Syntax and Meanings Through Optimization and

Distributional Analysis. In Y. Levy, M. Izchak & M.D.S. Braine (eds.), Categories

and Processes in Language Acquisition, London: Erlbaum, pp.179-215.

phrase structure helen louise gaylard

Documents