from frame to subframe: collocational asymmetry in ... · web viewthe study of word senses in...

Click here to load reader

Upload: hoangngoc

Post on 19-Apr-2018

223 views

Category:

Documents


9 download

TRANSCRIPT

From Frame to Subframe: Collocational Asymmetry in Mandarin Verbs of Conversation

Forward

Chin-Chuan Cheng

Academia Sinica

We felt it was time in 2000 to take a concerted action to work on Chinese lexical semantics. The study of word senses in traditional discipline of Xungu appeared to be done piecemeal. The truth-theoretic model of proposition looked illusive at times. Colleagues working on natural language processing demanded a better dictionary for word disambiguation. Few theoretical claims about human manipulation of senses were available. We therefore organized a workshop on lexical semantics to discuss these issues and others at the City University of Hong Kong four years ago. The one-day gathering was fairly informal. But we had fun knowing each of the score of colleagues from Taiwan, Hong Kong, Mainland China, and the United States. We did not call it First Chinese Lexical Semantics Workshop. We simply called it a lexical semantics workshop without knowing its consequences.

Somehow the gathering in Hong Kong made Professor Yu Shiwen of Beijing University happy. He should be because he had worked on semantics for years. In 2001 he invited more people to the meeting with the title of Second Workshop on Chinese Lexical Semantics. The Beijing meeting was enthusiastically followed by the Third Workshop in Taipei. Professor Huang Chu-ren was energetic enough to set up the mechanism of abstract submission and evaluation. Some submissions had to be left out because of a large number of excellent papers. The fourth workshop returned to the City University of Hong Kong in 2003. We were not daunted by SARS. Although trip restrictions made us stay home, our papers got exchanged and commented on via the internet. We received hundreds of comments, perhaps more than we would in a face to face conference.

It is now 2004. I am pleased to see the workshop fully alive in its fifth year of existence. Our hosts, Drs. Ji Donghong and Lua Kim Teng have kindly accepted our imposition and aptly made arrangements for us to see each other face to face here in Singapore. We are grateful to them for the arrangements. We are also grateful to them for gathering papers in this volume for discussion. I am sure during the workshop we will move away from piecemeal studies of words. We will be a step closer to theoretical generalizations about human cognition of words.

Content Table

1

7

15The Sinica Sense Management System: Design and Implementation

23

39

47From Frame to Subframe: Collocational Asymmetry in Mandarin Verbs of Conversation

53

61

69

76 (D-Type Theory )

84

91Using WordNet and SUMO to Determine Source Domains of Conceptual Metaphors

99From Lexical Semantics to Conceptual Metaphors: Mapping Principle Verification with WordNet and SUMO

107

114

120

128

133Feature Representations and Logical Compatibility between Temporal Adverbs and Aspects

143

152Multiple-layer Semantic Derivations of Two-part Allegorical Expressions in Taiwanese Southern Min (TSM)

161

169

174CCD

181

189WordNet

194hownet

201

213Pan-Chinese Variation on Verbal Synonymy: A Study of Common Reportage Verbs in News Texts

220The Usage and Perception of Judgement Termsin the Pan-Chinese Context

228

237

244X

249

254

259Taxonomy of Fine-grain Semantic Roles for Nominal Modifiers

264hannhiohonn

268""""

272Semantics-related Lexical Access Deficit of Mandarin-Chinese Dyslexia

276

280

284SVDE

290

294

301

305

311

316Verbs of Urging in Hakka: A Perspective from Force-Dynamics

TC "" \f C \l "1"

1998

bankbankbank

2004

(Kilgarriff and Tugwell 2001)

(VG)(Neqa)(P)(Na)(VC)(DE)(VL)(Na)(VHC)[spv]

(Na)(D)(VE)(D)(VC)(Caa)(D)(VJ)(P)

(Na)(P)(A)(DE)(VC)(Na)(VC)(Caa)

(DE)(VC)(Na)(VC)(Caa)(VK)(SHI)(D)(Nh)(Dfa)(DE)(VC)(Na)(VL)(Na)

VCN50501998

2002

2004(Huang 2003)

2004

--(Kuo at al. 2003)

*93-2524-S-001-003

. . 2004..

. 1998. . In Benjamin K. Tsou, Tom B. Y. Lai, Samuel W. K. Chan, and William S-Y. Wang eds. (Quantitative and Computational Studies on the Chinese Language) 15-30. City University of Hong Kong.

. 2002. . .

. 2004. .

. 2004. . .

Huang, Chu-Ren. 2003. SINICA BOW: Integrating ilingual WordNet and SUMO ontology. Invited panel talk: Synergy Between Language Resources and Knowledge Resources. The 2003 IEEE International Conference on Natural Language Processing and Knowledge Engineering (NLPKE2003), Special Session on Upper Ontology and Natural Language Processing. Beijing. Oct. 28.

Kilgarriff, Adam and David Tugwell. 2001 WORD SKETCH: Extraction and display of significant collocations for lexicography. In Proceedings of. Workshop on COLLOCATION: Computational Extraction, Analysis and Exploitation 32-38. 39th ACL & 10th EACL, Toulouse, July 2001.

Kuo, W.J., T. C. Yeh, C. Y. Lee, Y. T. Wu, C. C. Chou, L. T. Ho, D. L. Hung, O. J. Tzeng, and J. C. Hsieh. 2003. Frequency effects of Chinese character processing in the brain: an event-related fMRI study. Neuroimage 18:720-730.

TC "" \f C \l "1"

100871

[email protected] [email protected]

Dummy Verbs in Contemporary Chinese

YU Shiwen ZHU Xuefeng DUAN Huiming

Institute of Computational LinguisticsPeking University 100871 China

[email protected] [email protected]

Abstract

In contemporary Chinese, there is a subclass of verbs called Dummy Verbs. After briefly introducing the lexical meanings of two typical dummy verbs Jiayi and Jinxing, this paper discusses the grammatical attributes of Jiayi and Jinxing in detail, and further explores their functions as the marks of syntactic constituents and semantic roles.

Keywords : Dummy Verb, Lexical Meaning, Grammatical Attribute, Semantic Role

1.

Dummy Verb, DVDV[1]DVDV

[2][3]6[9][4-8]

1985DV3

_____________________________________________________________________________

* 8632001AA1142102002AA11701060173005

2.

DVDVDV

DVDV3*4343484 8484344384 48[10][11]in addition moreoverDVDV

DVDVDV head

sense[12-15]4

3.

[16]

[1]DVDV

DV

3.1

1DV23

3.2

DV

199810

1998

(:)

90[17]

3.3

DVDV

DV

*

*

DV

DV4.1

DV

3.4

DVDVDV

1 DV

2 DV

3 DV

4 DVDV

5 DV

6 DVAABBABAB

DV

4.

4.1

(1) /

(2)

(3)

(4)

(5)

1NPv(2)-(5)1NPvvvnDVvnNP2DV3DV4DVvn5DVvn

DVvDVvn

[9]

topicfocus

4.2

DVstop listDVWSDDV[18]DVDV[19]

DV[9]

[9] 5

1. We are carrying on reforms on the state-owned enterprises in our country.

2. We are making reforms on the state-owned enterprises in our country.

3. The reforms on the state-owned enterprises are being carried out in our country.

4. We are reforming the state-owned enterprises in our country.

5. The state-owned enterprises are being reformed in our country.

51,2,3carry onmakereformcarry on, carry out, undertake, undergo, conduct, engage, make, hold, commit, have, hold discussion, make investigation

5.

CLSW5, 200461416

5

2004337[20]CLSW5CLSW5

[1] 220032

[2] 19805

[3] 19855

[4] 1995V+468-71

[5] 1997DV04

[6] 199832291-93

[7] 200320111-16

[8] DV2003392-94

[9] 199951-57

[10] 198312

[11] 198811

[12] 2001327-33

[13] 2001498-104

[14] Vol.13, No.2, 2003159-176

[15] Vol.13, No.2, 2003177-194

[16] 20038161-162

[17] 2001321-26

[18] 20036132119-120

[19] 200211272003313189-98

[20] 20043

The Sinica Sense Management System: Design and Implementation TC " The Sinica Sense Management System: Design and Implementation " \f C \l "1"

Chu-Ren Huang, Chun-ling Chen, Cui-Xia Weng, and Keh-jiann Chen

Academia Sinica

1. Background and Motivation

It has been a trend for language engineering to construct a sense-based lexical knowledgebase as a core foundation. WordNet and Euro WordNet are two well-known examples. There are two important criteria in constructing this knowledgebase: linguistic felicity and data cohesion. Huang et al. (2003) discussed how to achieve linguistic felicity in building a comprehensive inventory of Chinese senses from corpus data. It introduced five criteria as well as operational guidelines for sense distinction. In this paper, we will discuss how to achieve data cohesion for the sense information thus collected through a Sinica Sense Management System (SSMS).

2. Introduction to the Content of the SSMS

The SSMS manages both lexical entries and word senses. This system is designed and implemented by the Chinese WordNet Team at Academia Sinica. It contains all the basic information that can be merged with the eventual Chinese WordNet. The basic structure of this system is meaning-driven: Each sense of a lemma is identified specifically and given a separate entry. When further differentiation at the meaning facet level is called for, each facet of a sense is also described in a full entry (Ahrens et al., 1998). In addition to sense and meaning facet, this system also includes the following information: POS, example sentences, corresponding English synset(s) from Princeton WordNet, and lexical semantic relation such as synonym/antonym, and hypernym/hyponym. Moreover, the overarching structure of the system is managed by a sense serial number, and inter-entry structure is established by cross-references among synsets and homographs.

In the present stage, the Chinese WordNet Team focuses on analyzing middle-frequent words in Sinica Corpus. The reason to choose middle-frequent words as our target ones is that with only three to five senses of a word, we can investigate senses and meaning facets of each word deeply and accurately, which would avoid the simple situation of one sense in low-frequent words, and the complicate situation in high-frequent words with numerous senses. Up to now, 1000 more lemma have been analyzed, and more than 2000 senses have been distinguished. We also published five technical reports to present these results [4]. In the near future, these fruits will be used as a basis for Natural Language Processing or E-learning application.

3. The Design Principle of SSMS

A sense-based lexical knowledgebase with data cohesion must meet three requirements: unique identification of senses, trackability of sense, and consistent sense definitions. SSMS has four devices to supply these requirements.

3.1 The Unique Serial Number

First, each sense or meaning facet is identified by a unique serial number in SSMS. In Princeton WordNet (Fellbaum 1998), each synset is given a unique offset number. However, the offset number does not have any logical structure to it. Hence, although it guarantees unique identification, it is not very trackable. An alternative is to set up a base ontology and assign senses to an ontological node with a unique ID. However, this is not feasible since we cannot pre-designate all the possible conceptual and semantic relations. And if decision is made to encode only certain higher level nodes, the random assignment issue is unavoidable since more than one lexical sense will be assigned to the same node. In our system, the unique serial number of each sense is composed of three segments: the sequential information of when the lemma was processed, the lemma form, and the sense classification code for each lemma (including the meaning facet level). Take bao4 zhi3 (newspaper) for example. bao4 zhi3 has two senses and two meaning facets being distinguished. The lexical entry of bao4 zhi3 is as follows.

Example 3-1: The result of sense distinction for bao4 zhi3 (newspaper)

bao4 zhi3

1Na

1newspaper, 03039218N

2newspaper, 04738466N

2Nanewspaper, 06009637N

Four-level unique serial number is shown as below to express four segments of the unique serial number for one meaning of bao4 zhi3.

bao4 zhi3 (newspaper)

Lemma processing year

03-

Lemma form ID

-0018-

The first sense

-01-

The first meaning facet

-01

The unique serial number for 1st. meaning facet of 1st.sense of bao4 zhi3 => 0300180101

There are four advantages to manage the sense database with unique serial numbers. First, the sequential number not only gives a unique code to each lemma, it also enables a project manager to track work progress more easily. Second, including the lemma in the serial number helps human users to quickly identify the relevant senses. It also facilitate man-machine interface such as in keyword search for senses. Third, it also provides a logical structure of the sense serial number since each lemma represents a small number of possible senses. Lastly, four digits are reserved to identify senses and meaning facets belong to each lemma. The first two digits are reserved for senses and the last for meaning facets. These four digits also allow the minimal space to identify exact sense in the database. For instance, when stipulating a synonym, we can identify it as word0200, which refers to the second sense of a certain lemma. There is no need to repeat the complete sense serial number. The sense serial number enables unique identification and also contributes to trackability.

3.2 The Cross-reference device

Second, SSMS will automatically prompt all possible cross-references. When a lemma is called up for analysis, all existing records that contain this lemma will be prompted. This includes not only lexical semantic relations such as synonyms and hyponyms, it also includes and sense definition that contain this lemma, as well as any explanatory notes that contain this lemma. This feature allows sense relations to be clearly defined, and inconsistencies to be detected. In addition any anomaly in definition or expression format will also be discovered. This process will also help us to narrow down to a set of control vocabulary for sense definition. This feature contributes to both the trackability of senses and consistency of sense definition.

3.3 The concurrent lexical knowledgebase and coupus

Third, SSMS enables parallel concurrent of the lexical knowledgebase and corpus. When a lemma is chosen in the system, all tagged example of that lemma from Sinica Corpus are retrieved. This allows closer examination of how the senses are used and distributed. It also allows automatic selection of corpus example sentences. In turn, when the sense classification is completed, SSMS allows all the corpus sentences to be sense-tagged and returned to merge with the original corpus. In other words, a sense-tagged corpus is being processed in parallel. This feature allows each lexical sense to be trackable to its actually uses in the corpus. It also allows linguist to examine the data supporting each sense classification.

3.4 Linking to the Sinica BOW

Fourth, SSMS is also linked to the bilingual wordnet information at Sinica BOW. Candidate English synset correspondences, including offset number, are shown after a Chinese lemma is chosen. This allows the cross-lingual trackability and consistency.

4. The Implementation of SSMS

There are three major phases in this system implementing. In lemma analysis phase, based on the criteria and operational guidelines proposed in Huang (2003), we distinguish senses and meaning facets for each word. At the same time, Sinica Corpus and WordNet will be referred for POS, examples and English translation. Then through the help of dictionary resources or word mapping by the system, we decide the word relation. The second phase can be divided into two steps. First, we design the schema of the sense management system database for storing the analyzing result of the first phase. Then, as for the data access, we develop the interface to help the Chinese Wordnet Team insert and query from the database. We employ DELPHI tool to design our system interface. Thought the interface, the data in the database also can be exported as Word documents. Last, the third phase of this system implementation is the application phase. Our work project is to build Chinese WordNet web sites for users querying. The development language of these web pages is HTML and ASP. Finally, these web pages in the web sites could be viewed thought web server. By the way of the Internet, people can retrieve data from our sense management database system everywhere at anytime. The flow of the Sinica Sense Management System is displayed in the following chart.

The First Phase

Lemma Analysis Phase

The Second Phase

Sinica Sense

Management System

Implement Phase

The Third Phase

Application Phase-Web

Sites Implement

Work ProjectWork ProjectWork Project

Sense Definition

Facet Definition

Wordnet Synset

Example Sentences

Pos

Word relation

Interface Develop

Database Implement

Chinese Wordnet

Web Sites

Implement

Figure 1: The flow chart of the Sinica Sense Management System.

We can represent the overall framework of SSMS diagrammatically in Fig. 2. As the diagram indicates, the Chinese WordNet Team use SSMS to access database and have electric documents as Word report. Moreover, the users in the internet can browse HTML/ASP pages to query database through and web server.

Database

Interface of

Sinica Sense

Management

System

Word

report

Users

Browser

Query

Web Server

Results

HTML/ASP

Pages

Chinese WordNet Term

Figure 2: The overall structure of SSMS.

4.1 The Schema of SSMS Database in Class Diagram

In the section, we discuss and design the schema of SSMS Database. The Unified Modeling Language (UML) [5]

REF _Ref64909437 \r \h \* MERGEFORMAT [6] is a graphical notation that provides the conceptual foundation for assembling a system out of components from the 4+1 views and nine diagrams. Each view is a projection into the organization and structure of the system, focused on a particular aspect of that system.

We employ the class diagram notations in UML to provide a static view of application concepts in terms of classes and their relationships including generalization and association. Therefore, we only introduce the details about class diagrams as follows.

Class diagrams [5]

REF _Ref64909437 \r \h \* MERGEFORMAT [6][7] commonly contain the following features:

1. A class diagram shows a set of classes and their relationships. For example, the class diagram of the Suppliers-and-Parts database as shown in Fig. 3. The terms with italic style in Fig. 2 indicates the concepts about class diagrams.

Shipments

qry

supply

Subject

Object

Suppliers

sno

sname

status

city

add()

Parts

pno

cno

pname

weight

color

1..*

1..*

1..*

1..*

Computers

cno

cname

1..*

1

1..*

1

Foreign

nation

Domestic

chinesename

assoication

class name

attribute

operation

inheritance

aggregation

Figure 3: A class diagram for the Suppliers-and- Parts Database.

2. A class is a description of a set of objects that share the same attributes, operations, relationships, and semantics. A class mainly contains three important parts: its name, attributes, and operations. We explain these terms as follows:

(a) Class name: every class must have a name to distinguish it from other classes. For example, Suppliers or Parts are class names.

(b) Attribute: an attribute represents some property that is shared by all objects of that class. A class may have any number of attributes or no attributes at all. For example, in Fig. 3, the Suppliers have some attributes such as sno, sname, city.

(c) Operation: an operation is the implementation of a service that can be requested from any object of the class to affect behavior. A class may have any number of operations or no operations at all. For example, in Fig. 3, the class of Suppliers has an operation add().

3. There are three kinds of relationships between classes:

(a) Association: an association is a structural relationship that specifies objects of one thing to be connected to objects of another. For example, in Fig. 3, a line drawn between the involved classes (Suppliers and Parts) represents an association named supply.

(b) Aggregation: an aggregation is a whole/part relationship, in which one class represents a larger thing (the whole class), which consists of smaller things (the parts class). Moreover, an aggregation represents a has-a relationship, which means that an object of the whole class has objects of the part class. To represent an aggregation, an empty diamond will be drawn at the whole class end of the line linking two classes.

(c) Inheritance: An inheritance relationship can be regarded as a generalization (or specialization), which is a taxonomic relationship between a general (super classes) and a special (subclasses) element, where the special element adds properties to the general one and behaves in a way that is compatible with it. Therefore, it is sometimes called an is-a-kind-of relationship. An inheritance relation is represented by means of a large empty arrow pointing from the subclass to the super class. For example, in Fig. 3, Domestic and Foreign suppliers (two subclasses) are a kind of suppliers (the super class).

According to the need of SSMS content and design principle, Fig. 4 is the schema of SSMS database using the concepts of class diagram.

cwn_example

cwn_id

example_sno

example_cont

cwn_note

cwn_id

note_sno

note_cont

cwn_synset

cwn_id

synset_word1

synset_offset

synset_cwnrel

0

*

0

*

0

1

*

0

*

0

*

0

*

0

0

0

0

*

*

*

CWN_Lemma

Lemma_id

CWN_lemma

CWN_pinyin

CWN_zhuyin

CWN_POS

Cwn_id

Pos_sno

Epos2

CWN_Facet

Facet_id

Sense_id

Facet_def

Facet_Domain

Facet_synonym

Facet_antonym

Facet_varword

Facet_upword

Facet_nearword

Facet_relword

CWN_Sense

Sense_id

Lemma_id

Sense_def

Sense_Domain

Sense_synonym

Sense_antonym

Sense_varword

Sense_upword

Sense_nearword

Sense_relword

*

*

Figure 4: The schema of the Sinica Sense database.

4.2 The Function of SSMS

In this section, we will discuss the interface marking for SSMS. The development language of SSMS interface is DELPHI 7.0. Based on the need of program execution, the function of SSMS is shown in Fig. 5. In SSMS, the programs have many functions and these functions can be represented in windows interface and ASP web pages. Sense management and Sense visualization are two major functions in SSMS. In Sense management function, the Chinese WordNet term can insert, update, and delete data including lexical entries, word sense, meaning facet, POS, example sentences, English synset(s), lexical semantic relation. The Sense visualization is SSMS interface and can be divided into two parts: Sense Query and Word Report. The format of SSMS interface is shown in Fig. 6. The SSMS interface provides a user-friendly interface to operate and maintain. For the Sense query function, the users can enter a serial number or a lexical entry for sense querying in SSMS interface. Another function, the Word report, uses development software Crystal Report9 to produce electric documents shown as Fig. 7.

The program of SSMS

Function

Sense ManagementSense visualization

Sense QueryWord Report

1*

The users of

Chinese

WordNet Term

Windows

1*

1

*

*

ASP

Web Pages

1

1*

Figure 5: The class diagram of SSMS function description.

Figure 6: The interface of SSMS.

Figure 7: The format of Word report.

6. Conclusion

In sum, SSMS is not only a versatile development tool and management system for sense-based lexical knowledgebase. It can also serve as the database backend for both Chinese WordNet and any sense-based applications for Chinese language processing.

Online Resources:

Sinica BOW: http://BOW.sinica.edu.tw/

Sinica Corpus: http://www.sinica.edu.tw/SinicaCorpus/

WordNet: http://www.cogsci.princeton.edu/~wn/

References

[1] Ahrens, K., L. Chang, K. Chen, and C. Huang, 1998, Meaning Representation and Meaning Instantiation for Chinese Nominals. Computational Linguistics and Chinese Lnaguage Processing, 3, 45-60.

[2] Booch, G., J. Rumbaugh, and I. Jacobson, The Unified Modeling Language User Guide, Addison-Wesley, 1999.

[3] Fellbaum, Christine. Ed. 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.

[4] Huang, Chu-Ren (ed.), 2004, Sense and Sensibility series: Technical Report 03-01~04. CKIP, Taipei.

[5] Huang, Chu-Ren et al., 2003, Sense and Meaning Facet: Criteria and Operational Guidelines for Chinese Sense Distinction]. Presented at the Fourth Chinese Lexical Semantics Workshops. June 23-25 Hong Kong, Hong Kong City University.

[6] Muller, R.J., Database Design for Smarties: Using UML for Data Modeling, Morgan Kaufmann, 1999.

[7] Oestereich, B., Developing Software with UML Object-Oriented Analysis and Design in Practice, Addison-Wesley, 1999.

CLSW5 Submission Information

Title: The Sinica Sense Management System: Design and Implementation

Authors: Chu-Ren Huang, Chun-ling Chen, Cui-Xia Weng, and Keh-jiann Chen

Affiliation: Academia Sinica

Contact Information:

[email protected] (Huang)

[email protected] (CL Chen)

[email protected] (Weng)

[email protected] (KJ Chen)

TC "" \f C \l "1"

1206144000490001590%610751

[Object language]

15002000 (primitive)

H2O010041

H2O65%100041/

H2OH2O

716H2OH2O

36364230872.535%52158467.562%52457

85488822

300

7445159616364758184858688979899102104106107112114118120124125126127135141144147151153155156157159161164165169173179180181185188191192193198200203204205208211212215216218219221226227229235242244245246254259261262263266267272273279282283284286287292293294295296297298299300

1

-

1

ADLHSTUVW X

2

2

7

44

()

0

3

2700

3ruziruizi

[]11037

1

A

B

C

D

E

F

G

H

AHAH

2

4847

/ /

261

25

3

50

50558168483737

196

25

//

4

6153410.9%2787%

5

1

Hl761132223223273.678%

2

3

4

2000

3000

8000

1980

1996

1998

R.R.K. R.C. 1981

1986

2001 1

1988 8()2001

2001

1999 3

2002 4

1992 3

1990

2001

2000 3

2001A 4

2001B 4

2003 6

12301

[email protected]

0592-218408121814798745786

TC "" \f C \l "1"

100083

E-mail: [email protected]

Representation and Computing of Cognate RoleFrame

Zhendong DongQiang DongChangling Hao

Research Center of Computer & Language Information Engineering, CAS, Beijing, 100083

E-mail: [email protected]

Abstract: Borrowing from the term of cognate object, we use Cognate RoleFrame to reveal a kind of semantic relations between nouns and verbs like hatred and hate. By Cognate RoleFrame, we mean that the noun has the same role frame as its correspondent verb. In HowNet we use CoEvent as the identifier to describe all the nouns of Cognate RoleFrame. We demonstrate two HowNet-based tools to evaluate our treatment of Cognate RoleFrame.

Keyword: HowNet, valency grammar, event role, event role frame, cognate RoleFrame

1.

WordNet[2]

2. Role frame

WordNetVerbNetFrameNetWordNet

agent, possession, source, cost, beneficiary

agent, instrument, partner, cause

agent, patient, instrument, PatientValue={dirty|}

agent, patient, instrument, PartOfTouch

815

815

{fight|} {HaveContest|:agent={human|}{group|->},instrument={weapon|},

partner={human|}{group|->},cause={*}}

(typical actor)selectional restriction

3. (Cognate role-frame)

live a happy lifesleep a sound sleeplifesleepcognate object[1]

experiencer, degree, contentexperiencer, degree, content

4.

Cognate role-frame conceptCognate role-frame word

---

5.

4223805.6%

1

2

3

4

5

6

7

7

6.

6.1

123

6.2 -- CoEvent

CoEventCoEventCognate_Roleframe_Event

W_C=

W_C=

G_C=N

G_C=V

E_C=

E_C=

W_E=love

W_E=cherish

G_E=N

G_E=V

E_E=

E_E=

DEF={emotion|:CoEvent={like|}}

DEF={like|}

1234

1/

2DEF={fact|:CoEvent={fight|},domain={military|}}

3{fight|}{HaveContest|:agent={*},instrument={*},partner={*},cause={*}};

{HaveContest|:coagent={*},instrument={*},cause={*}}

4{fight|} {HaveContest|:agent={human|}{group|->},

instrument={weapon|},

partner={human|}{group|->},cause={*}};

CoEventDEF{fight|:domain={military|}}

W_C=

G_C=N

E_C=

W_E=environmental sanitation

G_E=N

E_E=

DEF={fact|:CoEvent={clean|:patient={Environment|:host={entity|}}}}

W_C=

G_C=N

E_C=

W_E=table tennis tournament

G_E=N

E_E=

DEF={fact|:CoEvent={compete|},domain={TableTennis|}}

6.3

0.009091

0.061538

0.380000

0.450000

DEF={emotion|:CoEvent={love|}}

DEF={fight|:domain={military|}}

DEF={FondOf|}

DEF={fact|:CoEvent={love|},modifier={first|}}

DEF={fact|:CoEvent={love|}}

[1] 1980pp60-61

[2] 1996pp29-58

From Frame to Subframe: Collocational Asymmetry in Mandarin Verbs of Conversation TC " From Frame to Subframe: Collocational Asymmetry in Mandarin Verbs of Conversation " \f C \l "1"

Mei-Chun Liu Chun Edison Chang

Graduate Institute of Linguistics and Cultural Studies

National Chiao Tung University, Hsinchu 30050, Taiwan

[email protected] [email protected]

Abstract

This paper examines the collocational patterns of Mandarin verbs of conversation and proposes that a finer classification scheme than the flat structure of frames is needed to capture the semantic granularity of verb types. The notion of a subframe is introduced and utilized to explain the syntactic-semantic interdependencies among different groups of verbs in the conversation frame. The paper aims to provide detailed linguistic motivations for distinguishing subframes within a frame as a semantic anchor for further defining near-synonym sets.

1. Introduction

As the importance of lexical semantic research grows with the need of representing human knowledge, various lexically-based information networks have been proposed, such as the comprehensive work of differentiating word senses and sense relations in WordNet (Miller et al. 1990), the ontological hierarchy in SUMO (Das et al 2002, Pease et al 2002, Niles and Pease 2003), and the more linguistically-motivated model of FrameNet (Baker, Fillmore and Cronin 2003). While all providing valuable information regarding certain aspects of word meaning, the first two are constructed in a more intuitive manner. FrameNet, on the other hand, is based on the theory of Frame Semantics (Fillmore and Atkins 1992) and attempts to define meaning within a set of shared knowledge or background information, that is, a frame. However, as pointed out in Liu and Wu (2003), if meaning is anchored in the notion of frame, then we need independent motivations for postulating different frames. What seems to be lacking in the current framework is a cognitive linguistic explanation as to how the individual frames are distinguished and interrelated? In other words, what are the semantic relations among all the frames? To answer the question, Liu and Wu (2003) proposed an overarching conceptual schema which incorporates all the core frame elements (FEs) and accounts for the interrelationship among various frames in the communication domain. By providing a cognitive schema as a macro-structure, the distinction of frames is then well-motivated. However, there still remains another issue at a micro-level, as indicated by Liu and Wu (2003):

Within each frame, a wide range of verbs are found and one would wonder how these verbs differ from each other. For example, English verbs speak, discuss, quarrel, and gossip, are all found in the Conversation Frame, but obviously, these lemmas encode something different. What are the differences ? There seem to be frame-internal features that also need to be characterized.

In this paper, we will show that within each frame, a more elaborated classification system is needed to account for a variety of verb behaviors. The notion of subframe is introduced and utilized to capture the syntactic-semantic interdependencies observed in the corpus data.

2. Motivation for the Conversation Frame

Compared with the other communication frames, the Conversation Frame is unique in that it profiles the property of reciprocality or two-way communication. Verbs in the Conversation frame encodes reciprocal events where participants are involved as Interlocutors, such as tan talk , tanlun converse, ta-lun discuss, shangliang discuss, xietiao negotiate, xieyi negotiate, goutong communicate, chaojia quarrel, zhenglun argue, xianliaochat, and liaotian chat, etc. These verbal events highlight part of the conceptual schema as discussed in detail in Liu and Wu (2003) and represented in (1) below. The core Frame Elements (with bold fonts and grayed areas) help define the frame as a bidirectional communicative activity conducted by both the Speaker and Addressee as Interlocutor 1 and Interlocutor 2 (or Interlocutors), via a certain Medium, on a given Topic.

(1) Conceptual schema for the Conversation frame:

Encoding Decoding

- ---- Noise----- -----Noise-----

The Conversation frame proves to be well-motivated in relation to other communication frames, as most of its verbs share the same conceptual schema and realizing similar constructions in coding the core frame elements. There is, however, a fundamental question to be answered, that is, within the Conversation frame, are there semantic subtypes that are also syntactically motivated?

3. Motivation for Distinguishing Subframes

As mentioned above, verbs of conversation involve a set of core Frame Elements: Interlocutor1, Interlocutor2 (or combined to Interlocutors), Topic and Medium. In most cases, the default Medium is face-to-face when not overtly mentioned, as in the following sentence: //// They are talking/discussing/exchanging views about/argue about/chat about the meaning of life. Intuitively, these different lemmas seem to encode differences in manner, formality or purpose, while sharing the same topic. But what are the grammatical correlates to the lexicalized meaning differences? When looking closely at their collocational patterns, we found that there are asymmetrical distributions in five respects: 1) V+V pattern: some may occur with a preceding verb such as jinxing proceed or dacheng achieve; 2) V+NP pattern: the core element Interlocutor2 may sometimes be coded as the direct object; 3) Metonymic subject: the subject of the event may be inanimate entities taking the role of Interlocutors by the principle of metonym; 4) V+ Complement: some verbs take a postverbal complement or adverbial adjunct denoting effect evaluation, such as chenggong succeed or shibai failand 5) In terms of distribution of grammatical functions, they show different frequencies of nominalization. Based on the five criteria, verbs of conversation can be further divided into 5 groups with corresponding sets of near-synonyms. We will address the syntactic-semantic interdependencies revealed by each pattern in the following sections.

3.1. V+V Pattern: with the preceding verbs jinxing () or dacheng ()

The use of the light verb entails a formal register and encodes a process or atelic event. It tends to occur with an activity verb compatible with the formal register and involving a durative process, as shown in (2):

(2)a. //

b. * /////

Below is the distributional tendency found in Sinica Corpus:

(3) Percentage with Jingxing ()

V1 V2

Other Verbs (/////)

4% (3/83)

6% (25/419)

0%

Another verb dacheng () achieve is also compatible with some conversation verbs, requiring a formal register but encoding a telic event. The verb dacheng is only found with the nominalized form of such verbs as//, i.e., activity verbs entailing a semantic endpoint with an incremental theme, as shown in (4):

(4) a. //

b. * / /////

The co-occurrence with the preceding verbs jinxing () or dacheng () serves to distinguish the conversation events in terms of its pragmatic mode (formal vs. informal) and event types (telic vs. atelic).

3.2 V+NP pattern: Interlocutor2 as the Direct Object

Another pattern that sets the verbs apart regards the semantic role of the object NP. While most verbs can only take the Topic as the direct object, some verbs may encode Interlocutor2 as the direct object without adding the associative marker //, as shown in (5):

(5) a. []Intl1 []Intl2

b. [] Intl1 [] Intl2

This suggests that with the verbs/, the co-participant, Interlocutor2, may be viewed as the undergoer or the affected target of the event. Among the sentences of/followed by an object NP, an average of 23 percent may take an Interlocutor 2 as the direct object in Sinica Corpus:

(6) Percentage with Interlocutor2 as DO

DO V

/

Other Verbs (//////)

Interlocutor2 as DO

23%(28/123)

0%

3.3 Use of Inanimate Subject

Interlocutors in the conversation events are by default human participants. However, some verbs may take inanimate subjects (place or institute names) as Interlocutors via metonymic extensions from institute/building to human organization:

(7)a. ///

b. */*

The application of metonymy tends to be associated with verbs that comply with the formal register requiring also a formal, non-personal topic (e.g., public affairs). Sinica corpus shows that verbs with marked manners tend not to be used with metonymic subjects:

(8) Percentage with Inanimate Subject:

V

Subj. type

Other Verbs

(///)

Inanimate Subject

2%

(4/191)

6%

(5/83)

25%

(21/85)

0%

3.4 Postverbal Complement with Effect Evaluation

Among the conversation verbs, only the negotiate verbs (e.g., , ) may collocate with effect evaluation complement such as successfully andfailingly, as shown below with examples and percentage rate from the Sinica Corpus:

(9) a. []

b. []

(10) Percentage of result evaluation complement:

Comp. V

Other Verbs (//////)

/

12.5% (2/16)

0%

The co-occurrence with effect evaluation complement indicates that the semantics of the negotiate events encode some kind of an effect or result that is being sought by the negotiation process. This also implies that the two-way communication in the event of/is a solution-seeking process which is semantically bounded and may be evaluated as to whether the solution or purpose has been achieved.

This pattern also correlates with the use of dacheng achieve, as mentioned above, in signaling an evaluation of the attainment of the desired result.

3.5 Frequency of Nominalization

Some groups of verbs tend to be nominalized more frequently than the others. Comparing the high-frequency verbs and their distributions over grammatical functions, we see clear skewing in nominal uses:

(11) Distribution of Predicate vs. Nominal Uses

Func. V

Predicate

97%

(680/701)

52%

(83/161)

55%

(415/1013)

76%

(123/162)

94%

(134/142)

Nominalized

3%

(21/701)

48%

(78/161)

45%

(598/1013)

24%

(39/162)

6%

(8/142)

Nominalization serves to change verbs to event nominals that may be referred to as a quantifiable entity. Nominalization is also highly correlated with the formal register of written texts.

4. Subframes as an Anchor for Near-synonyms

The asymmetrical distributions of the conversation verbs over different collocational associations clearly suggest that verbs can be further divided into subtypes. Although sharing the same conceptual frame, subclasses of verbs show distinct patterns of syntactic-semantic interdependencies that may serve as the basis to further define near-synonym sets. These subtypes may be viewed as anchored in different subframes. Below is a summary of the collocational patterns associated with the 5 subframes within the conversation frame:

(12) Collocational Patterns associated with the Conversation Subframes

CP

Subtypes

+V

+V

Intl2 as DO

Inanimate

Subject

Complement

/

[+Nom]

1. Converse: /

No

No

No

Yes

No

Low

2. Discuss:

Yes

No

No

Yes

No

High

3. Negotiate

/

Yes

Yes

Yes

Yes

Yes

High

4. Quarrel:

/

No

No

No

Yes

No

Mid-High

5. Chat:

/

No

No

No

No

No

Low

Based on the distributional variations of collocational patterns, we can group all the other conversation lemmas into the five subframes:

1) Converse subframe: , , , , , , , ,,,,

2) Discuss subframe: , , , ,

3) Negotiate subframe: , , , , , , ,

4) Quarrel subframe: , , , , ,

5) Chat subframe: , , , , .

5. Concluding Remarks and Theoretical Implications

With the proposal of subframes within the theoretical construct of Frame Semantics, verb meanings may be defined with finer distinctions that are syntactically motivated. However, further fine-grained semantic distinctions are still needed to differentiate near-synonyms within each subframe, such asvs. or vs., etc. It is exactly at the subframe level that we may anchor all the near-synonym sets as closely related. In sum, to fully represent the meaning relations among verbs, wed like to propose the following classificational scheme for representing verb meanings:

Domain ->Frame -> Subframe -> Near-synonym Set -> Lemma

The five-layered structures allow verbs to be represented in a frame-based semantic hierarchy with detailed lexical information to further disambiguate near-synonyms.

References

Baker, Collin F., Charles J. Fillmore and Beau Cronin (2003) The Structure of the Framenet Database. International Journal of Lexicography 16(3).281-296.

Das, Subrata, Kurt Shuster, and Curt Wu. 2002. Ontologies for Agent-Based Information Retrieval and Sequence Mining. Proceedings of the Workshop on Ontologies in Agent Systems (OAS02), held at the 1st International Joint Conference on Autonomous Agents and Multi-Agent Systems,.Italy, July 15-19.

Fillmore, Charles J., and Atkins, Beryl T. 1992. Toward a Frame-Based Lexicon: The Semantics of RISK and Its Neighbors. Frames, Fields, and Contrasts, ed. by Adrienne Lehrer and Eva Feder Kittay. 75-102. Hillsdale. New Jersy: Lawrence.

Kennedy, C. and Levin, B. 2002. Telicity Corresponds to Degree of Change Handout to Speech at Georgetown University.

Liu, Mei-Chun and Yiching Wu. 2003. Beyond Frame Semantics: Insight from Mandarin Verbs of Communication. Paper presented at the 4th Chinese Lexical Semantics Workshop. City University of Hong Kong, Hong Kong. June 22-July 11. (http://icl.cityu.edu.hk/conference/4CLSW/BIG5/home.htm

Miller, A., R. Beckwidth, C. Fellbaum, D. Gross, K.J. Miller. 1990. Introduction to WordNet: An on-line Lexical Database. International Journal of Lexicography. 3.235-244.

Niles, Ian and Adam Pease. Linking Lexicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology. Proceedings of the 2003 International Conference on Information and Knowledge Engineering (IKE03), Las Vegas, Nevada, June 23-26, 2003.

Pease, A., Niles, I., and Li, J. 2002. The Suggested Upper Merged Ontology: A Large Ontology for the Semantic Web and its Applications. Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web, Edmonton, Canada, July 28-August 1, 2002.

Website Addresses:

FrameNet: http://www.icsi.berkeley.edu/~framenet/

HowNet (): http://www.keenage.com/html/c_index.html/

Sinica Corpus (): http://www.sinica.edu.tw/SinicaCorpus/

SUMO Ontology: http://ontology.teknowledge.com/

WordNet: http://www.cogsci.princeton.edu/~wn/

The Academia Sinica Bilingual Ontology WordNet (BOW): http://bow.sinica.edu.tw/

TC "" \f C \l "1"

Li & Huang (1999)52716780014.8%(1a)(1b)(1c)(1d)(1e)

(1)a.

b.

c. d.

e.

(Word Sense Disambiguation)

Li & Huang (1999)Lam et al. (1997)45.5%(1999)52.13%Chen et al. (1999)

Choueka et al. (1983)(2002)Chuang (2003)setAhrens et al. (2003)MARVS(sense)(meaning facet)

(1)(1d)(1e)

(2)a.E-Mail

b.

(3)a.b.

(4)a.

b.

(2003)WASPS(http:// wasps.itri.bton.ac.uk/)(word sketch)

1.

2.

3.

1334

DOIORCDCIO

DO

-

IO

-

-

-

-

-

RC

DC

DO(5a-b)(6a-b)DCDOIOIO(7a)7(b)RC(8a)(8b)

(5)a.

b.

(6)a.b.

(7)a.

b.

(8)a.

b.

(9)a.

b.

(1d-e)

1334

(100%)

629

(47.16%)

530

(39.73%)

124

(9.3%)

37

(2.77%)

9

(0.67%)

5

(0.37%)

28.26%

8.41%

19.56%

0.29%

+L+VP

3.45%

+IO

10.49%

+DO+C

0.22%

+L+VP

2.5%

+IO+DO

5.7%

0.07%

+L

1.35%

+IO

1.8%

+L

0.9%

+IO+DO

1.35%

+L+DO

0.07%

+IO+VP

0.15%

+IO+VP

0.07%

+IO

0.07%

+DO+L+IO

0.07%

71.74%

71.74%

38.78%

20.16%

9%

3%

0.66%

0.37%

+C

35.99%

33.36%

1.28%

0.68%

0.38%

0.22%

0.07%

+DO

16.49%

2.2%

13.1%

0.82%

-

0.37%

-

+DO+VP

8.41%

0.68%

0.9%

5.4%

1.43%

-

-

4.35%

1.35%

1.95%

0.68%

-

0.07%

0.3%

+DO+L

1.88%

0.75%

-

1.13%

-

-

-

+IO+VP

1.65%

0.3%

1.35%

-

-

-

-

+DO+IO

1.50%

0.07%

1.43%

-

-

-

-

+VP

0.74%

-

-

0.22%

0.52%

-

-

+DO+L+VP

0.51%

-

-

0.07%

0.45%

-

-

+DO+IO+VP

0.22%

0.07%

0.15%

-

-

-

-

(10)a.

b.

c.

d.

e.f.

(11)a.

b.

c.

d.

(12)a.b.

c.

d.

(13)a.

b.

c.

d.

e.

(14)a.

b.

(15)a.E-Mail

b.

(16)a.b.

(17)a.

b.

c.

629 (100%)

530 (100%)

124 (100%)

37 (100%)

+C 70.6%

+DO 33%

+DO+VP 58%

+DO+VP 51.4%

+L+VP 7.3%

+IO 26.4%

+DO+L 12%

+VP 18.9%

+L+VP 5.2%

+IO+DO 14.3%

+DO 9%

+DO+L+VP 16.2%

+DO 4.6%

4.9%

7.3%

+C 13.5%

2.9%

+IO 4.5%

+C 7.3%

+L 2.9%

+DO+IO 3.6%

+DO+C 2.4%

+L 1.9%

+IO+DO 3.4%

+VP 2.4%

+DO+L 1.6%

+IO+VP 3.4%

0.8%

+DO+VP 1.4%

+C 3.2%

+DO+L+VP 0.8%

9 (100%)

+IO+VP 0.6%

+DO+VP 2.3%

+DO 55.6%

+DO+IO 0.2%

+DO+IO+VP 0.4%

+C 33.3%

+DO+IO+VP 0.2%

+IO+VP 0.4%

11.1%

+DO+L+IO 0.2%

IO 0.2%

5 (100%)

+IO+VP 0.2%

80%

+L+DO 0.2%

+C 20%

1.

(18a)(18b-c)(19a)(19b)

(18)a.

b.

c.

(19)a.

b.

(20)a.

b.*

(20)

2.

26.4%4.5%14.3%3.4%

(21)a.b.

3.

(22a)(22b)(19a)(21b)

(22)a.

b.

4.

(23a)(23b)

(23)a.

b.

[location]

Goal [destination]

Goal [organization]

Goal [direction]

[human]

Goal [departure]

Goal [destination]

Goal [activity]

[receipient]

[life]

(24)a.b.

c.

(25)a.b.

(26)

(27)a.

b.*

c.

(28)a.

b.

(29)a.

b.

c.

(30)

2001

2002CLCLP 7.2, 77-88.

1980

1999

2001

2003

20031220

1983

Ahrens, K., C.-R. Huang, and Y-H Chuang. 2003. Sense and meaning facets in verbal semantics: A

MARVS perspective. Language and Linguistics 4.3: 469-484.

Chen, H.-H., G.-W. Bian, and W.-C. Lin. 1999. Resolving translation ambiguity and target polysemy in

cross-language information retrieval. CLCLP 4.2, 21-38.

Choueka, Y., and S. Lusignan. 1983. A connectionist scheme for modeling word sense disambiguation.

Cognition and Brain Theory 6.1, 89-120.

Chuang, Y.-H. 2003. Sense Distinction of Verbs in English and Mandarin Chinese: An Analysis of the

Verbs Set and Bai3. MA thesis. National Taiwan University.

Lam, S.-S., K.-F. Wong, and V. Lum. 1997. LSD-C A linguistics-based word sense disambiguation

algorithm for Chinese. Computer Processing of Oriental Languages 10.4, 409-422.

Levin, B., and M. Hovv. 2001. What alternates in the dative alternation? CSSP.

Li, J., and C. Huang. 1999. A model for word sense disambiguation. CLCLP 4.2, 1-20.

TC "" \f C \l "1"

[email protected]

[email protected]

[email protected]

[email protected]

2347

1

1998

1988119,50713,768

7,1052,347

2

2347

2.1

1

2

2.2

10

1

()

616

31.21

437

22.14

381

19.30

182

9.22

145

7.35

107

5.42

73

3.70

16

0.81

12

0.61

5

0.25

31.2110NPNPVPVP

2.3

2.4

2

(%)

146

23.70

73

11.85

70

11.36

54

8.77

37

6.01

36

5.84

35

5.68

25

4.06

20

3.25

17

2.76

11

1.79

9

1.46

8

1.30

7

1.14

7

1.14

6

0.97

5

0.81

4

0.65

4

0.65

4

0.65

3

0.49

3

0.49

3

0.49

2

0.32

2

0.32

2

0.32

2

0.32

2

0.32

2

0.32

2

0.32

2

0.32

1

0.16

1

0.16

1

0.16

1

0.16

1

0.16

1

0.16

1

0.16

1

0.16

1

0.16

1

0.16

1

0.16

1

0.16

1

0.16

14623.707311.857011.36

3

386

371

279

85

58

35

19

(%)

31.31

30.09

22.63

6.89

4.70

2.84

1.54

384.02

2.5

48290

3 3.1

1998

(interaction)(Tenor) (Vehicle)2004 A B B(target domain)A(source domain)(ground)

3.2

A B B A

(Love is a journey)

A B

Target domain Source domain

Tenor Vehicle

(

F: A ( B

F: B(A

3.3

1998(dead and buried)(inactive metaphor)

1998

1)

[]

[]

[]

2)

[]

[]

[]

[]

[]

[]

[]

3.4

2003

1)

2)

3)

1

(

2

(

3

(

4

(

5

(

6

(

2

[]

[]

[]

3.5

(handles)

4

(1988)

(2004)http://www.skycn.com

(1998)6P10-19

(2003)352P119-121

(2003)4P19-2420042

(2004)

(2000)

(2002)

(1985)

TC "" \f C \l "1"

[email protected]

[email protected]

X

1.

1.1

(1980[2003])1998

(1980[2003])

1998

(1980[2003])1998

1.2

(2004)(1)

(1)//

(2)

(2)//

( 1997)(1)-(2)

(2000)(3)

(3)()

(4)

vadj4217117

1612(V)(5)

(5)()

(6)-(8)

(6)

(7)

(8)

(8)

A

1

(B)

2

C

3

D

4

42

66

AB(C)(D)(9)-(13)

(9)()

(10)()

(11)()

(12)()

(13)()

2.

2.1-2.3

2.1

(2002)

(2002)

A

B

C

D

(2002)

(14)(A)

(15)(B)

(16)(C)

(17)(D)

2.2-2.3

2.2

(18)()

(19)()

(18)////(19)

(20)()

(20)

(21-23)

(21)()

(22)()

(23)()

(1992)

(24)-(25)

(24)()

(25)()

15(26)

(26)

24(27)(28)

(27) ()

(28)

(27)(28)D4(17)(29)(30)

(29)()

(30)

(31)()

2.3

3.

3

7

1

adv

38

10

12

3

adv

()

1

adv

6

1

adv

1

5

2

v

1

11

v

1

adv

3

adj

17

17

3

2000

66

15

16

0

42

0

4

;

1

2

adv

44

5

4

3

3

15

adv

5

1

adv

4

adv

3

1

138

56

6

4

3

5

15

1

0

A.

B.

C.2

D.

. 1997.

2004

. 2003.2:1-561-73

. 2002.

. 1998.

1992.

. 2003.19805

(D-Type Theory ) TC " (D-Type Theory )" \f C \l "1"

(czs,zhouq)@s1000e.cs.tsinghua.edu.cn

.-

1

Montague[11]Categorical Grammar [12]ete te e tNPSV1V2MontagueCategorical Grammar

[2] LOC(),TIM(),IND(),RELn (n),SIT(),INF(),TYP(),PAR(),POL()

/ [2][1][3]/[1][3]sssss /thesaurus(sort,class) (case, theme)[2]

(G-type),(D-type/O-type/)[7]

[7]/([7])//-//

/-2.3. .4.(). 5..

2.

2.1

/

[7]

4

(

2.2

r,l:Loc,I1,,In;p.

/--

. //((

3.

3.1

3.1. T1,T2

xx : T1x : T2T1T2T1T2T2T1.

. T1T2T1T2 T1=T2. T1=T2((T1=T2)T1T2.

T1>T2T1T2T1T2. T2T2T1T2T1T2T1=T2T1,T2T1T2(

( (( , )

3.2T1,T2T3T1 T3,T2 T3T1,T2T1T2T3T1,T2

(

3.3.T,T{Ti | TiT},T{Ti | TTi},T(

T (T, )TTT

.3.4.T1,T2T3T3 T1,T3 T2T1,T2T1T2T3T1,T2T1#T2.

(

3.5. xxVx{T| x : T}x(

xVx/

xVx =[x]. xx

(( ,)((,, /

3.2

A1,A2,,An. [9][m_][m_][m_][m_][m_][m_][m_][m_][m_][9]

3.6.`T1,T2T1T2,T1 T2,T1+T2,T1(T2

( : T1T2a1 : T1,a2 : T2 (=

( : T1T2a1 : T1,a2 : T2 (={a1,a2}

( : T1 + T2( : T1 ( : T2

{a1,a2}

( : T1 ( T2,((( : T1), ( : T2./*, ( : T1,( : T2. */

(

T1T2Tn,T1T2Tn,T1(T2(( Tn ,T1+ T2++ Tn.T1,T2,,Tn()

( :: T1T2a1 :: T1,a2 :: T2 (=

( :: T1T2a1 :: T1,a2 :: T2 (={a1,a2}

T1,T2 ,T1,T2T1T1, T2 T2T1T2 T1T2,T1T2TT2, T1+ T2 T1+ T2T1 ( T2 T1 ( T2.

(

nn(TT = T1T2TnT1T2Tnn/

T=T1T2 Tm, (TTi(i=1,,m)m-T=T1 + T2+ + Tk(TTk(T(T1 + (T2+ + (Tk, (T=(T1 + (T2+ + (TkT=T1 ( T2 ( ( Tk(TTk(T(T1 ( (T2 ( ( (Tk, (T=(T1 ( (T2 ( ( (Tk.

3.7. T1,T2T1 T2T1T2f :T1 T2f:a ( ba:T1 b:T2.(

3.3

/[7]

[7]

.

((()( (:T((()((T)(T((T).( = [(].()c_(T)[c_(T)]TT=[m_][m_]set[m_]set[m_]((_T.

( =r,(;p,( r((:T, TrrR_T(r),Tr. : r-T = R_T(r)TrrT

(r,(,p ((r,( ;p)( D([r]TP), PpD(),D(rTP)

-

R_T(r)rR_T-R_T=(R_T(r)|r:REL)= (Tr|r:REL)r:RELr

.((()(::T1,((()[(](T)=T1(T).

R_TR_T(r)-F_T(()(F_T(()T(. F_T

F_T=(F_T(()|(:FUN)=(T(|(:FUN)

.x|(,I)I[7]I

C=(,I),CTC

TC=T1T2TmD(T1)D(T2)D(Tn)

(#)

T1,T2,,Tm, D(T1),D(T2),,D(Tn ) I

xT,x|(,I)[x|(,I)] = T|TC,TC(#)

[7]

x|C[x|C].

C = {(T1,x1;1,(T2,x2;1},x=x1,x2;

[x1,x2|C]= T1T2, [bag(x1,x2|C)]= T1T2;

C = {(T1,x;1or(T2,x;1}, [x|C]= T1+ T2;

C = {if(T1,x;0then(T2,x;1}, [x|C]= T1( T2

-

r([r],[(]r=[r], (=[(].

3.8.T

(T(T((T)((T),([(](=T( ( (T()(=T( (( T()

rTr TD(rTP)PD(T).

(1(T1),, (m(Tm), D(T1),,D(Tn), T(1(T1)(m(Tm)D(T1)D(Tn), T|(1(T1)(m(Tm)D(T1)D(Tn)

(

1 (T1,T2,T( (()T1T2,((T1)((T2);T1=T2

2 (1,(2, (1>>

72AAAABABADAD> > >60%

7314413014130

BB5004>HH3509>EE2609>DD2556>EB2530>ED2105>HD1979>HB1381>BD1041>IB1026>II966>FB912>DB842>EA822>EH794>ID770>CC719>IH711>AA670>HI578>CB573>BK572>GG560>JD539>KH470>DA469>BC466>HE453>HA446>CD444>FF442>EI437>FH395>IE390>HJ389>DH377>KD375>HC365>DC344>GD331>BE309>AD301>JB289>KE284>HF279>JK279>EC272>EK267>KK264>FI263>EL262>JJ261>HK258>BH243>BA240>HG239>EG234>IF229>GH215>KJ208>DE208>IC203>BI203>EF202>KI199>FD179>JE175>EJ174>JI167>KG167>DK159>KC158>FK155>KB153>FE151>CH144>IJ138>IK134>GE130>IA121>DI120>DG108>GK107>KA102>JC100>AH99>AB99>GI95>JG93>JA91>FJ88>CA86>CE86>AK80>FC74>IG72>AE71>CI67>GJ66>KF64>GB64>GA56>DH55>GF54>AC50>FG48>FA46>JF40>BG28>BJ25>DF25>CJ24>AI23>GC19>CF17>AG16AJ16>CG14>LK6>LL5>LE4HL4AF4>LD3>KL2LG2>GL1LJ1IL1LI1

BBHHEEDD

74B21189>H17242>D16025>E15685>I7928>K5381>C5281>A4604>F4223>J3634>G3339>L5612BL

75E10763>H9928>B8267>D5325>I4788>K2808>F2750>C2296>J2176>G1735>A1433>L3774EEB12916> D10697> H7314>E4918>A3167>I3139>C2985>K2573>F1473>J1458>G1604>L1974BDHEA53%27%13%92%

76A

[1] 200331-10

[2]200272129-142

[3]

[4] 2002

[5]199619962003

19884

1996

2001

19853

1985

19982

20001

20016

1991

19957

1999,4

TC "" \f C \l "1"

100083

[email protected]

semantic componentsememe2050(lexeme)(semantic component)(semantic marker )(sememe) (semantic feature)

(Jakobson)(Hjelmsev )Katz ,Lakoff, McCwley ,Ross Dowty(2000)

S

Cause Zhangsan S

Become S

Not S

Alive Lisi

S

Cause Zhangsan S

Become S

Not alive Lisi

S

Cause Zhangsan S

Become Not alive Lisi

S

Cause become not alive zhangsan Lisi

KILL

S

KILL ZHANGSAN LISI

KILLKILL(BECOME)DEAD(BECOME NOT) ALIVE

1968

human,adult,male

1)

(2002)

+++++

+++

converge+++

++++++

+(-)+(-)

++

+++

converge++

2

1957

(Colorless green idea sleeps furiously)

++++++

3

2000

a1 2 3 123

b1 342 1234

c3 4 1234

1 2

abc

+++

+++

+++

++{}+

++{}

++{}+{}

{} {}{}

++++++ ++ +

+++++++

{} {}{}

{} {}{}

+++++{}+++{}

++++{}+

{} ++{}+

abb+a

{} ({}{}

{} {}{}

++++{}+++{}

1

{}++{}+++{}

{}( {}({}

2000

2000

2002

19961http://yywz.jhun.edu.cn/xide.htm

Feature Representations and Logical Compatibility between Temporal Adverbs and Aspects TC " Feature Representations and Logical Compatibility between Temporal Adverbs and Aspects " \f C \l "1"

Shih-Min Li, Su-Chu Lin, Keh-Jiann Chen

CKIP, Institute of Information Science, Academia Sinica, Taipei

{shihmin, jess}@hp.iis.sinica.edu.tw; [email protected]

Abstract

In this paper, we propose clear-cut definitions to distinct temporal adverbs and provide descriptive features for each class of temporal adverbs. By adopting the corpus-based approach and measuring time points in temporal axis, the temporal adverbs listed in Lu & Ma 1999 are revised and reclassified into four main classes namely, time, frequency, duration, and time manner. The descriptive features suffice to discriminate temporal relations and to predict logical compatibility between temporal adverbs and aspects.

1 Introduction

There are about 130 temporal adverbs in Mandarin Chinese. Lu & Ma classify the temporal adverbs into two groups, speaking-time related adverbs (abbr: ST-related adverbs, ) and reference-time related adverbs (abbr: RT-related adverbs, ). The ST-related adverbs consist of 27 temporal adverbs, which are subdivided into three subclasses. In the class of RT-related adverbs, 104 temporal adverbs are listed and subdivided into 18 subclasses. Lu and Mas subdivision of temporal adverbs based upon aspects of situations. However, the subdivision is vague and the definition is ambiguous. For example, cengjing , ceng , yeyi and yejing are grouped into two different subclasses of ST-related adverbs. The former two cengjing and ceng are grouped into the same subclass, which express the actions or situations have been existed or happened before speaking time. The later two yeyi and yejing are grouped into the same subclass, which indicate the actions or situations have been completed or occurred. In fact, it is difficult to differentiate the actions or situations that have been happened from those completed, especially when the situation type is achievement with SHORTLY-PRECEDE(t1,t2) or NEARLY-EQUAL (t1,t2). Moreover, temporal adverbs may not have the same syntactic behaviour even though they are classified into the same subclass. For instance, the ST-related adverbs cong , conglai , zhijin , xianglai , sulai , lilai , su , and yixiang are grouped into the same subclass. When co-occurring with aspect markers le , guo and zhe , cong, conglai and zhijin are incompatible with le and zhe; however, xianglai, sulai, lilai, su and yixiang are incompatible with le and guo. The cause of the difference in the compatibility of temporal adverbs with aspects will be also under discussion.

In this paper, we propose clear-cut definitions and provide descriptive features for each subclass of temporal adverbs. The descriptive features help to define temporal relations and to predict the compatibilities between temporal adverbs and aspect markers.

2 Literature Review and Methodology

To make a clear-cut differentiation, we use the Academia Sinica Balanced Corpus (Sinica Corpus) and adopt the corpus-based approach to analyze Mandarin Chinese temporal adverbs. Time points in temporal axis will be used to define the temporal relations of the temporal adverbs in Lu & Ma 1999.

Smith (1991) discusses aspectual systems in language. She illustrates each situation type and viewpoint type with temporal schema. Below are the temporal schemata of Mandarin Chinese aspectual markers le, guo and zhe, which are represented by symbols I, F, F+1, . and /.

(1) Temporal schema fore the le Perfective (Smith 1991: 348)

IF

// (RVC)

(2) The Mandarin guo perfective viewpoint (Smith 1991: 353)

I .F F+1

/ /

(3) The zhe viewpoint (Smith 1991: 363)

I..

////State

Klein (1994) points out five temporal features and notes TT, TU and TSit. TT, topic time, is the time span to which the speakers claim on the occasion is confined. TU is time of utterance, which is the time at which the utterance is made. TSit is time of situation, which presents the time at which event occurs.

In addition to Smiths and Kleins temporal terminology of time points and time interval, the event modules in the framework MARVS (Module-Attribute Representation of Verbal Semantics) proposed by Huang et al (2000) can be also applied to analyze temporal relations of temporal adverbs. Event modules are the basic building blocks of the event contour. Five event modules stand alone or in combination, including Boundary, Punctuality, Process, State and Stage. The event module Boundary is defined as an event module that can be identified with a temporal point and must be regarded as a whole (including complete Event), which is adopted in this paper to define the notion of Boundary Point.

Yang & Bateman further discuss the semantic temporal relations of aspect system and propose principled semantic conditions for aspect combination. In their opinion, Chinese aspect system is actually composed of both aspect morphemes (-le, -zhe, -guo4, etc.) and aspect adverbials. Moreover, they propose that the Chinese aspect system has basically seventeen simple primary aspect forms. These simple primary aspect forms belong to the three subsystems of perfective, imperfective or future-existing according to the semantic properties in individual cases. Some simple primary aspect forms can combine to form an aspect of secondary type if their temporal attributes are in harmony. The temporal relation of the combination is represented graphically by time point ti, tf, tr and ts.

In this paper, we adopt the terms proposed in the research above to help us clarify the temporal relation of each subclass of the temporal adverbs listed in Lu & Ma. We use the notations of ST, RT, ET, BP, Start and End to define temporal relations. Each respectively denotes the speaking time, the reference time, the event time, the boundary point, the start point of the event, and the end point of the event. For instances, the temporal features for le, guo and zhe are defined as follows in our system which are compatible with the notions of (Smith 1991).

le: BPST, which means the prominent boundary point of the referred event precedes the speaking time.

guo: End