from frame to subframe: collocational asymmetry in ... · web viewthe study of word senses in...
TRANSCRIPT
From Frame to Subframe: Collocational Asymmetry in Mandarin Verbs of Conversation
Forward
Chin-Chuan Cheng
Academia Sinica
We felt it was time in 2000 to take a concerted action to work on Chinese lexical semantics. The study of word senses in traditional discipline of Xungu appeared to be done piecemeal. The truth-theoretic model of proposition looked illusive at times. Colleagues working on natural language processing demanded a better dictionary for word disambiguation. Few theoretical claims about human manipulation of senses were available. We therefore organized a workshop on lexical semantics to discuss these issues and others at the City University of Hong Kong four years ago. The one-day gathering was fairly informal. But we had fun knowing each of the score of colleagues from Taiwan, Hong Kong, Mainland China, and the United States. We did not call it First Chinese Lexical Semantics Workshop. We simply called it a lexical semantics workshop without knowing its consequences.
Somehow the gathering in Hong Kong made Professor Yu Shiwen of Beijing University happy. He should be because he had worked on semantics for years. In 2001 he invited more people to the meeting with the title of Second Workshop on Chinese Lexical Semantics. The Beijing meeting was enthusiastically followed by the Third Workshop in Taipei. Professor Huang Chu-ren was energetic enough to set up the mechanism of abstract submission and evaluation. Some submissions had to be left out because of a large number of excellent papers. The fourth workshop returned to the City University of Hong Kong in 2003. We were not daunted by SARS. Although trip restrictions made us stay home, our papers got exchanged and commented on via the internet. We received hundreds of comments, perhaps more than we would in a face to face conference.
It is now 2004. I am pleased to see the workshop fully alive in its fifth year of existence. Our hosts, Drs. Ji Donghong and Lua Kim Teng have kindly accepted our imposition and aptly made arrangements for us to see each other face to face here in Singapore. We are grateful to them for the arrangements. We are also grateful to them for gathering papers in this volume for discussion. I am sure during the workshop we will move away from piecemeal studies of words. We will be a step closer to theoretical generalizations about human cognition of words.
Content Table
1
7
15The Sinica Sense Management System: Design and Implementation
23
39
47From Frame to Subframe: Collocational Asymmetry in Mandarin Verbs of Conversation
53
61
69
76 (D-Type Theory )
84
91Using WordNet and SUMO to Determine Source Domains of Conceptual Metaphors
99From Lexical Semantics to Conceptual Metaphors: Mapping Principle Verification with WordNet and SUMO
107
114
120
128
133Feature Representations and Logical Compatibility between Temporal Adverbs and Aspects
143
152Multiple-layer Semantic Derivations of Two-part Allegorical Expressions in Taiwanese Southern Min (TSM)
161
169
174CCD
181
189WordNet
194hownet
201
213Pan-Chinese Variation on Verbal Synonymy: A Study of Common Reportage Verbs in News Texts
220The Usage and Perception of Judgement Termsin the Pan-Chinese Context
228
237
244X
249
254
259Taxonomy of Fine-grain Semantic Roles for Nominal Modifiers
264hannhiohonn
268""""
272Semantics-related Lexical Access Deficit of Mandarin-Chinese Dyslexia
276
280
284SVDE
290
294
301
305
311
316Verbs of Urging in Hakka: A Perspective from Force-Dynamics
TC "" \f C \l "1"
1998
bankbankbank
2004
(Kilgarriff and Tugwell 2001)
(VG)(Neqa)(P)(Na)(VC)(DE)(VL)(Na)(VHC)[spv]
(Na)(D)(VE)(D)(VC)(Caa)(D)(VJ)(P)
(Na)(P)(A)(DE)(VC)(Na)(VC)(Caa)
(DE)(VC)(Na)(VC)(Caa)(VK)(SHI)(D)(Nh)(Dfa)(DE)(VC)(Na)(VL)(Na)
VCN50501998
2002
2004(Huang 2003)
2004
--(Kuo at al. 2003)
*93-2524-S-001-003
. . 2004..
. 1998. . In Benjamin K. Tsou, Tom B. Y. Lai, Samuel W. K. Chan, and William S-Y. Wang eds. (Quantitative and Computational Studies on the Chinese Language) 15-30. City University of Hong Kong.
. 2002. . .
. 2004. .
. 2004. . .
Huang, Chu-Ren. 2003. SINICA BOW: Integrating ilingual WordNet and SUMO ontology. Invited panel talk: Synergy Between Language Resources and Knowledge Resources. The 2003 IEEE International Conference on Natural Language Processing and Knowledge Engineering (NLPKE2003), Special Session on Upper Ontology and Natural Language Processing. Beijing. Oct. 28.
Kilgarriff, Adam and David Tugwell. 2001 WORD SKETCH: Extraction and display of significant collocations for lexicography. In Proceedings of. Workshop on COLLOCATION: Computational Extraction, Analysis and Exploitation 32-38. 39th ACL & 10th EACL, Toulouse, July 2001.
Kuo, W.J., T. C. Yeh, C. Y. Lee, Y. T. Wu, C. C. Chou, L. T. Ho, D. L. Hung, O. J. Tzeng, and J. C. Hsieh. 2003. Frequency effects of Chinese character processing in the brain: an event-related fMRI study. Neuroimage 18:720-730.
TC "" \f C \l "1"
100871
[email protected] [email protected]
Dummy Verbs in Contemporary Chinese
YU Shiwen ZHU Xuefeng DUAN Huiming
Institute of Computational LinguisticsPeking University 100871 China
[email protected] [email protected]
Abstract
In contemporary Chinese, there is a subclass of verbs called Dummy Verbs. After briefly introducing the lexical meanings of two typical dummy verbs Jiayi and Jinxing, this paper discusses the grammatical attributes of Jiayi and Jinxing in detail, and further explores their functions as the marks of syntactic constituents and semantic roles.
Keywords : Dummy Verb, Lexical Meaning, Grammatical Attribute, Semantic Role
1.
Dummy Verb, DVDV[1]DVDV
[2][3]6[9][4-8]
1985DV3
_____________________________________________________________________________
* 8632001AA1142102002AA11701060173005
2.
DVDVDV
DVDV3*4343484 8484344384 48[10][11]in addition moreoverDVDV
DVDVDV head
sense[12-15]4
3.
[16]
[1]DVDV
DV
3.1
1DV23
3.2
DV
199810
1998
(:)
90[17]
3.3
DVDV
DV
*
*
DV
DV4.1
DV
3.4
DVDVDV
1 DV
2 DV
3 DV
4 DVDV
5 DV
6 DVAABBABAB
DV
4.
4.1
(1) /
(2)
(3)
(4)
(5)
1NPv(2)-(5)1NPvvvnDVvnNP2DV3DV4DVvn5DVvn
DVvDVvn
[9]
topicfocus
4.2
DVstop listDVWSDDV[18]DVDV[19]
DV[9]
[9] 5
1. We are carrying on reforms on the state-owned enterprises in our country.
2. We are making reforms on the state-owned enterprises in our country.
3. The reforms on the state-owned enterprises are being carried out in our country.
4. We are reforming the state-owned enterprises in our country.
5. The state-owned enterprises are being reformed in our country.
51,2,3carry onmakereformcarry on, carry out, undertake, undergo, conduct, engage, make, hold, commit, have, hold discussion, make investigation
5.
CLSW5, 200461416
5
2004337[20]CLSW5CLSW5
[1] 220032
[2] 19805
[3] 19855
[4] 1995V+468-71
[5] 1997DV04
[6] 199832291-93
[7] 200320111-16
[8] DV2003392-94
[9] 199951-57
[10] 198312
[11] 198811
[12] 2001327-33
[13] 2001498-104
[14] Vol.13, No.2, 2003159-176
[15] Vol.13, No.2, 2003177-194
[16] 20038161-162
[17] 2001321-26
[18] 20036132119-120
[19] 200211272003313189-98
[20] 20043
The Sinica Sense Management System: Design and Implementation TC " The Sinica Sense Management System: Design and Implementation " \f C \l "1"
Chu-Ren Huang, Chun-ling Chen, Cui-Xia Weng, and Keh-jiann Chen
Academia Sinica
1. Background and Motivation
It has been a trend for language engineering to construct a sense-based lexical knowledgebase as a core foundation. WordNet and Euro WordNet are two well-known examples. There are two important criteria in constructing this knowledgebase: linguistic felicity and data cohesion. Huang et al. (2003) discussed how to achieve linguistic felicity in building a comprehensive inventory of Chinese senses from corpus data. It introduced five criteria as well as operational guidelines for sense distinction. In this paper, we will discuss how to achieve data cohesion for the sense information thus collected through a Sinica Sense Management System (SSMS).
2. Introduction to the Content of the SSMS
The SSMS manages both lexical entries and word senses. This system is designed and implemented by the Chinese WordNet Team at Academia Sinica. It contains all the basic information that can be merged with the eventual Chinese WordNet. The basic structure of this system is meaning-driven: Each sense of a lemma is identified specifically and given a separate entry. When further differentiation at the meaning facet level is called for, each facet of a sense is also described in a full entry (Ahrens et al., 1998). In addition to sense and meaning facet, this system also includes the following information: POS, example sentences, corresponding English synset(s) from Princeton WordNet, and lexical semantic relation such as synonym/antonym, and hypernym/hyponym. Moreover, the overarching structure of the system is managed by a sense serial number, and inter-entry structure is established by cross-references among synsets and homographs.
In the present stage, the Chinese WordNet Team focuses on analyzing middle-frequent words in Sinica Corpus. The reason to choose middle-frequent words as our target ones is that with only three to five senses of a word, we can investigate senses and meaning facets of each word deeply and accurately, which would avoid the simple situation of one sense in low-frequent words, and the complicate situation in high-frequent words with numerous senses. Up to now, 1000 more lemma have been analyzed, and more than 2000 senses have been distinguished. We also published five technical reports to present these results [4]. In the near future, these fruits will be used as a basis for Natural Language Processing or E-learning application.
3. The Design Principle of SSMS
A sense-based lexical knowledgebase with data cohesion must meet three requirements: unique identification of senses, trackability of sense, and consistent sense definitions. SSMS has four devices to supply these requirements.
3.1 The Unique Serial Number
First, each sense or meaning facet is identified by a unique serial number in SSMS. In Princeton WordNet (Fellbaum 1998), each synset is given a unique offset number. However, the offset number does not have any logical structure to it. Hence, although it guarantees unique identification, it is not very trackable. An alternative is to set up a base ontology and assign senses to an ontological node with a unique ID. However, this is not feasible since we cannot pre-designate all the possible conceptual and semantic relations. And if decision is made to encode only certain higher level nodes, the random assignment issue is unavoidable since more than one lexical sense will be assigned to the same node. In our system, the unique serial number of each sense is composed of three segments: the sequential information of when the lemma was processed, the lemma form, and the sense classification code for each lemma (including the meaning facet level). Take bao4 zhi3 (newspaper) for example. bao4 zhi3 has two senses and two meaning facets being distinguished. The lexical entry of bao4 zhi3 is as follows.
Example 3-1: The result of sense distinction for bao4 zhi3 (newspaper)
bao4 zhi3
1Na
1newspaper, 03039218N
2newspaper, 04738466N
2Nanewspaper, 06009637N
Four-level unique serial number is shown as below to express four segments of the unique serial number for one meaning of bao4 zhi3.
bao4 zhi3 (newspaper)
Lemma processing year
03-
Lemma form ID
-0018-
The first sense
-01-
The first meaning facet
-01
The unique serial number for 1st. meaning facet of 1st.sense of bao4 zhi3 => 0300180101
There are four advantages to manage the sense database with unique serial numbers. First, the sequential number not only gives a unique code to each lemma, it also enables a project manager to track work progress more easily. Second, including the lemma in the serial number helps human users to quickly identify the relevant senses. It also facilitate man-machine interface such as in keyword search for senses. Third, it also provides a logical structure of the sense serial number since each lemma represents a small number of possible senses. Lastly, four digits are reserved to identify senses and meaning facets belong to each lemma. The first two digits are reserved for senses and the last for meaning facets. These four digits also allow the minimal space to identify exact sense in the database. For instance, when stipulating a synonym, we can identify it as word0200, which refers to the second sense of a certain lemma. There is no need to repeat the complete sense serial number. The sense serial number enables unique identification and also contributes to trackability.
3.2 The Cross-reference device
Second, SSMS will automatically prompt all possible cross-references. When a lemma is called up for analysis, all existing records that contain this lemma will be prompted. This includes not only lexical semantic relations such as synonyms and hyponyms, it also includes and sense definition that contain this lemma, as well as any explanatory notes that contain this lemma. This feature allows sense relations to be clearly defined, and inconsistencies to be detected. In addition any anomaly in definition or expression format will also be discovered. This process will also help us to narrow down to a set of control vocabulary for sense definition. This feature contributes to both the trackability of senses and consistency of sense definition.
3.3 The concurrent lexical knowledgebase and coupus
Third, SSMS enables parallel concurrent of the lexical knowledgebase and corpus. When a lemma is chosen in the system, all tagged example of that lemma from Sinica Corpus are retrieved. This allows closer examination of how the senses are used and distributed. It also allows automatic selection of corpus example sentences. In turn, when the sense classification is completed, SSMS allows all the corpus sentences to be sense-tagged and returned to merge with the original corpus. In other words, a sense-tagged corpus is being processed in parallel. This feature allows each lexical sense to be trackable to its actually uses in the corpus. It also allows linguist to examine the data supporting each sense classification.
3.4 Linking to the Sinica BOW
Fourth, SSMS is also linked to the bilingual wordnet information at Sinica BOW. Candidate English synset correspondences, including offset number, are shown after a Chinese lemma is chosen. This allows the cross-lingual trackability and consistency.
4. The Implementation of SSMS
There are three major phases in this system implementing. In lemma analysis phase, based on the criteria and operational guidelines proposed in Huang (2003), we distinguish senses and meaning facets for each word. At the same time, Sinica Corpus and WordNet will be referred for POS, examples and English translation. Then through the help of dictionary resources or word mapping by the system, we decide the word relation. The second phase can be divided into two steps. First, we design the schema of the sense management system database for storing the analyzing result of the first phase. Then, as for the data access, we develop the interface to help the Chinese Wordnet Team insert and query from the database. We employ DELPHI tool to design our system interface. Thought the interface, the data in the database also can be exported as Word documents. Last, the third phase of this system implementation is the application phase. Our work project is to build Chinese WordNet web sites for users querying. The development language of these web pages is HTML and ASP. Finally, these web pages in the web sites could be viewed thought web server. By the way of the Internet, people can retrieve data from our sense management database system everywhere at anytime. The flow of the Sinica Sense Management System is displayed in the following chart.
The First Phase
Lemma Analysis Phase
The Second Phase
Sinica Sense
Management System
Implement Phase
The Third Phase
Application Phase-Web
Sites Implement
Work ProjectWork ProjectWork Project
Sense Definition
Facet Definition
Wordnet Synset
Example Sentences
Pos
Word relation
Interface Develop
Database Implement
Chinese Wordnet
Web Sites
Implement
Figure 1: The flow chart of the Sinica Sense Management System.
We can represent the overall framework of SSMS diagrammatically in Fig. 2. As the diagram indicates, the Chinese WordNet Team use SSMS to access database and have electric documents as Word report. Moreover, the users in the internet can browse HTML/ASP pages to query database through and web server.
Database
Interface of
Sinica Sense
Management
System
Word
report
Users
Browser
Query
Web Server
Results
HTML/ASP
Pages
Chinese WordNet Term
Figure 2: The overall structure of SSMS.
4.1 The Schema of SSMS Database in Class Diagram
In the section, we discuss and design the schema of SSMS Database. The Unified Modeling Language (UML) [5]
REF _Ref64909437 \r \h \* MERGEFORMAT [6] is a graphical notation that provides the conceptual foundation for assembling a system out of components from the 4+1 views and nine diagrams. Each view is a projection into the organization and structure of the system, focused on a particular aspect of that system.
We employ the class diagram notations in UML to provide a static view of application concepts in terms of classes and their relationships including generalization and association. Therefore, we only introduce the details about class diagrams as follows.
Class diagrams [5]
REF _Ref64909437 \r \h \* MERGEFORMAT [6][7] commonly contain the following features:
1. A class diagram shows a set of classes and their relationships. For example, the class diagram of the Suppliers-and-Parts database as shown in Fig. 3. The terms with italic style in Fig. 2 indicates the concepts about class diagrams.
Shipments
qry
supply
Subject
Object
Suppliers
sno
sname
status
city
add()
Parts
pno
cno
pname
weight
color
1..*
1..*
1..*
1..*
Computers
cno
cname
1..*
1
1..*
1
Foreign
nation
Domestic
chinesename
assoication
class name
attribute
operation
inheritance
aggregation
Figure 3: A class diagram for the Suppliers-and- Parts Database.
2. A class is a description of a set of objects that share the same attributes, operations, relationships, and semantics. A class mainly contains three important parts: its name, attributes, and operations. We explain these terms as follows:
(a) Class name: every class must have a name to distinguish it from other classes. For example, Suppliers or Parts are class names.
(b) Attribute: an attribute represents some property that is shared by all objects of that class. A class may have any number of attributes or no attributes at all. For example, in Fig. 3, the Suppliers have some attributes such as sno, sname, city.
(c) Operation: an operation is the implementation of a service that can be requested from any object of the class to affect behavior. A class may have any number of operations or no operations at all. For example, in Fig. 3, the class of Suppliers has an operation add().
3. There are three kinds of relationships between classes:
(a) Association: an association is a structural relationship that specifies objects of one thing to be connected to objects of another. For example, in Fig. 3, a line drawn between the involved classes (Suppliers and Parts) represents an association named supply.
(b) Aggregation: an aggregation is a whole/part relationship, in which one class represents a larger thing (the whole class), which consists of smaller things (the parts class). Moreover, an aggregation represents a has-a relationship, which means that an object of the whole class has objects of the part class. To represent an aggregation, an empty diamond will be drawn at the whole class end of the line linking two classes.
(c) Inheritance: An inheritance relationship can be regarded as a generalization (or specialization), which is a taxonomic relationship between a general (super classes) and a special (subclasses) element, where the special element adds properties to the general one and behaves in a way that is compatible with it. Therefore, it is sometimes called an is-a-kind-of relationship. An inheritance relation is represented by means of a large empty arrow pointing from the subclass to the super class. For example, in Fig. 3, Domestic and Foreign suppliers (two subclasses) are a kind of suppliers (the super class).
According to the need of SSMS content and design principle, Fig. 4 is the schema of SSMS database using the concepts of class diagram.
cwn_example
cwn_id
example_sno
example_cont
cwn_note
cwn_id
note_sno
note_cont
cwn_synset
cwn_id
synset_word1
synset_offset
synset_cwnrel
0
*
0
*
0
1
*
0
*
0
*
0
*
0
0
0
0
*
*
*
CWN_Lemma
Lemma_id
CWN_lemma
CWN_pinyin
CWN_zhuyin
CWN_POS
Cwn_id
Pos_sno
Epos2
CWN_Facet
Facet_id
Sense_id
Facet_def
Facet_Domain
Facet_synonym
Facet_antonym
Facet_varword
Facet_upword
Facet_nearword
Facet_relword
CWN_Sense
Sense_id
Lemma_id
Sense_def
Sense_Domain
Sense_synonym
Sense_antonym
Sense_varword
Sense_upword
Sense_nearword
Sense_relword
*
*
Figure 4: The schema of the Sinica Sense database.
4.2 The Function of SSMS
In this section, we will discuss the interface marking for SSMS. The development language of SSMS interface is DELPHI 7.0. Based on the need of program execution, the function of SSMS is shown in Fig. 5. In SSMS, the programs have many functions and these functions can be represented in windows interface and ASP web pages. Sense management and Sense visualization are two major functions in SSMS. In Sense management function, the Chinese WordNet term can insert, update, and delete data including lexical entries, word sense, meaning facet, POS, example sentences, English synset(s), lexical semantic relation. The Sense visualization is SSMS interface and can be divided into two parts: Sense Query and Word Report. The format of SSMS interface is shown in Fig. 6. The SSMS interface provides a user-friendly interface to operate and maintain. For the Sense query function, the users can enter a serial number or a lexical entry for sense querying in SSMS interface. Another function, the Word report, uses development software Crystal Report9 to produce electric documents shown as Fig. 7.
The program of SSMS
Function
Sense ManagementSense visualization
Sense QueryWord Report
1*
The users of
Chinese
WordNet Term
Windows
1*
1
*
*
ASP
Web Pages
1
1*
Figure 5: The class diagram of SSMS function description.
Figure 6: The interface of SSMS.
Figure 7: The format of Word report.
6. Conclusion
In sum, SSMS is not only a versatile development tool and management system for sense-based lexical knowledgebase. It can also serve as the database backend for both Chinese WordNet and any sense-based applications for Chinese language processing.
Online Resources:
Sinica BOW: http://BOW.sinica.edu.tw/
Sinica Corpus: http://www.sinica.edu.tw/SinicaCorpus/
WordNet: http://www.cogsci.princeton.edu/~wn/
References
[1] Ahrens, K., L. Chang, K. Chen, and C. Huang, 1998, Meaning Representation and Meaning Instantiation for Chinese Nominals. Computational Linguistics and Chinese Lnaguage Processing, 3, 45-60.
[2] Booch, G., J. Rumbaugh, and I. Jacobson, The Unified Modeling Language User Guide, Addison-Wesley, 1999.
[3] Fellbaum, Christine. Ed. 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.
[4] Huang, Chu-Ren (ed.), 2004, Sense and Sensibility series: Technical Report 03-01~04. CKIP, Taipei.
[5] Huang, Chu-Ren et al., 2003, Sense and Meaning Facet: Criteria and Operational Guidelines for Chinese Sense Distinction]. Presented at the Fourth Chinese Lexical Semantics Workshops. June 23-25 Hong Kong, Hong Kong City University.
[6] Muller, R.J., Database Design for Smarties: Using UML for Data Modeling, Morgan Kaufmann, 1999.
[7] Oestereich, B., Developing Software with UML Object-Oriented Analysis and Design in Practice, Addison-Wesley, 1999.
CLSW5 Submission Information
Title: The Sinica Sense Management System: Design and Implementation
Authors: Chu-Ren Huang, Chun-ling Chen, Cui-Xia Weng, and Keh-jiann Chen
Affiliation: Academia Sinica
Contact Information:
[email protected] (Huang)
[email protected] (CL Chen)
[email protected] (Weng)
[email protected] (KJ Chen)
TC "" \f C \l "1"
1206144000490001590%610751
[Object language]
15002000 (primitive)
H2O010041
H2O65%100041/
H2OH2O
716H2OH2O
36364230872.535%52158467.562%52457
85488822
300
7445159616364758184858688979899102104106107112114118120124125126127135141144147151153155156157159161164165169173179180181185188191192193198200203204205208211212215216218219221226227229235242244245246254259261262263266267272273279282283284286287292293294295296297298299300
1
-
1
ADLHSTUVW X
2
2
7
44
()
0
3
2700
3ruziruizi
[]11037
1
A
B
C
D
E
F
G
H
AHAH
2
4847
/ /
261
25
3
50
50558168483737
196
25
//
4
6153410.9%2787%
5
1
Hl761132223223273.678%
2
3
4
2000
3000
8000
1980
1996
1998
R.R.K. R.C. 1981
1986
2001 1
1988 8()2001
2001
1999 3
2002 4
1992 3
1990
2001
2000 3
2001A 4
2001B 4
2003 6
12301
0592-218408121814798745786
TC "" \f C \l "1"
100083
E-mail: [email protected]
Representation and Computing of Cognate RoleFrame
Zhendong DongQiang DongChangling Hao
Research Center of Computer & Language Information Engineering, CAS, Beijing, 100083
E-mail: [email protected]
Abstract: Borrowing from the term of cognate object, we use Cognate RoleFrame to reveal a kind of semantic relations between nouns and verbs like hatred and hate. By Cognate RoleFrame, we mean that the noun has the same role frame as its correspondent verb. In HowNet we use CoEvent as the identifier to describe all the nouns of Cognate RoleFrame. We demonstrate two HowNet-based tools to evaluate our treatment of Cognate RoleFrame.
Keyword: HowNet, valency grammar, event role, event role frame, cognate RoleFrame
1.
WordNet[2]
2. Role frame
WordNetVerbNetFrameNetWordNet
agent, possession, source, cost, beneficiary
agent, instrument, partner, cause
agent, patient, instrument, PatientValue={dirty|}
agent, patient, instrument, PartOfTouch
815
815
{fight|} {HaveContest|:agent={human|}{group|->},instrument={weapon|},
partner={human|}{group|->},cause={*}}
(typical actor)selectional restriction
3. (Cognate role-frame)
live a happy lifesleep a sound sleeplifesleepcognate object[1]
experiencer, degree, contentexperiencer, degree, content
4.
Cognate role-frame conceptCognate role-frame word
---
5.
4223805.6%
1
2
3
4
5
6
7
7
6.
6.1
123
6.2 -- CoEvent
CoEventCoEventCognate_Roleframe_Event
W_C=
W_C=
G_C=N
G_C=V
E_C=
E_C=
W_E=love
W_E=cherish
G_E=N
G_E=V
E_E=
E_E=
DEF={emotion|:CoEvent={like|}}
DEF={like|}
1234
1/
2DEF={fact|:CoEvent={fight|},domain={military|}}
3{fight|}{HaveContest|:agent={*},instrument={*},partner={*},cause={*}};
{HaveContest|:coagent={*},instrument={*},cause={*}}
4{fight|} {HaveContest|:agent={human|}{group|->},
instrument={weapon|},
partner={human|}{group|->},cause={*}};
CoEventDEF{fight|:domain={military|}}
W_C=
G_C=N
E_C=
W_E=environmental sanitation
G_E=N
E_E=
DEF={fact|:CoEvent={clean|:patient={Environment|:host={entity|}}}}
W_C=
G_C=N
E_C=
W_E=table tennis tournament
G_E=N
E_E=
DEF={fact|:CoEvent={compete|},domain={TableTennis|}}
6.3
0.009091
0.061538
0.380000
0.450000
DEF={emotion|:CoEvent={love|}}
DEF={fight|:domain={military|}}
DEF={FondOf|}
DEF={fact|:CoEvent={love|},modifier={first|}}
DEF={fact|:CoEvent={love|}}
[1] 1980pp60-61
[2] 1996pp29-58
From Frame to Subframe: Collocational Asymmetry in Mandarin Verbs of Conversation TC " From Frame to Subframe: Collocational Asymmetry in Mandarin Verbs of Conversation " \f C \l "1"
Mei-Chun Liu Chun Edison Chang
Graduate Institute of Linguistics and Cultural Studies
National Chiao Tung University, Hsinchu 30050, Taiwan
[email protected] [email protected]
Abstract
This paper examines the collocational patterns of Mandarin verbs of conversation and proposes that a finer classification scheme than the flat structure of frames is needed to capture the semantic granularity of verb types. The notion of a subframe is introduced and utilized to explain the syntactic-semantic interdependencies among different groups of verbs in the conversation frame. The paper aims to provide detailed linguistic motivations for distinguishing subframes within a frame as a semantic anchor for further defining near-synonym sets.
1. Introduction
As the importance of lexical semantic research grows with the need of representing human knowledge, various lexically-based information networks have been proposed, such as the comprehensive work of differentiating word senses and sense relations in WordNet (Miller et al. 1990), the ontological hierarchy in SUMO (Das et al 2002, Pease et al 2002, Niles and Pease 2003), and the more linguistically-motivated model of FrameNet (Baker, Fillmore and Cronin 2003). While all providing valuable information regarding certain aspects of word meaning, the first two are constructed in a more intuitive manner. FrameNet, on the other hand, is based on the theory of Frame Semantics (Fillmore and Atkins 1992) and attempts to define meaning within a set of shared knowledge or background information, that is, a frame. However, as pointed out in Liu and Wu (2003), if meaning is anchored in the notion of frame, then we need independent motivations for postulating different frames. What seems to be lacking in the current framework is a cognitive linguistic explanation as to how the individual frames are distinguished and interrelated? In other words, what are the semantic relations among all the frames? To answer the question, Liu and Wu (2003) proposed an overarching conceptual schema which incorporates all the core frame elements (FEs) and accounts for the interrelationship among various frames in the communication domain. By providing a cognitive schema as a macro-structure, the distinction of frames is then well-motivated. However, there still remains another issue at a micro-level, as indicated by Liu and Wu (2003):
Within each frame, a wide range of verbs are found and one would wonder how these verbs differ from each other. For example, English verbs speak, discuss, quarrel, and gossip, are all found in the Conversation Frame, but obviously, these lemmas encode something different. What are the differences ? There seem to be frame-internal features that also need to be characterized.
In this paper, we will show that within each frame, a more elaborated classification system is needed to account for a variety of verb behaviors. The notion of subframe is introduced and utilized to capture the syntactic-semantic interdependencies observed in the corpus data.
2. Motivation for the Conversation Frame
Compared with the other communication frames, the Conversation Frame is unique in that it profiles the property of reciprocality or two-way communication. Verbs in the Conversation frame encodes reciprocal events where participants are involved as Interlocutors, such as tan talk , tanlun converse, ta-lun discuss, shangliang discuss, xietiao negotiate, xieyi negotiate, goutong communicate, chaojia quarrel, zhenglun argue, xianliaochat, and liaotian chat, etc. These verbal events highlight part of the conceptual schema as discussed in detail in Liu and Wu (2003) and represented in (1) below. The core Frame Elements (with bold fonts and grayed areas) help define the frame as a bidirectional communicative activity conducted by both the Speaker and Addressee as Interlocutor 1 and Interlocutor 2 (or Interlocutors), via a certain Medium, on a given Topic.
(1) Conceptual schema for the Conversation frame:
Encoding Decoding
- ---- Noise----- -----Noise-----
The Conversation frame proves to be well-motivated in relation to other communication frames, as most of its verbs share the same conceptual schema and realizing similar constructions in coding the core frame elements. There is, however, a fundamental question to be answered, that is, within the Conversation frame, are there semantic subtypes that are also syntactically motivated?
3. Motivation for Distinguishing Subframes
As mentioned above, verbs of conversation involve a set of core Frame Elements: Interlocutor1, Interlocutor2 (or combined to Interlocutors), Topic and Medium. In most cases, the default Medium is face-to-face when not overtly mentioned, as in the following sentence: //// They are talking/discussing/exchanging views about/argue about/chat about the meaning of life. Intuitively, these different lemmas seem to encode differences in manner, formality or purpose, while sharing the same topic. But what are the grammatical correlates to the lexicalized meaning differences? When looking closely at their collocational patterns, we found that there are asymmetrical distributions in five respects: 1) V+V pattern: some may occur with a preceding verb such as jinxing proceed or dacheng achieve; 2) V+NP pattern: the core element Interlocutor2 may sometimes be coded as the direct object; 3) Metonymic subject: the subject of the event may be inanimate entities taking the role of Interlocutors by the principle of metonym; 4) V+ Complement: some verbs take a postverbal complement or adverbial adjunct denoting effect evaluation, such as chenggong succeed or shibai failand 5) In terms of distribution of grammatical functions, they show different frequencies of nominalization. Based on the five criteria, verbs of conversation can be further divided into 5 groups with corresponding sets of near-synonyms. We will address the syntactic-semantic interdependencies revealed by each pattern in the following sections.
3.1. V+V Pattern: with the preceding verbs jinxing () or dacheng ()
The use of the light verb entails a formal register and encodes a process or atelic event. It tends to occur with an activity verb compatible with the formal register and involving a durative process, as shown in (2):
(2)a. //
b. * /////
Below is the distributional tendency found in Sinica Corpus:
(3) Percentage with Jingxing ()
V1 V2
Other Verbs (/////)
4% (3/83)
6% (25/419)
0%
Another verb dacheng () achieve is also compatible with some conversation verbs, requiring a formal register but encoding a telic event. The verb dacheng is only found with the nominalized form of such verbs as//, i.e., activity verbs entailing a semantic endpoint with an incremental theme, as shown in (4):
(4) a. //
b. * / /////
The co-occurrence with the preceding verbs jinxing () or dacheng () serves to distinguish the conversation events in terms of its pragmatic mode (formal vs. informal) and event types (telic vs. atelic).
3.2 V+NP pattern: Interlocutor2 as the Direct Object
Another pattern that sets the verbs apart regards the semantic role of the object NP. While most verbs can only take the Topic as the direct object, some verbs may encode Interlocutor2 as the direct object without adding the associative marker //, as shown in (5):
(5) a. []Intl1 []Intl2
b. [] Intl1 [] Intl2
This suggests that with the verbs/, the co-participant, Interlocutor2, may be viewed as the undergoer or the affected target of the event. Among the sentences of/followed by an object NP, an average of 23 percent may take an Interlocutor 2 as the direct object in Sinica Corpus:
(6) Percentage with Interlocutor2 as DO
DO V
/
Other Verbs (//////)
Interlocutor2 as DO
23%(28/123)
0%
3.3 Use of Inanimate Subject
Interlocutors in the conversation events are by default human participants. However, some verbs may take inanimate subjects (place or institute names) as Interlocutors via metonymic extensions from institute/building to human organization:
(7)a. ///
b. */*
The application of metonymy tends to be associated with verbs that comply with the formal register requiring also a formal, non-personal topic (e.g., public affairs). Sinica corpus shows that verbs with marked manners tend not to be used with metonymic subjects:
(8) Percentage with Inanimate Subject:
V
Subj. type
Other Verbs
(///)
Inanimate Subject
2%
(4/191)
6%
(5/83)
25%
(21/85)
0%
3.4 Postverbal Complement with Effect Evaluation
Among the conversation verbs, only the negotiate verbs (e.g., , ) may collocate with effect evaluation complement such as successfully andfailingly, as shown below with examples and percentage rate from the Sinica Corpus:
(9) a. []
b. []
(10) Percentage of result evaluation complement:
Comp. V
Other Verbs (//////)
/
12.5% (2/16)
0%
The co-occurrence with effect evaluation complement indicates that the semantics of the negotiate events encode some kind of an effect or result that is being sought by the negotiation process. This also implies that the two-way communication in the event of/is a solution-seeking process which is semantically bounded and may be evaluated as to whether the solution or purpose has been achieved.
This pattern also correlates with the use of dacheng achieve, as mentioned above, in signaling an evaluation of the attainment of the desired result.
3.5 Frequency of Nominalization
Some groups of verbs tend to be nominalized more frequently than the others. Comparing the high-frequency verbs and their distributions over grammatical functions, we see clear skewing in nominal uses:
(11) Distribution of Predicate vs. Nominal Uses
Func. V
Predicate
97%
(680/701)
52%
(83/161)
55%
(415/1013)
76%
(123/162)
94%
(134/142)
Nominalized
3%
(21/701)
48%
(78/161)
45%
(598/1013)
24%
(39/162)
6%
(8/142)
Nominalization serves to change verbs to event nominals that may be referred to as a quantifiable entity. Nominalization is also highly correlated with the formal register of written texts.
4. Subframes as an Anchor for Near-synonyms
The asymmetrical distributions of the conversation verbs over different collocational associations clearly suggest that verbs can be further divided into subtypes. Although sharing the same conceptual frame, subclasses of verbs show distinct patterns of syntactic-semantic interdependencies that may serve as the basis to further define near-synonym sets. These subtypes may be viewed as anchored in different subframes. Below is a summary of the collocational patterns associated with the 5 subframes within the conversation frame:
(12) Collocational Patterns associated with the Conversation Subframes
CP
Subtypes
+V
+V
Intl2 as DO
Inanimate
Subject
Complement
/
[+Nom]
1. Converse: /
No
No
No
Yes
No
Low
2. Discuss:
Yes
No
No
Yes
No
High
3. Negotiate
/
Yes
Yes
Yes
Yes
Yes
High
4. Quarrel:
/
No
No
No
Yes
No
Mid-High
5. Chat:
/
No
No
No
No
No
Low
Based on the distributional variations of collocational patterns, we can group all the other conversation lemmas into the five subframes:
1) Converse subframe: , , , , , , , ,,,,
2) Discuss subframe: , , , ,
3) Negotiate subframe: , , , , , , ,
4) Quarrel subframe: , , , , ,
5) Chat subframe: , , , , .
5. Concluding Remarks and Theoretical Implications
With the proposal of subframes within the theoretical construct of Frame Semantics, verb meanings may be defined with finer distinctions that are syntactically motivated. However, further fine-grained semantic distinctions are still needed to differentiate near-synonyms within each subframe, such asvs. or vs., etc. It is exactly at the subframe level that we may anchor all the near-synonym sets as closely related. In sum, to fully represent the meaning relations among verbs, wed like to propose the following classificational scheme for representing verb meanings:
Domain ->Frame -> Subframe -> Near-synonym Set -> Lemma
The five-layered structures allow verbs to be represented in a frame-based semantic hierarchy with detailed lexical information to further disambiguate near-synonyms.
References
Baker, Collin F., Charles J. Fillmore and Beau Cronin (2003) The Structure of the Framenet Database. International Journal of Lexicography 16(3).281-296.
Das, Subrata, Kurt Shuster, and Curt Wu. 2002. Ontologies for Agent-Based Information Retrieval and Sequence Mining. Proceedings of the Workshop on Ontologies in Agent Systems (OAS02), held at the 1st International Joint Conference on Autonomous Agents and Multi-Agent Systems,.Italy, July 15-19.
Fillmore, Charles J., and Atkins, Beryl T. 1992. Toward a Frame-Based Lexicon: The Semantics of RISK and Its Neighbors. Frames, Fields, and Contrasts, ed. by Adrienne Lehrer and Eva Feder Kittay. 75-102. Hillsdale. New Jersy: Lawrence.
Kennedy, C. and Levin, B. 2002. Telicity Corresponds to Degree of Change Handout to Speech at Georgetown University.
Liu, Mei-Chun and Yiching Wu. 2003. Beyond Frame Semantics: Insight from Mandarin Verbs of Communication. Paper presented at the 4th Chinese Lexical Semantics Workshop. City University of Hong Kong, Hong Kong. June 22-July 11. (http://icl.cityu.edu.hk/conference/4CLSW/BIG5/home.htm
Miller, A., R. Beckwidth, C. Fellbaum, D. Gross, K.J. Miller. 1990. Introduction to WordNet: An on-line Lexical Database. International Journal of Lexicography. 3.235-244.
Niles, Ian and Adam Pease. Linking Lexicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology. Proceedings of the 2003 International Conference on Information and Knowledge Engineering (IKE03), Las Vegas, Nevada, June 23-26, 2003.
Pease, A., Niles, I., and Li, J. 2002. The Suggested Upper Merged Ontology: A Large Ontology for the Semantic Web and its Applications. Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web, Edmonton, Canada, July 28-August 1, 2002.
Website Addresses:
FrameNet: http://www.icsi.berkeley.edu/~framenet/
HowNet (): http://www.keenage.com/html/c_index.html/
Sinica Corpus (): http://www.sinica.edu.tw/SinicaCorpus/
SUMO Ontology: http://ontology.teknowledge.com/
WordNet: http://www.cogsci.princeton.edu/~wn/
The Academia Sinica Bilingual Ontology WordNet (BOW): http://bow.sinica.edu.tw/
TC "" \f C \l "1"
Li & Huang (1999)52716780014.8%(1a)(1b)(1c)(1d)(1e)
(1)a.
b.
c. d.
e.
(Word Sense Disambiguation)
Li & Huang (1999)Lam et al. (1997)45.5%(1999)52.13%Chen et al. (1999)
Choueka et al. (1983)(2002)Chuang (2003)setAhrens et al. (2003)MARVS(sense)(meaning facet)
(1)(1d)(1e)
(2)a.E-Mail
b.
(3)a.b.
(4)a.
b.
(2003)WASPS(http:// wasps.itri.bton.ac.uk/)(word sketch)
1.
2.
3.
1334
DOIORCDCIO
DO
-
IO
-
-
-
-
-
RC
DC
DO(5a-b)(6a-b)DCDOIOIO(7a)7(b)RC(8a)(8b)
(5)a.
b.
(6)a.b.
(7)a.
b.
(8)a.
b.
(9)a.
b.
(1d-e)
1334
(100%)
629
(47.16%)
530
(39.73%)
124
(9.3%)
37
(2.77%)
9
(0.67%)
5
(0.37%)
28.26%
8.41%
19.56%
0.29%
+L+VP
3.45%
+IO
10.49%
+DO+C
0.22%
+L+VP
2.5%
+IO+DO
5.7%
0.07%
+L
1.35%
+IO
1.8%
+L
0.9%
+IO+DO
1.35%
+L+DO
0.07%
+IO+VP
0.15%
+IO+VP
0.07%
+IO
0.07%
+DO+L+IO
0.07%
71.74%
71.74%
38.78%
20.16%
9%
3%
0.66%
0.37%
+C
35.99%
33.36%
1.28%
0.68%
0.38%
0.22%
0.07%
+DO
16.49%
2.2%
13.1%
0.82%
-
0.37%
-
+DO+VP
8.41%
0.68%
0.9%
5.4%
1.43%
-
-
4.35%
1.35%
1.95%
0.68%
-
0.07%
0.3%
+DO+L
1.88%
0.75%
-
1.13%
-
-
-
+IO+VP
1.65%
0.3%
1.35%
-
-
-
-
+DO+IO
1.50%
0.07%
1.43%
-
-
-
-
+VP
0.74%
-
-
0.22%
0.52%
-
-
+DO+L+VP
0.51%
-
-
0.07%
0.45%
-
-
+DO+IO+VP
0.22%
0.07%
0.15%
-
-
-
-
(10)a.
b.
c.
d.
e.f.
(11)a.
b.
c.
d.
(12)a.b.
c.
d.
(13)a.
b.
c.
d.
e.
(14)a.
b.
(15)a.E-Mail
b.
(16)a.b.
(17)a.
b.
c.
629 (100%)
530 (100%)
124 (100%)
37 (100%)
+C 70.6%
+DO 33%
+DO+VP 58%
+DO+VP 51.4%
+L+VP 7.3%
+IO 26.4%
+DO+L 12%
+VP 18.9%
+L+VP 5.2%
+IO+DO 14.3%
+DO 9%
+DO+L+VP 16.2%
+DO 4.6%
4.9%
7.3%
+C 13.5%
2.9%
+IO 4.5%
+C 7.3%
+L 2.9%
+DO+IO 3.6%
+DO+C 2.4%
+L 1.9%
+IO+DO 3.4%
+VP 2.4%
+DO+L 1.6%
+IO+VP 3.4%
0.8%
+DO+VP 1.4%
+C 3.2%
+DO+L+VP 0.8%
9 (100%)
+IO+VP 0.6%
+DO+VP 2.3%
+DO 55.6%
+DO+IO 0.2%
+DO+IO+VP 0.4%
+C 33.3%
+DO+IO+VP 0.2%
+IO+VP 0.4%
11.1%
+DO+L+IO 0.2%
IO 0.2%
5 (100%)
+IO+VP 0.2%
80%
+L+DO 0.2%
+C 20%
1.
(18a)(18b-c)(19a)(19b)
(18)a.
b.
c.
(19)a.
b.
(20)a.
b.*
(20)
2.
26.4%4.5%14.3%3.4%
(21)a.b.
3.
(22a)(22b)(19a)(21b)
(22)a.
b.
4.
(23a)(23b)
(23)a.
b.
[location]
Goal [destination]
Goal [organization]
Goal [direction]
[human]
Goal [departure]
Goal [destination]
Goal [activity]
[receipient]
[life]
(24)a.b.
c.
(25)a.b.
(26)
(27)a.
b.*
c.
(28)a.
b.
(29)a.
b.
c.
(30)
2001
2002CLCLP 7.2, 77-88.
1980
1999
2001
2003
20031220
1983
Ahrens, K., C.-R. Huang, and Y-H Chuang. 2003. Sense and meaning facets in verbal semantics: A
MARVS perspective. Language and Linguistics 4.3: 469-484.
Chen, H.-H., G.-W. Bian, and W.-C. Lin. 1999. Resolving translation ambiguity and target polysemy in
cross-language information retrieval. CLCLP 4.2, 21-38.
Choueka, Y., and S. Lusignan. 1983. A connectionist scheme for modeling word sense disambiguation.
Cognition and Brain Theory 6.1, 89-120.
Chuang, Y.-H. 2003. Sense Distinction of Verbs in English and Mandarin Chinese: An Analysis of the
Verbs Set and Bai3. MA thesis. National Taiwan University.
Lam, S.-S., K.-F. Wong, and V. Lum. 1997. LSD-C A linguistics-based word sense disambiguation
algorithm for Chinese. Computer Processing of Oriental Languages 10.4, 409-422.
Levin, B., and M. Hovv. 2001. What alternates in the dative alternation? CSSP.
Li, J., and C. Huang. 1999. A model for word sense disambiguation. CLCLP 4.2, 1-20.
TC "" \f C \l "1"
2347
1
1998
1988119,50713,768
7,1052,347
2
2347
2.1
1
2
2.2
10
1
()
616
31.21
437
22.14
381
19.30
182
9.22
145
7.35
107
5.42
73
3.70
16
0.81
12
0.61
5
0.25
31.2110NPNPVPVP
2.3
2.4
2
(%)
146
23.70
73
11.85
70
11.36
54
8.77
37
6.01
36
5.84
35
5.68
25
4.06
20
3.25
17
2.76
11
1.79
9
1.46
8
1.30
7
1.14
7
1.14
6
0.97
5
0.81
4
0.65
4
0.65
4
0.65
3
0.49
3
0.49
3
0.49
2
0.32
2
0.32
2
0.32
2
0.32
2
0.32
2
0.32
2
0.32
2
0.32
1
0.16
1
0.16
1
0.16
1
0.16
1
0.16
1
0.16
1
0.16
1
0.16
1
0.16
1
0.16
1
0.16
1
0.16
1
0.16
14623.707311.857011.36
3
386
371
279
85
58
35
19
(%)
31.31
30.09
22.63
6.89
4.70
2.84
1.54
384.02
2.5
48290
3 3.1
1998
(interaction)(Tenor) (Vehicle)2004 A B B(target domain)A(source domain)(ground)
3.2
A B B A
(Love is a journey)
A B
Target domain Source domain
Tenor Vehicle
(
F: A ( B
F: B(A
3.3
1998(dead and buried)(inactive metaphor)
1998
1)
[]
[]
[]
2)
[]
[]
[]
[]
[]
[]
[]
3.4
2003
1)
2)
3)
1
(
2
(
3
(
4
(
5
(
6
(
2
[]
[]
[]
3.5
(handles)
4
(1988)
(2004)http://www.skycn.com
(1998)6P10-19
(2003)352P119-121
(2003)4P19-2420042
(2004)
(2000)
(2002)
(1985)
TC "" \f C \l "1"
X
1.
1.1
(1980[2003])1998
(1980[2003])
1998
(1980[2003])1998
1.2
(2004)(1)
(1)//
(2)
(2)//
( 1997)(1)-(2)
(2000)(3)
(3)()
(4)
vadj4217117
1612(V)(5)
(5)()
(6)-(8)
(6)
(7)
(8)
(8)
A
1
(B)
2
C
3
D
4
42
66
AB(C)(D)(9)-(13)
(9)()
(10)()
(11)()
(12)()
(13)()
2.
2.1-2.3
2.1
(2002)
(2002)
A
B
C
D
(2002)
(14)(A)
(15)(B)
(16)(C)
(17)(D)
2.2-2.3
2.2
(18)()
(19)()
(18)////(19)
(20)()
(20)
(21-23)
(21)()
(22)()
(23)()
(1992)
(24)-(25)
(24)()
(25)()
15(26)
(26)
24(27)(28)
(27) ()
(28)
(27)(28)D4(17)(29)(30)
(29)()
(30)
(31)()
2.3
3.
3
7
1
adv
38
10
12
3
adv
()
1
adv
6
1
adv
1
5
2
v
1
11
v
1
adv
3
adj
17
17
3
2000
66
15
16
0
42
0
4
;
1
2
adv
44
5
4
3
3
15
adv
5
1
adv
4
adv
3
1
138
56
6
4
3
5
15
1
0
A.
B.
C.2
D.
. 1997.
2004
. 2003.2:1-561-73
. 2002.
. 1998.
1992.
. 2003.19805
(D-Type Theory ) TC " (D-Type Theory )" \f C \l "1"
(czs,zhouq)@s1000e.cs.tsinghua.edu.cn
.-
1
Montague[11]Categorical Grammar [12]ete te e tNPSV1V2MontagueCategorical Grammar
[2] LOC(),TIM(),IND(),RELn (n),SIT(),INF(),TYP(),PAR(),POL()
/ [2][1][3]/[1][3]sssss /thesaurus(sort,class) (case, theme)[2]
(G-type),(D-type/O-type/)[7]
[7]/([7])//-//
/-2.3. .4.(). 5..
2.
2.1
/
[7]
4
(
2.2
r,l:Loc,I1,,In;p.
/--
. //((
3.
3.1
3.1. T1,T2
xx : T1x : T2T1T2T1T2T2T1.
. T1T2T1T2 T1=T2. T1=T2((T1=T2)T1T2.
T1>T2T1T2T1T2. T2T2T1T2T1T2T1=T2T1,T2T1T2(
( (( , )
3.2T1,T2T3T1 T3,T2 T3T1,T2T1T2T3T1,T2
(
3.3.T,T{Ti | TiT},T{Ti | TTi},T(
T (T, )TTT
.3.4.T1,T2T3T3 T1,T3 T2T1,T2T1T2T3T1,T2T1#T2.
(
3.5. xxVx{T| x : T}x(
xVx/
xVx =[x]. xx
(( ,)((,, /
3.2
A1,A2,,An. [9][m_][m_][m_][m_][m_][m_][m_][m_][m_][9]
3.6.`T1,T2T1T2,T1 T2,T1+T2,T1(T2
( : T1T2a1 : T1,a2 : T2 (=
( : T1T2a1 : T1,a2 : T2 (={a1,a2}
( : T1 + T2( : T1 ( : T2
{a1,a2}
( : T1 ( T2,((( : T1), ( : T2./*, ( : T1,( : T2. */
(
T1T2Tn,T1T2Tn,T1(T2(( Tn ,T1+ T2++ Tn.T1,T2,,Tn()
( :: T1T2a1 :: T1,a2 :: T2 (=
( :: T1T2a1 :: T1,a2 :: T2 (={a1,a2}
T1,T2 ,T1,T2T1T1, T2 T2T1T2 T1T2,T1T2TT2, T1+ T2 T1+ T2T1 ( T2 T1 ( T2.
(
nn(TT = T1T2TnT1T2Tnn/
T=T1T2 Tm, (TTi(i=1,,m)m-T=T1 + T2+ + Tk(TTk(T(T1 + (T2+ + (Tk, (T=(T1 + (T2+ + (TkT=T1 ( T2 ( ( Tk(TTk(T(T1 ( (T2 ( ( (Tk, (T=(T1 ( (T2 ( ( (Tk.
3.7. T1,T2T1 T2T1T2f :T1 T2f:a ( ba:T1 b:T2.(
3.3
/[7]
[7]
.
((()( (:T((()((T)(T((T).( = [(].()c_(T)[c_(T)]TT=[m_][m_]set[m_]set[m_]((_T.
( =r,(;p,( r((:T, TrrR_T(r),Tr. : r-T = R_T(r)TrrT
(r,(,p ((r,( ;p)( D([r]TP), PpD(),D(rTP)
-
R_T(r)rR_T-R_T=(R_T(r)|r:REL)= (Tr|r:REL)r:RELr
.((()(::T1,((()[(](T)=T1(T).
R_TR_T(r)-F_T(()(F_T(()T(. F_T
F_T=(F_T(()|(:FUN)=(T(|(:FUN)
.x|(,I)I[7]I
C=(,I),CTC
TC=T1T2TmD(T1)D(T2)D(Tn)
(#)
T1,T2,,Tm, D(T1),D(T2),,D(Tn ) I
xT,x|(,I)[x|(,I)] = T|TC,TC(#)
[7]
x|C[x|C].
C = {(T1,x1;1,(T2,x2;1},x=x1,x2;
[x1,x2|C]= T1T2, [bag(x1,x2|C)]= T1T2;
C = {(T1,x;1or(T2,x;1}, [x|C]= T1+ T2;
C = {if(T1,x;0then(T2,x;1}, [x|C]= T1( T2
-
r([r],[(]r=[r], (=[(].
3.8.T
(T(T((T)((T),([(](=T( ( (T()(=T( (( T()
rTr TD(rTP)PD(T).
(1(T1),, (m(Tm), D(T1),,D(Tn), T(1(T1)(m(Tm)D(T1)D(Tn), T|(1(T1)(m(Tm)D(T1)D(Tn)
(
1 (T1,T2,T( (()T1T2,((T1)((T2);T1=T2
2 (1,(2, (1>>
72AAAABABADAD> > >60%
7314413014130
BB5004>HH3509>EE2609>DD2556>EB2530>ED2105>HD1979>HB1381>BD1041>IB1026>II966>FB912>DB842>EA822>EH794>ID770>CC719>IH711>AA670>HI578>CB573>BK572>GG560>JD539>KH470>DA469>BC466>HE453>HA446>CD444>FF442>EI437>FH395>IE390>HJ389>DH377>KD375>HC365>DC344>GD331>BE309>AD301>JB289>KE284>HF279>JK279>EC272>EK267>KK264>FI263>EL262>JJ261>HK258>BH243>BA240>HG239>EG234>IF229>GH215>KJ208>DE208>IC203>BI203>EF202>KI199>FD179>JE175>EJ174>JI167>KG167>DK159>KC158>FK155>KB153>FE151>CH144>IJ138>IK134>GE130>IA121>DI120>DG108>GK107>KA102>JC100>AH99>AB99>GI95>JG93>JA91>FJ88>CA86>CE86>AK80>FC74>IG72>AE71>CI67>GJ66>KF64>GB64>GA56>DH55>GF54>AC50>FG48>FA46>JF40>BG28>BJ25>DF25>CJ24>AI23>GC19>CF17>AG16AJ16>CG14>LK6>LL5>LE4HL4AF4>LD3>KL2LG2>GL1LJ1IL1LI1
BBHHEEDD
74B21189>H17242>D16025>E15685>I7928>K5381>C5281>A4604>F4223>J3634>G3339>L5612BL
75E10763>H9928>B8267>D5325>I4788>K2808>F2750>C2296>J2176>G1735>A1433>L3774EEB12916> D10697> H7314>E4918>A3167>I3139>C2985>K2573>F1473>J1458>G1604>L1974BDHEA53%27%13%92%
76A
[1] 200331-10
[2]200272129-142
[3]
[4] 2002
[5]199619962003
19884
1996
2001
19853
1985
19982
20001
20016
1991
19957
1999,4
TC "" \f C \l "1"
100083
semantic componentsememe2050(lexeme)(semantic component)(semantic marker )(sememe) (semantic feature)
(Jakobson)(Hjelmsev )Katz ,Lakoff, McCwley ,Ross Dowty(2000)
S
Cause Zhangsan S
Become S
Not S
Alive Lisi
S
Cause Zhangsan S
Become S
Not alive Lisi
S
Cause Zhangsan S
Become Not alive Lisi
S
Cause become not alive zhangsan Lisi
KILL
S
KILL ZHANGSAN LISI
KILLKILL(BECOME)DEAD(BECOME NOT) ALIVE
1968
human,adult,male
1)
(2002)
+++++
+++
converge+++
++++++
+(-)+(-)
++
+++
converge++
2
1957
(Colorless green idea sleeps furiously)
++++++
3
2000
a1 2 3 123
b1 342 1234
c3 4 1234
1 2
abc
+++
+++
+++
++{}+
++{}
++{}+{}
{} {}{}
++++++ ++ +
+++++++
{} {}{}
{} {}{}
+++++{}+++{}
++++{}+
{} ++{}+
abb+a
{} ({}{}
{} {}{}
++++{}+++{}
1
{}++{}+++{}
{}( {}({}
2000
2000
2002
19961http://yywz.jhun.edu.cn/xide.htm
Feature Representations and Logical Compatibility between Temporal Adverbs and Aspects TC " Feature Representations and Logical Compatibility between Temporal Adverbs and Aspects " \f C \l "1"
Shih-Min Li, Su-Chu Lin, Keh-Jiann Chen
CKIP, Institute of Information Science, Academia Sinica, Taipei
{shihmin, jess}@hp.iis.sinica.edu.tw; [email protected]
Abstract
In this paper, we propose clear-cut definitions to distinct temporal adverbs and provide descriptive features for each class of temporal adverbs. By adopting the corpus-based approach and measuring time points in temporal axis, the temporal adverbs listed in Lu & Ma 1999 are revised and reclassified into four main classes namely, time, frequency, duration, and time manner. The descriptive features suffice to discriminate temporal relations and to predict logical compatibility between temporal adverbs and aspects.
1 Introduction
There are about 130 temporal adverbs in Mandarin Chinese. Lu & Ma classify the temporal adverbs into two groups, speaking-time related adverbs (abbr: ST-related adverbs, ) and reference-time related adverbs (abbr: RT-related adverbs, ). The ST-related adverbs consist of 27 temporal adverbs, which are subdivided into three subclasses. In the class of RT-related adverbs, 104 temporal adverbs are listed and subdivided into 18 subclasses. Lu and Mas subdivision of temporal adverbs based upon aspects of situations. However, the subdivision is vague and the definition is ambiguous. For example, cengjing , ceng , yeyi and yejing are grouped into two different subclasses of ST-related adverbs. The former two cengjing and ceng are grouped into the same subclass, which express the actions or situations have been existed or happened before speaking time. The later two yeyi and yejing are grouped into the same subclass, which indicate the actions or situations have been completed or occurred. In fact, it is difficult to differentiate the actions or situations that have been happened from those completed, especially when the situation type is achievement with SHORTLY-PRECEDE(t1,t2) or NEARLY-EQUAL (t1,t2). Moreover, temporal adverbs may not have the same syntactic behaviour even though they are classified into the same subclass. For instance, the ST-related adverbs cong , conglai , zhijin , xianglai , sulai , lilai , su , and yixiang are grouped into the same subclass. When co-occurring with aspect markers le , guo and zhe , cong, conglai and zhijin are incompatible with le and zhe; however, xianglai, sulai, lilai, su and yixiang are incompatible with le and guo. The cause of the difference in the compatibility of temporal adverbs with aspects will be also under discussion.
In this paper, we propose clear-cut definitions and provide descriptive features for each subclass of temporal adverbs. The descriptive features help to define temporal relations and to predict the compatibilities between temporal adverbs and aspect markers.
2 Literature Review and Methodology
To make a clear-cut differentiation, we use the Academia Sinica Balanced Corpus (Sinica Corpus) and adopt the corpus-based approach to analyze Mandarin Chinese temporal adverbs. Time points in temporal axis will be used to define the temporal relations of the temporal adverbs in Lu & Ma 1999.
Smith (1991) discusses aspectual systems in language. She illustrates each situation type and viewpoint type with temporal schema. Below are the temporal schemata of Mandarin Chinese aspectual markers le, guo and zhe, which are represented by symbols I, F, F+1, . and /.
(1) Temporal schema fore the le Perfective (Smith 1991: 348)
IF
// (RVC)
(2) The Mandarin guo perfective viewpoint (Smith 1991: 353)
I .F F+1
/ /
(3) The zhe viewpoint (Smith 1991: 363)
I..
////State
Klein (1994) points out five temporal features and notes TT, TU and TSit. TT, topic time, is the time span to which the speakers claim on the occasion is confined. TU is time of utterance, which is the time at which the utterance is made. TSit is time of situation, which presents the time at which event occurs.
In addition to Smiths and Kleins temporal terminology of time points and time interval, the event modules in the framework MARVS (Module-Attribute Representation of Verbal Semantics) proposed by Huang et al (2000) can be also applied to analyze temporal relations of temporal adverbs. Event modules are the basic building blocks of the event contour. Five event modules stand alone or in combination, including Boundary, Punctuality, Process, State and Stage. The event module Boundary is defined as an event module that can be identified with a temporal point and must be regarded as a whole (including complete Event), which is adopted in this paper to define the notion of Boundary Point.
Yang & Bateman further discuss the semantic temporal relations of aspect system and propose principled semantic conditions for aspect combination. In their opinion, Chinese aspect system is actually composed of both aspect morphemes (-le, -zhe, -guo4, etc.) and aspect adverbials. Moreover, they propose that the Chinese aspect system has basically seventeen simple primary aspect forms. These simple primary aspect forms belong to the three subsystems of perfective, imperfective or future-existing according to the semantic properties in individual cases. Some simple primary aspect forms can combine to form an aspect of secondary type if their temporal attributes are in harmony. The temporal relation of the combination is represented graphically by time point ti, tf, tr and ts.
In this paper, we adopt the terms proposed in the research above to help us clarify the temporal relation of each subclass of the temporal adverbs listed in Lu & Ma. We use the notations of ST, RT, ET, BP, Start and End to define temporal relations. Each respectively denotes the speaking time, the reference time, the event time, the boundary point, the start point of the event, and the end point of the event. For instances, the temporal features for le, guo and zhe are defined as follows in our system which are compatible with the notions of (Smith 1991).
le: BPST, which means the prominent boundary point of the referred event precedes the speaking time.
guo: End