a method for supporting document selection in cross-language information retrieval and its...

Computers and the Humanities 35: 421–438, 2001.© 2001 Kluwer Academic Publishers. Printed in the Netherlands.

421

A Method for Supporting Document Selection inCross-language Information Retrieval and itsEvaluation

MASAMI SUZUKI∗, NAOMI INOUE and KAZUO HASHIMOTOKDD Research and Development Laboratories Inc., 2-1-15 Ohara, Kamifukuoka-shi, Saitama,356–8502 Japan (∗author for correspondence: E-mail: [email protected])

Abstract. It is important to give useful clues for selecting desired content from a number of retrievalresults obtained (usually) from a vague search request. Compared with monolingual retrieval, sucha support framework is inevitable and much more significant for filtering given translingual retrievalresults. This paper describes an attempt to provide appropriate translation of major keywords ineach document in a cross-language information retrieval (CLIR) result, as a browsing support forusers. Our idea of determining appropriate translation of major keywords is based on word co-occurrence distribution in the translation target language, considering the actual situation of WWWcontent where it is difficult to obtain aligned parallel (multilingual) corpora. The proposed methodprovides higher quality of keyword translation to yield a more effective support in identifying thetarget documents in the retrieval result. We report the advantage of this browsing support techniquethrough evaluation experiments including comparison with conditions of referring to a translateddocument summary, and discuss related issues to be examined towards more effective cross-languageinformation extraction.

Key words: browsing support, cross-language information retrieval, partial translation, term list

1. Introduction

Cross-language information retrieval (cf. Oard, 1999) for ordinary users is goingto be a realistic task in the recent explosive expansion of WWW environments.Currently, physically accessible WWW pages (many of them containing languageindependent visual contents) are dramatically increasing in Asian countries as wellas in other world areas. Nevertheless, a huge amount of valuable documents arevirtually impossible to be reached due to high language barriers. In such a situation,some large scale search engines are offering language selection as one of the filter-ing parameters. However, usually no language support for browsing the listed (hit)result is given during the retrieval navigation, except in few examples like TITAN(Kikui et al., 1995), while full text translation (of limited quality) is available afterthe document is selected to be read; e.g. Altavista (http://www.altavista.com/),Davis and Ogden (1997). Our objectives are to examine support techniques fordocument selection in cross-language information retrieval, and to design a prac-

422 MASAMI SUZUKI ET AL.

tical server system for non-specialist users. We also believe such a system willcontribute to reduction of the cost of full-text translation for any document and topromote inter-lingual (cultural) information exchanges.

We consider that cross-language information retrieval (CLIR) should have thefunction of providing information useful enough in identifying the relevance ofthe retrieved document, in the user’s language. Though the ideal may be a cross-language text summary, a possible fallback position would be the indication oftranslated important keywords (phrases) in the text. To achieve this subgoal,we implemented a keyword-based cross-language search engine, which acceptskeywords in English or Japanese (currently) and provides relevant documents indifferent languages including Chinese as well as English and Japanese, with word-level translation of major keywords extracted from each content. The basic ideaand method are described in Section 2, and its evaluation (in comparison with thecase of providing a translated summary of contents) will be reported in Section 3,followed by a discussion and a conclusion.

2. Browsing Support for Cross-Language Text Retrieval

2.1. BACKGROUND AND MOTIVATION

As we described in the Introduction, providing browsing support information is ascrucial for document selection as extracting relevant documents in cross-languageinformation retrieval. We find only a few approaches to this issue, like the exampleof TITAN which provides roughly translated HTML (or section) titles of retrieveddocuments, though the effectiveness has not yet been reported. In other words, itmay be difficult to evaluate supporting methods, compared with the retrieval per-formance itself. However, Resnik (1997) showed evidence of a browsing supporteffect caused by information created as “gisting” of original text. Its outline is asfollows:

Resnik emphasizes the importance of “decision making” using provided infor-mation. In his experiments, Japanese yellow page descriptions1 were used fororiginal information source and their translated “gisting” was prepared as a list-ing of English translation candidates for each noun in the description. Thus,such a gisting is considered to help a user’s decision in estimating the abstractof each description. Resnik reported a result of an evaluation experiment, wheresubjects classified given descriptions into 6 pre-assigned service categories, refer-ring to only the translated gisting. According to his view, it showed a sufficienteffectiveness as “decision making” in such a task.

Our objective is to give more general support clues for retrieving foreign doc-uments with a certain amount of text. The basic idea of “Enhancing source text(with translated major keywords) for WWW content distribution and retrieval” wasproposed in Suzuki et al. (1996) based on the following scheme (refer to Figure 1).1. Search Environments: After WWW content is collected by a spider, language

and code identification of collected documents is performed based on a stat-

A METHOD FOR SUPPORTING DOCUMENT SELECTION 423

Figure 1. Configuration.


istical method considering the code value distribution in each combination oflanguage and code system, currently within Japanese, English and Chinese.Then, indexing for each document is carried out by language, based on freeWAIS and language-dependent morphological analyses.The server accepts keywords in English or Japanese and returns the cross-language text retrieval result sorted by identified languages and scores respec-tively: Japanese, English and Chinese.2

2. Browsing Support Information: In our framework, each result item contains alist of major keywords (frequent terms except predefined stop words) in theretrieved document, together with the HTML title, URL, document size and soon. This will help users to grasp the result list as a reference for actual selectionamong them. Our current objective is how to generate the most appropriatetranslation of the above keywords in the user’s own language, in order toachieve more effective browsing support for identifying the relevance of theretrieved document.

Under such a scheme, the browsing support information appears as in thefollowing Figure 2:

Search Request (User’s) Language = EnglishSearch Target Language = Japanese and ChineseInput Query = Asia AND telecommunication AND MarketDisplay of one example item from a retrieval list:

(1)Chugoku keitai denwas shijou *1http://www.foo.or.jp/~bar/report.html *2Japan 12Kbyte *3(enterprise,HongKong,mobile,competition,handover) *4

*1 {document title} = China mobile phone market*2 {resource location}*3 {domain / document size}*4 {Translated major keyword list} Japanese → English

Figure 2. Translated major keyword list as browsing support information.

This example shows that our aim is to provide clue information for judgingwhether a retrieval document is relevant or not for the user, at a result listing stage.For this purpose, improvement of the translation quality is important particularlyin the case of words with plural senses. In the next subsection, we describe ourapproach to achieve a reasonable translation candidate from multiple choices.


2.2. METHOD OF SELECTING APPROPRIATE TRANSLATION FOR MAJOR

KEYWORDS

2.2.1. Basic Idea

When we examine a case of translation from English into Japanese (the otherdirection as in the previous example, Figure 2.) with a bilingual word lexicon,an English word entity can have several translation candidates in Japanese. There-fore, the question is how to select the most appropriate combination of translationcandidates for the given 6 original keywords.

Carbonell et al. (1997) showed the advantage of their example-based MTapproach to learn translation candidates using a volume of aligned parallel cor-pus, compared with using an existing machine-readable lexicon, or other statisticalmethods of generalized vector space models (GVSM), latent semantic indexing(LSI) and so on.

For practical use, however, it would be rather important to consider certainapproximations when sufficient aligned parallel corpus is not available. Thus, weadopted the utilization of the corpus in the language into which the keywordsshould be translated. This is based on the following supposition in such cases asthe above situation:

− If we define the co-occurrence of two words as the simultaneous appearanceof those words near in the same document,3 the two English keywords andtheir correctly translated Japanese keywords will have relatively similar co-occurrence distributions in both corpora of a common topical domain in thetwo languages.

− Even if the two English keywords have several translation candidates inJapanese respectively, the co-occurrence of the most appropriate combinationof the two candidates will be dominant in the Japanese corpus.

For instance, let us consider the English word “organ”. Its possible meanings are:

organ1 = a musical instrumentorgan2 = a biological part of an organismorgan3 = a group or body of an organization

. . .

Though the sense of “organ” itself is ambiguous, it will be more definite whenits co-occurrent words are known. Table I shows example statistics of the sense dis-tribution of “organ” when each other specific word appears within its neighborhoodin a certain corpus. The above supposition means that such a word sense distribu-tion is probably similar to the distribution of each corresponding translation in thetarget language (Japanese). Therefore, we can estimate the probable translationcandidate of “organ” (corresponding to one of the above meanings), consideringthe co-occurrence of words in the Japanese corpus. For example, in the text where


Table I. Sense co-occurrence matrix of two words

Organ1 Organ2 Organ3

(orugan) (zouki) (kikan)

hospital 2 25 10

(byouin)

music 36 1 5

(ongaku)

Italic letters indicate correponding Japanese terms.

the word “music” frequently appears, the word “organ” in the same text is likely tobe interpreted as a musical instrument (orugan).4

2.2.2. Actual Method

To realize the above mentioned idea, we implemented the following procedure.5

1. Counting overall word frequency in a collected Japanese corpus (making theunigram)6 based on morphological analysis using ChaSen (Matsumoto et al.,1999).

2. Computing the co-occurrence as near appearance based on our definition forword pairs which appeared in the unigram within the top 3,000 frequency (forreducing computational cost and considering the sparseness of data). The resultis a matrix-like co-occurrence table (bigram).

3. Extracting major keywords (in English) from each individual HTML documentto be retrieved, according to their frequency and eliminating functional (stop)words.

4. Consulting a bilingual lexicon to prepare translation candidate(s) for the aboveoriginal keywords.

5. Selecting the most dominant combination of the candidate words which tendto be co-occurrent in the Japanese corpus, referring to (2) and (4) as follows:

(a) Let the translation candidate of an English keyword E1 be J11, J12, andJ13.

(b) As well as E1, the other English keywords have their translation possibil-ities: J21, J22, . . . for E2; J31, J32. . . for E3; . . . ; Jn1, Jn2 . . . forEn.

(c) While referring to the co-occurrence table for the Japanese words, thepossible co-occurrences among the binary combinations are compared:For the term 1 (E1), the possible combinations are:

J11 with either of (J21, J22, . . . , J31, J32, . . . , Jn1,Jn2 . . . )J12 with either of (J21, J22, . . . , J31, J32, . . . , Jn1,Jn2 . . . )J13 with either of (J21, J22, . . . , J31, J32, . . . , Jn1,Jn2 . . . )


After these comparisons, the preferred selection of the translation of E1 isdetermined as the Japanese candidate word J1i giving the most frequentco-occurrence.

(d) Then, as the next step, selection among J21, J22 . . . for E2 is performed inthe same way, except using the fixed J1i which was previously determined.

(e) In such a manner, translation (candidate selection) of the remainingEnglish keywords are assigned in order.

This procedure yields an approximation for selecting the most appropriate com-bination of translated keywords, giving priority to the frequent occurrence of twowords. Recent research (Grefenstette, 1999) supports the hypothesis that frequentco-occurrence of candidate translations does indeed provide correct translations.

3. Evaluation of Browsing Support Effectiveness

In the previous section, we described our motivation for providing browsingsupport information as translated major keywords in each retrieved document.Moreover, we showed a method of achieving higher quality of translation usingthe co-occurrence of two words in the translation target language. The effec-tiveness of a similar method has been reported by Kikui (1998), as far as thedegree of improvement in translation quality. However, as we mentioned before,its contribution to a users’ document selection has not yet been discussed.

Thus, in order to evaluate the effectiveness due to the above proposed mech-anism of selecting appropriate translation of keywords, we carried out two kindsof experiments with human evaluation. First, we examined the effectiveness ofreferring to keyword lists for selecting appropriate content among the retrievedresults, under different conditions (see Section 3.1). Next, we made an attempt tocompare the above support method with indicating translated summary of contentsas another potential means of browsing support (see Section 3.2).

3.1. EVALUATION EXPERIMENT 1

3.1.1. Experimental Procedure 1

We designed the following evaluation task to compare the different ways of dis-playing keywords in a search result list, when a user tries to find a document inforeign language which is relevant to a certain topic theme shown in the user’squery language.− Human subjects: 64 judges (31 males and 33 females) ranging in age between

18 and 30 (mostly students of Japanese universities); their native language isJapanese.

− Target text of retrieval: A set of newspaper articles on economical issues inChina were prepared as 224 parallel texts in both English and Japanese.7 Inour experimental supposition, the retrieval target is English text, and 10 art-icles were indicated as a set of a retrieval result list for each search topic that


Table II. Task topics

1 Current situation of retail business in China

2 New idea to reform possession of dwelling by GuangDong Province

3 Construction of infrastructure expects cooperation with foreign firms

4 The ministry of foreign economy and trade has suggested that the import of

strategic industry technology will be “taken care of”

5 Asia: Consumption of rolled steels rises and production expands

6 Cereal imports obviously declined last year in China

7 To change to the international order of cement in our country

8 Great motorcycle market potential in the country

was given to the judges in Japanese.8 The prepared 10 results were shuffledand only one article matching with the given search topic was hidden amongthem.

− Task of the subjects: After instruction, a search topic list (with 8 topics asshown in Table II) is given in Japanese to the judges. A judge selects, oneby one, the topics to perform each evaluation task. In one evaluation session,according to a given topic theme, the retrieval result is shown as a table of 10local URLs and major keyword lists corresponding to each text, in a browserwindow without scrolling. The task of the judge is to decide the topicallyrelevant article numbers in the given list by referring to the indicated keywordlists. The way of indicating keywords is varied according to the followingconditions. A judge is requested to select at least one candidate which seemsto be the most appropriate, and can choose the second and the third best items.

− Comparison conditions: The major (frequent) keywords are indicated accord-ing to one of the types in Table III. Each judge performed all the typesof keyword indication, A to C, with scrambled orders (person by person).Moreover, the number of displayed keywords was controlled for each topic toone of 3, 6, 9 or 12.9 For example, in the case of 6 English keywords (type A),the following formatted list is indicated for each retrieved document:

(E1, E2, E3, E4, E5, E6),while 3 Japanese keywords (type B or C) are indicated as

(J1, J2, J3); each Ji is a translation of Ei .The measure of effectiveness of a certain given keyword list as a clue for

appropriate document selection is defined as the correct decision rate.If a judge’s choice (at most 3 items) include the correct document number (only

one for each topic), one point is given. Therefore, the correct decision rate rangesbetween 0/8(minimum) to 8/8(maximum) for each judge.


Table III. Keyword list types

(A) English (original language of the text)

(B) Japanese translation without considering the word co-occurrence

(as presented at the first order in the dictionary)

(C) Japanese translation with consideration of the word co-occurrence

(using the proposed method in 2.2.2)

Table IV. Overall experimental result

Condition Correct decision Time duration

A 67.2 %∗ 145 sec

B 69.5 %∗ 104 sec

C 75.8 %∗ 105 sec

∗The correct decision rates are significantly different (p < 0.01).

3.1.2. Evaluation Result 1

Table IV shows the average percentage of correct decision by the judges for thegiven topic and result list, according to the difference of the above comparisonconditions (a total of 128 trials for each), with average time duration for eachsession. This result gives us the next suggestion.

1. Effectiveness of browsing support by translated major keywords: The dif-ference of correct decision rate among the three conditions is considered toreflect how the indicated keyword list was helpful to the judges in selectingthe candidates. Especially, the observation that condition C is superior to con-dition B shows the advantage of our method using word co-occurrences inthe language generating the translation. The generated Japanese keywords incondition C were relatively changed by 40.3 % (including synonyms) fromthose in condition B on average.10

As for the time consumed for identifying topically relevant article numbers forthe given topic, it is about 70 % in the case of the subject’s native language(Japanese, condition B and C) compared with the original language (English,condition A) on average.11 This is regarded as the difference in efficiency ofvisual input volume of information to the judges between the two languages.

2. Comparison of correct decision rate according to topics and numbers of dis-played keywords: The detailed results showed that the 8 topic themes can bedivided into two groups: 4 higher and 4 lower correct decision scored topics(see Table V). Figure 3 illustrates these two groups and their average score.This result indicates that the effectiveness of browsing support by translatedkeyword list was prominent in the lower scored topics group.


Table V. Experimental results sorted by task topics

Topic Correct decision rate

number Whole A B C

3 50 %

1 53 %48 % 53 % 62 %6 56 %

2 64 %

5 81 %

7 81 %85 % 85 % 89 %

4 87 %

8 89 %

Figure 3. Two groups with higher and lower scored topics.

On the other hand, Table VI suggests that there was no definite correlationbetween the numbers of displayed keywords and the correct decision rates. Therelationship between the number of displayed keywords and the time durationis also shown in Table VI; it seems that the judges spent less judging timein the conditions of fewer keywords. Moreover, average time duration for 2(lower and higher scored) topic groups were 116 and 94 seconds respectively.


Table VI. Experimental result classified with numbers of displayedkeywords

Number of keywords 3 6 9 12

Lower scored topics 58 % 38 % 63 % 60 %

Higher scored topics 94 % 85 % 85 % 83 %

Time duration 77sec 109sec 112sec 123sec

Table VII. Experimental result classified by subjects’ English ability

Category (subjects) Low (24) Middle (30) High (10)

Correct Decision 65 % 73 % 69 %

Note: Classification categories for English test score:Low: less than 50, Middle: 50 ∼ 69, High: 70 or over.Scores are between 0 and 100.

Table VIII. Experimental result classified by familiarity with informa-tion systems

Category (subjects) 1 (17) 2 (17) 3 (15) 4 (15)

Correct decision 68 % 71 % 67 % 75 %

Note: Classification categories for familiarity with information systems:1: not familiar, 2: a little experience, 3: considerable experience,4: expert.

3. Additional observation: We investigated the subjects’ English language abil-ity in a 10-minute paper test (designed for measuring personal attainmentsin English; error correction for given text, selection of appropriate expressionfor given situations, etc.) and familiarity with information systems (question-naire). The classified experimental results are shown in Table VII and TableVIII respectively. Though we could not find a significant correlation betweenthose parameters and scores, it was observed that some subjects with higherfamiliarity with information retrieval through the Internet (in the category 4)showed higher decision scores (Table VIII).

3.2. EVALUATION EXPERIMENT 2

Based on the results from Experiment 1, we designed another experiment. In thisexperiment, human subjects performed the same task, referring to a translated con-tent summary for each result document, instead of referring to a major keywordlist. This experiment was carried out as a paper- based writing task different from


Table IX. Summary types

(S1) First 2 lines (75 characters) from manually translated

(Japanese) article.

(S2) Text extracted by a summarizing software M from the manually

translated (Japanese) article.

(S3) Corresponding Japanese (partial text) for text extracted

by a summarizing software P from the English article.

(S4) Extracted and translated by a “summarizing and translation”

software Q

Note: Above M is a part of a major word processing software, and P was possibleto use as an online service. Q is a popular translation software product for personalcomputers.

Table X. Experimental results with condensed proportion to the original text

Summary type Average text amount (proportion) Correct Decision

S1 71 characters (9.4%) 78.3%

S2 88 characters (10.6%) 77.5%

S3 175 characters (23.3%) 72.5%

S4 227 characters (30.2%) 50.8%

Original 752 characters (100.0%)

Experiment 1. The reason is that the texts of summarized contents are larger thanthose of keyword lists and it is impossible to display them in a window withoutscrolling, with readable fonts. The other aspects are the same as those of the formerexperiment.

3.2.1. Experimental Procedure 2

− Human subjects: 60 Japanese students (27 judges and 33 judges from twodifferent universities) participated in the experiments held on their campuses.

− Comparison conditions: The summary of each content was given as one of thesummary types in Table IX. We note that S1 ∼ S3 were given as a partial textof manually translated documents from an original text in English, while onlyS4 was generated by machine translation from a summarized original text.

3.2.2. Evaluation Result 2

Table X shows the condensed rate of each type of summary S1 ∼ S4, and thecorrect decision rate.12 In this result, S1 ∼ S3 gave relatively higher decision rateswith little difference, while S4 gave a much lower rate. The most crucial reason


for the result seems that only S4 was generated using machine translation. Further-more, the appearance of text and experimental result were similar between S1 andS2, where S1 was simply generated but often reasonable for such news articles,while S2 was output by a summarizing software M (a method based on importantsentence extraction seems to be adopted).

Additionally, the average (overall) time duration for one decision task was 194seconds, and this is about 1.8 times longer than that in Experiment 1 (conditionsB and C). Though the experimental conditions of the two experiments were notstrictly the same (WWW browser vs. paper), it may be caused by the differenceof information content between the keyword lists and the summary indication.According to a questionnaire after the experiment, most of the subjects preferredthe text length used in conditions S1 and S2 as the text retrieval result indicationamong S1 ∼ S4.

3.3. COMPARISON OF EVALUATION RESULTS 1 AND 2

1. Comparison between indicating conditions: Through the two evaluation exper-iments, we could compare the effectiveness of the different ways of supportingdocument selection for CLIR. Apart from the display of a text summarygenerated by machine translation, the two frameworks (keyword list vs. sum-mary) indicated near levels of effectiveness on document selection from givenretrieval results, referring to Table IV and Table X. We observed that even thetranslated major keywords list could be helpful for the purpose of “sifting”task in the text search domain. On the other hand, the (translated) summarycould show only the same or slightly better effectiveness on the same task,though much broader information was given in a naive sense. Moreover, thetranslation quality was very crucial for accurate judgment.

2. Comparison of score distribution with individual task topics: As mentioned inEvaluation result 1 (Figure 3), the 8 task topics were divided into two groups:the higher (87 %) scored 4 topics and the lower (57 %) scored 4 topics, whilesuch a partial distribution was not seen in Evaluation result 2. However, onetask topic showed an exceptionally low decision rate (25 % in Topic 3, whichalso indicated the worst score in Experiment 1). The reason seems to be thatthe prominent keyword “infrastructure” in the task topic did not appear in eachsummary text, though its instances were actually described in the text; e.g.harbors, railway, power plant, etc.13

Related issues are also examined in the following sections.

4. Discussion

In our experiments, several human-related factors seemed to have influenced theresults. In this section, we will discuss the following issues, as observed in thisstudy.


− Quantity and Quality of Information: Compared with the keyword list, thetext summary has much richer information. However, the two results showedonly slight differences between their supporting effectiveness for documentselection. This means that such a term list provides sufficient indicative infor-mation for a document selection, which the judges performed in Experiment1. Moreover, our method of improving translation quality considering wordco-occurrences enabled a more precise relevance decision. Furthermore, theobservation that the number of displayed keywords has no definite correla-tion with the correct document selection (Table VI) reveals that increase ofinformation does not always yield a more correct decision. It also holds in thecase of providing a text summary in Experiment 2, where machine translationcaused serious deterioration of document selection.

− Difficulty of Tasks: In Experiment 1, as shown in Table V, the subjects’ cor-rect decision (document selection) rate varied strongly according to the topic.The reason seems to be that the discriminative strength of the keyword listswas largely different among the document sets. In other words, even simplyextracted frequent keywords in a certain document set appear to be informat-ive rather than indicative, while such a type of information is not sufficientlyindicative in another document set. This may be due to the limitation of ourcurrent approach to extract frequent keywords within a document (see also thenext section: 1. Extracting the important part of the text). However, it is stillhard to predict which document set (as a retrieval result) causes difficulty infinding a suitable selection by human subjects.

− Individual Difference: The texts used for the experiments are economicalnewspaper articles and are considered to be unfamiliar to most of thejudge students. Therefore, they seemed to have little voluntary motivationof inquiry.14 In that sense, the given task situation was almost equal to thesubjects. On the other hand, the additional observation in evaluation result 1(Table VII and VIII) shows that (passive) English ability of the subjects hasno definite relationship with the relevance decision task, while familiarity withinformation systems may have a certain influence.

5. Further Investigation

As we mentioned in the Introduction, our approach is also regarded as the firststep towards much more intelligent cross-language information navigation. For thepurpose of enhancing our current efforts, we still have to investigate the followingsubthemes:1. Extracting the important part of the text: In our current (tentative) frame-

work, only the term frequency (tf) is used for extracting major keywordsfrom the content, because it can be easily prepared at individual indexing. Wecould use idf (inverted document frequency) after the retrieving at a certain


computational cost, or utilize some structural information of the text (title,paragraph, sentence, and various structural markers in the case of certainrestricted domains); e.g. network news digesting by Sato (1995). Furthermore,passage retrieval techniques are promising for browsing support methods asindicating the most relevant part of each document according to the request.Its recent state was reported by Mochizuki et al. (1999). Another attractive dir-ection of interface for cross-language retrieval with summarization (the samemotivation with us) has been demonstrated by Ogden et al. (1999).

2. Achieving an appropriate translation for the above extracted phrases: Our pro-posed method of selecting translation candidates using word co-occurrenceshowed an advantage that bilingual parallel corpora are not necessary. How-ever, it might be better to prepare at least a comparable bilingual corpus toextract more precise translation, as the experimental result by Carbonell etal. (1997) suggested. From this viewpoint, one solution was proposed forJapanese-English cross-language text retrieval, where original (query) termsare transferred into those in the target language using co-occurrence frequencyin a comparable corpus (Okumura et al., 1998) (Ballestelos and Croft, 1997).Moreover, we could involve other kinds of techniques: e.g. knowledge-basedprocessing of word-sense disambiguation, though it depends on the efficiencyof describing or extracting such kinds of knowledge bases. A related studyusing a large scale database of multilingual lexical entries is shown in Dorrand Oard (1998).

3. Customizing the browsing support information: If the system previously recog-nizes user-dependent parameters like language capacity, scope of interest,retrieval history and so on, we could provide various adaptive ways of indic-ating browsing support information. We suppose that such an adaptationtechnique for rendering information retrieval result will be much more import-ant, because providing such a user-oriented customization should be includedin the information retrieval task.

4. Displaying the cross-language keyword tracking: So far, we reported ourapproach to cross-language information retrieval support for finding certainrelevant documents from large amounts of archives. We noticed that cross-language search of individual documents is not always necessary for users,they often would like to obtain macroscopic trends of certain topics concerningeach genre in the target foreign countries: politics, economy, society, etc. Thistriggered our attempt to analyze keyword distribution in a cross-language way.Its concept is to extract significant trend information from a certain volumeof document sets, based on statistical calculation of texts which contain giventopic keywords.

For instance, we could show chronological distribution (weekly or monthly) ofkeyword(s) for the selected target document set like newspaper archives, withuser interfaces for translating input keywords and displaying visual graphs.


Even such a simple mechanism often provides meaningful trend informationfor foreign topics. Similar studies for monolingual information visualizationare found in the field of text mining, e.g. “Information Outlining” by Takedaand Nomiyama (1998). Our scheme may be regarded as its cross-languageversion.

6. Conclusion

In this paper, new techniques for browsing support were introduced in cross-language information retrieval. The proposed method provides useful informationfor document selection, by displaying translated major keywords in user’s lan-guage for the retrieved contents. The effectiveness of such a browsing supporttechnique was confirmed in evaluation experiments which compared various con-ditions displaying a major keyword list or a translated summary of the contents. Asa conclusion, the method with translating keywords based on word co-occurrencedistribution in the target language seems to be one current reasonable solution forcreating effective clues for document selection, because its helpfulness was littleless than that with elaborated translation of summarized text.

Acknowledgements

The authors would like to thank the reviewers for many useful suggestions andlatest references.

Notes1 2 ∼ 3 line text indicating vendor name, address and service outline.2 The retrieval technique using a multilingual lexicon is not the point of discussion in this paper. Arelated reference is Suzuki et al. (1998).3 Hereafter, the definition of co-occurrence is as follows: if a document simultaneously containswords Wi and Wj which occur m and n times respectively, the co-occurrence of Wi and Wj is thelesser number of either m or n. Another restriction is that the distance of two words is less than 100words. This value of co-occurrence is accumulated through all the documents in a learning set.4 This example was simplified for explanation.5 We note that a similar method based on a word co-occurrence vector model was independentlyproposed by Kikui (1998).6 About 7,000 Web pages from various Asian information guide sites (including news pages) wereused.7 The original articles are in Chinese and they were translated into English and Japanese respectivelyby a human translator, maintaining the content equality.8 Each result was created as a document set of relatively similar word distribution by a documentclustering method (Aoki et al., 1998).9 Each judge was assigned to one of 8 different patterns, avoiding factors on trial orders.10 Under such circumstances using a lexicon without tuning, we cannot estimate that most of thechanged keywords considering co-occurrences are better translations.


11 The time duration was much dependent on individual persons.12 The two subject groups (universities) indicated very similar tendencies according to the 4 differ-ent conditions. It suggests that our experiments are sufficiently reproducible.13 It seems that the word “infrastructure” was not familiar to most of the subjects (university stu-dents), or they would perform only surface looking up in the given task.14 On the contrary, if we prepared more attractive materials for young students, like music, sports,etc., their knowledge and motivation might be rather individually different.

References

Aoki, K., K. Matsumoto, K. Hoashi and K. Hashimoto. “A Study of Bayesian Clustering of a Doc-ument Set Based on GA”. Proceedings of The Second Asia-Pacific Conference on SimulatedEvolution And Learning (SEAL98), 1998.

Ballesteros, L. and W. B. Croft. “Statistical Method for Cross-Language Information Retrieval”. InCross-Language Information Retrieval Ed. G. Grefenstette, Kluwer Academic Publishers, 1998.

Carbonell, J. G., Y. Yang, R. E. Frederking, R. D. Brown, Y. Geng and D. Lee, “Translingual Infor-mation Retrieval: A Comparative Evaluation”. Proceedings of International Joint Conference onArtificial Intelligence (IJCAI’97), 1997, pp. 708–715.

Davis, M. W. and W. C. Ogden. “Implementing Cross-Language Text Retrieval Systems for Large-scale Text Collections and the World Wide Web”. AAAI Spring Symposium on Cross-LanguageText and Speech Retrieval Electronic Working Notes, 1997.

Dorr, B. J. and D. W. Oard. “Evaluating Resources for Query Translation in Cross-Language Infor-mation Retrieval”. Proceedings of the First International Conference on Language ResourceEvaluation (LREC), Granada, Spain, 1998.

Grefenstette, G. “The World-Wide-Web as a Resource for Example-Based Machine Translation”.Proceedings of ASLIB Õ99 Translating and the Computer 21, 1999.

Kikui, G., S. Suzaki, Y. Hayashi and R. Sunaba. “Cross-lingual Internet Navigation System: TITAN”.Proceedings of Symposium on Application of Natural Language Processing ’95, InformationProcessing Society of Japan, 1995, pp. 97–105.

Kikui, G. “Term-list Translation using Mono-lingual Word Co-occurrence Vectors”. Proceedings ofCOLING-ACL ’98, 1998, pp. 670–674.

Matsumoto, Y., A. Kitauchi, T. Yamashita and Y. Hirano. “Japanese Morphological Analyzer,ChaSen 2.0 Users Manual”. NAIST-IS-TR99009, Nara Institute of Science and Technology(NAIST), 1999.

Mochizuki, H., M. Iwayama and M. Okumura. “Passage-Level Document Retrieval Using LexicalChains”. Journal of Natural Language Processing, 6(3) (1999), 101–126.

Oard, D. W. “Cross-Language Information Retrieval Resources”. http://www.clis.umd.edu/dlrg/clir/,1999.

Ogden, W., J. Cowie, M. Davis, E. Ludovik, H. Molina-Salgado and H. Shin. “Getting Informationfrom Documents You Cannot Read: An Interactive Cross-Language Text Retrieval and Sum-marization System”. Joint ACM Digital Library/SIGIR Workshop on Multilingual InformationDiscovery and AccesS (MIDAS) Electronic Working Notes, 1999.

Okumura, A., K. Ishikawa and K. Satoh. “GDMAX Query Translation Model for Cross-LanguageInformation Retrieval”. Proceedings of Information Processing Society of Japan (IPSJ) 1998Spring Meeting, Vol. 3, 1998, pp. 138–139.

Resnik, P. “Evaluating Multilingual Gisting of Web Pages”. AAAI Spring Symposium on Cross-Language Text and Speech Retrieval Electronic Working Notes, 1997.

Sato, S. “Automatic Digesting of the NetNews”. Proceedings of Symposium on Application ofNatural Language Processing ’95, IPSJ, 1995, pp. 81–88.


Suzuki, M. and K. Hashimoto. “Enhancing Source Text for WWW Distribution”. Proceedings ofWorkshop on Information Retrieval with Oriental Languages (IROL-96), 1996, pp. 51–56.

Suzuki, M., N. Inoue and K. Hashimoto. “Effect on Displaying Translated Major Keyword ofContents as Browsing Support in Cross-Language Information Retrieval”. Technical Report ofIEICE(Institute of Electronics, Information and Communication Engineers). NLC98-20, 1998,pp. 37–44.

Takeda, K. and H. Nomiyama. “Site Outlining”. Proceedings of ACM Digital Libraries 98 (DL’98),1998, pp. 309–310.