efficient relevance feedback for content-based image retrieval by mining user navigation patterns

13
Efficient Relevance Feedback for Content-Based Image Retrieval by Mining User Navigation Patterns Ja-Hwung Su, Wei-Jyun Huang, Philip S. Yu, Fellow, IEEE, and Vincent S. Tseng, Member, IEEE Abstract—Nowadays, content-based image retrieval (CBIR) is the mainstay of image retrieval systems. To be more profitable, relevance feedback techniques were incorporated into CBIR such that more precise results can be obtained by taking user’s feedbacks into account. However, existing relevance feedback-based CBIR methods usually request a number of iterative feedbacks to produce refined search results, especially in a large-scale image database. This is impractical and inefficient in real applications. In this paper, we propose a novel method, Navigation-Pattern-based Relevance Feedback (NPRF), to achieve the high efficiency and effectiveness of CBIR in coping with the large-scale image data. In terms of efficiency, the iterations of feedback are reduced substantially by using the navigation patterns discovered from the user query log. In terms of effectiveness, our proposed search algorithm NPRFSearch makes use of the discovered navigation patterns and three kinds of query refinement strategies, Query Point Movement (QPM), Query Reweighting (QR), and Query Expansion (QEX), to converge the search space toward the user’s intention effectively. By using NPRF method, high quality of image retrieval on RF can be achieved in a small number of feedbacks. The experimental results reveal that NPRF outperforms other existing methods significantly in terms of precision, coverage, and number of feedbacks. Index Terms—Content-based image retrieval, relevance feedback, query point movement, query expansion, navigation pattern mining. Ç 1 INTRODUCTION M ULTIMEDIA contents are growing explosively and the need for multimedia retrieval is occurring more and more frequently in our daily life. Due to the complexity of multimedia contents, image understanding is a difficult but interesting issue in this field. Extracting valuable knowl- edge from a large-scale multimedia repository, so-called multimedia mining, has been recently studied by some researchers. Typically, in the development of an image requisition system, semantic image retrieval relies heavily on the related captions, e.g., file-names, categories, anno- tated keywords, and other manual descriptions [19], [20]. Unfortunately, this kind of textual-based image retrieval always suffers from two problems: high-priced manual annotation and inappropriate automated annotation. On one hand, high-priced manual annotation cost is prohibitive in coping with a large-scale data set. On the other hand, inappropriate automated annotation yields the distorted results for semantic image retrieval. As a result, a number of powerful image retrieval algorithms have been proposed to deal with such problems over the past few years. Content-Based Image Retrieval (CBIR) is the mainstay of current image retrieval systems. In general, the purpose of CBIR is to present an image conceptually, with a set of low-level visual features such as color, texture, and shape. These conventional approaches for image retrieval are based on the computation of the similarity between the user’s query and images via a query by example (QBE) system [21]. Despite the power of the search strategies, it is very difficult to optimize the retrieval quality of CBIR within only one query process. The hidden problem is that the extracted visual features are too diverse to capture the concept of the user’s query. To solve such problems, in the QBE system, the users can pick up some preferred images to refine the image explorations iteratively. The feedback procedure, called Relevance Feedback (RF), repeats until the user is satisfied with the retrieval results. Although a number of RF studies [1], [11], [12], [16] have been made on interactive CBIR, they still incur some common problems, namely redundant browsing and explora- tion convergence. First, in terms of redundant browsing, most existing RF methods focus on how to earn the user’s satisfaction in one query process. That is, existing methods refine the query again and again by analyzing the specific relevant images picked up by the users. Especially for the compound and complex images, the users might go through a long series of feedbacks to obtain the desired 360 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 3, MARCH 2011 . J.-H. Su, W.-J. Huang, and V.S. Tseng are with the Department of Computer Science and Information Engineering, National Cheng Kung University, No. 1, Ta-Hsueh Road, Tainan, Taiwan, R.O.C. E-mail: [email protected], [email protected], [email protected]. . P.S. Yu is with the Department of Computer Science (MC 152), University of Illinois at Chicago, 851 South Morgan Street, Chicago, IL 60607-7053. E-mail: [email protected]. Manuscript received 17 Feb. 2009; revised 26 Aug. 2009; accepted 7 Nov. 2009; published online 27 July 2010. Recommended for acceptance by X. Zhou. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TKDE-2009-02-0075. Digital Object Identifier no. 10.1109/TKDE.2010.124. 1041-4347/11/$26.00 ß 2011 IEEE Published by the IEEE Computer Society

Upload: vincent-s

Post on 18-Dec-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Efficient Relevance Feedbackfor Content-Based Image Retrievalby Mining User Navigation Patterns

Ja-Hwung Su, Wei-Jyun Huang, Philip S. Yu, Fellow, IEEE, and

Vincent S. Tseng, Member, IEEE

Abstract—Nowadays, content-based image retrieval (CBIR) is the mainstay of image retrieval systems. To be more profitable,

relevance feedback techniques were incorporated into CBIR such that more precise results can be obtained by taking user’s

feedbacks into account. However, existing relevance feedback-based CBIR methods usually request a number of iterative feedbacks

to produce refined search results, especially in a large-scale image database. This is impractical and inefficient in real applications. In

this paper, we propose a novel method, Navigation-Pattern-based Relevance Feedback (NPRF), to achieve the high efficiency and

effectiveness of CBIR in coping with the large-scale image data. In terms of efficiency, the iterations of feedback are reduced

substantially by using the navigation patterns discovered from the user query log. In terms of effectiveness, our proposed search

algorithm NPRFSearch makes use of the discovered navigation patterns and three kinds of query refinement strategies, Query Point

Movement (QPM), Query Reweighting (QR), and Query Expansion (QEX), to converge the search space toward the user’s intention

effectively. By using NPRF method, high quality of image retrieval on RF can be achieved in a small number of feedbacks. The

experimental results reveal that NPRF outperforms other existing methods significantly in terms of precision, coverage, and number of

feedbacks.

Index Terms—Content-based image retrieval, relevance feedback, query point movement, query expansion, navigation pattern

mining.

Ç

1 INTRODUCTION

MULTIMEDIA contents are growing explosively and theneed for multimedia retrieval is occurring more and

more frequently in our daily life. Due to the complexity ofmultimedia contents, image understanding is a difficult butinteresting issue in this field. Extracting valuable knowl-edge from a large-scale multimedia repository, so-calledmultimedia mining, has been recently studied by someresearchers. Typically, in the development of an imagerequisition system, semantic image retrieval relies heavilyon the related captions, e.g., file-names, categories, anno-tated keywords, and other manual descriptions [19], [20].Unfortunately, this kind of textual-based image retrievalalways suffers from two problems: high-priced manualannotation and inappropriate automated annotation. On onehand, high-priced manual annotation cost is prohibitive incoping with a large-scale data set. On the other hand,

inappropriate automated annotation yields the distortedresults for semantic image retrieval.

As a result, a number of powerful image retrievalalgorithms have been proposed to deal with such problemsover the past few years. Content-Based Image Retrieval (CBIR)is the mainstay of current image retrieval systems. Ingeneral, the purpose of CBIR is to present an imageconceptually, with a set of low-level visual features such ascolor, texture, and shape. These conventional approaches forimage retrieval are based on the computation of thesimilarity between the user’s query and images via a queryby example (QBE) system [21]. Despite the power of thesearch strategies, it is very difficult to optimize the retrievalquality of CBIR within only one query process. The hiddenproblem is that the extracted visual features are too diverseto capture the concept of the user’s query. To solve suchproblems, in the QBE system, the users can pick up somepreferred images to refine the image explorations iteratively.The feedback procedure, called Relevance Feedback (RF),repeats until the user is satisfied with the retrieval results.

Although a number of RF studies [1], [11], [12], [16] havebeen made on interactive CBIR, they still incur somecommon problems, namely redundant browsing and explora-tion convergence. First, in terms of redundant browsing, mostexisting RF methods focus on how to earn the user’ssatisfaction in one query process. That is, existing methodsrefine the query again and again by analyzing the specificrelevant images picked up by the users. Especially for thecompound and complex images, the users might gothrough a long series of feedbacks to obtain the desired

360 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 3, MARCH 2011

. J.-H. Su, W.-J. Huang, and V.S. Tseng are with the Department ofComputer Science and Information Engineering, National Cheng KungUniversity, No. 1, Ta-Hsueh Road, Tainan, Taiwan, R.O.C.E-mail: [email protected], [email protected],[email protected].

. P.S. Yu is with the Department of Computer Science (MC 152), Universityof Illinois at Chicago, 851 South Morgan Street, Chicago, IL 60607-7053.E-mail: [email protected].

Manuscript received 17 Feb. 2009; revised 26 Aug. 2009; accepted 7 Nov.2009; published online 27 July 2010.Recommended for acceptance by X. Zhou.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TKDE-2009-02-0075.Digital Object Identifier no. 10.1109/TKDE.2010.124.

1041-4347/11/$26.00 � 2011 IEEE Published by the IEEE Computer Society

images using current RF approaches. In fact, it is notpractical in real applications like online image retrieval in alarge-scale image database. Second, Fig. 1 illustrates theproblem of exploration convergence. In Fig. 1, suppose thattwo users query with the same image whose conceptconsists of “car” and “sunset.” In this example, however,the aimed concept for Query 1 and Query 2 is “car” and“sunset,” respectively. After a set of feedbacks for Query 1and Query 2, two different moving paths will be producedsince they will lead to images of aimed concepts, respec-tively. The involved problem, so-called visual diversity, isshown in Fig. 2. In this case, if the compound concept to aimat consists of “car,” “sunset,” and “sunset and car,” it is noteasy for traditional CBIR methods to capture the user’sintention. Especially for query point movement methods,this problem will result in that the features would convergetoward the specific point in the feature space during thequery session. Hence, it is still hard to cover the concepts of“car,” “sunset,” and “sunset and car” even by performingthe weighted K-Nearest Neighbors (KNNs) search.

To resolve the aforementioned problems, we propose anovel method named Navigation-Pattern-based Relevance

Feedback (NPRF) to achieve the high retrieval quality ofCBIR with RF by using the discovered navigation patterns.The expected scenario for effectiveness versus efficiency isclearly illustrated in Fig. 3. It shows that our intent is toreach high precision efficiently and effectively. In terms ofefficiency, the navigation patterns mined from the user

query log can be viewed as the shortest paths to the user’sinterested space. According to the discovered patterns, theusers can obtain a set of relevant images in an online queryrefinement process. Thus, the problem of redundantbrowsing is successfully solved. In terms of effectiveness,the proposed navigation-pattern-based search algorithm(NPRFSearch) merges three query refinement strategies,including Query Point Movement (QPM), Query Reweighting(QR), and Query Expansion (QEX), to deal with the problemof exploration convergence. In short, the discoverednavigation pattern in NPRFSearch can be regarded as anoptimized search path to converge the search space towardthe user’s intention effectively. As a whole, through NPRF,the optimal results can be attained in very few feedbacks.

The rest of the paper is organized as follows: A review ofpast studies is briefly described in Section 2. In Section 3, wedescribe the details of our proposed method for queryrefinement and relevance feedback. Empirical evaluationsof the proposed method are expressed in Section 4. Finally,we conclude the paper in Section 5.

2 RELATED WORK

Relevance feedback [5], [17], [25], in principle, refers to a setof approaches learning from an assortment of users’browsing behaviors on image retrieval [10]. Some earlierstudies for RF make use of existing machine learningtechniques to achieve semantic image retrieval, includingStatistics, EM, KNN, etc. Although these forerunners weredevoted to formulating the special semantic features forimage retrieval, e.g., Photobook [11], QBIC [1], VisualSEEK[16], there still have not been perfect descriptions forsemantic features. This is because of the diversity of visualfeatures, which widely exists in real applications of imageretrieval. Therefore, active query refinement, based on theanalysis of usage logs, attracts researchers’ attention in thisarea of RF.

2.1 Query Reweighting

Some previous work keeps an eye on investigating whatvisual features are important for those images (positiveexamples) picked up by the users at each feedback (alsocalled iteration in this paper). The notion behind QR isthat, if the ith feature fi exists in positive examplesfrequently, the system assigns the higher degree to fi. QR-like approaches were first proposed by Rui et al. [14],which convert image feature vectors to weighted-termvectors in early version of Multimedia Analysis and RetrievalSystem (MARS). Furthermore, Rui et al. [15] provided

SU ET AL.: EFFICIENT RELEVANCE FEEDBACK FOR CONTENT-BASED IMAGE RETRIEVAL BY MINING USER NAVIGATION PATTERNS 361

Fig. 1. Motivating example for the problem of exploration convergence.

Fig. 2. Example of visual diversity.

Fig. 3. The expected scenario for effectiveness versus efficiency.

another interactive approach that allows the user to submita coarse initial query to refine her/his need via a set ofrelevance feedbacks. In this work, the feature weights aredynamically updated to connect low-level visual featuresand high-level human concepts. NNEW, developed by Youet al. [24], learns the user’s query from positive andnegative examples by weighting the important features.For this kind of approach, no matter how the weighted orgeneralized distance function is adapted, the diverse visualfeatures extremely limit the effort of image retrieval. Fig. 4illustrates this limitation that although the search area iscontinuously updated by reweighting the features, sometargets could be lost.

2.2 Query Point Movement

Another solution for enhancing the accuracy of imageretrieval is moving the query point toward the contour ofthe user’s preference in feature space. QPM regards multi-ple positive examples as a new query point at eachfeedback. After several forceful changes of location andcontour, the query point should be close to a convex regionof the user’s interest. The well-known space-vector formulaproposed by Rocchio [13] is as follows:

Qi ¼ Qi�1 þ �Xnrj¼1

Rj

nr� �

Xnirj¼1

IRj

nir; ð1Þ

where Qi is the vector of the ith query, Rj is the vector of thejth relevant image, IRj is the vector of the jth irrelevantimage, nr is the cardinality of relevant images, and nir is thecardinality of irrelevant images. One of the QPM ap-proaches is the modified version of MARS [9]. MARSperforms weighted euclidean distance to compute thesimilarity between the query and the targets. Anotherwell-known study is MindReader [6], which took advantageof a generalized euclidean distance to look for the targets inwell ellipsoids. However, diverse visual contents shared bydifferent kinds of images damage the retrieval quality verymuch. Also, it is very difficult to derive an adaptive andperfect measuring function. A specific measuring functionindeed cannot cover all target groups with various visualcontents. Moreover, the modified query point of eachfeedback probably moves toward the local optimal centroid.Thus, global optimal results are not easily touched in QPM-like work.

2.3 Query EXpansion

Because QR and QPM cannot elevate the quality of RF, QEXhas been another hot technique in the solution space of RFrecently. That is, straightforward search strategies, such asQR and QPM, cannot completely cover the user’s interest

spreading in the broad feature space. As a result, diverseresults for the same concept are difficult to obtain. For thisreason, the modified version of MARS [9] groups thesimilar relevant points into several clusters, and selectsgood representative points from these clusters to constructthe multipoint query. Wu et al. [22] proposed FALCON,which is designed to handle disjunctive queries withinarbitrary metric spaces. Qcluster, developed by Kim andChung [8], intends to handle the disjunctive queries byemploying adaptive classification and cluster mergingmethods. As experimented in earlier studies, the effective-ness of QEX is better than those of QPM and QR.Nevertheless, there are still some problems unsolved forQEX. For MARS, inappropriate search regions cannot dealwith complex queries. For FALCON, the relevant querypoints are too many to be efficient. Adjusting the disjunctivequeries causes the expensive search cost and the resultscannot escape from the restricted range (clusters) that theusers are able to specify. On the whole, QEX brings outhigher computation cost and more feedbacks in RF.

2.4 Hybrid RF

In addition to past studies already described, another typeof RF approach emphasizes the integration of varioussearch strategies [2], [3], [4], [7], [18]. However, this kind ofmethod is instinctive, and very little hybridized workfocuses on the accumulated information (long-term usagelog) coming from various users. Moreover, the greatereffectiveness of the multisystem requires a higher computa-tion cost, due to multiple processings. One of the hybrid RFstrategies is IRRL. IRRL, proposed by Yin et al. [23],addresses the important empirical question of how toprecisely capture the user’s interest at each feedback. InIRRL, exploiting knowledge from the long-term experienceof users can facilitate the selection of multiple RF techniquesto get the best results. The derived problems from IRRL are:the selection of optimal RF technique cannot avoid theoverhead of long iterations of feedback. Also, the visualdiversity existing in the global feature space cannot beresolved with an optimal RF technique alone.

3 PROPOSED APPROACH

As elaborated above, the critical issue of RF can be chieflysummarized thus: how to achieve effective and efficientimage retrieval. To deal with this issue, we describe howour proposed approach NPRF integrates the discoverednavigation patterns and three RF techniques to achieveefficient and effective exploration of images.

3.1 Overview of Navigation-Pattern-BasedRelevance Feedback

The major difference between our proposed approach andother contemporary approaches is that we approximate anoptimal solution to resolve the problems existing in currentRF, such as redundant browsing and exploration convergence.To this end, the approximated solution takes advantage ofexploited knowledge (navigation patterns) to assist theproposed search strategy in efficiently hunting the desiredimages. Generally, the task of the proposed approach can bedivided into two major operations, namely offline knowledgediscovery and online image retrieval. As depicted in Fig. 5,each operational phase contains some critical components

362 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 3, MARCH 2011

Fig. 4. Relevance feedback with generalized QR technique [14].

for completing the specific process. For online operation,once a query image is submitted to this system, the systemfirst finds the most similar images without considering anysearch strategy, and then returns a set of the most similarimages. The first query process is called initial feedback.Next, the good examples picked up by the user deliver thevaluable information to the image search phase, includingnew feature weights, new query point, and the user’sintention. Then, by using the navigation patterns, threesearch strategies, with respect to QPM, QR, and QEX, arehybridized to find the desired images. Overall, at eachfeedback, the results are presented to the user and therelated browsing information is stored in the log database.After accumulating long-term users’ browsing behaviors,offline operation for knowledge discovery is triggered toperform navigation pattern mining and pattern indexing.The framework of the proposed approach is brieflydescribed as follows:

3.1.1 Online Image Retrieval

. Initial Query Processing Phase: Without consideringthe feature weight, this phase extracts the visualfeatures from the original query image to find thesimilar images. Afterward, the good examples (alsocalled positive examples in this paper) picked up bythe user are further analyzed at the first feedback(also called iteration 0 in this paper).

. Image Search Phase: Behind the search phase, ourintent is to extend the one search point to multiplesearch points by integrating the navigation patternsand the proposed search algorithm NPRFSearch.Thus, the diverse inclusion of the user’s interest canbe successfully implied. In this phase, a new querypoint at each feedback is generated by the precedingpositive examples. Then, the k-nearest images to thenew query point can be found by expanding theweighted query. The search procedure does not stopunless the user is satisfied with the retrieval results.

3.1.2 Offline Knowledge Discovery

. Knowledge Discovery Phase: Learning from users’behaviors in image retrieval can be viewed as one typeof knowledge discovery. Consequently, this phaseprimarily concerns the construction of the navigationmodel by discovering the implicit navigation patternsfrom users’ browsing behaviors. This navigationmodel can provide image search with a good supportto predict optimal image browsing paths.

. Data Storage Phase: The databases in this phase canbe regarded as the knowledge marts of a knowledgewarehouse, which store integrated, time-variant, andnonvolatile collection of useful data includingimages, navigation patterns, log files, and imagefeatures. The knowledge warehouse is very helpfulto improve the quality of image retrieval. Note thatthe procedure of constructing rule base from theimage databases can be conducted periodically tomaintain the validity of the proposed approach.

3.2 Offline Knowledge Discovery

In fact, usage mining has been made on how to generateusers’ browsing patterns to facilitate the web pagesretrieval. Similarly, for web image retrieval, the user hasto submit a query term to the search engine, so-calledtextual-based image search. Then the user can obtain a setof most relevant web images according to the metadata orthe browsing log. However, if the result does not satisfy theuser, the query refinement can be easily incorporated intothe query procedure. This is why CBIR using RF has beenthe focus of the researchers in the field of image retrieval.As far as the usage log of CBIR is concerned, the challengemainly lies on: how to generate and utilize the discoveredpatterns. In this paper, we develop a navigation-pattern-based data structure permeated by the query point move-ment aspect, which has never been proposed by paststudies. Through the special data structure, the user’sintention can be caught more quickly and precisely.

In detail, the data structure can be viewed as a hierarchy,including positive images, query points, and clusters. Aquery session contains a set of iterative feedbacks (itera-tions), which is referred to a navigation path. At eachfeedback, the positive examples, which indicate the resultspicked up by the user, are used to derive a referred visualquery point by averaging the positive visual features.Finally, the query sessions, iterations, positive examples,and visual query points are stored into the original logdatabase, as shown in Fig. 8. If the original log data areready, the next task is to discover navigation patterns fromthe original log data. Basically, navigation pattern discoveryconsists of two stages: data transformation and navigationpatterns mining. For data transformation, as shown in lines 1-6 of Fig. 6, the visual query points of the ith iteration aregrouped into m clusters, where n is the maximum sessionlength and 0 ¼< i ¼< n. Then, the visual query points in eachcluster are converted into a specific symbol, called item #.The transformed log table is also divided into severalsubtables at this time. For navigation patterns mining, asshown in lines 8-21 of Fig. 6, the frequent itemsets aremined from the navigation-transaction table.

SU ET AL.: EFFICIENT RELEVANCE FEEDBACK FOR CONTENT-BASED IMAGE RETRIEVAL BY MINING USER NAVIGATION PATTERNS 363

Fig. 5. Workflow of NPRF.

3.2.1 Data Transformation

To date, very few significant studies have succeeded insemantic image retrieval or image recognition because of thecomplicated visual contents. To handle the vagueness inimage presentation, data transformation for visual content is

a fundamental and important operation because it cansimplify both the description of visual query points and thediscovery of navigation patterns. In other words, without thedata transformation, we have to consider all positive images

of each query session in the log database. If all positiveimages are considered for navigation pattern mining, toomany items make the frequent itemsets (navigation patterns)hard to find. Also, the mining cost is expensive.

As a result, the aim of data transformation is to generateQuery Point Dictionary (QPD) to reduce the kinds of itemson the transaction list. The scenario of QPD can bepresented in Fig. 7. In Fig. 7, each trail stands for a querysession (browsing path) and each point represents thevisual query point of the iteration in a query session. ForFig. 7, given a set of iterations, ITN ¼ fQ; iteration 0;iteration 1; . . . :; iteration ng, where Q contains the set ofstarting query images related to different navigation trails,and iteration n contains the set of visual query points at thenth iteration. From Q to iteration n, the visual query points

of each iteration are grouped into several clusters, withrespect to the circles encoded as C11, C12; . . . . and Cnm inFig. 7, by utilizing k-means algorithm and visual similaritycalculations. For visual similarity calculation, the adoptedfeatures are Color Layout, Color Structure, Edge Histo-gram, Homogeneous Texture, and Region Shape. At last,each query point of each cluster is assigned the specificsymbol referred to the cluster number. The symbolizedcluster number is employed as the item number in thenavigation-transaction table for the mining procedure. Forexample, in Fig. 7, there are 2, 4, 3, 2, and 3 starting queryimages in clusters C11;C12;C13;C14, and C15, respectively.For C11, the related query transactions are fC11;C21;C32;C42g and fC11;C23; C32;C42g. That is, C11;C12;C13;C14 andC15 can be viewed as the starting items in these 14 querytransactions.

In this phase, the transformed log table is first generatedby the logged query sessions containing query session id,iteration number, positive images, and visual query pointnumber. Once a query point is projected onto the QPD, thereferred item number is stored into the transformed logtable. As shown in Fig. 8, the transformed log table has to befurther partitioned into three tables for various needs in thispaper, including QP table, Navigation-transaction table,and Partitioned Log table. Navigation-transaction table isused for navigation patterns mining. QP table and Parti-tioned Log table are necessary for image search discussed atlength in subsequent sections. In viewing the complete data,the jointed table can be derived with connecting the jointattributes of different tables.

3.2.2 Navigation Patterns Mining

This stage focuses on the discovery of relations among theusers’ browsing behaviors on RF. Basically, the frequentpatterns mined from the user logs are regarded as theuseful browsing paths to optimize the search direction onRF. In our NPRF approach, the users’ common interests canbe represented by the discovered frequent patterns (alsocalled frequent itemsets). Through these navigation pat-terns, the user’s intention can be precisely captured in ashorter query process. In this phase, the Apriori-likealgorithm is performed to exploit navigation patterns usingthe transformed data. The task for establishing the naviga-tion model can be decomposed into two steps:

364 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 3, MARCH 2011

Fig. 6. Procedure for offline knowledge discovery.

Fig. 7. The query point dictionary of the proposed approach.

Step 1: Construction of the navigation transaction table.From Fig. 7, let us select five query sessions as an exampleshown in Table 1. In Table 1, a query session can be considereda transaction. In this case, the transaction is composed of aquery item C1mj C1m 2 Qf g and several iteration itemsfC1mj C1m 2 iteration#4, where 2 ¼< i ¼< 4g. To exploit valuablenavigation patterns, all query sessions in the transformed logtable are collected as the navigation-transaction table.

From Table 1, we can discover some interestingphenomena. First, though passing through different itera-tions, the paths starting with the same query item lead to thesame destination, e.g., Session 001 and Session 002. Second,the paths starting with different query items lead to thesame destination, e.g., Session 002 and Session 004. Third,the paths starting with the same query item lead to differentdestinations, e.g., Session 003 and Session 004. Even thespecial path, Session 005, may be the other important trailfor image hunting. Such evidence implicates the mainaspect we want to demonstrate in this paper.

Step 2: Generation of navigation patterns. This operationconcentrates on mining valuable navigation patterns tofacilitate online image retrieval. As shown in lines 8-21 ofFig. 6, the frequent itemsets X whose supports support(X)exceed the presetting minimum support minsup are minedby Apriori-like algorithm. In fact, the generated frequentitemsets can be regarded as sequential navigation patternsdirectly since temporal continuities have been considered increating QPD. For example, as shown in Table 2, the

sequential navigation pattern C11 ! C32 ! C42f g derivedfrom frequent itemset C11;C32;C42f g under the minsup is 2.

3.2.3 Pattern Indexing

In this stage, we describe how to build the navigationpattern tree with the discovered navigation patterns. Asshown in Fig. 9, the navigation patterns can be regarded asthe branches of the navigation pattern tree. Once thenavigation patterns are generated, the query item C1m 2 Qin each navigation pattern is used as a seed (called queryseed) to plant the navigation pattern tree. Therefore, if thecardinality of the clusters is 7, there are seven navigationtrees generated in this stage. A tree contains a number ofnavigation paths, and each node of the paths stands for anitem consisting of several visual query points. A visualquery point indicates a set of positive images. In particular,to decrease the complexities of both pattern search andpattern storage, the redundant navigation patterns have tobe pruned further. The observed redundancy between twopatterns can be defined as follows:

Definition 1 (Pattern Redundancy). Consider two navigationpatterns, namely Fitemset1 ¼ Cab; . . . ;Cijf g and Fitemset2 ¼ Cpg; . . . ;Cxyf g. I f Cab ¼ Cpg, Cij ¼ Cxy, a n djFitemset1j ¼> jFitemset2j, Fitemset1 is called redundantnavigation pattern.

After eliminating the redundant patterns, the trimmednavigation pattern tree reduces the search cost significantly.Based on the navigation pattern tree, the desired images canbe captured more promptly without repeating the scan ofthe whole image database at each feedback, especially forthe large-scale image data.

SU ET AL.: EFFICIENT RELEVANCE FEEDBACK FOR CONTENT-BASED IMAGE RETRIEVAL BY MINING USER NAVIGATION PATTERNS 365

TABLE 1Example of Navigation-Transaction Table

TABLE 2Example of Navigation Patterns

Fig. 9. Example of navigation pattern trees.

Fig. 8. The entity-relationship data model for partitioning the log data.

3.3 Online Image Search

3.3.1 Basic Idea

As we can recall from previous explanations, the aim of thesearch strategy is to attack the weakness of the traditionalapproaches, including redundant browsing and explorationconvergence. Indeed, these unsolved problems result in largelimitation in RF. Perhaps, the aged hybrid systems fusingthe results generated by multiple query refinement systemscan look for the better results than individual systems.Nevertheless, the expensive computation cost makes itimpractical in real applications. Instead, we attempt toapproximate the optimal solution, namely NPRFSearch, toresolve such problems by using the generated navigationpatterns. For the problem of exploration convergence, ourproposed approach extends the search range from a querypoint to a number of relevant navigation paths. As a result,each iterative search can escape from the local optimalspace and further move toward the global optimal space forthe user’s interest. For the problem of redundant browsing,the discovered navigation patterns are adopted as theshortest paths to derive the superior results in a shorterfeedback process. Additionally, the expensive navigationcost is saved further, especially for the massive image data.In general, the NPRFSearch algorithm can be recognized as avery important part of our proposed iterative solution toRF, which merges QEX, QP, and QR strategies.

3.3.2 Algorithm NPRFSearch

As already described above, NPRFSearch is proposed to reachthe high precision of image retrieval in a shorter queryprocess by using the valuable navigation patterns. In thissection, we explain the details of NPRFSearch. As illustratedin Fig. 10, the NPRFSearch algorithm is triggered by receiving:1) a set of positive examples G and negative examples Ndetermined by the user at the preceding feedback, 2) a set ofnavigation patterns TR ¼ ftr1; tr2; . . . ; trhg, where each trhcontains a query seed rth and several patterns (e.g.,fC11;C32;C42g referred to Table 2), and 3) an accuracythreshold thrd. In brief, the iterative search procedure canbe decomposed into several steps as follows:

1. Generate a new query point by averaging the visual-features of positive examples.

2. Find the matching navigation pattern trees bydetermining the nearest query seeds (root).

3. Find the nearest leaf nodes (terminations of a path)from the matching navigation pattern trees.

4. Find the top s relevant visual query points from theset of the nearest leaf nodes.

5. Finally, the top k relevant images are returned tothe user.

From the aspect of NPRFSearch, step 1 can be regarded asQPM and steps 2-5 can be regarded as QEX. For QR, thefeature weights are iteratively updated based on the positiveexamples at each feedback. As a whole, the proposedNPRFSearch takes advantage of QPM, QEX, QR, andnavigation patterns to make RF more efficiently andeffectively. Without navigation patterns, NPRFSearch cannotreach the high quality of RF. From the viewpoint ofapplicability, the goal of our approach is to satisfy eachquery efficiently instead of providing personalized functionsfor each user. Hence, irrelevant queries from a user will not be

a problem. By collecting a large number of query transac-tions, most queries can be well answered for matching user’sinterests by NPRFSearch. The details of the NPRFSearchalgorithm are described as follows:

Query point generation. The basic idea of this operation isto find the images not only with the specific similarityfunction. By recursively modifying the query point, thesearch direction can move toward the targets gradually.Assume that a set of images is found by the query point qpold

at the preceding feedback. Next, the visual features of thepositive examples G picked up by the user are first averagedinto a new query point qpnew. As shown in lines 1-3 of Fig. 10,consider a set of positive examples G ¼ fg1; g2; . . . ; gkg and ddimensions of the ith feature Fi ¼ ffx1 ; fx2 ; . . . ; fxd g extracted

366 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 3, MARCH 2011

Fig. 10. Algorithm for NPRFSearch.

from the xth positive example. Thereupon, the new querypoint qpnew implied by G can be defined as

qpnew ¼ F1; F2; . . . ; Fb� �

; ð2Þ

where 1 ¼< i ¼< b,

Fi ¼ f1; f2; . . . ; fd� �

;

and

ft ¼P

1�x�k;fxt 2Fifxt

k:

Meanwhile, qpnew and the positive examples are stored intothe log database to enhance the knowledge database. Next,the negative examples are appended to the accumulatednegative set NIMG. At each feedback, eliminating MINGfrom the targets can increase the precision of image retrievalsignificantly. In addition to generating qpnew and MING, theweight of each feature has to be calculated to keep searchingthe images similar to qpnew. In this paper, the feature weightfor similarity computation is normalized as follows:

Feature reweighting. Consider a set of positive examplesG ¼ fg1; g2; . . . ; gkg found by the preceding query pointqpold. Given that a set of �1; �2; . . . ; �bf g is referred toF1; F2; . . . ; Fbf g. The new weight of the ith feature Fi is

defined as

wi ¼

Pb

y¼1�y

�i

Pbz¼1

Pb

y¼1�y

�z

; ð3Þ

where

� ¼Xkx¼1

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPdj¼1

�fxj � f

qpoldj

�2q

d

and 1 ¼< i ¼< b.Query expansion. To keep an eye on the problem of

exploration convergence, the attempt of this stage is to coverall possible results by the relevant patterns discovered. Then,by performing a weighted KNN search, QEX-like procedurefirst determines the nearest query seed to each of G, calledpositive query seed, and the nearest query seed to each of N,called negative query seed. Lines 5-8 of Fig. 10 show how to findthe positive and negative query seed sets. As a result, a set ofpositive query seeds is selected to be the start of potentialsearch paths. Additionally, the slight loss of the informationembedded in the negative examples is also deliberated in thispaper. In theory, if the negative query seeds are all droppedat each feedback, the desired results could be captured moreprecisely. However, there exist some query seeds belongingto both of the positive query seed set and the negative queryseed set at each feedback. Dropping the negative query seedswould lead to the loss of positive query seeds. That is, thesedropped negative seeds may be the start of good searchpaths. To take account of both positive and negativeinformation simultaneously, every seed has its own tokenrth.chk. If the seed owns the maximum number of negativeexamples or owns no positive example, it will be tokenized asa bad manner, i.e., rth:chk ¼ 0, as shown in lines 4 and 15 ofFig. 10. Otherwise, rth.chk is 1 for any good manner. Yet,considering both positive and negative information elevates

the computation cost. To reduce the computation cost, ateach feedback, the algorithm assigns a bad manner to theseed that owns maximum number of negative examples ifand only if the satisfaction rate (jGj=jG [Nj) cannot reach thepresetting threshold thrd, as shown in lines 9-16 of Fig. 10.Finally, the good manners are the starts of the referrednavigation paths to find the relevant leaf nodes (terminationsof the relevant paths).

For example, consider that seven navigation pattern treesare seeded by seven query seeds, referred to Figs. 7 and 9.Assume that thrd is 20 percent. Fig. 11 is an example showingthat 10 resulting images are returned at a feedback. After theuser’s picking, six are positive examples and four arenegative examples. For each example, the proposed systemthen performs the weighted KNN search to determine thenearest query seed (root) of the pattern tree. Thus, the goodmanner set is {Tree 2, Tree 5, Tree 6, Tree 7}. In this case,Tree 2 owns two positive examples, Tree 5 owns two positiveexamples, Tree 6 owns one positive example and twonegative examples, and Tree 7 owns one positive exampleand one negative example. The potential bad manner is{Tree 6} because it owns maximum number of negativeexamples (that is, two negative examples). Because theprecision (jGj=jG [N j ¼ 60%) in this case exceeds thrd, thebad manner does not need to be eliminated. Otherwise, Tree6 has to be dropped from the good manner set.

Soon after the relevant query seeds (good manners) aredetermined, as shown in line 18 of Fig. 10, a set of matchingleaf nodes is found by traversing the navigation patterntree. Then the crucial search procedure starts looking for thedesired images, using the new feature weights derived by(3). In detail, the crucial search procedure can be decom-posed into two steps.

Step 1: Determination of the relevant visual querypoints. As shown in lines 17-22 of Fig. 10, from thecollection of the relevant navigation paths, the leaf node ofeach relevant navigation path is used to find the relevantquery points by scanning QP table shown in Fig. 8. In otherwords, the last item of each pattern represents thedestination of each search path. Fig. 9 is an appropriateexample to show the aspect. Thus, the user’s intention canbe repaid quickly and comprehensively through determin-ing the relevant visual query points contained in the founddestinations. At last, the most similar visual query pointsto qpnew are acquired by executing a weighted KNN search,as shown in line 23 of Fig. 10.

Step 2: Determination of the relevant images. To reducethe computation cost, top s similar visual query points are

SU ET AL.: EFFICIENT RELEVANCE FEEDBACK FOR CONTENT-BASED IMAGE RETRIEVAL BY MINING USER NAVIGATION PATTERNS 367

Fig. 11. Example of determining the relevant query seeds.

determined in step 1. Hence, the search space is narrowedto the images referred to the top s similar visual querypoints by scanning the partitioned log table shown in Fig. 8.Finally, top k images with the shorter distances to qpnew canbe found by performing a weighted KNN search. Note thatas shown in line 28 of Fig. 10, the negative examplesexisting at each feedback are all skipped in this paper.

Observation and discussion. In our proposed method,the data structure can be regarded as a three-layer hierarchyreferred to Fig. 9, with respect to images, query points, andpatterns (also called items or clusters). At the beginning ofimplementing NPRF, we tried to look for the desired imagesdirectly from the positive images in the relevant items (leafnodes), but in vain. This search strategy can be basicallyregarded as a Breadth-First Search (BFS)-based KNN. Themajor purpose of using BFS is to speed up the search withoutconsidering the relevant visual query points in the relevantitems (leaf nodes). However, an irreconcilable conflict existsbetween Depth-First-Search (DFS)-based KNN and BFS-based KNN search strategies. Traditional KNN takesadvantage of BFS-based search strategy to look for topk-nearest neighbors to the query in the feature space.Comparatively, we adopt DFS-based search to hierarchicallyfind the desired images from query point level to imagelevel. The experimental results show that a vertical search isbetter than a horizontal search in this case. This is due to thefact that the correct results do not spread across the querypoints. In other words, a query point represents a topic in theuser’s mind, containing a set of similar positive images. Ifperforming BFS KNN, the results will spread across thequery points/topics. Thus, the precision is inferior.

The above viewpoint can be clarified in Fig. 12. Given thatthree candidate query points {query Point 1, query Point 2,and query Point 3} are found in the relevant items, andI1; I2; . . . ; I14f g indicates the related image set. Assume that

the number of relevant query points is 1 and the number ofrelevant images is 5. If the correct results I4; I5; I6; I7; I8f g areall contained in the range of query Point 2, there exists alarge gap of results between DFS-based KNN and BFS-basedKNN. For BFS-based KNN, the nearest neighbors to qpnew

contain the images I1; I2; I3; I4; I5f g spreading on the rangesof three query points. Thus, the precision of BFS-based KNNis 2/5. In contrast, the irrelevant query points, with respectto query Point 1 and query Point 3, are dropped using DFS-based KNN. Thereby, the most relevant query Point 2 is

identified, and the actual desired images I4; I5; I6; I7; I8f glimited in the range of query Point 2 can be found.Accordingly, the precision of DFS-based KNN is 5/5. Inthis paper, DFS-based KNN brings out the better results thanBFS-based KNN.

4 EMPIRICAL EVALUATIONS

So far, we have presented our proposed approach NPRF inthe preceding section in great detail. In this section, weevaluate the effectiveness of NPRF.

4.1 Experimental Data

The experimental data came from the collection of the Corelimage database and the web images. We prepared sevendata sets composed of different kinds of categories, asshown in Table 3. Each category contains 200 images. In ourexperimental logs, we initially performed QPM to collect thelog on the queries. Then, the navigation patterns areobtained by adopting our pattern discovery mechanism.Incrementally, the knowledge discovered from the naviga-tion patterns can be enhanced once the new query issubmitted to NPRF. Indeed, it does need time to gather theusage logs. However, a larger log that needs longercollection time can help achieve higher retrieval quality.An alternative way to reduce the whole collection time is toincrease the size of collected logs incrementally such that theprecision will also be enhanced gradually. All the experi-ments were implemented in C++, running on a personalcomputer with Intel Dual Core Xeon 3050 2.13 GHzprocessor and 1 G MB RAM.

To analyze the effectiveness of our proposed approach,

two major criteria, namely precision and coverage, are used to

measure the related experimental evaluations. They are

defined as

368 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 3, MARCH 2011

Fig. 12. Example of the search space for BFS-based KNN and DFS-based KNN.

TABLE 3The Experimental Data Sets

precision ¼ jcorrectjjretrievedj � 100%;

Coverage ¼ jac correctjjrelevantj � 100%;

where correct is the positive image set to the query image ateach feedback, retrieved is the resulting image set exploitedby the proposed approach at each feedback, ac_correct is theunion set of all correct during a query session, and relevant isthe ground truth. The criterion precision delivers the abilityfor hunting the desired images in user’s mind and thecoverage represents the ability for finding the accumulatedpositive images in a query session.

4.2 Experimental Results

In practice, the primary intentions behind our experimentsare: 1) evaluations for parameter settings, 2) comparisonsbetween NPRF and other existing RF approaches in terms ofeffectiveness and efficiency, and 3) the promise of theperformance for the scale-up data.

4.2.1 Evaluations for Parameter Settings

Before evaluating NPRF, we have to elicit the appropriateparameter settings, as shown in Table 4, to set up the relatedexperiments. Based on data set 3, the evaluation measure ofthis experiment is the average precision of iterations 0-6. Thefirst parameter we concerned is the number of clusters cl.Fig. 13 reveals that the precision slightly degrades as thenumber of clusters increases. However, the impact of varied

settings on cl is not significant. The second parameter wetested is the number of logs and Fig. 14 depicts that, the largerthe number of logs, the higher the average precision and thehigher the retrieval cost. To keep balance between effective-ness and efficiency, 3,600 is adopted as our default setting forlog size in the experiments. The third parameter tested is theminimum support minsup for navigation patterns mining.Fig. 15 shows that 0.01 is the best setting. The properexplanation is that the smaller the minsup, the more the rulesand the more the noises. In other words, a smaller minsupresults in more noises, and thus, lower precision. The fourthparameter tested is the most relevant seed s. Fig. 16 illustratesthat a larger number of seeds (query points) could result inlower precision since more irrelevant images would beproduced. The fifth parameter tested is thrd. In our experi-mental results, the best setting is 0.1 (the figure is eliminateddue to the space limitation). However, the difference ofresulted precision is within 0.00185 with thrd varied from 0 to1. That is, the impact of thrd is not obvious. The last parameterwe tested is k (number of returned images) with the value setas 20 and 30. The experimental results show that higherprecision is produced when k ¼ 30. The potential reason isthat more returned images may provide more usefulinformation to support RF. However, few users would liketo do feedback on a large number of results in practice.Hence, k is set as 20 in our further experiments. Overall, themost important parameters are minimum support, thenumber of logs, and most relevant seeds.

4.2.2 Comparisons between NPRF and

Other Approaches

Evaluation of effectiveness. The first experimental evalua-tion shows the effectiveness of the proposed approach. Inthis experiment, we utilized the data set 7 as the experi-mental data set. Fig. 17 shows: First, the precision of NPRFperforms better than other existing RF approaches signifi-cantly whatever the iteration number is. Second, for ourproposed approach, the best improvement (steep slope)exists between iteration 0 and iteration 2. This is the majorpoint we want to demonstrate, as depicted in Fig. 3. Third,

SU ET AL.: EFFICIENT RELEVANCE FEEDBACK FOR CONTENT-BASED IMAGE RETRIEVAL BY MINING USER NAVIGATION PATTERNS 369

TABLE 4The Experimental Parameter Settings

Fig. 15. The average precisions of different minsup for data set 3.

Fig. 16. The average precisions of different s for data set 3.Fig. 13. The average precisions of different cl for data set 3.

Fig. 14. The average precisions of different numbers of log-transactionfor data set 3.

the naı̈ve approaches, such as Naı̈ve QR and Naı̈ve QPM,execute worse than the advanced approaches, such as QRand QPM, because the advanced approaches skip all thenegative examples occurring at each feedback. The similaroperation in our proposed approach is shown in line 28 ofFig. 10. Fourth, the hybrid approaches outperform theindividual approaches. Overall, QPM is the best individualapproach and NPRF is the best of all.

Moreover, we also evaluate the strength of our proposedapproach for solving the exploration convergence problemin terms of coverage. Fig. 18 exhibits that NPRF can findmore diverse images than other approaches. It indicatesthat the navigation patterns can effectively help the usersget away from the local optimal results and capture thedesired images in the global search space, as alreadydescribed in Section 1.

Evaluation of efficiency. In addition to the effectiveness,another issue we are interested in is the efficiency of ourproposed approach. Through this experiment, we canrealize whether the redundant browsing problem can bereally avoided or not. Table 5 reveals the comparison of theefficiency among different approaches. The result showsthat NPRF reaches the specific precision 80 percent onlyneeding two iterations. That is, according to the high qualityof search strategy, the users can obtain the desired imagesquickly without a needless exploration program. In contrastto the other compared approaches, our proposed approachcan be viewed as the best solution to resolve the redundantbrowsing problem.

4.2.3 Scale-Up Evaluation

In most former approaches, an important limitation forimage retrieval is that the explosive growth of images leadsto poor and unstable performance. This leaves us with the

motivation of approximating a good solution to cope withthe largeness of the data. To address this point, we enlargedthe data set to evaluate the performance of our proposedapproach. Obviously, Table 6 illustrates that the precision issubstantially stable even when enlarging the data set. Thisfurther proves that our approach is very robust in thesuccess of RF for the large-scale image data. In addition toprecision, the execution time of NPRF is shown in Table 7. Itreveals that the computation time increases slightly from 0.9to 1.5 seconds, with the number of categories increasedfrom 30 to 50 (i.e., the number of images increased from6,000 to 10,000). This shows that NPRF is scalable in termsof the execution time.

4.2.4 Illustrative Examples

Now let us consider some testing examples, as shown inFigs. 19, 20, and 21. In each example, top 20 images arereturned at three specialized iterations. In these illustrativeexamples, the images marked in box are called the “correctset.” Fig. 19 shows that even though the query image is acomplex image (jack-o’-lantern), NPRF can exactly capturethe concept in user’s mind at iteration 1. That is, through onlyone feedback, the precision can reach 12/20 at iteration 1.Furthermore, after six iterative feedbacks, all 20 resultingimages can be included in the correct set as the user desires. In

370 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 3, MARCH 2011

Fig. 17. The precisions of different approaches for data set 7.

Fig. 18. The coverage of different approaches for data set 7.

TABLE 5The Minimum Number of Feedbacks for Different Approaches to

Reach the Specific Precision 80 Percent

TABLE 6The Precisions for Different Amounts of Data by NPRF

TABLE 7The Execution Time for Different Amounts of Data by NPRF

this case, from iterations 0 to 6, the improvement of precisionby NPRF is

Improvement

¼ ðprecision of iteration 6Þ � ðprecision of iteration 0Þðprecision of iteration 0Þ

¼2020� 2

20220

� 100% ¼ 900%:

As shown in Figs. 20 and 21, other traditional approachescannot bring out the same effectiveness. The results revealthat our proposed approach outperforms other well-knownapproaches in terms of effectiveness and efficiency.

5 CONCLUSION

To deal with the long iteration problem of CBIR with RF,we have presented a new approach named NPRF byintegrating the navigation pattern mining and a naviga-tion-pattern-based search approach named NPSearch.

In summary, the main feature of NPRF is to efficientlyoptimize the retrieval quality of interactive CBIR. On onehand, the navigation patterns derived from the users’ long-term browsing behaviors are used as a good support for

minimizing the number of user feedbacks. On the other hand,the proposed algorithm NPRFSearch performs the naviga-tion-pattern-based search to match the user’s intention bymerging three query refinement strategies. As a result,traditional problems such as visual diversity and explorationconvergence are solved. For navigation-pattern-basedsearch, the hierarchical BFS-based KNN is employed tonarrow the gap between visual features and human conceptseffectively. In addition, the involved methods for special datapartition and pattern pruning also speed up the imageexploration. The experimental results reveal that the pro-posed approach NPRF is very effective in terms of precisionand coverage. Within a very short term of relevance feedback,the navigation patterns can assist the users in obtaining theglobal optimal results. Moreover, the new search algorithmNPRFSearch can bring out more accurate results than otherwell-known approaches.

In the future, there are some remaining issues toinvestigate. First, in view of very large data sets, we willscale our proposed method by utilizing parallel and dis-tributed computing techniques. Second, we will integrateuser’s profile into NPRF to further increase the retrievalquality. Third, we will apply the NPRF approach to morekinds of applications on multimedia retrieval or multimediarecommendation.

ACKNOWLEDGMENTS

This research was supported by the National ScienceCouncil, Taiwan, R.O.C., under grant no. NSC 98-2631-H-006-001, and by the Ministry of Economic Affairs, Taiwan,R.O.C., under grant no. 97-EC-17-A-02-S1-024.

REFERENCES

[1] M.D. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B.Dom, M. Gorkani, J. Hafner, D. Lee, D. Steele, and P. Yanker,“Query by Image and Video Content: The QBIC System,”Computer, vol. 28, no. 9, pp. 23-32, Sept. 1995.

[2] R. Fagin, “Combining Fuzzy Information from Multiple Systems,”Proc. Symp. Principles of Database Systems (PODS), pp. 216-226, June1996.

[3] R. Fagin, “Fuzzy Queries in Multimedia Database Systems,” Proc.Symp. Principles of Database Systems (PODS), pp. 1-10, June 1998.

[4] J. French and X-Y. Jin, “An Empirical Investigation of theScalability of a Multiple Viewpoint CBIR System,” Proc. Int’l Conf.Image and Video Retrieval (CIVR), pp. 252-260, July 2004.

SU ET AL.: EFFICIENT RELEVANCE FEEDBACK FOR CONTENT-BASED IMAGE RETRIEVAL BY MINING USER NAVIGATION PATTERNS 371

Fig. 19. The resulting example for NPRF. Fig. 21. The resulting example for QPM.

Fig. 20. The resulting example for QR.

[5] D. Harman, “Relevance Feedback Revisited,” Proc. 15th Ann. Int’lACM SIGIR Conf. Research and Development in Information Retrieval,pp. 1-10, 1992.

[6] Y. Ishikawa, R. Subramanya, and C. Faloutsos, “MindReader:Querying Databases through Multiple Examples,” Proc. 24th Int’lConf. Very Large Data Bases (VLDB), pp. 218-227, 1998.

[7] X. Jin and J.C. French, “Improving Image Retrieval Effectivenessvia Multiple Queries,” Multimedia Tools and Applications, vol. 26,pp. 221-245, June 2005.

[8] D.H. Kim and C.W. Chung, “Qcluster: Relevance Feedback UsingAdaptive Clustering for Content-Based Image Retrieval,” Proc.ACM SIGMOD, pp. 599-610, 2003.

[9] K. Porkaew, K. Chakrabarti, and S. Mehrotra, “Query Refinementfor Multimedia Similarity Retrieval in MARS,” Proc. ACM Int’lMultimedia Conf. (ACMMM), pp. 235-238, 1999.

[10] J. Liu, Z. Li, M. Li, H. Lu, and S. Ma, “Human BehaviourConsistent Relevance Feedback Model for Image Retrieval,” Proc.15th Int’l Conf. Multimedia, pp. 269-272, Sept. 2007.

[11] A. Pentalnd, R.W. Picard, and S. Sclaroff, “Photobook: Content-Based Manipulation of Image Databases,” Int’l J. Computer Vision(IJCV), vol. 18, no. 3, pp. 233-254, June 1996.

[12] T. Qin, X.D. Zhang, T.Y. Liu, D.S. Wang, W.Y. Ma, and H.J. Zhang,“An Active Feedback Framework for Image Retrieval,” PatternRecognition Letters, vol. 29, pp. 637-646, Apr. 2008.

[13] J.J. Rocchio, “Relevance Feedback in Information Retrieval,” TheSMART Retrieval System—Experiments in Automatic DocumentProcessing, pp. 313-323, Prentice Hall, 1971.

[14] Y. Rui, T. Huang, and S. Mehrotra, “Content-Based ImageRetrieval with Relevance Feedback in MARS,” Proc. IEEE Int’lConf. Image Processing, pp. 815-818, Oct. 1997.

[15] Y. Rui, T. Huang, M. Ortega, and S. Mehrotra, “RelevanceFeedback: A Power Tool for Interactive Content-Based ImageRetrieval,” IEEE Trans. Circuits and Systems for Video Technology,vol. 8, no. 5, pp. 644-655, Sept. 1998.

[16] J.R. Smith and S.F. Chang, “VisualSEEK: A Fully AutomatedContent-Based Image Query System,” Proc. ACM Multimedia Conf.,Nov. 1996.

[17] G. Salton and C. Buckley, “Improving Retrieval Performance byRelevance Feedback,” J. Am. Soc. Information Science, vol. 41, no. 4,pp. 288-297, 1990.

[18] H.T. Shen, S. Jiang, K.L. Tan, Z. Huang, and X. Zhou, “Speed upInteractive Image Retrieval,” VLDB J., vol. 18, no. 1, pp. 329-343,Jan. 2009.

[19] V.S. Tseng, J.H. Su, J.H. Huang, and C.J. Chen, “IntegratedMining of Visual Features, Speech Features and FrequentPatterns for Semantic Video Annotation,” IEEE Trans. Multimedia,vol. 10, no. 2, pp. 260-267, Feb. 2008.

[20] V.S. Tseng, J.H. Su, B.W. Wang, and Y.M. Lin, “Web ImageAnnotation by Fusing Visual Features and Textual Information,”Proc. 22nd ACM Symp. Applied Computing, Mar. 2007.

[21] K. Vu, K.A. Hua, and N. Jiang, “Improving Image RetrievalEffectiveness in Query-by-Example Environment,” Proc. 2003ACM Symp. Applied Computing, pp. 774-781, 2003.

[22] L. Wu, C. Faloutsos, K. Sycara, and T.R. Payne, “FALCON:Feedback Adaptive Loop for Content-Based Retrieval,” Proc. 26thInt’l Conf. Very Large Data Bases (VLDB), pp. 297-306, 2000.

[23] P.Y. Yin, B. Bhanu, K.C. Chang, and A. Dong, “IntegratingRelevance Feedback Techniques for Image Retrieval UsingReinforcement Learning,” IEEE Trans. Pattern Analysis and MachineIntelligence, vol. 27, no. 10, pp. 1536-1551, Oct. 2005.

[24] H. You, E. Chang, and B. Li, “NNEW: Nearest NeighborExpansion by Weighting in Image Database Retrieval,” Proc. IEEEInt’l Conf. Multimedia and Expo, pp. 245-248, Aug. 2001.

[25] X.S. Zhou and T.S. Huang, “Relevance Feedback for ImageRetrieval: A Comprehensive Review,” Multimedia Systems, vol. 8,no. 6, pp. 536-544, Apr. 2003.

Ja-Hwung Su received the BS and MS degreesfrom the Department of Information Manage-ment at I-Shou University in 2000 and 2002,respectively. He is currently working toward thePhD degree in the Department of ComputerScience and Information Engineering at NationalCheng Kung University. His research interestsinclude data mining and data warehousing.

Wei-Jyun Huang received the BS and MSdegrees from the Department of ComputerScience and Information Engineering at NationalChengchi University and National Cheng KungUniversity, respectively. Her research interestincludes data mining.

Philip S. Yu received the BS degree in electricalengineering from National Taiwan University, theMS and PhD degrees in electrical engineeringfrom Stanford University, and the MBA degreefrom New York University. He is a professor inthe Department of Computer Science at theUniversity of Illinois at Chicago and also holdsthe Wexler chair of information technology. Hespent most of his career at IBM Thomas J.Watson Research Center and was the manager

of the Software Tools and Techniques Group. His research interestsinclude data mining, database systems, and privacy. He has publishedmore than 560 papers in refereed journals and conferences. He holds orhas applied for more than 300 US patents. He is an associate editor oftheACM Transactions on the Internet Technology and ACM Transac-tions on Knowledge Discovery from Data. He is on the steeringcommittee of the IEEE Conference on Data Mining and was a memberof the IEEE Data Engineering steering committee. He was the editor-in-chief of the IEEE Transactions on Knowledge and Data Engineering(2001-2004). He had received several IBM honors including two IBMOutstanding Innovation Awards, an Outstanding Technical AchievementAward, two Research Division Awards, and the 94th plateau of InventionAchievement Awards. He was an IBM master inventor. He received aResearch Contributions Award from the IEEE International Conferenceon Data Mining in 2003 and also an IEEE Region 1 Award for “promotingand perpetuating numerous new electrical engineering concepts” in1999. He is a fellow of the ACM and the IEEE.

Vincent S. Tseng received the PhD degreefrom National Chiao Tung University, Taiwan,R.O.C., in 1997, with major in computer science.He is currently a professor in the Department ofComputer Science and Information Engineeringat National Cheng Kung University (NCKU),Taiwan. Before this, he was a postdoctoralresearch fellow in the Computer Science Divi-sion at the University of California at Berkeley,during January 1998 and July 1999. He has

acted as the director of the Institute of Medical Informatics of NCKUsince August 2008. During February 2004 and July 2007, he had alsoserved as the director of the Informatics Center at National Cheng KungUniversity Hospital, Taiwan. He is on the editorial board of theInternational Journal of Data Mining and Bioinformatics and served asa board member for various societies like Taiwan Association forArtificial Intelligence, Taiwan HL-7 Society, and Taiwan BioinformaticsSociety. His research interests include data mining, biomedicalinformatics, mobile web technology, and multimedia databases. Hehas published more than 170 research papers in referred journals andinternational conferences. He has also held (or filed) more than15 patents in the US and R.O.C. He has also served as the chair orprogram committee member for various premier conferences related todata mining and biomedical informatics, including the 2009 IEEEInternational Conference on Data Mining, 2009 SIAM InternationalConference on Data Mining, the 2009 IEEE International Conference onMobile Data Management, 2009 ACM CIKM, PAKDD 2009 (PC memberand tutorial chair), PAKDD 2010, DASFAA 2010, the IEEE BIBM 2008and 2009, the IEEE CBMS 2006-2009, etc. He is a member of the IEEE,the ACM, and an honorary member of Phi Tau Phi Society.

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

372 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 3, MARCH 2011