[ieee 2012 ieee virtual reality (vr) - costa mesa, ca, usa (2012.03.4-2012.03.8)] 2012 ieee virtual...

2
Im-O-Ret: Immersive Object Retrieval Pedro B. Pascoal INESC-ID/IST/TU Lisbon Lisbon, Portugal [email protected] Alfredo Ferreira INESC-ID/IST/TU Lisbon Lisbon, Portugal [email protected] Joaquim Jorge INESC-ID/IST/TU Lisbon Lisbon, Portugal [email protected] ABSTRACT The growing number of three-dimensional (3D) objects stored in digital libraries brought forth the challenge of search in 3D model collections. To address it, several approaches have been developed for 3D object retrieval. However, these approaches traditionally present query results as a list of thumbnails, and fail to take ad- vantage of recent visualization and interaction technologies. In this paper, we propose an approach to 3D object retrieval using immer- sive VR for query result visualization. Query results are shown in a three-dimensional virtual space as 3D objects and users can explore these results by navigating in this virtual space and manipulating the scattered objects. Index Terms: Multimedia Information Retrieval; 3D Object Re- trieval; Immersive Virtual Environment; 1 I NTRODUCTION As a response to the growing number of three-dimensional (3D) objects available in digital libraries, a few search engines have been proposed. Still, most of the existing solutions exhibit major drawbacks and challenges that need to be tackled. In his survey, Datta [4] identified extensively these drawbacks, from which we highlight two. First, queries traditionaly rely on meta-information, often keyword-based, reducing search to text information retrieval of 3D objects. Second, results are presented as a list of items on a screen. These items are usually thumbnails, often combined with metadata. While solutions to query specification have been proposed based on queries by example [13], sketched queries [9] or gesture-based queries [7], the thumbnail approach for result presentation is generally used and is clearly inadequate. Indeed, thumbnails may not provide the best view of the model, as shown in Dutagaci’s study [5]. In this paper, we propose a novel approach to query result visualization for 3D object retrieval. Instead of using thumbnails, we display the retrieved models in a virtual 3D space. After a query, retrieved objects are shown in a virtual reality (VR) environment, organized according to their degree of similarity. Then, the user can navigate through the search results and explore them. 2 RELATED WORK In recent years, many 3D shape search engines have been presented. In 2001, Thomas Funkhouser et al. [6] introduced the Princeton 3D model search engine, providing content-based retrieval of 3D models. Queries can be specified using text, by example, 2D sketch, or 3D sketch. The results are presented as an array of model thumbnails. After a search it is also possible to choose a result model as query-by-example to initiate a search. More recently, Ansary et al. FOX-MIIRE search engine [1], introduced query by photo. This tool retrieves a 3D model similar to an object in a photo. Additionally, this system provides both standard and mobile device interfaces. However, and similarly to previous solutions, results are displayed as a list of thumbnail. Outside the research field, Google 3D Warehouse offers search engine for 3D models. This repository contains a very large number of different models but searching for models in this collection is limited by textual queries or, when available, by its geo-reference. On the other hand, query results are displayed by model images in a list, with the opportunity to additionaly manipulate a 3D view of a selected model. Despite the growing capabilities of current hardware and software, these and other approaches do not take advantage of advances in computer graphics or interaction paradigms to improve result visualization. Indeed, to the extent of our knowledge, the single approach that took advantage of immersive virtual environ- ments for information retrieval was Nakazato’s 3D MARS [10]. Recent advances in post-WIMP (windows, icons, menus, pointer) human-computer interactions (HCI) paradigms brought new possibilities for multi-modal interaction. Devices like Nintendo Wiimote or Microsoft Kinect and 3DTV sets provide low-cost approaches that brought immersive experiences from the labs into our homes. In this context, Holz and Wilson [7] system allows users to describe spatial objects through gestures. The system captures gestures with a Kinect camera and then finds the most closely matching object in a database of physical objects. This presents a good combination of new interaction paradigms in 3D object retrieval. However, it does not address query result visualiza- tion, since they focus on query specification, the first issue we highlighted in previous section. 3 I M-O-RET Taking advantage of the new paradigms in HCI, we propose an im- mersive VR system for 3D object retrieval (Im-O-Ret). Here, the query results are displayed in a three-dimensional space as 3D ob- jects, instead of the traditional list of thumbnails. The user can then explore the results, navigate in the three-dimensional space and ma- nipulate the scattered objects. The combined use of virtual environ- ments and devices with six DOF (Degrees of Freedom) provides a complete visualization of models with a natural interaction, enabled by direct object manipulation, illustrated in Figure 1. 3.1 Spacial Distribution of Results Query results are distributed in the virtual 3D space according to their similarity. To each axis is assigned a different shape matching algorithm, being the coordinate value determined by the corresponding similarity to the query. When performing a search, the query model is used to find similar models, using each assigned algorithm. The results are then merged, giving a 3D position for each similar model retrieved. 121 IEEE Virtual Reality 2012 4-8 March, Orange County, CA, USA 978-1-4673-1246-2/12/$31.00 ©2012 IEEE

Upload: joaquim

Post on 16-Mar-2017

216 views

Category:

Documents


3 download

TRANSCRIPT

Im-O-Ret: Immersive Object RetrievalPedro B. Pascoal

INESC-ID/IST/TU LisbonLisbon, [email protected]

Alfredo FerreiraINESC-ID/IST/TU Lisbon

Lisbon, [email protected]

Joaquim JorgeINESC-ID/IST/TU Lisbon

Lisbon, [email protected]

ABSTRACT

The growing number of three-dimensional (3D) objects stored indigital libraries brought forth the challenge of search in 3D modelcollections. To address it, several approaches have been developedfor 3D object retrieval. However, these approaches traditionallypresent query results as a list of thumbnails, and fail to take ad-vantage of recent visualization and interaction technologies. In thispaper, we propose an approach to 3D object retrieval using immer-sive VR for query result visualization. Query results are shown in athree-dimensional virtual space as 3D objects and users can explorethese results by navigating in this virtual space and manipulating thescattered objects.

Index Terms: Multimedia Information Retrieval; 3D Object Re-trieval; Immersive Virtual Environment;

1 INTRODUCTION

As a response to the growing number of three-dimensional (3D)objects available in digital libraries, a few search engines havebeen proposed. Still, most of the existing solutions exhibit majordrawbacks and challenges that need to be tackled. In his survey,Datta [4] identified extensively these drawbacks, from which wehighlight two. First, queries traditionaly rely on meta-information,often keyword-based, reducing search to text information retrievalof 3D objects. Second, results are presented as a list of items ona screen. These items are usually thumbnails, often combinedwith metadata. While solutions to query specification have beenproposed based on queries by example [13], sketched queries [9]or gesture-based queries [7], the thumbnail approach for resultpresentation is generally used and is clearly inadequate. Indeed,thumbnails may not provide the best view of the model, as shownin Dutagaci’s study [5].

In this paper, we propose a novel approach to query resultvisualization for 3D object retrieval. Instead of using thumbnails,we display the retrieved models in a virtual 3D space. Aftera query, retrieved objects are shown in a virtual reality (VR)environment, organized according to their degree of similarity.Then, the user can navigate through the search results and explorethem.

2 RELATED WORK

In recent years, many 3D shape search engines have beenpresented. In 2001, Thomas Funkhouser et al. [6] introducedthe Princeton 3D model search engine, providing content-basedretrieval of 3D models. Queries can be specified using text, byexample, 2D sketch, or 3D sketch. The results are presented asan array of model thumbnails. After a search it is also possible tochoose a result model as query-by-example to initiate a search.

More recently, Ansary et al. FOX-MIIRE search engine [1],introduced query by photo. This tool retrieves a 3D model similarto an object in a photo. Additionally, this system provides bothstandard and mobile device interfaces. However, and similarly toprevious solutions, results are displayed as a list of thumbnail.

Outside the research field, Google 3D Warehouse offers searchengine for 3D models. This repository contains a very large numberof different models but searching for models in this collection islimited by textual queries or, when available, by its geo-reference.On the other hand, query results are displayed by model images ina list, with the opportunity to additionaly manipulate a 3D view ofa selected model.

Despite the growing capabilities of current hardware andsoftware, these and other approaches do not take advantage ofadvances in computer graphics or interaction paradigms to improveresult visualization. Indeed, to the extent of our knowledge, thesingle approach that took advantage of immersive virtual environ-ments for information retrieval was Nakazato’s 3D MARS [10].Recent advances in post-WIMP (windows, icons, menus, pointer)human-computer interactions (HCI) paradigms brought newpossibilities for multi-modal interaction. Devices like NintendoWiimote or Microsoft Kinect and 3DTV sets provide low-costapproaches that brought immersive experiences from the labs intoour homes.

In this context, Holz and Wilson [7] system allows users todescribe spatial objects through gestures. The system capturesgestures with a Kinect camera and then finds the most closelymatching object in a database of physical objects. This presentsa good combination of new interaction paradigms in 3D objectretrieval. However, it does not address query result visualiza-tion, since they focus on query specification, the first issue wehighlighted in previous section.

3 IM-O-RET

Taking advantage of the new paradigms in HCI, we propose an im-mersive VR system for 3D object retrieval (Im-O-Ret). Here, thequery results are displayed in a three-dimensional space as 3D ob-jects, instead of the traditional list of thumbnails. The user can thenexplore the results, navigate in the three-dimensional space and ma-nipulate the scattered objects. The combined use of virtual environ-ments and devices with six DOF (Degrees of Freedom) provides acomplete visualization of models with a natural interaction, enabledby direct object manipulation, illustrated in Figure 1.

3.1 Spacial Distribution of ResultsQuery results are distributed in the virtual 3D space accordingto their similarity. To each axis is assigned a different shapematching algorithm, being the coordinate value determined by thecorresponding similarity to the query. When performing a search,the query model is used to find similar models, using each assignedalgorithm. The results are then merged, giving a 3D position foreach similar model retrieved.

121

IEEE Virtual Reality 20124-8 March, Orange County, CA, USA978-1-4673-1246-2/12/$31.00 ©2012 IEEE

Figure 1: Immersive 3DOR experience

In the current version of Im-O-Ret, the Light-Field Descrip-tors [3], the Coord and Angle Histogram [11] and the SphericalHarmonics Descriptor [8] are used, since each targets a differentset of features [12]. However, this estimation mechanism can beadapted to specific domains, improving distribution precision.

3.2 Multimodal Interaction

To complement the pointing and direct manipulation of objects, weused voice commands for the specification of the actions. For in-stance, when using an object as query, user points to it and says‘search similar to this’, thus triggering a new query. Furthermore,our the system could also be configurable to work with a wide rangeof different interaction devices and displays (e.g. 3DTV, displaywalls, HMD glasses). This allowed that by combining differentvisualization and interaction devices, we could create multiple in-teraction paradigms for our system.

3.3 Exploration of Query Results

To explore the query results, users navigate in the 3D space anddirectly manipulate the objects, as depicted in Figure 2. We di-vided the navigation into two types based on their movement, assuggested by Bowman et al. [2]. As such, the user can either navi-gate through models, using the wiimote nunchuck or tracking tools,or navigating by points of interest where the camera is moved tothe referenced object, using a combination of pointing and voicecommands.

4 CONCLUSIONS

We believe that recent advances in low-cost post-WIMP enablertechnology can be seen as an opportunity to overcome somedrawbacks of current multimedia information retrieval solutions.Focusing on 3DOR, we presented a solution for query resultvisualization and exploration.

Im-O-Ret offers the users an immersive virtual environmentfor browsing results of a query to a collection of 3D objects. Thequery results are displayed as 3D models in a 3D space, instead ofthe traditional list of thumbnails. The user can explore the results,navigating in that space and directly manipulating the objects.Although exploratory, we believe that the approach presented hereis a fertile ground to be explored by search engines for multimediainformation retrieval. Not only 3D objects, but also image, audioand video.

Figure 2: Exploring a query result

ACKNOWLEDGEMENTS

The work described in this paper was partially supported by the Por-tuguese Foundation for Science and Technology (FCT) through theproject 3DORuS, reference PTDC/EIA-EIA/102930/2008 and bythe INESC-ID multiannual funding, through the PIDDAC Programfunds.

REFERENCES

[1] T. F. Ansary, J.-P. Vandeborre, and M. Daoudi. 3d-model search en-gine from photos. In Proc. of the 6th ACM int. conf. on Image andvideo retrieval, CIVR ’07. ACM, 2007.

[2] D. A. Bowman, E. T. Davis, L. F. Hodges, and A. N. Badre. Main-taining spatial orientation during travel in an immersive virtual envi-ronment. In Presence: Teleoperators and Virtual Environments, pages618–631, 1999.

[3] D.-Y. Chen, X.-P. Tian, Y. te Shen, and M. Ouhyoung. On visualsimilarity based 3d model retrieval. volume 22 of EUROGRAPHICS2003 Proceedings, pages 223–232, 2003.

[4] R. Datta, D. Joshi, J. Li, and J. Z. Wang. Image retrieval: Ideas,influences, and trends of the new age. ACM Comput. Surv., 40:5:1–5:60, 2008.

[5] H. Dutagaci, C. P. Cheung, and A. Godil. A benchmark for best viewselection of 3d objects. In Proc. of the ACM workshop on 3D objectretrieval, 3DOR ’10, pages 45–50. ACM, 2010.

[6] T. Funkhouser, P. Min, M. Kazhdan, J. Chen, A. Halderman,D. Dobkin, and D. Jacobs. A search engine for 3d models. ACMTrans. Graph., 22:83–105, 2003.

[7] C. Holz and A. Wilson. Data miming: inferring spatial object de-scriptions from human gesture. In Proc. of the 2011 annual conf. onHuman factors in computing systems, CHI ’11, pages 811–820. ACM.

[8] M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. Rotation invariantspherical harmonic representation of 3d shape descriptors. In Proc. ofthe 2003 Eurographics, SGP ’03, pages 156–164. Eurographics, 2003.

[9] J. Lee and T. Funkhouser. Sketch-based search and composition of 3Dmodels. In EUROGRAPHICS Workshop on Sketch-Based Interfacesand Modeling, June 2008.

[10] M. Nakazato and T. S. Huang. 3d mars: Immersive virtual reality forcontent-based image retrieval. In Proc. of 2001 IEEE int. Conferenceon Multimedia and Expo (ICME2001), 2001.

[11] E. Paquet and M. Rioux. Nefertiti: a query by content software forthree-dimensional models databases management. In Proc. of the int.Conf. on Recent Advances in 3-D Digital Imaging and Modeling, NRC’97, pages 345–. IEEE Computer Society, 1997.

[12] J. W. Tangelder and R. C. Veltkamp. A survey of content based 3dshape retrieval methods. Multimedia Tools Appl., 39:441–471, 2008.

[13] M. M. Zloof. Query-by-example: A data base language. IBM SystemsJournal, 16(4):324 –343, 1977.

122