[ieee 2014 ieee virtual reality (vr) - minneapolis, mn, usa (2014.03.29-2014.04.2)] 2014 ieee...

2
AR Jigsaw Puzzle with RGB-D Based Detection of Texture-Less Pieces João Paulo Lima DEINFO-UFRPE, Brazil Voxar Labs, CIn-UFPE, Brazil João Marcelo Teixeira DEINFO-UFRPE, Brazil Voxar Labs, CIn-UFPE, Brazil Veronica Teichrieb Voxar Labs, CIn-UFPE, Brazil ABSTRACT This demo presents an AR application that helps the user to solve a jigsaw puzzle that consists of non-textured pieces with a discriminative shape. The pieces are detected, their poses are estimated and the ones that are correctly assembled are highlighted. In order to detect the pieces, the Depth-Assisted Rectification of Contours (DARC) method is used, which performs detection and pose estimation of texture-less planar objects using an RGB-D camera. Keywords: Jigsaw puzzles, pose estimation, texture-less objects, augmented reality, RGB-D cameras. Index Terms: I.4.8 [Image Processing and Computer Vision]: Scene Analysis—Depth Cues, Range Data, Tracking; H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems—Artificial, Augmented, and Virtual Realities 1 DEMO DESCRIPTION In this demonstration, an AR jigsaw puzzle is created by performing detection and pose estimation of texture-less pieces with arbitrary discriminative shape. The pieces are detected using the Depth-Assisted Rectification of Contours (DARC) method [6][7]. DARC is a technique for detection and pose estimation of texture-less planar objects, which are not handled by most existing object detection methods. It makes use of depth information available in consumer devices such as the Microsoft Kinect for obtaining a rectified representation of contours extracted from the RGB image. This normalized representation is invariant to rotation, scale and perspective distortions. It is obtained by transforming the contour points to a canonical view. Once the contours are rectified, they can be directly matched by computing their similarity using Hamming distance. This allows finding correspondences between contours extracted from a query image and previously obtained rectified contours from a single template image of each object, without needing to compute perspective warps from the reference images. Based on these correspondences, accurate pose estimation and augmentation of texture-less planar objects in real-time is possible. To the best of the authors’ knowledge, there are no other methods focusing on 6DOF pose estimation of planar texture-less objects that make use of depth images. A puzzle can be seen as a graph where the vertices correspond to the pieces and the edges represent connections between pieces. This graph must be provided to the application. In order to determine if two pieces fit together, the relative position of the template points that belong to each pair of connecting pieces is learnt beforehand. Using this information, it is possible to obtain for a given piece the expected position of the template points of each neighboring piece. The expected pose is compared with the actual pose of a piece by calculating the RMS error between expected and actual locations of the template points that belong to that given piece. A pair of pieces is considered as correctly assembled when the RMS reprojection error is below pixels. The jigsaw puzzle used in the application consists of a map of the south region of Recife, capital of the state of Pernambuco, Brazil. This map was obtained from the open data portal of the Recife City Council [8]. The south region of Recife has eight districts and each district map is used as a puzzle piece, as depicted in Figure 1. This way, all the pieces detected by the application are non-textured and have an arbitrary shape. The pieces that are correctly assembled are highlighted in green, while the other ones are highlighted in red. In addition, the application also draws the name of each detected district over the corresponding piece 1 . A screenshot of the application is shown in Figure 2. Figure 1: Map of districts of the south region of Recife, which was used as a jigsaw puzzle by the AR application. Figure 2: AR jigsaw puzzle application using DARC, where the correctly assembled pieces are highlighted in green and the other ones are highlighted in red. 1 URL of the demo video: http://goo.gl/gnJO4g email: {jpsml, jmxnt, vt}@cin.ufpe.br 177 IEEE Virtual Reality 2014 29 March - 2 April, Minneapolis, Minnesota, USA 978-1-4799-2871-2/14/$31.00 ©2014 IEEE

Upload: veronica

Post on 24-Mar-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2014 IEEE Virtual Reality (VR) - Minneapolis, MN, USA (2014.03.29-2014.04.2)] 2014 IEEE Virtual Reality (VR) - AR jigsaw puzzle with RGB-D based detection of texture-less pieces

AR Jigsaw Puzzle with RGB-D Based Detection of Texture-Less Pieces

João Paulo Lima DEINFO-UFRPE, Brazil

Voxar Labs, CIn-UFPE, Brazil

João Marcelo Teixeira DEINFO-UFRPE, Brazil

Voxar Labs, CIn-UFPE, Brazil

Veronica Teichrieb Voxar Labs, CIn-UFPE, Brazil

ABSTRACT

This demo presents an AR application that helps the user to solve a jigsaw puzzle that consists of non-textured pieces with a discriminative shape. The pieces are detected, their poses are estimated and the ones that are correctly assembled are highlighted. In order to detect the pieces, the Depth-Assisted Rectification of Contours (DARC) method is used, which performs detection and pose estimation of texture-less planar objects using an RGB-D camera.

Keywords: Jigsaw puzzles, pose estimation, texture-less objects, augmented reality, RGB-D cameras.

Index Terms: I.4.8 [Image Processing and Computer Vision]: Scene Analysis—Depth Cues, Range Data, Tracking; H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems—Artificial, Augmented, and Virtual Realities

1 DEMO DESCRIPTION

In this demonstration, an AR jigsaw puzzle is created by performing detection and pose estimation of texture-less pieces with arbitrary discriminative shape. The pieces are detected using the Depth-Assisted Rectification of Contours (DARC) method [6][7]. DARC is a technique for detection and pose estimation of texture-less planar objects, which are not handled by most existing object detection methods. It makes use of depth information available in consumer devices such as the Microsoft Kinect for obtaining a rectified representation of contours extracted from the RGB image. This normalized representation is invariant to rotation, scale and perspective distortions. It is obtained by transforming the contour points to a canonical view. Once the contours are rectified, they can be directly matched by computing their similarity using Hamming distance. This allows finding correspondences between contours extracted from a query image and previously obtained rectified contours from a single template image of each object, without needing to compute perspective warps from the reference images. Based on these correspondences, accurate pose estimation and augmentation of texture-less planar objects in real-time is possible. To the best of the authors’ knowledge, there are no other methods focusing on 6DOF pose estimation of planar texture-less objects that make use of depth images.

A puzzle can be seen as a graph where the vertices correspond to the pieces and the edges represent connections between pieces. This graph must be provided to the application. In order to determine if two pieces fit together, the relative position of the template points that belong to each pair of connecting pieces is

learnt beforehand. Using this information, it is possible to obtain for a given piece the expected position of the template points of each neighboring piece. The expected pose is compared with the actual pose of a piece by calculating the RMS error between expected and actual locations of the template points that belong to that given piece. A pair of pieces is considered as correctly assembled when the RMS reprojection error is below pixels.

The jigsaw puzzle used in the application consists of a map of the south region of Recife, capital of the state of Pernambuco, Brazil. This map was obtained from the open data portal of the Recife City Council [8]. The south region of Recife has eight districts and each district map is used as a puzzle piece, as depicted in Figure 1. This way, all the pieces detected by the application are non-textured and have an arbitrary shape. The pieces that are correctly assembled are highlighted in green, while the other ones are highlighted in red. In addition, the application also draws the name of each detected district over the corresponding piece1. A screenshot of the application is shown in Figure 2.

Figure 1: Map of districts of the south region of Recife, which was

used as a jigsaw puzzle by the AR application.

Figure 2: AR jigsaw puzzle application using DARC, where the

correctly assembled pieces are highlighted in green and the other ones are highlighted in red.

1 URL of the demo video: http://goo.gl/gnJO4g

email: {jpsml, jmxnt, vt}@cin.ufpe.br

177

IEEE Virtual Reality 201429 March - 2 April, Minneapolis, Minnesota, USA978-1-4799-2871-2/14/$31.00 ©2014 IEEE

Page 2: [IEEE 2014 IEEE Virtual Reality (VR) - Minneapolis, MN, USA (2014.03.29-2014.04.2)] 2014 IEEE Virtual Reality (VR) - AR jigsaw puzzle with RGB-D based detection of texture-less pieces

2 RESEARCH LAB DESCRIPTION

Voxar Labs is a research group that develops and transfers technology related to visualization, tracking and interaction techniques focusing on augmented reality in multi-disciplinary application domains. The laboratory is located at the Informatics Center (CIn) building, at the Recife campus of the Federal University of Pernambuco (UFPE). The group has many ongoing projects, including international cooperations, industry projects, and also research and academic ones.

Since 2005, a group of researchers with common interests has been investigating virtual reality and augmented reality technologies at CIn, leading to the creation of Voxar Labs. This team, headed by Professor Veronica Teichrieb, is composed by multi-disciplinary researchers, from professors to PhD, master and undergraduate students, nowadays numbering twenty five. They work in diverse knowledge domains like computer science, computer engineering and design, as well as application driven areas like education and physiotherapy.

Developing people by augmenting experiences is what the Voxar Labs team does, being the laboratory’s mission. The values representing the core priorities in its culture are creativity, cooperation, reliability, responsibility, flexibility, and enjoyment. Detailed information about the Voxar Labs can be found at http://www.cin.ufpe.br/voxarlabs.

2.1 Research Areas Voxar Labs performs research on three major subjects, which are visualization, tracking and interaction, focusing on augmented reality. The laboratory team has been involved with augmented reality research for almost nine years.

2.1.1 Visualization

Ideally, augmented reality proposes that the user must not be able to distinguish between real and virtual information, demanding that the virtual elements show both geometric (correct placement, correct size, occlusions identification) and photometric (shadowing, mutual reflections, chromatic adaptation to scene illumination) consistency. To accomplish this task, Voxar Labs works with great efforts in real time computer graphics algorithms [10] for massive data visualization and photorealistic rendering. Other projects regarding this subject aim to build tools for 3D reconstruction from images and for aiding the hydrodynamics design of submerged ships advancing and maneuvering with physics simulation in collaboration with the Brazilian Navy.

2.1.2 Tracking

The problem related to correctly positioning virtual information relative to the real environment is solved by tracking the environment so that the synthetic elements can be adequately registered with the real scene. There are diverse tracking technologies available and the optical tracking is often used due to its cost, accuracy and robustness. Two types of optical tracking can be cited: marker based and markerless. The Voxar Labs conducts research on both of them. For the ARBlocks [9], a frame marker was proposed and its tracker was developed. Markerless augmented reality has received more attention from researchers in the latest years and presents important challenges to be overcome. Three examples of works with markerless tracking inside the laboratory are: a hand tracking technique [1]; a texture based 3D tracking algorithm which is part of a markerless augmented reality framework developed at Voxar Labs that was successfully tested during the tracking competition at ISMAR 2012 [11]; and a method for generating projective rectifications of patches around keypoints using RGB-D sensors in order to improve keypoint matching of perspectively distorted textured images [5].

2.1.3 Natural Interaction Natural interaction is a powerful tool to achieve intuitiveness and usability for human-computer interfaces. In fact, interfaces are constantly evolving to provide users an easier way to interact with machines. In this field, Voxar Labs is conducting research focused on natural interactions applied in diverse application domains. One of them is education, with a dynamic blocks platform using tangible user interfaces [9]. Another one is physiotherapy, with an environment for rehabilitation and accessibility purposes based on virtual and augmented reality techniques using Microsoft Kinect [4]. The group also researches natural gestures for in-place effortless navigation in large industrial scenarios [2] and improved viewing experience over 3D content using a Fishtank technique [3].

3 ACKNOWLEDGMENTS

The authors would like to thank CAPES/INRIA/CONICYT STIC-AmSud project ARVS and CNPq (process 141705/2010-8) for partially funding this research.

REFERENCES

[1] L. Figueiredo, J. Lindoso, R. Roberto, V. Teichrieb, R. Anjos Filho, E. Vilar Neto, and M. Silva. Bare hand natural interaction with augmented objects. In ISMAR Works in Progress Talks, ISMAR ’13, 6 p., Adelaide, Australia, 2013.

[2] L. Figueiredo, M. Pinheiro, E. Vilar Neto, T. Menezes, J. Teixeira, and V. Teichrieb. In-place natural and effortless navigation for large industrial scenarios. In HCII ’14, Crete, Greece, 2014 (to be published).

[3] L. Figueiredo, E. Vilar Neto, E. Arruda, J. Teixeira, and V. Teichrieb. Fishtank everywhere: improving viewing experience over 3D content. In HCII ’14, Crete, Greece, 2014 (to be published).

[4] A. da Gama, T. Chaves, L. Figueiredo, and V. Teichrieb. Improving motor rehabilitation process through a natural interaction based system using Kinect sensor. In 3DUI ’12, pages 145–146, Costa Mesa, California, 2012.

[5] J. Lima, F. Simões, H. Uchiyama, V. Teichrieb, and E. Marchand. Depth-assisted rectification of patches: using RGB-D consumer devices to improve real-time keypoint matching. In VISAPP ’13, pages 651–656, Barcelona, Spain, 2013.

[6] J. Lima, H. Uchiyama, V. Teichrieb, and E. Marchand. Object detection and pose estimation from natural features using consumer RGB-D sensors: applications in augmented reality. In ISMAR Doctoral Consortium, ISMAR ’12, 4 p., Atlanta, Georgia, 2012.

[7] J. Lima, H. Uchiyama, V. Teichrieb, and E. Marchand. Texture-less planar object detection and pose estimation using depth-assisted rectification of contours. In ISMAR ’12, pages 297–298, Atlanta, Georgia, 2012.

[8] Recife City Council. Open data portal. URL: http:// http://dados.recife.pe.gov.br/. Last access: January 2014.

[9] R. Roberto, D. Freitas, F. Simões, and V. Teichrieb. A dynamic blocks platform based on projective augmented reality and tangible interfaces for educational activities. In SVR ’13, pages 1–9, Cuiabá, Brazil, 2013.

[10] A. Santos, J. Teixeira, T. Farias, V. Teichrieb, and J. Kelner. Understanding the efficiency of kD-tree ray-traversal techniques over a GPGPU architecture. In IJPP, volume 40, issue 3, pages 331–352, 2012.

[11] F. Simões, R. Roberto, L. Figueiredo, J. Lima, M. Almeida, and V. Teichrieb. 3D tracking in industrial scenarios: a case study at the ISMAR tracking competition. In SVR ’13, pages 97–106, 2013.

178