[ieee 2010 ieee virtual reality conference (vr) - boston, ma, usa (2010.03.20-2010.03.24)] 2010 ieee...
TRANSCRIPT
VirtualizeMe: Real-time Avatar Creation for Tele-Immersion EnvironmentsDaniel Knoblauch∗
University of California, San DiegoPau Moreno Font†
La Salle, Universitat Ramon LlullFalko Kuester‡
University of California, San Diego
ABSTRACT
VirtualizeMe introduces a new design for a fully immersive Tele-Immersion system for remote collaboration and virtual world inter-action. This system introduces a new avatar creation approach full-filling four main attributes: high resolution, scalability, flexibilityand affordability. This is achieved by a total separation of recon-struction and rendering and exploiting the capabilities of moderngraphic cards. The high resolution is achieved by using as much ofthe input information as possible through lossless compression ofthe input data and introducing a focused volumetric visual hull re-construction. The resulting avatar allows eye-to-eye collaborationfor remote users. The interaction with the virtual world is facili-tated by the volumetric avatar model and allows a fully immersivesystem. This paper shows a proof of concept based on publiclyavailable pre-recorded data to allow easier comparison.
Index Terms: I.4.8 [IMAGE PROCESSING AND COMPUTERVISION]: Scene Analysis—Shape I.3.7 [COMPUTER GRAPH-ICS]: Three-Dimensional Graphics and Realism—Virtual RealityH.5.1 [INFORMATION INTERFACES AND PRESENTATION]:Multimedia Information Systems —Artificial, augmented, and vir-tual realities
1 INTRODUCTION
The objective of Tele-Immersion systems is to allow remote usersto interact with each other and virtual worlds. This can be achievedby allowing face-to-face, viewpoint corrected interaction with thevirtual environment via avatars representing the users. The mainconstituent parts of these kind of systems are the acquisition andextraction of avatars and their integration and rendering in virtualworlds. For remote collaboration to work, adequate data compres-sion and networking has to be integrated. The VirtualizeMe system,shown in Figure 1, introduces an approach in which avatar recon-struction and its integration in the virtual world are modularized toincrease flexibility and scalability.Acquisition and extraction of avatars has been studied in differentprojects. Blue-c [1] and Hasenfratz et al. [2] introduced a relativelylow resolution visual hull reconstruction. The Tele-Immersionproject at UC Berkeley [5] produces avatars based on 12 stereorigs allowing the reconstruction of concave objects but introduc-ing more noise to the reconstruction. VirtualizeMe introduces afocused volumetric visual hull reconstruction with color extractionapproach [3] for high spatial resolution avatar creation.
Existing techniques produce avatars in the form of point clouds,polynomials or polygonial meshes requiring different renderingstrategies. Most systems that produce point data [1] [5] use pointbased rendering. The Tele-Immersion project at UC Berkeley [5]recently introduced an alternative based on triangulation. Hasen-fratz et al. [2] perform a marching cubes based smoothing of thevisual hull surface, but do not use any color information for the
∗e-mail:[email protected]†e-mail:[email protected]‡e-mail:[email protected]
Figure 1: VirutalizeMe system diagram, separating avatar creationand virtual world interaction and rendering to allow high flexibility.
rendering. The VirtualizeMe approach uses a marching cubes ap-proach optimized for structured surface point clouds with color in-terpolation based on geometry shaders on the GPU. The structuredsurface point cloud consists of the midpoint of the surface voxelsobtained from the focused visual hull reconstruction.
2 SYSTEM OVERVIEW
The VirtualizeMe system (Figure 1) is designed to separate avatarcreation and virtual world rendering. The avatar reconstructionpipeline is split into acquisition and reconstruction stages which aremapped to multiple acquisition nodes and a single reconstructionnode. Each acquistion node can capture a variable amount of cam-era feeds. All image-based calculations, such as color silhouetteextraction, are performed locally and transfered to the reconstruc-tion stage, after applying lossless background aware compressionalgorithms to retain the best possible input resolution for the recon-struction.High quality avatars are extracted using a focused visual hull ap-proach providing higher spatial resolution and a near one-to-onevoxel, input pixel ratio. This approach focuses the voxel grid onthe target object and therefore increases the spatial resolution ofthe visual hull and reduces the total amount of evaluated voxels forfaster reconstructions [3]. Based on the high spatial resolution ofthe reconstruction only surface voxels are transfered to the remoterendering node. The design of the avatar creation module is shownin Figure 2.
The input for the rendering node consists of position and colorinformation for each surface voxel. The output is a mesh createdbased on a marching cubes approach. The system converts the in-put data to a 3D matrix of boolean voxel information. Instead ofsending the entire matrix to the GPU to generate the mesh, onlythe surface points are send as a vertex. Their color, and the eightadjacent voxels with their respective colors are transfered as a mul-titexcoord. The process is repeated for each adjacent voxel. Thisdata is then processed in the geometry shader based on the originalmarching cube idea. To evaluate the color a linear interpolation isperformed.
279
IEEE Virtual Reality 201020 - 24 March, Waltham, Massachusetts, USA978-1-4244-6238-4/10/$26.00 ©2010 IEEE
Figure 2: VirutalizeMe avatar creation setup, consisting of 8 cam-eras, two acquisition nodes and one reconstruction node. This layoutallows total separation of acquisition and reconstruction.
2.1 ResultsAs the focused visual hull avatar creation is independent of thenumber of input cameras and acquisition environment a test envi-ronment based on the visual hull data by Vlasic et al. [4] has beenintroduced to create verifiable and repeatable results. The back-ground extraction for all eight input cameras is done on one acquisi-tion node and the resulting color silhouettes transfered to the recon-struction node. The resulting reconstruction is send to a renderingnode. The visual results for a 2563 focused voxel grid are shownin Figure 3. It can be seen that the focused visual hull improvesthe spatial resolution of the avatar and the introduced rendering ap-proach improves the visual results of the avatar even further. Perfor-mance tests conclude that the avatar creation with a 2563 focusedvoxel grid runs at 8 fps and with 1283 at 30 fps. The avatar ren-dering including simple interaction with the virtual world exceedsthese speeds. This interaction consists thanks to the volumetric vi-sual hull approach of simple collisions with virtual spheres. Thisproof of concept opens the doors for more sophisticated interactionwith virtual objects. Figure 4 shows the rendering result on a tileddisplay Tele-Immersion system.
3 CONCLUSION AND FUTURE WORK
This paper introduced the design for a fully immersion Tele-Immersion system for remote collaboration and interaction with thevirtual world. This system has been designed to fullfill the follow-ing attributes: high quality avatar creation in real-time, scalabilityand flexibility while also beeing cost effective. The flexibility ofthis system is based on its total separation of avatar creation andrendering efforts. The system consists of high quality focused vi-sual hull based avatar creation, networking between acquisition,reconstruction and rendering nodes with lossless compression al-gorithms and an efficient marching cubes based rendering of theresulting structured surface point clouds. To create verifiable andrepeatable results a readily published set of video sequences is usedto create deterministic input for the processing pipeline. Tests havebeen performed with this data as a proof of concept, which showsthe real-time capabilities of this system.
ACKNOWLEDGEMENTS
This publication is based in part on work supported by Award No.US 2008-107, made by King Abdullah University of Science andTechnology (KAUST)
Figure 3: a) Not focused visual hull; b) Focused visual hull; c) March-ing cubes based rendering
Figure 4: Avatar rendering on tiled display Tele-Immersion system.
REFERENCES
[1] M. Gross, S. Wurmlin, M. Naef, E. Lamboray, C. Spagno, A. Kunz,E. Koller-Meier, T. Svoboda, L. V. Gool, S. Lang, K. Strehlke, A. V.Moere, and O. Staadt. blue-c: A spatially immersive display and 3dvideo portal for telepresence. In Proceedings of ACM SIGGRAPH,pages 819–827, 2003.
[2] J. Hasenfratz, M. Lapierre, and F. Sillion. A real-time system for fullbody interaction with virtual worlds. In Eurographics Symposium onVirtual Environments, pages 147–156, 2004.
[3] D. Knoblauch and F. Kuester. Focused Volumetric Visual Hull withColor Extraction. In 5th International Symposium on Visual Computing(ISVC’09), accepted for publication, 2009.
[4] D. Vlasic, I. Baran, W. Matusik, and J. Popovic. Articulated meshanimation from multi-view silhouettes. ACM Trans. Graph., 27(3):1–9, 2008.
[5] W. Wu, Z. Yang, K. Nahrstedt, G. Kurillo, and R. Bajcsy. Towardsmulti-site collaboration in tele-immersive environments. In MULTI-MEDIA ’07: Proceedings of the 15th international conference on Mul-timedia, pages 767–770, New York, NY, USA, 2007. ACM.
280